Skip to content

Helper Applications

The byconaut repository provides a number of helper applications with different types of functionalities, e.g.

  • data I/O
  • plotting (see plotting)
  • database maintenance
  • data transformation

These applications are in some way used to populate or manage data resources for bycon driven implementations of the Beacon protokol (i.e. genomic data resources).

Plotting Apps

For more information see the dedicated documentation page).

Data transformation & database maintenance

analysesStatusmapsRefresher

This is one of the housekeeping scripts which has to be run after CNV data has been added or modified in the database. It creates CNV status data for binned genome intervals, used for histogram generation, sample clustering etc., as well as some other statistics (e.g. CNV coverage per chromosomal arms ...).

Arguments

  • -d, --datasetIds ... to select the dataset (only one per run)
  • --filters ... to (optionally) limit the processing to a subset of samples (e.g. after a limited update)

Use

  • bin/analysesStatusmapsRefresher.py -d progenetix
  • bin/analysesStatusmapsRefresher.py -d progenetix --filters "pgx:icdom-81703"
  • bin/analysesStatusmapsRefresher.py -d cellz --filters "cellosaurus:CVCL_0312"

collationsCreator

The collationsCreator script updates the dataset specific collations collections which provide the aggregated data (sample numbers, hierarchy trees etc.) for all individual codes belonging to one of the entities defined in the filter_definitions in the bycon configuration. The (optional) hierarchy data is provided in rsrc/classificationTrees/__filterType__/numbered-hierarchies.tsv as a list of ordered branches in the format code | label | depth | order.

TBD The filter definition should be one of the configuration where users can provide additions and overrides in the byconaut/local directory.

Arguments

  • -d, --datasetIds ... to select the dataset (only one per run)
  • --filters ... to (optionally) limit the processing to a subset of samples (e.g. after a limited update)

Use

  • bin/collationsCreator.py -d progenetix
  • bin/collationsCreator.py -d examplez --collationTypes "PMID"

frequencymapsCreator

This app creates the frequency maps for the "collations" collection. Basically, all samples matching any of the collation codes and representing CNV analyses are selected and the frequencies of CNVs per genomic bin are aggregated. The result contains teh gain and loss frquencies for all genomic intervals, for the given entity.

Arguments

  • -d, --datasetIds ... to select the dataset (only one per run)
  • --collationTypes ... to (optionally) limit the processing to a selected collation types (e.g. NCIT, PMID, icdom ...)

Use

  • bin/frequencymapsCreator.py -d progenetix
  • bin/frequencymapsCreator.py -d examplez --collationTypes "icdot"

Utility apps

ISCNsegmenter

This is a helper app to transform cytogenetic CGH annotations (rev ish) to the canonical tab-delimited .pgxseg segment file format.

Use

  • bin/ISCNsegmenter.py -i imports/ccghtest.tab -o exports/cghtest-with-histo.pgxseg