byconaut
¶
Deprecation of byconaut
dependency for bycon
Installations
Since the bycon v2.0 "Taito City" release, the byconaut
project has been
reduced to non-standard functionality. Importantly, "beyond Beacon services",
installation support, example data and data import functions have been migrated into
the bycon
project itself. The byconaut
project now mainly serves as a playground
for temporary utilities and scripts making use of bycon
functions for additional
tasks.
Installation¶
byconaut
depends on the bycon
package which can be downloaded from its
repository. Please see the repository
and the corresponding documentation site.
While there is also a pip
installation possible over pip3 install bycon
this will not include the local configuration files necessary e.g. for
processing the databases.
Create your own databases¶
Core Data¶
A basic setup for a Beacon compatible database - as supported by the bycon
package -
consists of the core data collections mirroring the Beacon default data model:
variants
analyses
(which covers parameters from both Beaconanalysis
andrun
entity schemas)biosamples
individuals
Databases are implemented in an existing MongoDB setup using utility applications
contained in the importers
directory by importing data from tab-delimited data
files. In principle, only 2 import files are needed for inserting and updating of records:
* a file for the non-variant metadata1 with specific header values, where as
the absolute minimum id values for the different entities have to be provided
* a file for genomic variants, again with specific headers but also containing
the upstream ids for the corresponding analysis, biosample and individual
Examples¶
Minimal metadata file¶
individual_id biosample_id analysis_id
BRCA-patient-001 brca-001 brca-001-cnv
BRCA-patient-001 brca-001 brca-001-snv
BRCA-patient-002 brca-002 brca-002-cnv
Variant file¶
Further and optional procedures¶
- Create database and variants collection
- update the local
bycon
installation for your database information andlocal parameters- database name(s)
filter_definitions
for parameter mapping
- Create metadata collections -
analyses
,biosamples
andindividuals
- Create
statusmaps
and CNV statistics for the analyses collection- only relevant for CNV database use cases
- Create the
collations
collection which usesfilter_definitions
and the corresponding values to aggregate information for query matching, term expansion ... - Create
frequencymaps
for binned CNV data- relies on existence of
statusmaps
inanalyses
andcollations
- only needed for CNV data
- relies on existence of
Data maintenance scripts¶
Please see the helper apps documentation.
-
Metadata in biomedical genomics is "everything but the sequence variation" ↩