rRNA contextual data (metadata)
for environmental sequences
What are contextual data?
Contextual data (also called metadata) is a standardized structure of data describing aspects like geographic location (longitude, latitude) and habitat characteristics (depth, temperature, etc) from which a sequence was retrieved, how it was processed (vectors and primers), etc. Metadata information needs to be attached to the sample files before submitting them to international databases (NCBI). Metadata is important because it adds additional value to samples submitted to a database.
Following the MIGS/MIMS (Minimum Information About a Meta-Genome Sequence) standards for environmental samples at least the GPS position (longitude, latitude), depth/altitude and time of sampling will become mandatory.
Automatic metadata integration
The SILVA team has started a number of projects to facilitate and improve the integration of metadata. DNA Baser is able to automatically read the metadata template file provided by SILVA ribosomal database project and automatically integrate them into contigs during sequence assembly or into existing FASTA files.
Please note that the template file is provided in EXCEL format (XLS). However, DNA Baser only supports the CSV (comma separated values) format. In EXCEL you can easily convert the XLS files to CSV by using the "Save as..." function.
The Wiki pages of the Genomic Standards Consortium provide an overview about our survey results proposed to be associated with any ribosomal RNA sequence.
|Copyright © Heracle BioSoft SRL||