sequence assemblingsequence assembling
Welcome to DNA BASERFeatures and performancesScreen shotsPricesHot infos and news.Download a full working versionContact us
molecular biology software
scf trace assembly

sequence assembly software

FAQ about quality values (QV)

 

What is the quality value (QV)?
The QV (sometimes called quality values or bases trust information or confidence values) is a number assigned to each base in chromatogram that shows how much is that base trusted. A small QV means that the predicted base can’t be trusted too much (the base caller may be wrong). A high QV (like 100) means that the base can be trusted.

How important is the QV for DNA Baser?
DNA Baser must clean (trim) the wrong ends of the used sequences before it assembles those sequences. In order to perform correctly this operation, it heavily relies on the quality value (QV) information assigned to each base. If this information is missing, DNA Baser will not trim the ends of a sequence. This may result in a poor or wrong alignment. In this case, manually trimming of the bad ends is required - it can be done using DNA Baser editing functions.

Which are the formats that can store quality value information?
Only SCF and ABI file types can store QV information. FASTA, SEQ and TXT can’t store this information.

If I have ABI or SCF files, it means that implicitly they contain QV?
Not necessarily. If your sequencing machine is set to generate SCF files, those files will certainly contain QV information. We encountered only once until now a SCF file without QV.
If your sequencing machine is set to generate ABI files, then in many cases the generated files may not contain the quality values (QV) field. However, your technician can fix this easily. Please read below.

What about corrupted ABI files?
ABI format was invented long time ago. It can be considered an obsolete format because it has problems in storing large sequences/chromatograms. Often the ABI files generated by some machines are corrupted. This post shows hard evidence of corrupted data in ABI files.
When DNA Baser encounters a corrupted file, it will automatically try to restore the file and it will write to disk (in the same folder where the file is located) a report for that file.

Why SCF format is better than ABI format?
Not only that SCF format do not have the above-mentioned bug, but it offers many other additional features. In addition, the SCF format was designed to be easily compressed (using programs like WinRar or WinZip). This is very important for companies storing large amounts of chromatogram files.
We strongly recommend you to use the SCF format instead of ABI.

What is the difference between files with QV included and files without QV?
DNA Baser is the only software on the market optimized for automatic sequence assembly. For example where an ambiguity is encountered, DNA Baser will suggest in 95% of the cases the correct base. It will also trim the ends with a very high precision exactly as a human will do it.

In order to obtain this precision the program relies on quality value assigned to each base.
If this information is missing, the contigs generated by DNA Baser are not better than the contigs generated by other programs: the ends are not trimmed and the program may give week suggestions in
Overall, you have to work twice more to correct manually all ambiguities/mismatches, exactly like in a regular sequence assembler.

What to do in case your ABI files do not have QV filed included?
Always use the SCF files instead of ABI files as SCF format always has the QV included. If you do not have the SCF files, please instruct your technicians to set the machine to generate not only ABI files but also SCF files. This is a very simple procedure. All you have to do is to check a checkbox in the machine’s (software) interface. In addition, your technician can instruct the software to generate ABI file WITH QV included.

Why DNA Baser does not automatically trim the low quality regions in my chromatograms?
Usually you don't have to manually trim the ambiguous bases (the 'N's) from your sequences. DNA Baser will clean the end automatically for you BUT ONLY IF the quality values are included in the file. SCF files always contain QV. ABI files only sometimes have this information included. You can instruct your sequencing machine to always generate ABI files with QV included (you just need to check a checkbox in your machine's interface).

How can I find out if my samples contain information about quality values (QV)?
With DNA Baser, it is very easy to get this information: double click an ABI/SCF file to open it in DNA Baser. If you see green columns above each base in chromatogram, then your file has the quality value field included.


Fig 1. ABI file with QV included


Fig 2. ABI file without QV included

Another way to check the existence of quality value is to see the properties of a file. Pres Control+Enter to see the properties dialog:

  

 



If my samples do not contain quality values, does it mean that I can’t make a contig?
In most cases, DNA Baser will be able to generate contigs without problems even if your samples does not contain QV field. However in some very rare cases (especially when your samples are extremely poor), DNA Baser may not succeed to create contig. In this case, you must manually trim the bad portions (the bad ends) of your samples/chromatograms.

Can I mix ABI and SCF files?
Sure.

Can I mix samples containing QV with samples that do not contain QV?
Yes. However, DNA Baser will create the contig in Non-QV mode. This means that if you assemble 100 files and one single file does not have QV included, DNA Baser will treat all files as without QV. It may be better to totally remove that file from contig than to use it. See the table below:

Files contain QV?
Contig done in:
All
QV mode
None
non-QV mode
Some
non-QV mode


In rare situations, DNA Baser generates poor contigs from some ABI files. Why?
Because your files contain no QV information. You must trim the ends manually.


See also:

      - Automatic bad ends trimming
      - What is a good chromatogram?

DNA chromatogram assembly
contig assembly software
align aligner alignment alternative assemble assemblies assembly base biology clip code