What are the untrusted regions?
Due to the nature of the sequencing process, most often, bases at one or both ends of the DNA sample are not identified correctly by the base caller which often assigns wrong bases to the peaks or assigns the correct base but it marks this assignment as being a potential mistake (see confidence scores). These regions are called untrusted regions, region of untrusted bases, low quality ends or just bad ends.
Why do they need to be cut?
If untrusted regions are left uncut, the sequence assembly software may assemble the samples in the wrong order. The contig may have lots of errors that will have to be manually corrected or it will be simple unusable.
Therefore, before starting the assembly process, the end user has to inspect all bases and confirm that the assignments are correct. If not, it has to manually correct them. This process is painfully slow and it may take up to 15 minutes to manually prepare a couple of samples for sequence assembly.
How to speed up the process?
Users that have only few samples to process (below 100), just open the chromatograms, cut the ends (using visually inspection and common sense), save the chromatograms back to disk and then start the sequence assembly process. This reduce the amount of time to few minutes if the user has only few samples. However, most users have hundreds or thousand of samples. In this case, they just open the samples, cut about 100 bases at each end, without visually inspecting the chromatogram, and save the sample back to disk. This number is more or less random - for most samples, the base caller will predict the incorrect base only for the first and last 100 bases. However, the untrusted region can be anything from zero to all bases. Therefore, this method save a lot of time but is extremely inaccurate.
Introducing automatic end trimming
There have been decades since Sanger sequencing technology was invented and still before DNA Baser no software was able to provide automatic ends trimming.
FAQ - Automatic trimming
FAQ - Chromatogram files
|Copyright © Heracle BioSoft SRL 2020||