sequence assemblingsequence assembling
DNA BASER-The sequence assembler-Home pageFeatures and performancesScreen shotsPricesInfo and news.Download a full working versionContact us
molecular biology software
scf trace assembly

Automatic trimming

 

sequence assembly software

 

 

 

 

 

 

 

Why my are alignments wrong? Why are my blast hits are totally crazy?


The problem with wrong alignments often appears when the untrusted regions of the chromatograms are poor and they have not been properly cleaned. DNA Baser Sequence Assembler is doing the trimming automatically if the sample stores quality value information. If not, the user should manually cut the untrusted regions.

 

Fig 1 - A chromatogram showing untrusted reagion (gray) and quality values (vertical green bars above each base).

 

 

Can DNA Baser clean the untrusted regions automatically if my sequences don't have QV?

 

No. The end trimming process is based on the quality values assigned to each base. The bases are removed when their quality values are lower than a threshold (set by the user). Beside this threshold, there are 2 more parameters that influence the trimming algorithm: the length of the trimming window and the percent of bases in the trimming window whose quality values are higher than the threshold one:

 

bad end trimming from chromatograms

 

Increasing the value of these parameters will make DNA Baser Assembler to cut and throw away more bases from your sequence(s). Decreasing these values will make DNA Baser to keep more bases even if their quality is poor. Adjust these settings according with the quality of your chromatogram file(s).

 


  

 

Details about how the Automatic Trimming Engine works

 

We will provide a clear example of how DNA Baser Assembler will automatically clean the low quality region in a sample. We assume that you have set up the trimming parameters like this:

bad end trimming from chromatograms

 

DNA Baser Assembler will create an imaginary window (the blue rectangle in the picture below) and it will place this window at the beginning of the sequence. This window will be 18 bases wide. If 75% of the bases inside this window are good, the DNA Baser has found the first high quality region in your sample and it will stop the trimming process.
If the above condition was not met, it will move the window one base to the left and it will repeat the process until the condition is met.
In the picture below we see that DNA Baser had to move the window 10 bases until it found a region where more than 75% of bases are good. A good (trusted) base is a base that has a QV of minimum 25. All bases at the right will be cut (thrown away). As you can see most of the bases that will be cut are marked with red (low quality).

automatic trimming engine for DNA sequences
Fig 1. The trimming algorithm on action.
The window is marked in light blue, bad bases are marked in red; good (trusted) bases are marked in green.

 

 

 

 

When the trimming will be performed?


DNA Baser Assembler is removing automatically the low quality ends from the chromatograms before starting the assembly process. It doesn't really cut away the bad ends, it just marks them as entrusted and they won’t be taken into consideration when the assembly process will start. Your original files will remain untouched.

 

 



Automatic trimming engine in action - Video tutorial

 

How can the automatically trimming engine can save time by eliminating all ambiguity manual corrections. Please note that there are 381 ambiguities i the trimming engine is no used. After automatically cleaning the low quality ends, there are only 12 ambiguities!!
Click here to enlarge the video.

 

 

 

 

Details about chromatogram files

 

 

DNA chromatogram assembly
contig assembly software