Dear LO Seong Loong
Thank you very much for working with DNA Baser and sharing your experience with us. Interaction with users it is very important for us.
To answer your questions:
A) Will the order of adding files affect the assembly in QV and non-QV mode?
Our sequence assembly it is based on special algorithms for pairwise alignment. Therefore, for a successful sequence assembly, it is imperative to clean the sequences before starting the assembly.
In the case of sequence files (SCF/ABI) which have QV’s included, DNA Baser Assembler is doing the cleaning automatically. However, cleaning is not possible when not having QV’s (quality value information, also known as trust information) integrated in your sequences.
Therefore, when assembling un-cleaned sequences that have no QV’s, the quality of the assembly process is not much better than in other concurrent products. This means that in very rare cases, changing the order of your sequences may improve the quality of the resulted contig.
Why this happens? Because in non-QV mode DNA Baser Assembler parses all sequences an assigns a (more or less precise) quality score to each of them. When the assembly process starts, if two sequences have the same score then the first sequence (in alphabetic order) is processed before the other. Because the ends are not trimmed, two sequences that should be identical may be in fact highly different. This may result in slightly divergent assembly process.
IMPORTANT: This will not be the case when the user manually cleaned the sequences or when the sequences have QV’s included. This is why we strongly recommend to use sequences containing quality values. Not only that the assembly process is way more faster, but also the results are more accurate and in case of ambiguities, the assembler can make suggestions with a precision close to 100%
Please ask your technician to set your sequencing machine to produce OLNY chromatogram files (SCF or ABI) with QV included. It is possible to do that even on older machines.
B) Is it possible to have automatic trimming in non-QV mode before assembly? Maybe something like anything after first 500 nucleotides will be trimmed. If not, it will be also nice if we can do it manually (for *.seq files) in the software.
Your idea with the cleaning in non-QV mode is a very good one. We will be using it in the development of the next version. Meanwhile, you can use the edit functions to clean your sequences. If you double click a sequence (in the File List panel), a window will open, were you can edit the sequence/chromatogram (cut the ends or change/insert bases) and then save it (as FASTA, SEQ or SCF).
Hope your questions are answered now.
4….It will also be nice if a sequence can be copied and pasted in a temporary window for assembly purpose.
Can you detail a little bit this idea? Do you want to put the filename in a temporary window, or the sequence (bases)?
5. Alignment could be done between the reference and contig to make sure there’s no mismatch.
This means that the user has to start a new assembly task. This may be a little bit slow. However based on your original idea, we generated a better solution: we can show discrepancies between the contig and reference in the same assembly window by highlighting them in contig, in a special color (like dark purple). Thanks.
We appreciate your suggestions and we'll be taking them into consideration (especially point 2) for the next version, which we are currently developing.
DNA Baser Assembler - Easy to use molecular biology software for automatic DNA sequence analysis, sequence editing and sequence assembly.
http://www.dnabaser.com