DNA Sequence Assembler FAQ|DNA sequence assembly & contig editing software. Supported file formats

Sequence assembly FAQ
DNA Sequence Assembler

DNA Baser shows the correct contig on screen but when I save it to disk it is incomplete

This is because you Vector Removal settings are wrong. You need to use different vector recognition sequence, which after RC (reverse complement) are not identical. Please see this page for details.

Why DNA Baser cannot assemble (some of) my files?

A quick look into the LOG will tell you why. In most cases is because DNA Sequence Assembler automatically removed some of your samples from the job list. This happens when the quality of the samples is below a certain threshold. To force DNA Baser to keep even the low quality samples, you need to relax the “Trimming engine” parameters (DNA Baser can automatically trim low quality ends of your samples) and the “Assembler engine” parameters. However, this may result sometimes in more errors/ambiguities in your contig which means more manual work to correct them. Therefore, if you have enough redundant samples in the set of files that you want to assemble, it is better to let DNA Baser to exclude the samples with low quality (the samples that may introduce ambiguities).

Another reason may be that the sample does not really belong to the current contig. In this case you will get a message saying "Sample [xxx] has no neighbors".

DNA Baser does not create the contig with the right orientation

The SCF/ABI/FASTA samples you receive from your sequencing machine are in random order (each user can use its own naming scheme). Therefore, DNA Baser cannot know the orientation of your contig. We will add in the near future an input box where you can designate on of your samples as being forward or reverse. The orientation of the contig will then be changed according to this information.

Workaround:

You can use one of your samples (the one that you know it is "forward") as reference and align the rest of your samples to this reference.
You can user the "View reverse complement of contig" (available in the "Contig" menu in "Assembly" window) function to change the orientation of the contig.
Save the contig using the 'Save contig as RC' function (available in DNA Baser v3.0)

Do I have to save the contig every time I assemble some samples?

No. DNA Sequence Assembler will automatically save the contig for you at the end of the assembly process. After you have pressed the 'Start assembly' button the program will automatically trim bad ends, assemble the contig, correct the errors and save the contig to. This is just one of the many automations you will find in DNA Sequence Assembler.

I have an error saying: Cannot open file… The process cannot access the file because it is being used by another process” or “EFOpenError".

DNA Baser does not lock the files that it opens. This way it allows you to open the same file multiple times without generation sharing violations.
However, other programs do not allow you to do that. For example if you open a FASTA file in Microsoft Word, Word will lock that file. Any other program that will try to open it will fail as long as the file is opened in Word. We recently created a workaround this issue but works only for FASTA/SEQ/TXT files. The SCF and ABI files are still under this restriction.
Workaround: do not open the same file in DNA Sequence Assembler and other programs (that are using locking techniques) at the same time.

DNA Baser does not cut the recognition sequence (vector) correctly OR

DNA Baser saves only a small part (only the vector) of the contig to disk

This happens when the two recognition sequences used are the exact reverse complement of each other. To solve this, you need to change one of the recognition sequence (e.g. by adding more bases) such that, when reverse complemented, it will not be identical with the other recognition sequence.

I need a small feature. Can you implement it?

BioSoft provides effective, easy-to-use sequence assembly and analysis software tools to meet the ever changing needs of today’s genetic/medical researcher and diagnostician. All of our software tools are the result of close, effective collaborations with the genetic community. Therefore, if one of our software programs doesn't meet your exact needs, please do not hesitate to contact us. After all DNA Baser is made for you.

How to copy/print chromatograms from DNA Baser?

This can be achieved by copying the chromatograms to clipboard then pasting them in your preferred image editor (example: Corel, Photoshop, Paint Shop Pro, PaintBrush) or text editor (example: Microsoft Word, Wordpad, Open Office). From these editors you can edit or print the image (screenshot) easily.

To copy the chromatogram to clipboard click the chromatogram you want to copy then:

click the 'Copy visible chromatogram' menu located under the 'Chromatograms' menu if you want to print a single chromatogram from DNA Baser
press the 'Alt' and the 'PrintScreen' keys at the same time if you want to copy multiple chromatograms or the whole DNA Baser window

Note: new features for chromatogram manipulation has been added in DNA Baser v4.0

How much RAM does DNA Baser Assembler need?

For small contig assembly 1GB of RAM should be more than enough. Please see this page for details about memory requirements.

FAQ about file formats

What file formats does DNA Baser Assembler recognize?

DNA Sequence Assembler supports ABI (all varieties), SCF, TXT, SEQ, GBK, multiGbk, multiFasta and FASTA files. Whenever possible you should use chromatogram files (SCF/ABI) instead of FASTA files. SFF files are supported also (by SFF Workbench).

Why I should prefer chromatogram files (SCF/ABI) over Fasta files?

DNA Sequence Assembler is a tool like no other. One of its outstanding features is the capability to perform time consuming task (task that until now could only be done manually) instantly. For example by using DNA Sequence Assembler you do not have to inspect your contig or manually correct the ambiguities. DNA Sequence Assembler will do it for you. However, this feature works only if your input samples contain information about base quality (aka QV or confidence score or basecaller confidence). Therefore, whenever possible you should use chromatogram files (SCF/ABI) files instead of FASTA files, empowering DNA Baser to automatically remove untrusted regions and to automatically correct the ambiguities. The accuracy of the suggestions can be as high as 95-99% and in most cases you don't have to do any manual editing/corrections in your contig. This will save a lot of your precious time!!

Does DNA Baser support FNA or FA files?

Because of the lack of standardization, miscellaneous software may output Fasta files using exotic extensions like “FST”, “FNA”, “FAS”, “FST” or “FA”. We try to make the program compatible with as many file formats as possible. If DNA Baser does not recognize your file(s) but you are sure it is a valid FASTA file, then you need to change its extension to Fasta.

How can I change the file extension of a file (sample)?

In some cases you cannot change the extension of a file because by default Windows is hiding this information. Here is how to make Windows to show this information:

1. Double-click "My Computer".
2. Click the "Tools" menu and select "Options".
3. When the "Folder Options" multi-tabbed dialog box appears, select the "View" tab.
4. Uncheck "Hide file extensions for known file types"
5. Press OK to close the dialog box.

Support Online Manual