How to use DNA Baser
Part 4
How to assemble using the batch mode
Level: beginner
 
This tutorial will show you how to assemble contigs from chromatogram/fasta/seq files using the batch mode.
During the batch assembly, the following steps are automatically performed for each individual assembling process:
- cleaning of the bad ends
- assembling into a contig
- error correction
- vector removal using primer sequences (if option is checked)
- assembly using a reference sequence (if option is checked)
Example: you have a clone library with 500 clones and you use two primers (Forward and Reverse) to sequence each clone. At the end, you will have a folder with 1000 sequences, which need to be assembled in 500 contigs. If you would assemble each contig at a time, it would be
rather tedious. A faster option is to assemble in one-step, using the batch mode button. The prerequisite is that the sequences are named after a pattern that will help DNA Baser recognize the sequences that belong to the same contig.
1. Open DNA Baser by clicking its icon on your desktop. When DNA Baser starts, the Project Manager should be opened by default. Navigate to the folder with your sequences (sample's folder).

2. Open the Settings window and navigate to the Manager tab, where the settings for the batch assembly mode are.

Here you instruct DNA Baser about the name pattern. The name pattern means that the files belonging to the same contig have first letters of the name identical (to signal the contig) and the last ones different (to signal each sequence).
In this case, the sequences belonging to same contig have the same name, except for the last letter. For example: E10B082TF and E10B082TR belong to the same contig. The first 8 letters are identical to both sequences (E10B082T), enabling DNA Baser to assign them to the same contig.
In this window, you can also set DNA Baser to either move or copy the sequences that cannot be assembled in separate folders.
At the end, press OK and go the Project manager window.
3. Next, press the START BATCH ASSEMBLY button. DNA Baser will start assembling the sequences into contigs.
4. During the assembly process, a log file is generated and automatically saved in the same folder with the sequences used. If DNA Baser has not already showed the log window for you, then press this button to
display it. The log contains: information about each individual assembling process, a batch job summary, the list of parameters used for assembling.

5. DNA Baser creates the following folders (located in the sample's folder):
.../Output - this is the folder were the contigs are saved. Two types of files are here: individual contig files (in fasta format) and a file with all the generated contigs, in multifasta format. The individual contig files are named with the prefix "contig" and
a suffix indicating the invariable part of the original sequence names (in this example, the contig from E10B082TF and E10B082TR will be named "Contig - E10B081T"). The multifasta file is saved with the name deriving from the name of the sample's folder.
.../Unassembled - here are moved/copied the sequences which do not fulfill the assembly parameters, and hence cannot be assembled to contigs. In order to assemble them the assembling parameters must be relaxed. Click here to
see were you can change the assembling parameters.
.../Unpaired - here are moved/copied the sequences which do not have pairs.
|