|
|
|
How to use DNA Baser Assembler
This tutorial will show you how to assemble contigs from chromatogram (sequence) files using the batch mode. During the batch assembly, the following steps are automatically performed for each individual assembling process:
- cleaning of the bad ends
You have a clone library with 500 clones and you use two primers (Forward and Reverse) to sequence each clone. At the end of sequencing process, you will have a folder with 1000 sequences, which need to be assembled in 500 contigs. It would be rather tedious to assemble a contig at a time. However, DNA Baser is the only software that allows you to assemble all sequences in one-step. The prerequisite is that the sequences are named after a pattern that will help DNA Baser recognize the sequences that belong to the same contig.
There are four simple steps to assemble all your files at batch: 1. Start DNA Baser 2. Set parameters (only two parameters to set) 3. Press that START button 4. Done (Inspect the Log window)
Let's start:
1. Start DNA Baser. The Project Manager should be opened by default. Navigate to the folder that contains your sequences (in our case it is "E:\Batch mode Test\")
Fig.1 - The Project manager
2. Open the Settings window and navigate to the Manager tab. Here locate the 'Batch mode' section.
Fig.2 - Setting parameters for batch assembly
Here, you instruct DNA Baser about the file name pattern. The 'file name pattern' means that the files belonging to the same contig have the first letters of the name identical (to signal the contig) and the last ones different (to signal each sequence). For example: files E10B082TF and E10B082TR belong to the same contig. If we set the 'Length of fixed part' parameter to 8, DNA Baser will see that the first 8 letters are identical to both sequences (E10B082T) and will assign them to the same contig.
In this Settings window, you can also set DNA Baser to either move or copy the sequences that cannot be assembled, in separate folders. After we prepared DNA Baser for the batch assembly job, we press OK to close the Settings window and go the Project manager window.
3. Here we press the START BATCH ASSEMBLY 4. During the assembly process, a log file is generated and automatically saved in the current folder. The log can be also viewed in the Log window. The log contains: information about each individual assembling process, a batch job summary, the list of parameters used for assembling.
Fig.3 - The log window
5. During the batch assembly process, DNA Baser creates the following folders (located in current folder):
Output This is the folder were the new created contigs are saved. The contigs will be saved both as individual contig files (in fasta format) but also in multiFasta format. The individual contig files are named with the prefix "contig" and a suffix indicating the invariable part of the original sequence names (in this example, the contig from E10B082TF and E10B082TR will be named "Contig - E10B081T"). The multifasta file is saved with the name deriving from the name of the current folder.
Unassembled DNA Baser will move/copy in this folder the sequences that do not fulfill the assembly parameters, and hence cannot be assembled to contigs. In order to assemble them the assembling parameters must be relaxed. Click here to see were you can change the assembling parameters.
Unpaired Here are moved/copied the sequences which do not have pairs.
Note: Different file types can be mixed together. For example, you can assemble ABI, SCF and FASTA sequences together. |
|
|
|
||