sequence assembly softwareabi trace assembly
Welcome to DNA BASER’s official web siteFeatures and performancesScreen shots-DNA Baser Sequence AssemblerOrder your copy nowDNA sequencing info and newsDownload a full working version right nowContact us if you need more information
contig assembly software
scf trace assembly

How to use DNA Baser Assembler
Part 4


How to assemble using the batch mode


 

Level: beginner
Previous chapterTable of content

 


     This tutorial will show you how to assemble contigs from chromatogram (sequence) files using the batch mode. During the batch assembly, the following steps are automatically performed for each individual assembling process:

 

  - cleaning of the bad ends
   - assembling into a contig
   - error correction
   - vector removal using primer sequences (if this option is activated in the Settings window)
   - assembly using a reference sequence (if this option is activated in the Project Manager window)


Scenario:

 

You have a clone library with 500 clones and you use two primers (Forward and Reverse) to sequence each clone. At the end of sequencing process, you will have a folder with 1000 sequences, which need to be assembled in 500 contigs. It would be rather tedious to assemble a contig at a time. However, DNA Baser is the only software that allows you to assemble all sequences in one-step. The prerequisite is that the sequences are named after a pattern that will help DNA Baser recognize the sequences that belong to the same contig.

 

There are four simple steps to assemble all your files at batch:

 1. Start DNA Baser

 2. Set parameters (only two parameters to set)

 3. Press that START button

 4. Done (Inspect the Log window)

 

Let's start:

 

1. Start DNA Baser. The Project Manager should be opened by default. Navigate to the folder that contains your sequences (in our case it is "E:\Batch mode Test\")


contig assembly software

Fig.1 - The Project manager

 

2. Open the Settings window and navigate to the Manager tab. Here locate the 'Batch mode' section.

 

Fig.2 - Setting parameters for batch assembly

 

Here, you instruct DNA Baser about the file name pattern. The 'file name pattern' means that the files belonging to the same contig have the first letters of the name identical (to signal the contig) and the last ones different (to signal each sequence). For example: files E10B082TF and E10B082TR belong to the same contig. If we set the 'Length of fixed part' parameter to 8, DNA Baser will see that the first 8 letters are identical to both sequences (E10B082T) and will assign them to the same contig.

 

In this Settings window, you can also set DNA Baser to either move or copy the sequences that cannot be assembled, in separate folders. After we prepared DNA Baser for the batch assembly job, we press OK to close the Settings window and go the Project manager window.

 

3. Here we press the START BATCH ASSEMBLY button. DNA Baser will start assembling the sequences into contigs.

4. During the assembly process, a log file is generated and automatically saved in the current folder. The log can be also viewed in the Log window. The log contains: information about each individual assembling process, a batch job summary, the list of parameters used for assembling.

 

log DNA assembly

Fig.3 - The log window


5. During the batch assembly process, DNA Baser creates the following folders (located in current folder):

 

Output

This is the folder were the new created contigs are saved. The contigs will be saved both as individual contig files (in fasta format) but also in multiFasta format. The individual contig files are named with the prefix "contig" and a suffix indicating the invariable part of the original sequence names (in this example, the contig from E10B082TF and E10B082TR will be named "Contig - E10B081T"). The multifasta file is saved with the name deriving from the name of the current folder.

 

Unassembled

DNA Baser will move/copy in this folder the sequences that do not fulfill the assembly parameters, and hence cannot be assembled to contigs. In order to assemble them the assembling parameters must be relaxed. Click here to see were you can change the assembling parameters.

 

Unpaired

Here are moved/copied the sequences which do not have pairs.

 

 

Note: Different file types can be mixed together. For example, you can assemble ABI, SCF and FASTA sequences together.

DNA trace assembly
DNA contig assembly