De Novo Assembly of Sanger Sequences
How do I assemble overlapping Sanger reads to create a single contiguous sequence?
The SnapGene "Assemble Contigs" tool uses the CAP3 assembler to assemble reads into one or more contiguous assemblies.
This tool is designed primarily for assembly of a small set of Sanger reads, all derived from the same clonal source, and all of which are expected to overlap to form a contiguous sequence.
Click Tools → Assemble Contigs... to open the "Assemble Contigs" window.
Drag and drop Sanger trace files (.ab1 format) into the window to import them. Alternately, click the Import Sequences to Assemble → Import Sequence Files.
As a general rule you should only assemble reads derived from the same clonal source
Assemble the Sequences
Ensure the option to "Trim low-quality ends of sequences before running CAP3" option is checked.
Provide a name for your assembly and set the Save destination for the assembly.
Click Assemble to run the CAP3 assembler.
Trimming by SnapGene is performed by the same algorithm as that used when hiding chromatogram ends at medium stringency, except low quality ends are removed rather than hidden prior to passing the reads to the CAP3 assembler (see Set the Default Stringency for Hiding Chromatogram Ends)
Note that the CAP3 algorithm may separately trim reads based on quality.
A Settings... button is provided to allow users to alter the default CAP3 assembler settings. However, unless you are familiar with CAP3 we recommend you DO NOT alter these settings.
View the Assembly
The new sequence file opens with the Alignment side panel displayed. The initial sequence (labelled Original Sequence in Sequence view) in the file is the consensus created by the CAP3 assembler. In Map view all aligned reads will be depicted above the consensus.
If reads assemble into two or more contigs then a Collection will be created and each sequence in the Collection will represent a Contig.
Validate the Assembly
Switch to Sequence view to view and edit the assembly. The top panel #1 shows the initial CAP3 consensus (Original Sequence). The bottom panel #2 shows the CAP3 consensus (Original Sequence) and the aligned trace sequences within the field of view.
Click the right "Jump" triangle in the bottom panel to jump to the first mismatch or gap discrepancy in the alignment.
View the Trace Sequence Chromatogram
Click on the disclosure triangles to expand each trace view.
Determine the cause of the disagreement, in the example above a compression has resulted in two A peaks being called as a single A.
Option-click (macOS) or ALT-Click (Windows/Linux) on any triangle to expand all traces simultaneously
The expanded trace view provides controls to change display of the peaks, and provides information on trace length and orientation, and a summary of differences compared to the reference.
Edit the Traces
Correct Miscalled/Disagreeing Errors in Traces
Select the sequence to be corrected and type to add sequence, or hit delete to remove sequence. In the above example, the sequencer has miscalled a compressed AA double peak as a single A, so we have selected the gap and typed "a".
Click Insert to accepted the insertion of an "a".
Click the right "Jump" triangle in the bottom panel to jump to the next mismatch or gap in the alignment, continue editing until all reads are in agreement.
Update the "Original Sequence"
After correcting miscalls or gaps it is likely that the reads no longer agree with the Original Sequence (the initial CAP3 consensus).
To replace the "Original Sequence" with the corrected sequence defined be the edited reads, select all reads then click Aligned Sequences → Replace Original with Aligned → Update this File.
Click menu File → Save to save the sequence file.