Pairwise and multiple sequence alignment can be performed pre and post analysis. Alignment can be performed using the Align operation under the Post-processing tab.
To align sequences in the Files table, (1) select more than one sequence, (2) click Post-processing and (3) Align in the dropdown.
To align sequences within a Biologics Annotator Result document, (1) select a Biologics Annotator Result document, (2) select two or more sequences in the Sequence Table and click Post-processing and select Align in the dropdown.
Region to align
You can choose to either align the entire sequence or just a region of your sequences. To align the entire sequence, click Entire sequence under Region to align. This operation will align all the selected sequences from the 5' end to the 3' end.
To align a selected region, click Extract region with name and select the appropriate region name in the drop-down. This operation will align the sequences from the 5' end to the 3' end of the selected region rather than than the entire region.
To align a region found in both the heavy and light chains, select Filter by and select from either heavy chain or light chain from the dropdown. For example, to align the heavy CDR3 regions of the selected sequences, select Extract region with name: CDR3 and Filter by: Heavy Chain in the respective dropdowns.
In the absence of annotations within the selected sequences, you will see the following message (see image below). To align on a region, please ensure that all of the selected sequences consist of the region annotation.
You can also align translated sequences by selecting the Translate nucleotide sequence(s) prior to alignment option. This operation will translate the nucleotide sequence according to the selected genetic code and translation frame before aligning the sequences. The available genetic codes are obtained from NCBI.
The standard start codon, AUG codes for Methionine in eukaryotes and a modified Met (fMet) in prokaryotes. Alternative start codons are still translated as Met when they are located at the start of a coding sequence. When Consider Alternative Start Codons is selected, you can select from the following options:
Auto detect: Alternative start codons are translated as M when the annotation is of type CDS, ORF or gene
Always consider: Alternative start codons are translated as M regardless of annotation type
Always ignore: Alternative start codons are not translated as M
The Alignment algorithm dropdown allows you to select the alignment algorithm you wish to use. We currently support the iteration-based alignment method MUSCLE (multiple sequence comparison by log-expectation) and MAFFT (Multiple Alignment using Fast Fourier Transform). The latter will be selected automatically if you select greater than 1,000 sequences as it performs much faster on large datasets.
To build a tree from the alignment, select Build tree from alignment with in the Expected Output section. You can then select a tree builder algorithm from one of the following options: RAxML (Randomized Axelerated Maximum Likelihood) and Geneious (Neighbour-joining algorithm, Saitou & Nei 1987).
To view the aligned sequences as a tree view, (1) select a Tree file and (2) ensure that Show Tree in the Sequence Viewer Sidebar is selected.
**Note that the tree will dissolved upon sequence sorting, to view the alignment in a tree format select Show Tree in the Sequence Viewer Sidebar
To align sequences with a reference sequence, you will first need to upload a reference sequence into the same folder as the selected sequences for alignment. You can then select the reference sequence in the Align with reference dropdown to be aligned with the selected sequences.
The reference sequence must meet the following requirements:
- Located in the same folder as the sequences to be aligned or in the parent folder
- Contains the selected region annotation when aligning by region
- Be of File type Nucleotide Sequence when aligning with both nucleotide and amino acid sequences
- Be of File type Amino Acid Sequence when aligning with amino acid sequences
**Note that "Align with reference" is an alpha feature - please contact us if you would like advanced access.
To include external assay or metadata in the alignment document as a heatmap, you will need to first import the assay data into the results table prior to alignment. Learn more on how to add assay data to your results in the following article.
Upon importing assay data, select the sequences you would like to align and proceed with the alignment operation as usual. To view the metadata in the tree or alignment view, click Sidebar in the right Sequence Viewer panel and select the assay data you would like to be included in the view. To view the values of the metadata, click Show values for metadata and hover over the heatmap to see the details of the metadata.