This article describes the Repair Sequences operation. Sequences can be repaired following annotation, if you have observed that some of them could not be annotated correctly or contain low quality/ambiguous bases. Repairing consists of replacing low quality regions in your sequences with the corresponding germline reference database sequences used to annotate them. Repair Sequences is currently only supported for results produced by the Antibody Annotator pipeline.
The purpose of repair is to enable you to salvage sequences that are otherwise difficult to compare against your high quality sequences due to sequencing errors or missing data.
Repair Sequences is an alpha feature, if you would like access to it please enquire with support staff.
How to Get Started
To repair your sequences, select a Biologics Annotator Result document in the documents viewer, then select all sequences or a subset that you wish to check for repair. In the dropdown labelled Post-processing above the sequence selection viewer, click on Repair Sequences... to bring up the operation options.
You may choose to repair one or more light and heavy chain regions in your sequences, where regions consist of FR1, CDR1, FR2, CDR2, FR3 and FR4. Note that CDR3 is intentionally excluded due to high variability. To select more than one region, hold Control/Command and click on multiple options.
Most commonly, your sequences may have truncated ends. In this case, you could select the FR1 or FR4 regions for repair.
Conditions for Repair
The Repair Sequences operation runs on each sequence selected and determines whether the region(s) you selected require repair. A region requires repair if one of the following conditions is met:
- One or more ambiguous bases are present in the sequence of the region.
- The region was truncated relative to the reference database sequence.
- Only part of the region could be identified.
It will not be repaired if:
- There is no gene annotation overlapping the target region.
- The region is completely missing from the target sequence.
- The overlapping gene does not contain the annotation to repair.
How is Repair Conducted?
If a region has been marked for repair, its sequence is entirely replaced by the sequence of the same region from the closest gene match.
For example, if you have selected to repair FR1 and one of your sequences has an ambiguous nucleotide inside its FR1 annotation, it will be marked for repair. If FR1 was annotated as matching the gene IGKV1-9*01, the FR1 sequence for this gene will be taken from the reference database used to originally annotate your sequences. This reference sequence for FR1 will entirely replace the existing FR1 sequence with the ambiguous nucleotide. When a region is replaced, an annotation is added on the entirety of the replaced region(s) and includes the original sequence information for future reference.
Note that the rest of the sequence to either side of the repaired region will remain unaltered.
What Happens Next?
Your selected sequences are annotated using Antibody Annotator again using the same options as used to create the original document. It will re-annotate both repaired sequences, and any selected sequences that did not require repairing. In the new result, you will now see fewer errors across your sequences.