You can use custom reference sequences but only admin users can create these custom reference sequence databases. Please refer to the following article to learn more on how to create these reference sequence databases.
In order to create a custom reference sequence database, the reference sequence(s) will have to satisfy some requirements and these requirements differ depending on the selected pipeline. The following outlines the requirements for the Scaffold and Antibody annotators:
The scaffold and reference sequence annotations must match in order to produce correct results. For example, if the reference sequence FR1 region is annotated as Type: FR, the scaffold FR1 region annotation type should have FR1 as the annotation type as well. If the scaffold and reference sequence annotations do not match, the scaffold analysis would most likely produce incorrect results.
Note that there are currently NO checks for this.
The Antibody annotator requires a reference sequence to be identifiable as either heavy or light chain either through annotations or reference sequence name. Hence, the heavy and light chain reference sequences must be distinguishable and you can do this by either:
- Annotating heavy chains with VDJ-region and light chains with VJ-region annotations names (CDS type).
- Or each reference sequence may contain an annotation named V-region or J-region or D-region (V_segment, J_segment and D_segment type respectively) with a gene name property starting with IGH (for heavy chain) or IGL or IGK (for light chain); e.g. IGHV1-1*01.
- In the absence of annotations, the reference sequence name must contain the words Heavy-Chain or Light-Chain.
- Frameworks and CDRs can also be annotated, using the annotation types FR and CDR, respectively.
If there is more than one reference sequence per database, these reference sequences must be grouped into a sequence list.
To group multiple sequences into a single sequence list, select more than one sequences and click Group Sequences in the Pre-processing dropdown.
Note that a reference sequence database may contain one or more reference sequences and multiple databases may be used simultaneously (each of which may be from a different chain).
Please contact support if you require assistance with this.