Why do I get error messages when I try to annotate a coding region (CDS) in BankIt?

Views:

Your GenBank submission may include nucleotide sequences from protein-coding genes. In BankIt, you must annotate the coding region (CDS) on such sequences. GenBank indexers can't verify your sequences without the correct CDS annotation.

BankIt will report annotation errors if you do any of the following:

Enter incorrect CDS locations
Select the wrong coding strand (for example the plus strand where it should be minus)
Select a wrong reading frame (codon_start) for 5’ partial CDS
Have a sequence with poor quality

You can avoid annotation problems if you analyze your sequences, such as with Nucleotide BLAST (blastn). The blastn search result page offers the CDS feature display. This option shows protein translations of coding regions. It can help you find the correct locations, reading frame, and strand for your CDS. In the same analysis, you can check for any sequencing errors. We recommend fixing the errors before you start working in BankIt.

See these two articles on how to:

See these articles for blastn methods to determine CDS properties:

See these articles for blastn methods to check for sequencing errors in CDS:

Try other BLAST tools if your blastn results are not adequate:

blastx to check for frameshifts in CDS
genomic blast for organisms with annotated assemblies

Keywords: GenBank submission, BankIt feature annotation, coding region annotation, CDS annotation problems, nucleotide sequence analysis, Nucleotide BLAST, blastn, CDS feature

Comments (0)