A nucleotide sequence may represent a 5’ (five prime) partial coding region (CDS). A 5’ partial CDS encodes a protein with an incomplete N-terminus. Such a nucleotide sequence can start with either the first nucleotide of a complete codon (coding triplet) or with an incomplete codon (lacking the first nucleotide or the first and the second nucleotides of the codon). Codon completion determines the reading frame for translating a 5’ partial CDS into protein. GenBank uses the term “codon_start” as a synonym for the reading frame.
Nucleotide BLAST (blastn) can help determine the correct reading frame of a 5’ partial CDS. Use the CDS feature display on the BLAST search results page. See the article on blastn and CDS feature set up.
To determine the reading frame for a 5’ partial CDS:
You can determine the reading frame from the placements as follows:
AA code placed on the 2nd nucleotide: reading frame (codon_start) is 1
Explanation: BLAST places the single letter AA codes in the middle of the complete codons. In this case, nucleotides 1, 2, and 3 represent a complete codon. The translation therefore starts with nucleotide 1.
AA code placed on the 3rd nucleotide: reading frame (codon_start) is 2
Explanation: The translation skips the first base of the sequence to start at the first complete codon (nucleotides 2, 3, and 4).
AA code placed on the 4th nucleotide: reading frame (codon_start) is 3
Explanation: The translation skips the first two nucleotides of the sequence to start the first complete codon (bases 3, 4, and 5).
See Figures 1, 2, and 3 for examples of the three reading frames.
Figure 1: A pairwise BLAST alignment with the CDS feature display. Query aligns to Subject from base 1. Lack of initiation codon (ATG) indicates a 5’ partial CDS. The first complete codon (underlined in red) on Query are bases 1, 2, and 3 with the AA residue “L” in the middle of the codon. Query's reading frame is 1.
Figure 2: A pairwise BLAST alignment with the CDS feature display. Query aligns to Subject from base 1. Lack of initiation codon (ATG) indicates a 5’ partial CDS. The first complete codon (underlined in red) on Query are bases 2, 3, and 4 with the AA residue “A” in the middle of the codon. Query's reading frame is 2.
Figure 3: A pairwise BLAST alignment with the CDS feature display. Query aligns to Subject from base 1. Lack of initiation codon (ATG) indicates a 5’ partial CDS. The first complete codon (underlined in red) on Query are bases 3, 4, and 5 with the AA residue “G” in the middle of the codon. Query's reading frame is 3.