Hidden Markov Models for gene finding - the real picture
In reality, the situation is much more complicated. Coding
regions of genes are not characterized by frequencies of
single nucleotides, but of triplets and hexamers of
nucleotides. Additional information, such as signals that
indicate the beginning or end of a gene or a splicing site
are being used. Additional difficulties arise because of:
existence of six possible reading frames
existence of introns in eukaryotes
variable codon usage frequencies in different species