Hidden Markov Models for gene finding (caricature)

Most current gene finding programs are based on Hidden

Markov Models. These work as follows: assume (wrongly)

that the DNA-sequence has been generated randomly by a

Markov model that can be in one of two states: “gene” or

“intergenic region.” Each state has a characteristic

probability of “emitting” a given nucleotide, and has a

characteristic (low) probability of switching to the other

state. The observer sees the sequence of emissions

(nucleotides), but the information by which state a given

nucleotide was emitted is hidden from the observer.

Previous slide Next slide Back to first slide View graphic version