Basic architecture
of standard speech recognition technology
1. Bayes’ Rule:  P(W|S) P(S|W)P(W)
2. Approximate P(S|W)P(W)
as a Hidden Markov Model
     a probabilistic function      [ to get P(S|W)]
         of a markov chain         [  to get P(W) ]
3. Use Baum/Welch (=EM) algorithm
       to “learn” HMM parameters
4. Use Viterbi decoding
       to find the most probable W given S
       in terms of the estimated HMM