IRCS/CCN Summer Workshop: Statistical Speech Processing


	Other typical details:

Complex elaborations of the basic ideas

•

HMM states ← triphones ← words

–

each triphone → 3-5 states + connection pattern

–

phone sequence from pronuncing dictionary

–

clustering for estimation

•

Acoustic features

–

RASTA-PLP etc.

–

Vocal tract length normalization, speaker clustering

•

Output pdf for each state as mixture of gaussians

•

Language model as N-gram model over words

–

recency/topic effects

•

Empirical weighting of language vs. acoustic models

•

etc. etc.