Dictation machine: A large vocabulary, non-specific and continuous speech recognition system is usually called dictation machine. Its architecture is HMM topology based on the above acoustic model and language model. In training, the model parameters are obtained by the forward-backward algorithm of each primitive. In recognition, the primitives are concatenated into words, and a silent model is added between words, and a language model is introduced as the transition probability between words to form a circular structure, which is decoded by Viterbi algorithm. In view of the easy segmentation of Chinese, it is a simplified method to improve efficiency to segment first and then decode segment by segment.
Dialogue system: The system used to realize man-machine oral dialogue is called dialogue system. Limited by the current technology, the dialogue system is often a system oriented to a narrow field with limited vocabulary, and its topics include travel inquiry, reservation, database retrieval and so on. Its front end is a speech recognizer, which recognizes the generated N-best candidate or word candidate grid, and the semantic information is analyzed by the parser, and then the response information is determined by the dialogue manager and output by the speech synthesizer. Because the current system often has a limited vocabulary, we can also obtain semantic information by extracting keywords.