2. Text analysis refers to the representation of text and the selection of its feature items; Text analysis is a basic problem in text mining and information retrieval. It quantifies the feature words extracted from the text to represent the text information. Transform them from an unstructured original text into structured information that can be recognized and processed by computers, that is, scientifically abstract the text and establish its mathematical model to describe and replace the text. In this way, the computer can recognize the text through the calculation and operation of this model. Because the text is unstructured data, in order to mine useful information from a large number of texts, we must first convert the text into a manageable structured form. At present, people usually use vector space model to describe the text vector, but if the feature items obtained by word segmentation algorithm and word frequency statistics method are directly used to represent each dimension of the text vector, then the dimension of this vector will be very large. This unprocessed text vector not only brings huge computational overhead to the follow-up work, which makes the whole processing process very inefficient, but also damages the accuracy of classification and clustering algorithms, thus making the results unsatisfactory. Therefore, it is necessary to further purify the text vector and find out the most representative text features on the basis of ensuring the original intention. In order to solve this problem, the most effective method is to reduce the dimension through feature selection.