Current location - Education and Training Encyclopedia - Graduation thesis - Automatic scoring algorithm for articles
Automatic scoring algorithm for articles
The automatic grading of articles is abbreviated as AES (Automated Essay Scoring). The AES system uses NLP technology to automatically grade articles, which can reduce the burden of marking staff. At present, many large-scale exams use AES algorithm to grade compositions, such as GRE exam. GRE test, a marking teacher will use AES system to score. If the AES score is too different from the marking teacher's score, add another marking teacher to score. This paper mainly introduces two classic automatic scoring algorithms.

According to the optimization objective or loss function, automatic scoring algorithms can be roughly divided into three types:

Traditional automatic scoring algorithms usually need to manually set many features, such as grammatical errors, n-tuples, word count, sentence length and so on. And then train the machine learning model to score. At present, there are many ways to learn the characteristics of articles by using neural networks.

There are two scoring algorithms:

From the paper "Regression Based on Automatic Paper Scoring". Given a large number of articles that need to be graded, it is first necessary to construct the characteristics of the articles, and set the characteristics and vector space characteristics manually.

Spelling error: use pyenchant package to count the proportion of misspelled words to the total number of words.

Statistical features: counting words, words, sentences, paragraphs, stop words, named entities, punctuation marks (reflecting the organization of the article), text length (reflecting the fluency of writing), and the proportion of different words in the total number of words (reflecting the vocabulary level).

Part-of-speech statistics: count the frequency of various parts of speech, such as nouns, verbs, adjectives, adverbs, etc. Part of speech is obtained through nltk package.

Grammatical fluency: analyze sentences with link grammar, and then count the number of links; Statistic the probability of n-tuples; Statistical probability of part-of-speech n-tuple.

Readability: Readability score is an index to measure text organization and syntactic and semantic complexity of text. Using Kincaid readability score as the feature, the calculation formula is as follows

Ontology features Ontology features: label every sentence, such as research, hypothesis, proposition, quotation, support and opposition.

Articles can be projected into a vector space model (VSM). At this point, the article can be represented by the feature vector in the vector space. For example, an article can be represented by a hotspot code, and its length is equal to the vocabulary length. If a word appears in the article, the corresponding position is set to 1, as shown below:

In addition, TF-IDF vector can also be used to represent text, but there is no correlation between words in this way. In order to solve this problem, this paper uses a word correlation matrix W plus linear transformation to introduce the correlation between words.

The word correlation matrix w is calculated from the word vectors generated by word2vec, that is, W (i, j) = cosine similarity of word vectors of word I and word J.

Finally, in order to consider the word order in the article, the article is divided into k paragraphs, and then the vector space features are calculated and fused together.

After obtaining the above characteristics, SVR algorithm is used for regression learning. The data set is kaggle ASAP competition data set, including 8 groups of articles, and the evaluation index is KAPPA and correlation coefficient. The following are some experimental results.

This is the effect of using linear kernel and rbf kernel on 8 groups respectively.

This is a comparison with human raters.

The following content comes from the paper "Neural Network for Automatic Paper Scoring", which can be trained by regression or classification. The model is shown in the following figure.

This paper mainly uses three methods to construct the feature vector of the article:

This paper mainly adopts three neural network structures, NN (Forward Neural Network), LSTM and BiLSTM. All networks will output a vector h(out), and the loss function is constructed according to h(out). The following are the loss functions of regression and classification respectively.

Regression loss

Classified loss

The first model: NN (Forward Neural Network)

A two-layer feedforward neural network is used, and the input feature vector of the network is the average value of glove word vector or training word vector. The calculation formula of h(out) is as follows.

The second mode: LSTM

The input of LSTM model is the word vector sequence of all the words in the article, and then the last output vector of LSTM is taken as the feature vector h(out) of the article.

The third mode: BiLSTM

Because articles are usually long, one-way LSTM is easy to lose the previous information, so the author also uses BiLSTM model, and adds the outputs of forward LSTM and backward LSTM models together as h(out).

Add TF-IDF vector

The output h(out) of the above model can be improved by adding TF-IDF vector. Firstly, the dimension of TF-IDF vector needs to be reduced, and then it is spliced with the output of the model, as shown in the following figure (BiLSTM as an example).

Automatic paper grading based on regression

Neural Network for Automatic Paper Grading