First, the advantages of corpus linguistics
Before the rise of corpus linguistics, the way to describe the language system was mostly the traditional non-empirical way, relying on the intuition of linguists. In recent years, corpus-based research has found a large number of language structures that were previously ignored or considered ungrammatical by analyzing real language examples in large corpora, which supplemented the traditional language system description. Corpus linguistics is good at revealing the most typical language features, discovering the usage rules of language in real life and reflecting the true face of language. For example, taking English as an example, corpus can answer some questions that are crucial to English linguistics: What are the most commonly used words and phrases in English? What is the tense that people use the most? What are the most used verbs and nouns? What is their most common collocation? How do people use modal verbs? What words are used in informal situations? What are the basic words people need in their daily conversation? How do people use English grammatical structures? Are these structures used more frequently or less frequently in different registers? Do words with these grammatical structures have some semantic features? Corpus linguistics can reveal such language features and usage rules, which is extremely important for foreign language teaching and can greatly reduce the blindness of foreign language teaching and improve teaching efficiency.
Secondly, corpus linguistics reveals the law of asymmetric language distribution.
Although the frequency of English words in discourse is unbalanced, it is regular. Corpus linguistics clearly shows the asymmetric distribution characteristics of language through frequency statistics. 95% of most written articles consist of 4000 ~ 5000 high-frequency words, of which the first 1000 high-frequency words account for 85% of the articles. In oral English, 50 high-frequency function words account for 60% of the text. According to research, the 700 words with the highest frequency in English account for about 70% of English. In other words, 70% of the English people use in their daily listening, speaking, reading and writing is made up of these 700 most common words. When the scope is expanded to the first 1500 words, the proportion of language use rises to 76%, which means that the proportion of 800 words only increases by 6%. By 2500 words, the proportion has reached 80%, that is, the proportion of 1000 words has only increased by 4%. This set of data reveals the asymmetric features of language and makes linguists realize the importance of distinguishing typical and atypical language features. Since the 1960s, there have been more and more researches on describing language systems at all levels based on corpora, which explain the typical features of language systems from many aspects. At the same time, through the investigation of the corpus, we can clearly describe the high and low frequency distribution map of word collocation forms, distinguish common language information from uncommon language information, and help people extract the most valuable information from complex language phenomena.
Third, college English teaching to deal with the status quo of language features
The asymmetry of language revealed by corpus linguistics makes linguists start to think about some questions: Is the language knowledge that teachers teach students in class and what students are required to master in various exams the language commonly used by most British and American people? A large number of foreign studies have found that the content and arrangement order of many foreign language textbooks are quite different from the language actually used by the nation. These studies compare related words or structures in textbooks and corpora, and find that some textbooks pay attention to less commonly used expressions, but ignore important usages. These studies include Kennedy's (Y1987) survey of quantity and frequency expression, HPLMES's (1988) study of cognitive modality expression, and LJung's (1990) comparative study of common words and so on. All these studies have found that there are great differences between the English described in college English textbooks and the English actually used, and stressed that the description of language system in textbooks should be modified by using corpus information to make it truly present the whole picture of language actually used. At the same time, it also reflects from the side that textbooks that can not truthfully describe the actual use of language will mislead learners and become one of the root causes of learners' mistakes. The conclusions of these studies also show that corpus information should be used to guide the compilation of syllabus and teaching materials, so that common words in language systems can get more attention than unusual words.
Fourthly, the enlightenment of corpus linguistics to college English teaching.
The asymmetry of language requires English teachers to treat the typical and atypical features of language differently in teaching, and pay attention to high-frequency words, high-frequency semantics and collocation forms of words, high-frequency grammatical structures and so on. From the statistical point of view, language items with high frequency are generally the language items that learners are most likely to encounter and need to learn in language use. Linguists have long suggested that language items with high frequency should be studied first to reduce the learning burden and avoid confusion. Because mother tongue learners have limited energy and time to contact and learn the target language, what they have learned in their life may be just a drop in the ocean, so it is often very important to take typical language rules as the teaching center. Compared with the balanced treatment of different language phenomena, this approach can often get twice the result with half the effort. So far, the design of English syllabus and textbooks mostly relies on the traditional language description based on limited expectations, and mainly relies on experience and subjective judgment to identify the difficulty, importance and learning order of language features and vocabulary. This college English syllabus based on limited experience is often not accurate enough. The research results and frequency information of thesaurus should be widely used in important foreign language teaching fields such as syllabus design, textbook compilation and classroom application. At present, linguists have reached a consensus that in the primary and intermediate stages, high-frequency language features should be the focus of teaching, not the difficult language features, from the perspective of textbook content, outline order and teaching focus. Therefore, it is of great significance to change the idea of balancing language features into emphasizing typical language features to improve teaching quality. Teachers, as the instructors of teaching activities, should also have the consciousness that language phenomena are divided into high and low frequencies, and conscientiously implement the principle of focusing on typical language phenomena in all aspects of teaching. Language projects with high frequency should be the focus of teaching, and get the attention of both teachers and students. This principle should also be embodied in tests and exercises, so as to avoid asking difficult questions that are divorced from the actual use of language and examining relatively uncommon words, language phenomena and expressions. Students will lose confidence in language learning because of failure and frustration. Teachers should also actively assist students to establish the frequency concepts of language phenomena such as word frequency and semantic frequency in teaching, and help students arrange their study time reasonably. Therefore, corpus linguistics is an important means for teachers and language learners. In the past decades, corpus research has accumulated a lot of research results in vocabulary and grammar. These resources can provide very valuable reference information for outline design. When designing college English syllabus, teachers can use corpus information to select teaching content that can reflect the typical characteristics of the target language, arrange teaching order and adjust the focus of teaching content.
Verb (abbreviation of verb) conclusion
Corpus linguistics provides quantitative evidence about the asymmetric distribution of languages, which can guide the design of college English syllabus and the ordering of different language features in teaching materials, and also help to improve the understanding of the importance and priority of various learning tasks in teaching. The enlightenment to college English teaching is that typical language features and atypical language features should be treated differently. However, the typicality of language features, that is, the frequency information of features, is not the only variable to be considered in college English teaching and syllabus design. Therefore, while attaching importance to typicality, college English teaching should also consider the differences between mother tongue and target language.
Author: Su Dongying Unit: School of Foreign Languages, Yinchuan University of Energy
;