1. Corpus Linguistics corpora and conceptual corpora, also known as corpora or materials, are warehouses for storing language materials. Corpus is a database for collecting and scientifically organizing language materials. This material naturally appears in a certain language, either written or spoken. These raw materials are the basis of language statistics and the first-hand materials for analyzing and studying language laws [2 ~ 3]. Corpus linguistics is a subject that studies language on the basis of text corpus. Different researchers hold different views on corpus linguistics.
Point. Some people think that corpus linguistics is a theoretical framework and a new discipline that keeps pace with other branches of linguistics. Other researchers believe that corpus linguistics is not an independent discipline, but it provides a methodological basis for language research and a new philosophical thinking for linguistic research. In order to solve this problem, China researchers pointed out that only corpora are available on the basis of summarizing the research results of famous foreign linguists Halliday, Leach and Togni-Bonelli.
Corpus linguistics is the name of a new discipline when linguists use the language materials and facts to criticize the existing linguistic theories and put forward new viewpoints or theories [3 ~ 6]. Generally speaking, corpus is a method and means to study a certain aspect of language by using real language materials. With the help of the methods provided by corpus linguistics, linguists can not only verify the existing language rules, but also describe the evolving grammar and pragmatic rules according to the data provided by corpus. In the past, the materials in the corpus were collected and arranged manually, and were usually used to calculate the frequency of vocabulary use as the basis for compiling textbooks and dictionaries. Now, the efficiency and scale of establishing corpus by computer have been greatly improved. Corpus is randomly selected from representative language materials.
—
09
—
The sample is input into the computer, which can be composed of a large number of texts processed by the computer. The more texts the corpus receives, the wider the coverage and the more reliable the information it provides. The construction of corpus includes collecting original materials to form an original corpus, and then tagging the corpus to generate a tagged corpus. We can use corpus analysis software to make various statistical analysis on the labeled corpus, so as to reveal various features of the target language.
Second, the role of corpus linguistics in language research
The development of corpus linguistics has played a positive role in promoting the in-depth study of language. On the one hand, it affects language learning concepts and methods; On the other hand, it provides a foundation for the implementation of some language learning concepts. This paper only discusses the influence of corpus on language research in three aspects.
(1) Is the language descriptive or prescriptive?
Whether the language is descriptive or prescriptive, different periods have different emphases. In the18th century, major European languages were learned by prescribed methods. Linguists always try to formulate various rules for the correct use of languages, which emphasize the correctness of languages and the application of Latin standard patterns. Therefore, grammar is the focus of language research and language learning, and some usages need to be memorized repeatedly after being stipulated, "because this is a question of black and white, right or wrong" [7]. Under the influence of this view, language teaching adopts teacher-centered grammar translation method. The methods of language teaching include the interpretation and memory of a large number of definitions and rules. Written language often receives more attention. The corresponding view is that language is a descriptive science. Under the guidance of this view, linguists try their best to find and record the language actually used by a certain language community without using other rules to correct the community language. Corpus linguistics provides the foundation and feasibility for this view. Through the research and analysis of a large number of examples in the corpus, we can sum up the practical application rules of language. The embodiment of this view in language learning is to pay more attention to learners' personal needs and change teacher-centered cramming teaching into student-centered knowledge exploration teaching. In teaching activities, teachers no longer blindly instill language rules and language knowledge, but require learners to contact the real and natural language through search corpus, observe language phenomena, analyze and summarize language laws, make assumptions, and constantly test and correct their assumptions in language use. Teachers have also changed from traditional knowledge givers to knowledge explorers and language researchers who are equal to students. It is true that one-sided emphasis on whether language is prescriptive or descriptive seems to be not objective enough. Starting from the stipulation of language, language researchers and learners can simplify complex language, grasp the basic framework of language macroscopically, and reduce the difficulty of language research and learning. For language learners, learning motivation and interest may not be too high, and learning methods are relatively rigid. From the descriptive point of view of language, language researchers and learners will come into contact with rich and real language, and learn and master the language by observing, analyzing and summarizing the laws of language. In this process, researchers and learners are active, so their motivation and interest will be greatly improved.
(2) Language and speech, as well as language ability and language use.
Saussure divided language into language and speech. Language is a grammatical system commonly used by all members of the social system, and it is the potential in the brains of a group of people. It is a social, homogeneous and abstract language form. And speech is the language produced by every individual in society, which is heterogeneous and diverse [5]. Due to the different understanding of the nature of language and the division between language and speech, there are two schools of structuralism and functionalism, and their research focuses on language are not the same. The language studied by structuralism is an abstract symbol system above individuals and society. Functionalism emphasizes the use function of language, investigates the actual language phenomenon and tries to find out the structure of * * *. On the basis of Saussure's research, Chomsky put forward two concepts: language competence and language application. Chomsky believes that language users have an instinctive grasp of the rules of language, and language ability is the language knowledge that individuals rooted in the brain can generate unlimited words according to limited rules. Linguists pay attention to the limited rules that can produce infinite speech. The concept of language use is very similar to that defined by Saussure, which refers to the real use of language in a specific scene. Corpus linguistics provides a more scientific research method for the further development of functionalism, and its research focus is on speech and language use. Through the in-depth study of speech and language use, we can verify the existing prescriptive language rules and try to generalize new language rules. The study of speech and language use in corpus is reflected in all aspects of language research, such as the study of register; Analyze native speakers' discourses and summarize their typical structures for compiling oral teaching materials; Statistical high-frequency words, applied to the design of teaching syllabus; Analyze and compare the language use of foreign language learners and explore more effective learning strategies. The study of language and language competence is carried out in language acquisition, that is, when and how the rules that exist in the subconscious form in people's brains are formed. Therefore, corpus studies concrete language, while the study of language and language competence focuses more on abstract language. We can't blindly affirm or deny a certain research method, because language can be studied from multiple angles, and different research angles can complement each other, serve different purposes and meet different needs. However, the emergence of corpus provides conditions for language research.
—
1
nine
—
A new perspective makes language research more objective and true.
(3) Combination relationship and aggregation relationship
Saussure is the founder of structuralism school. He believes that language is a symbol system, so linguists must try their best to find the value of language from the relationship between one symbol and other symbols and understand the position of symbols in the system. Saussure put forward two main linguistic relations: combinatorial relations and aggregation relations. Combinatorial relationship refers to the relationship between one unit and other units in the same sequence, or the relationship between all components in this field. Aggregation relationship, also known as association relationship, refers to the relationship between components that can be replaced each other in a special position of the structure, or the relationship between components on site and components not on site. The words in the aggregation relationship have the same syntactic features, but they cannot be semantically replaced [7]. The emergence of large-scale corpus provides a huge space for the study of the relationship between them, especially the combination relationship. Because computers have the ability to search for a specific word and study all the words that have a * * * relationship with this word. This is also the collocation relationship usually studied. Halliday (1976) defines collocation as "linear * * * phenomenon reflecting the combination relationship of lexical items in significant neighborhood". This definition clearly defines collocation as combinatorial relationship. In collocation research, Jones and Sinclair (Jones &; Sinclair is the first researcher to study word collocation in corpus. Since 1980s, the study of word collocation based on corpus or corpus-driven has been widely carried out, which has greatly changed the study of collocation. The research validity is improved, the proportion of quantitative research is increased, and the collocation ability is observable and operable. With the increasing degree of automation, the human interference factors in the research process are greatly reduced. Daniel Krieger affirmed the objectivity of corpus linguistics in the study of combinatorial relations by studying the usage of "any" in his paper [9]. According to the traditional grammar rules, "any" is usually used in negative sentences and interrogative sentences, but Mindt's statistics show that the frequency of "any" in affirmative sentences is 50%, that in negative sentences is 40%, and that in interrogative sentences only accounts for 65,438+00%. Therefore, corpus provides great convenience for the study of combinatorial relations. The study of aggregation relationship is more reflected in the study of some synonyms. For example, Cui pointed out in her article that Rundell made a comparative study of the group of synonyms "start" and "begin" by using the corpus [10]. These research results provide a model for studying combinatorial relations and aggregation relations by using corpus, and language researchers can do many similar studies by using corpus.
Third, the current problems in corpus research
There is no doubt that corpus plays an active role in linguistic research, but the biggest problem at present is that only a few language researchers have mastered corpus research methods, and most language teachers and language learners do not know corpus and how to use corpus resources for scientific research. However, there are few articles about the use of corpus. Therefore, training language teachers and language learners to use corpus resources on a large scale will play the role of corpus more effectively and greatly promote the pace of language research and language learning.
Corpus provides new ideas and methods for language research, but the utilization of corpus resources needs to be greatly improved. Only when more language teachers and language learners have the closest connection with the corpus of language masters can the corpus really promote language research and language learning.