With the development of computer technology, software becomes more and more complex, and system development becomes more and more important. The extensive application of information technology will produce a large amount of data. Mining data and analyzing its existence rules are of great significance to the effective utilization of data resources. This paper briefly expounds the application of data mining technology in software engineering.
Keyword data mining technology; In software engineering; Applied software technology
With the development of information technology, it develops rapidly, but its controllability is not particularly strong. Software will produce a lot of data in the application process. As a valuable resource, data can be effectively used to add value. As a software development industry, the application of data mining technology has realized the effective utilization of data resources. By studying these laws, we can provide corresponding guidance for software engineering, effectively deal with system failures and improve the effectiveness of cost evaluation.
1 Problems in the application of data mining technology
1. 1 Complexity of information data itself.
The data contained in software engineering can be divided into two categories, structured and unstructured. Software code plays an important role in unstructured data. What affects structured data is software version information. There is a close relationship between structured and unstructured data. To realize the effective use of data, it is necessary to find out the rules through certain techniques. Data mining technology just meets this demand. Use this technology to integrate structured and unstructured data and improve the effectiveness of its use.
1.2 lacks consistency of evaluation criteria.
Data mining technology is widely used in life, and it can better evaluate the actual situation and optimize the results. But there is no unified standard, which leads to the complexity of software information. However, there are differences in expression. Information acquirers cannot effectively apply and compare information. The reason for the lack of uniform standards for information is that the evaluation methods are inconsistent.
Application of Data Mining Technology in Software Engineering
2. 1 Data Mining Execution Record
The execution of record mining is mainly to analyze the path of the main program, so as to find the relevant relationship of the program code. Its essence is to achieve the goal by analyzing the related execution path and reverse modeling. The function is to verify, maintain and understand the program. The process of record mining is usually the initial insertion of the analyzed system, followed by the recording process, which records the state variables of application programming interfaces, systems and modules after executing the previous program, and finally reduces, filters and clusters the obtained information. The final model can express the characteristics of the system.
2.2 Vulnerability detection
There will be loopholes in the system or software itself, and the loopholes themselves have certain concealment. Because there are some blind spots in people's thinking, we can't find loopholes, so we need to use some software. The purpose of vulnerability detection is to find out the vulnerabilities and errors in the software and fix them, so as to ensure the quality and safety of the software. To apply data mining technology to software testing, we must first determine the test items, plan the test contents according to users' needs, so as to determine the test methods and make specific plans. The testing work is mainly data cleaning and conversion, and its foundation is vulnerability data collection. By cleaning up the collected and summarized information, the data related to software data and defective data are screened out, and the remaining data are cleaned up, and corresponding measures are taken to supplement the missing items, and their attributes are converted into numerical representations. After that, select the appropriate model for training and verification. This link should combine the actual needs of the project to choose the mining method, and find the most suitable method through the analysis and comparison of different data results. After that, the above methods are repeatedly applied to locate and detect vulnerabilities in the software. Collect the corresponding data in the software library, classify them on the basis of describing the vulnerabilities, and finally apply the knowledge obtained from mining to the test project.
2.3 Open source software
Because of its openness, dynamics and overall situation, the management of open source software needs to be treated differently from traditional management software. In general, mature open source software has relatively complete software application records, including error reports and developer activities. The people involved in the development will be in dynamic change, and the reason for the dynamic change lies in the openness of the software. At the same time, mining the dynamic characteristics of software can achieve the goal of high quality management of open source software.
2.4 Version control information
In order to ensure the uniformity of editing content of project participants, it is necessary to control the system application. In the application of software development engineering, the management and protection of development work will be realized through version control system. Its application mode is mainly to mine change data, find out the relationship between different modules and systems, and detect possible loopholes in the program. The application of this technology can effectively reduce the maintenance cost of the system in the later period, and also has a certain evasive effect on the loopholes caused by the later changes.
Application of Data Mining in Software Engineering
3. 1 correlation method
The function of this method is to find relevant connections and interesting associations in the data. Concrete association rules have two obvious characteristics. ① support; ② Reliability. The former means that in a group of things, two subsets have the same probability. The latter indicates the probability that one thing will appear in a collection of things, and another thing will also appear.
3.2 Classification method
This method is mainly used to classify labels and discrete values. The operation steps of this method are: firstly, establish the corresponding model, describe the data, and classify them by using the model. In the selection of classification methods, decision tree method, Bayesian method and support word measuring machine are commonly used. The basis of decision tree method is greedy algorithm.
3.3 Clustering method
The commonly used methods of this method include partition method, density-based method, model method, grid method and hierarchical method. The input of clustering analysis is a set of ordered pairs, and the data in the ordered pairs represent samples and similarity respectively. Its basic application theory is based on different object data.
Application of Data Mining in Software Engineering
4. 1 Data Mining of Clone Code
The most primitive thing in software engineering is to check and test the cloned code. As far as its mode is concerned, it is based on text comparison and identifier comparison. The former is judged by the statements contained in the program code in the system. This method is mainly to improve the efficiency of string matching in the later improvement process. In the practical application process, the efficiency is optimized by correlation function matching.
4.2 Software data retrieval mining
This method is also one of the initial mining requirements in software engineering. The application of this method mainly includes the following three steps.
① Data input. Its essence is to input the information that needs to be retrieved and find the data that users need in the data according to their needs.
② Information search process. After confirming the information that users need to find, the system will search in the database according to the information content and list it in categories.
③ Export and view information data. Users can export data or view it online according to their own needs. After the data is exported, corresponding records will be formed, and it will be more convenient and quick for customers to query again. The export of data requires the use of related software.
4.3 Applicable to the three stages of design.
Software engineering has a lot of information about software, which is usually stored in the code base. Data application can improve work efficiency. Every cycle of software engineering will produce a lot of data. The life cycle of software engineering can be divided into three stages: analysis and design, iterative development and maintenance, and application.
4.4 Mining Datasets for Project Management
So far, software development has been a multidisciplinary integration. Such as economics, organizational behavior, management, etc. For software developers, the focus of attention is not only technological innovation, but also scientific and standardized management. Besides the mining of version control information, there is also the mining of personnel organization relationship. For large-scale software development, the effective distribution and coordination of human resources is also a problem that needs to be faced in the field of software work. For example, in the process of large-scale system development, there are often many people involved and people need to communicate with each other. Communication methods include face-to-face communication, file transfer, electronic information, etc. Excavating the relationship between people is conducive to the development of management work. The employee's network is a social network. The reasonable organization and distribution of personnel will affect the progress, cost and the possibility of success of the project. This research usually adopts simulation modeling.
5 concluding remarks
Software engineering technology is widely used in many fields of life. With the development of science and technology, data mining, as one of the technologies, has become more and more important and effective. In order to ensure the reliability and efficiency of mining technology, it has certain integration with other engineering technologies. Data mining has shown great economic benefits in practical application, and its application scope should be vigorously promoted to expand its application depth and level.
refer to
[1] Li. On the application of data mining technology in software engineering [J]. Computer Knowledge and Technology, 20 16(34).
[2] Lei Lei. Research on the Application of Data Mining Technology in Software Engineering [J]. Electronic Testing, 20 14(02).
[3] Sun. Overview of the application of data mining technology in software engineering [J]. China New Communication, 20 15( 15).
;