Current location - Education and Training Encyclopedia - Graduation thesis - How to manage and control data effectively?
How to manage and control data effectively?
With the advent of the era of big data, the government and enterprises have seen the value of data assets, and quickly began to explore application scenarios and business models and build technology platforms. However, if data governance is forgotten in the big data puzzle, no amount of business and technical investment will be in vain, because there is a classic saying: garbage goes in and garbage goes out.

When you process or use a large amount of data, the word "data governance" will be familiar to you. What do you think of data governance? Is data governance right for you? How to implement it. In short, data governance is a strategy to deal with data-how to collect, verify, store, access, protect and use data. Data governance also includes who will view, use and enjoy your data.

With the advancement of the era of big data, these problems have become increasingly prominent, and more and more enterprises rely on collecting, managing, storing and analyzing data to achieve their business goals. Data has become a profit tool, commercial medium and commercial secret of enterprises. Data leakage will lead to legal disputes and make consumers lose confidence in the company's core business.

If you are lucky enough to let all business departments manage their own data, then you lack effective data management, and even all departments do it themselves. You can't imagine all departments producing, storing and selling products at will. Improper use of data, like improper use of inventory, will cause great losses to enterprises. So we need to make a measurement to ensure the validity, security and availability of the required data, which is what we want to talk about "data governance".

A data governance strategy must include a complete data lifecycle. The strategy must include data collection, cleaning and management. In this life cycle, data governance must focus on the following:

Where and how did the data come from?

This is the beginning of the data life cycle. The source of data determines the basis of data governance strategy. For example, the size of a dataset is determined by the data source. Do you collect data from target markets, existing users and social media? Or use a third party to collect data or analyze the data you collected? What is the input data stream? Data governance must pay attention to these issues and formulate strategies to manage data collection, guide third parties to process the data they collect or analyze the data you collect, and control the path and life cycle of data.

data check

Usually, data sources are very large and diverse, which is a headache for data managers. Distinguishing between data noise and important data is just the beginning. If you collect data from affiliated companies, you must ensure that the data is reliable. For those tens of thousands, hundreds of thousands or even millions of complex relational data, it is not realistic to manually clean the data through Excel. Querying, replacing, correcting, enriching and storing massive and complex relational data in batches requires professional data cleaning tools or systems. Metadata, master data, transaction data, reference data and data standards are built into data cleaning tools or systems, and management mechanisms and technical standards such as organizational structure, content control and process control are combined to improve the work efficiency of data managers. For example, if you need to manually write the metadata collected by the program, the system will automatically obtain it for you; You need to manually identify or write code to check the data quality, and the system will help you automatically identify the problem; With the data dictionary of document management, the system helps you manage online; Based on e-mail and offline processes, the system helps you realize online automation. Of course, the system is not omnipotent, and the software tools for data governance, like other software tools, have no magic. Without the participation of data governance personnel and the promotion of data governance, even if the software is perfected, the whole process of data governance cannot be completed. This is also the reason why data governance consulting service has always had its market, and it is also the reason why most domestic pure data governance software projects failed to achieve the expected goal.

Data governance must solve the storage problem

Data storage is closely related to the size of the data set. The storage of big data must be in a secure redundant system. Hierarchical systems are usually used to store data according to the frequency of use. In this way, expensive online systems provide frequently requested data, while less frequently requested data is stored in cheaper and less available systems. Of course, if some sensitive data with low request frequency is stored in a system with low security, the risk will be greatly enhanced. Therefore, when making a data storage scheme, a good data governance strategy must consider all aspects.

Data governance must establish an access management system to find a balance between demand and security.

Define the rights of visitors and only access the data contained in their corresponding rights. Only legitimate requests can access data, while sensitive data requires higher authority and stricter verification. Only open to users with a specific security level. Access levels should be set for users and the data itself. When managing accounts, it is very important to interact closely with the human resources department and the purchasing department, because it can make employees who have left the company and suppliers who have stopped cooperating no longer have timely access rights. Dealing with these details and ensuring data ownership and responsibility is part of a complete data governance strategy.

Data usage/* * * enjoyment/analysis

How to use data is an important content after data governance. The data may be used for customer management, improving customer experience, placing targeted advertisements, initializing basic data for user application systems, assisting application system construction, and providing market analysis and data for affiliated companies. We must carefully define which data can be used for enjoyment or marketing, and protect them from attacks and leaks, because data should be used for purely internal purposes. Let users know that all companies that collect data will abide by data security and guarantee regulations. Ensuring the reasonable and compliant use of data is also an important content of data governance.

Collection, verification, storage, access and use are all necessary components of a data security plan.

Collection, verification, storage, access and use are all necessary components of the data security plan, and there must be a comprehensive strategy to solve these and other security problems. The data security plan must be effective and highly available, but all parts of the data life cycle are vulnerable to attacks and damage caused by carelessness. You must determine the data security scheme in data governance, including access control, static data, data processing, encryption after data transmission, etc.

Management/metadata

The life cycle of data without management is incomplete. For example, metadata is applied to a piece of data for identification and retrieval. Metadata includes the source of data, the date of collection or generation, the level of information access, semantic classification and other information necessary for enterprises. Data governance can establish a metadata vocabulary and define the validity period of data. Please note that the data will also expire, after which we can only use it for the analysis of historical data.

In the process of creating data governance, there may be some resistance within the enterprise. For example, some people are afraid of losing access to data, while others are reluctant to share data with competitors. The data governance strategy needs to solve the above problems and make it acceptable to all parties. Companies accustomed to data silos will find it difficult to adapt to the new data governance strategy. However, today's dependence on large data sets and the ensuing security problems make it necessary to create and implement company-wide data policies.

Data is increasingly becoming a part of enterprise infrastructure, and decision-making is formed in the process of dealing with various specific situations step by step. It is one-off, usually to answer a specific question. Therefore, the way enterprises process data will change because of different departments, or even different situations within departments. Even if each department has a reasonable data processing scheme, these schemes may conflict with each other, and enterprises should find ways to coordinate them. It is difficult to find out the requirements and demands of data storage. If it is not done well, the potential of data in marketing and customer retention will not be brought into play, and if data is leaked, it will also bear legal responsibility.

In addition, in large enterprises, various departments will compete for data resources, and each department only pays attention to its own business situation, lacking the overall concept, and it is difficult to reach a compromise without mediation.

Therefore, the company needs an organization similar to the Data Governance Committee, whose responsibility is to implement the existing data policies, tap the unsatisfied needs and potential security problems, and create data governance policies to standardize the data collection, management, storage, access and use strategies, and at the same time consider the different needs of various departments and positions. Balance the conflicting needs of different departments, coordinate security and access requirements, and ensure the most efficient and secure data management strategy.

Establish a data governance Committee

Responsible for evaluating the needs of all data users and establishing company-wide data management strategies to meet the needs of internal users, external users and even legal aspects. Committee members should include stakeholders from all business fields to ensure that the needs of all parties are well met and all types of data ownership are reflected. The Committee also needs data security experts, and data security is also an important part. It is important to understand the goals of the Data Governance Committee. Therefore, the reasons why enterprises need data governance strategies should be considered and clearly explained.

Develop a data governance framework

This framework should include the internal, external and even legal data requirements of enterprises. All parts of the framework should be integrated into a whole to meet the requirements of collection, cleaning, storage, retrieval and security. To this end, enterprises must clearly explain their end-to-end data strategy in order to design a framework that can meet all requirements and necessary operations.

There are many advantages to combine and support each other in a planned way, such as executing retrieval requirements in a highly secure environment. Compliance also needs to be specifically designed as part of the framework so that regulatory issues can be tracked and reported. The framework also includes daily records and other security measures, which can provide early warning of attacks. Validating data before using it is also part of the framework. The data governance committee should understand each part of the framework, and be clear about its purpose and how it plays a role in the whole life cycle of data.

Data testing strategy

Usually, a data strategy needs to be tested in a small-scale business environment to find out the shortcomings of the data strategy in the framework, structure and plan and make adjustments before it can be put into formal use.

Data governance strategy should keep pace with the times

With the expansion of data governance strategy to new business areas, it is absolutely necessary to adjust the strategy. Moreover, with the development of technology, data strategy should also develop with the development of security situation, data analysis methods and data management tools.

Determine what a successful data strategy is.

We need to establish clear standards to measure the success of data governance in order to measure the progress. Setting data management goals is helpful to determine important indicators of success, and then ensure that the direction of data governance strategy is to meet the needs of enterprises.

No matter how big or small an enterprise is, it faces similar data challenges in using data. The bigger an enterprise is, the more data it has, and the more data it has, the more it needs to formulate an effective and formal data governance strategy. Smaller enterprises may only need informal data governance strategies, but this is limited to those companies with small scale and low dependence on data. Even an informal data governance plan needs to consider the collection, verification, access and storage of data users and employees as much as possible.

When the scale of an enterprise expands and the data demand spans multiple departments, when the data system and data set are too large to control, when the business development needs enterprise-level strategies, or when legal or regulatory requirements are put forward, more formal data governance strategies must be formulated.