Current location - Education and Training Encyclopedia - Education and training - What does Bi mainly master?
What does Bi mainly master?
Business Intelligence, also known as BI, is the abbreviation of the English word Business Intelligence. The concept of business intelligence was first put forward in 1996. At that time, business intelligence was defined as a technology and its application, consisting of data warehouse (or data mart), query report, data analysis, data mining, data backup and recovery, etc., aiming at helping enterprises make decisions. At present, business intelligence is usually understood as a tool to transform existing data in enterprises into knowledge and help enterprises make wise business decisions. The data discussed here include orders, inventories, trading accounts, customers and suppliers from the industry and competitors where the enterprise is located, as well as various data from other external environments where the enterprise is located. Business intelligence can help business management decisions, including operational decisions, tactical decisions and strategic decisions. In order to transform data into knowledge, we need to use technologies such OLAP data warehouse, on-line analytical processing tools and data mining. Therefore, from the technical level, business intelligence is not a new technology, but a comprehensive application of data warehouse, OLAP, data mining and other technologies. BI is a factory:

The raw material of>& gtBI is massive data;

& gt& gtBI's products are information and knowledge processed from data;

& gt& gtBI pushes these products to enterprise decision makers;

& gt& gt enterprise decision makers make correct decisions by using the products of Bichang to promote enterprise development;

This is business intelligence, that is, business intelligence-connecting data with decision makers and turning data into value.

BI applications are divided into two categories: information applications and knowledge applications, and their characteristics are shown in the following table:

Intelligent application of information business;

It refers to data query, report chart, multidimensional analysis, data visualization and other applications from raw data processing. The common feature of these applications is: transforming data into information acceptable to decision makers and presenting it to them.

For example, processing bank transaction data into bank financial statements.

Only responsible for providing information, not actively analyzing data.

For example, the tools of bank financial statements have no ability to deeply analyze the relationship between customer churn and bank interest rates, and can only rely on decision makers to combine information and acquire knowledge through human thinking.

Intelligent application of knowledge business;

Refers to data mining technology and tools, mining hidden relationships in data, processing data directly into knowledge through computers and presenting it to decision makers.

Will actively explore the data association in the data, explore the hidden knowledge that the decision-maker's brain can't quickly explore, and present it to the decision-maker in an understandable form.

(3) Overview of the main application modes of two-way data query.

Data query is the simplest application of business intelligence, which belongs to the legacy of MIS system. Although it comes from an old school, it is still the most direct way for decision makers to obtain information.

Today's data query interface has completely got rid of the traditional SQL command line, a large number of drop-down menus, input boxes, list boxes and other elements, and even the mouse drag-and-drop interface, packaging the SQL statements of the coolies in the background into a fascinating data acquisition system, but in essence, there are still several elements of data query:

& gt& gt Check what?

& gt& gt Where to check?

& gt& gt filter conditions

& gt& gt display method

At present, the more popular data query applications abroad have completely released the flexibility of data query. As shown in the picture on the right, Query Studio, the data query interface of Cognos ReportNet, allows users to define data query elements by dragging the mouse through a pure browser interface, and display data in various ways such as reports and charts.

(4) Overview of main application modes of two-way declaration.

Report is one of the most popular BI applications in China, which is inseparable from the historical position of report in state-owned enterprises and institutions in China. China's report forms are famous for their peculiar format, centralized data and eccentric rules, which makes countless foreign report tools and BI tools beat their chests.

The two elements of a report are data and format. Without format, report application is almost equivalent to data query application. It can be said that the report is to present the queried data in a specified format.

Report application includes two modules: report presentation and report generation. Report presentation is to let decision makers see the report and allow decision makers to select report data through condition definition, such as selecting report year, department and institution. Report production is oriented to report developers. The flexibility of report developers in format definition, data mapping and rich calculation methods all affect the quality of BI report application.

It needs to be clarified that Microsoft Excel is not a BI reporting tool, because Excel has no ability to connect data sources, and at best it is a spreadsheet. However, the powerful format function of Excel made the report makers bow to their knees, and even almost all BI vendors provided plug-ins for Microsoft Excel. Through the plug-in, Excel can connect to the data source of BI, become a report tool of BI, and the ugly duckling becomes a swan.

5)BI Advanced Application Mode-Overview of on-line analytical processing (OLAP)

OLAP, or on-line analytical processing, is a brand-new data observation method brought by business intelligence and one of the core technologies of business intelligence.

As we know, data is stored in the data table of the database. For example, the sales data of a store is stored in a data table as shown below:

Sales time

Place of sale

product

sales volume

Sales

2004- 1 1- 1

Beijing

soap

10

342.00

2004- 1 1-6

Guangzhou

orange

30

123.00

2004- 12-3

Beijing

banana

20

12.00

2004- 12- 13

Shanghai

orange

50

189.00

2005- 1-8

Beijing

soap

10

342.00

2005- 1-23

Shanghai

toothbrush

30

150.00

2005-2-4

Guangzhou

toothbrush

20

100.00

Decision makers often want to know macro information, such as distribution, proportion and trend, such as the following questions:

& gt& gt Aside from the time factor, what is the sales trend in Beijing?

& gt& gt Which product's sales increased the most in 2005 than in 2004?

& gt> What is the sales proportion distribution of various products in 2004? ……

Faced with this demand, a large number of SUM operations must be carried out with SQL statements, and SQL SUM is needed every time the result of a question is obtained. In the face of the above seven records, we can easily get the results, but when we face millions or even billions of records, such as the call data of mobile companies, it takes a lot of time to calculate each SQL sum. Decision makers often put forward analysis requirements on the first day and wait until the next day to get the calculation results. This analysis method is "off-line analysis", which is very inefficient.

In order to improve the efficiency of data analysis, OLAP technology completely breaks the record-based data browsing mode and divides data into "dimension" and "measurement":

The>& gt dimension is the perspective of observing data, such as "sales time", "sales place" and "product" in the above example;

& gt& gt measurement is the specific quantity value, such as "sales quantity" and "sales amount" in the above example;

In this way, we can convert the above plan data list into a three-dimensional data cube:

The process of exploring data is to determine a point in this cube and then observe the measured value of this point:

Of course, the data cube is not limited to three dimensions. Here, three dimensions are used to illustrate the problem, just because the limit that can be expressed graphically is three dimensions.

Dimensions can be divided into levels. For example, time can be summarized as month and year day after day, products can be summarized as food and daily necessities, and locations can be summarized as North China and South China. Users can drill down and scroll up at will along the level of the dimension:

In this way, we can get rid of the speed limit of SQL SUM, quickly locate the detailed data that meet different conditions, and quickly get a certain level of summary data. OLAP technology provides a multi-angle, multi-level and efficient data exploration method for decision makers. Decision-makers' thinking is no longer bound by fixed drop-down menus and query conditions, but is dominated by decision-makers' thinking of obtaining data, arbitrarily combining analysis angles and analyzing objectives. This breaks the traditional interactive analysis and high efficiency, making OLAP the core application of BI system.

(*) The fourth spray: BI advanced application mode-data visualization and data mining.

(6) Overview of 6)BI application mode-data visualization

The application of data visualization is devoted to presenting information in as many forms as possible. The purpose is to enable decision makers to quickly acquire the knowledge contained in information, such as trends, distribution, density and other elements, through the intuitive expression of graphics. It is worth mentioning that GIS software vendors represented by MapInfo are also trying to combine BI applications. MapInfo first put forward the concept of location intelligence, relying on geographic information system to display the attribute values of each region, such as population density, industrial output value, number of hospitals per capita, etc. This visualization application partially overlaps with the BI data visualization application, forming a powerful supplement, and sometimes they can match each other in a project.

The picture above shows Cognos Visualizer products. This guy displays data and information in an almost sensational and rich form, including nearly 50 kinds of display graphics such as maps, pie charts and waterfall charts, and provides two kinds of display modes: two-dimensional and three-dimensional. All graphic elements are movable, for example, users can click a province on the map and drill down the information of the cities in that province. This interactivity is a remarkable difference between BI and ordinary image generation software.

(7) Overview of business intelligence application mode-data mining

Data mining is the most advanced BI application, because it can replace some functions of human brain.

Data mining is a special case of knowledge discovery in structured data.

The purpose of data mining is to analyze a large number of data by computer, find out the hidden rules and knowledge between data, and show them to users in a way that users can understand.

The three elements of data mining are:

& gt& gt technology and algorithms: At present, commonly used data mining technologies include-

Automatic clustering detection (automatic clustering detection)

Decision tree

Neural network (neural network)

> > Data: Because data mining is a process of mining the unknown from the known,

So we need a lot of data accumulation as a data source, data accumulation.

The larger the number, the more reference points there are for data mining tools.

& gt& gt forecast model: that is, the business logic that needs data mining depends on.

Computer simulation, which is the main task of data mining.

Compared with the information-based BI application, the knowledge-based BI application represented by data mining is not mature at present, but from another perspective, data mining still has a lot of room for development and is the key direction of BI development in the future. The image of knowledge-based BI application vendors such as SAS and SPSS has gradually grown, quietly occupying new profit growth points.

In the picture above, the famous IBM intelligent mining machine is analyzing customers' consumption behavior. It can analyze a large number of customer data, and then automatically divide customers into several groups (automatic category detection), and display the consumption characteristics of each group, so that decision makers can make promotion plans or advertising plans for different customers' consumption habits at a glance.

If the above functions are realized only by information BI application, decision makers need to do a lot of OLAP analysis and data query based on experience, and may not be able to find hidden rules in the data. For example, the customer classification above, for a bank with 4 million users, if there is no data mining tool, people will be exhausted.

(8) BI foundation-Data Warehouse technology (data warehouse)

Before we start this topic, let's take a look at the official definition of data warehouse:

Data warehouse is a subject-oriented, integrated, nonvolatile and time-varying data set, which is used to support management decisions. The above is the official definition of data warehouse.

The "operation database" is like the database of the bank bookkeeping system. Every business operation (for example, if you deposit 5 yuan money) will be recorded in this database immediately. In the long run, all the accumulated data will be fragmentary. This kind of database is called "operation database" and is oriented to business operation.

"Data warehouse" is used for decision support, oriented to analytical data processing, and different from operational database. In addition, data warehouse is an effective integration of multiple heterogeneous data sources. After integration, it is reorganized according to the theme, including historical data, and the data stored in the data warehouse is generally not modified.

The relationship between operation database, data warehouse and database is just like the relationship between C: and D: and hard disk. The database is hard disk, and the operation database is C: Both operational database and data warehouse are stored in the database, but the design mode and purpose of table structure are different.

So why add such a layer of "data warehouse" between the operation database and BI?

First, because the operation database is busy day and night, and the main goal is to respond quickly to the business, there is no energy to serve the data demand of the BI side, and the data demand of the BI side is usually summary. A select sum(xx) group in XX will make the database operation consume a lot of resources, and the business processing can't keep up, so it will be in big trouble. For example, if you deposit 5,000 yuan and find that the money hasn't arrived ten minutes later, what do you think? Must be the head of the bank looking at the pie chart?

Secondly, there are many applications in enterprises, corresponding to many operational databases, such as human resources database, financial database, sales document database, inventory commodity database and so on. In order to provide a panoramic view of data, BI must integrate these scattered data. For example, in order to realize OLAP analysis of integrated sales and inventory information, BI tools must be able to effectively obtain data from two databases. At this time, the most efficient method is to integrate the data into the data warehouse first, and the BI application is unified from the data warehouse.

It is a university question to integrate the data in the decentralized operation database into the data warehouse, which gives birth to the data integration software market. This integration does not simply pile tables together, but extracts the dimensions of each operational database, sets the same dimension as * * * as the dimension of * *, and then unifies the database tables containing specific measurement values into several large tables (the term "fact table") according to the theme, establishes the data warehouse table structure according to the dimension-measurement model, and then carries out data extraction and transformation. Subsequent extraction is generally to extract new data incrementally when the load of the operating database is relatively small (such as early morning), so that the data in the data warehouse will accumulate.

Most BI applications don't need real-time data, such as decision makers, just read last week's weekly report every week. 95% of BI applications are unrealistic and allow data lag from 1 hour to 1 month. This is the application characteristic of decision support system, and this lag interval is the working time of data extraction tools. Of course, BI applications usually contain few real-time data requirements. At this time, it is only necessary to directly connect BI query software to the business database for these special needs, but the load must be limited and complex queries are prohibited.

At present, all database products provide special optimization for data warehouse. For example, when installing a higher version of MySQL, the installation sequence will ask you whether you want the database instance to be transaction-oriented or decision support. The former is an operation database, and the latter is a data warehouse (decision support, please refuel). For these two forms, the database will provide targeted optimization.

(9) Double lace

That's the knowledge about BI. Write some lace as a conclusion.

Key points of BI: BI can't handle unstructured data, only digital information. However, there are still a lot of unstructured data such as text, streaming media and pictures in enterprises, and these data also contain a lot of value, but in the face of these data, the current BI tools are powerless. IBM Intelligent Miner for Text is reliable, but it seems weak in dealing with Chinese.

BI suppliers and products:

First of all, let's meet foreign celebrities! In data warehouse, there are IBM DB2, Oracle, Sybase IQ, NCR Teradata and so on. BI applications include Cognos, Business Objects, MicroStrategy, Hyperion, IBM, etc. Data mining includes IBM, SAS, SPSS and so on. Microsoft, a giant, has also got a foot in the BI field, and launched BI-related products such as SQL Server analysis server and report service to seize the hill!

We often only pay attention to foreign BI bosses, but ignore the emerging BI army in China. At present, the well-known BI in China include Power-BI of Aowei Zhidong, BlueQuery of Shangnan, Run Qian Report, etc. It is particularly worth mentioning that the Power-BI of Aowei Zhidong is a standardized BI, which has a certain market share in China.

The development of business intelligence market in China:

a period of time

Application of Business Intelligence in China

Before 2002

A large number of BI software are regarded as reports that can extract data from multiple data sources.

At the beginning, when the company promoted its products, it introduced to users: "We are the strongest in BI field ..." The effect was not good; Later, those salespeople finally found the trick and came up and said, "We can do any report!" " "Then the command came.

2002-2003

Some discerning people finally found the value of OLAP. In order to improve their competitiveness, some enterprises with high competitive pressure urgently need to tap the value of historical data and quickly discover the advantages of OLAP. At this time, the sales finally don't have to say "we can make any report". However, state organs and monopoly enterprises are still statements, and they think that it is a statement.

2004

With the implementation of more and more successful BI projects, OLAP finally surfaced, and then a reasonable BI application framework of data query+report presentation +OLAP analysis was formed in China. Users often ask for data visualization. In some enterprises with fierce competition and large amount of data, data mining applications have emerged.

2005

Information provision can no longer meet the requirements of many enterprises, especially in industries with fierce competition and risk-intensive, such as banking, communications and securities. The demand for data mining has emerged in large numbers, and the application of BI has finally formed the whole of information+knowledge.

Problems encountered by BI tools in China;

Complex forms: China has the most complex forms in the world. The concept of sample design in China is different from that in the West. Western reports tend to use only one report to explain a problem, while China's report tends to concentrate as many problems as possible in one report, which directly leads to the complex format and weird style of China's report.

* Big data: China is the most populous country in the world. Take China Mobile Company as an example. The number of users in only one province in China is equivalent to the population of a medium-sized country in Europe. What a huge amount of data! Foreign databases, data warehouses and BI application software are all undergoing the test of carrying capacity of large amount of data in China. For the United States, a customer analysis application may get the results in two seconds, but in China, the amount of data is so large that it is not two seconds.

* Data write-back: China is the country with the strangest requirements for BI system in the world. Initially, BI system was based on the principle of faithfully reproducing source data, but this principle met with difficulties in China. Many leaders put forward the requirement of data modification. "The figures on the report are not good-looking, they can definitely be changed, and sometimes they need to be adjusted for the superior leaders to watch!" A leader said. At present, there are only two BI products that can meet this requirement: Microsoft and MicroStrategy. Microsoft knows the China market very well.