Current location - Education and Training Encyclopedia - Graduation thesis - Complex network-social network analysis
Complex network-social network analysis
"Social network" refers to the collection of social members and their relationships. The "point" in social network refers to all social members, while the "edge" in social network refers to various social relations among members. The relationship between members can be directed or undirected. At the same time, social relations can be manifested in many forms, such as friendship between people, superior-subordinate relationship, scientific research cooperation relationship, communication relationship between organization members, trade relationship between countries and so on. Social network analysis is a concrete tool in social network theory, which is to quantitatively study the relationship between actors in social network.

Therefore, the focus of social network analysis is relationship and relationship model, and the approaches and methods adopted are conceptually different from traditional statistical analysis and data processing methods.

Social network usually means that human individuals are connected through various relationships, such as friends, marriage, business and so on. , and these links show a certain pattern on the macro level. In the early years, some sociologists began to pay attention to people's communication patterns. Ebel and others conducted an e-mail version of the small-world problem experiment, and completed the 1 12-day e-mail connection data of 5,000 students in Kiel University. The node is an e-mail address, and the connection is a message transmission. The exponential truncated power law distribution is obtained, and the exponent r = 1. 18. At the same time, it is proved that the network is a small world with an average separation of 4.94.

Social network analysis can solve or try to solve the following problems:

Centrality is one of the focuses of social network analysis, which is used to analyze what kind of power an individual or organization has in its social network or what kind of central position it occupies. This idea was one of the earliest discussions among social network analysts.

The centrality of a point indicates the number of points directly connected to the point, and the undirected graph is (n- 1) and the directed graph is (in-degree, out-degree).

The centrality of an individual measures the degree to which an individual is at the center of the network and reflects the importance of this point in the network. Each individual in the network has a centrality, which describes the characteristics of the individual. In addition to calculating the centrality of individuals in the network, we can also calculate the concentration trend of the whole network (centralization for short). The network central potential describes the degree of difference between points in the whole network, and a network has only one central potential.

According to different calculation methods, centrality and centrality potential can be divided into three types: point centrality/point centrality potential, intermediate centrality/intermediate centrality potential and near centrality/near centrality potential.

In a social network, if there are a lot of direct connections between an individual and other individuals, then the individual occupies a central position in the network and has greater "power". Under the guidance of this idea, the centrality of a point in a network can be measured by the number of points connected to the point in the network, that is, the centrality of the point.

Network central potential refers to the concentration trend of each point in the network, and its calculation is based on the following steps: first, find the value of the centrality of the largest point in the graph, then calculate the difference between this value and the centrality of any other point, then calculate the sum of these "differences", and finally divide this sum by the maximum possible value of the sum of all "differences".

In the network, if an individual is located in the path between many other two individuals, it can be considered that the individual is in an important position because he has the ability to control the communication between the other two individuals. This feature is described by intermediate centrality, which measures the degree of individual control over resources. The more individuals occupy such a position in the network, it means that it has a high centrality, and the more individuals need to communicate through it.

The intermediate centrality potential is defined as the gap between the intermediate centrality of the node with the highest intermediate centrality in the network and that of other nodes, which is used to analyze the overall structure of the network. The higher the intermediate center potential, it means that the nodes in the network may be divided into several small groups, relying too much on a node and transferring the relationship, which shows that the node is in an extremely important position in the network.

Proximity is used to describe the ability of individuals in the network not to be controlled by others. When calculating the proximity to the center, we pay attention to shortcuts, not direct relationships. If a point is connected with many other points through a relatively short path, we say that the point is very close to the center.

For a social network, the higher the potential near the center, the greater the difference between nodes in the network; On the contrary, it shows that the differences between nodes in the network are small.

Note: The above formula is for undirected graphs. If it is a directed graph, the formula can be modified according to the definition.

When some individuals in the network are so close that they form a subgroup, this group is called cohesive subgroup in social network analysis. The analysis of how many such subgroups exist in the network, the characteristics of the relationship between members within subgroups, the characteristics of the relationship between subgroups and the characteristics of the relationship between members of one subgroup and members of another subgroup is concentrated subgroup analysis.

Because of the close relationship between the members of the condensed subgroup, some scholars also call the condensed subgroup analysis "small group analysis" or "community phenomenon".

Commonly used community detection methods mainly include the following:

(1) Graph-based segmentation methods, such as Kernighan-Lin algorithm and spectral dichotomy;

(2) Methods based on hierarchical clustering, such as GN algorithm and Newman fast algorithm;

(3) Methods based on modular optimization, such as greedy algorithm, simulated annealing algorithm, Memetic algorithm, PSO algorithm, evolutionary multi-objective optimization algorithm, etc.

External-internal index cluster density (E-IIndex) is mainly used to measure whether the phenomenon of small groups in a large network is very serious, and it is very effective in analyzing organizational management and other issues.

In the worst case, the large group is loose, but the core small group has high cohesion. Another situation is that there are many small groups with high cohesion in large groups, and it is likely that there will be a struggle between small groups. The density range of condensed subgroups is [- 1,+1]. The closer the value is to 1, the greater the factionalism. The closer the value is to-1, the smaller the factionalism is. The closer the value is to 0, the more random the relationship tends to be, and there is no factional situation.

E-I index can be said to be an important crisis index for enterprise managers. When an enterprise's E-I index is too high, it means that small groups in the enterprise may closely combine and start plotting their own interests, thus hurting the interests of the whole enterprise. In fact, E-I index can be applied not only in the field of enterprise management, but also in other fields, such as studying the relationship between scholars in a certain subject field. If there is a cohesive subgroup in the network, and the density of cohesive subgroups is high, it shows that scholars in this cohesive subgroup have close ties and frequent exchanges in information sharing and scientific research cooperation, while members outside the subgroup cannot get enough opportunities for information and scientific research cooperation. This situation is not conducive to the development of this discipline to a certain extent.

The purpose of core-edge structure analysis is to study which nodes are in the core position and which nodes are in the edge position in social networks. Core-edge structure analysis has a wide range of applications, which can be used to analyze elite networks, paper citation networks, organizational relationship networks and many other social phenomena.

According to the types of relational data (classified data and proportional data), the core-edge structure has different forms. Classified data and fixed ratio data are basic concepts in statistics. Generally speaking, classified data are represented by categories, and these categories are usually represented by numbers, but these values cannot be used for mathematical calculation. Constant ratio data is expressed by numerical value and can be used for mathematical calculation. If the data is classified data, a discrete core-edge model can be constructed; If the data is isometric data, a continuous core-edge model can be constructed.

According to the existence and closeness of the relationship between core members and edge members, discrete core-edge models can be divided into three types: core-edge full correlation model, core-edge local correlation model and core-edge relationship missing model. If the relationship between core and edge is regarded as missing value, the core-edge relationship missing model is formed.

The following are four discrete core-edge models suitable for classified data:

involve