Here I share two articles with you and talk about some advanced ideas. One was published in our old friend Oncotarget, and the other was published in protein Group Research Journal (if = 4. 1).
First, look at Oncotarget's article "The Difference of Genome Expression between Red Hair Individuals and Dark Hair Individuals Based on Bioinformatics Analysis". In this paper, the reliability of different genes in individuals with two different melanoma phenotypes is analyzed.
It is said that the mutation of gene MC 1R will lead to two different phenotypes of RHC, and the incidence of cancer is high, among which the phenotype of RHC will increase the incidence of skin cancer. So what genes are affected by the mutation of MC 1R? Through PPI network analysis, the differential genes in normal skin cells and two different phenotypes (RHC and BHC) cancer cells were compared and analyzed respectively. The results showed that there was no difference between cancer cells, but 23 hub genes were screened from normal skin cells, of which 8 genes were abnormally expressed. This result shows that the abnormal expression of these 8 genes may be an important reason for the increased risk of RHC phenotypic cancer.
In this paper, three data packets are used for comprehensive analysis and a novel conclusion is drawn. In this paper, the differential genes in GSE44805 are used to construct PPI network to screen key genes, and then the sequencing results in other data packets are used to verify that these genes do have abnormal expression, which proves that the results of confidence analysis are reliable. Although the author didn't do any experiments at all, it may be more reliable than trying to do small sample sequencing in terms of data volume and reliability.
The analysis methods in this paper (differential gene analysis and PPI analysis) are very familiar to us. Screening out differential genes, constructing PPI networks for up-regulated genes and down-regulated genes respectively, and getting four pictures in this paper (in any case, the face value of this picture is much higher than the last routine analysis article).
The construction method of this picture will not be described here.
summary
The method in this paper can be used for reference and copied. The difficulty lies in finding enough similar and comparable data results, finding a suitable starting point and drawing a conclusion relative to the novel.
Let's look at the article "Weighted protein Interaction Network Analysis of Frontotemporal Dementia" in the Journal of protein Group Research.
As soon as I looked at this flow chart, I felt that this article was written by a credit professional. When I went to school in this palace, I felt that all the students in our school of life were coding farmers, and people majoring in bioinformatics, biomedical engineering and biological science were coding every day, so I couldn't feel the breath of biological specialty at all. )
What is this article about? First, 13 seed genes are selected, and then the first-layer network structure of these 13 seed genes is constructed according to the protein interaction in PPI database.
Then take the first layer network as the seed to build the second layer network structure (and then the computer crashes).
Then, the topological structure of the second layer network is analyzed, and the hub genes are screened out (the green dots in the figure represent the original 13 seed genes, and the blue dots represent the first layer genes). In the process of construction, with the increasing number of genes, the initial selected 13 seed gene may not be the later hub gene. A control group was also established, and the screening methods of these 13 seed genes were described in detail. Because the whole analysis process is based on the analysis of the original information, it is completely overhead, so the whole research process pays great attention to the rigor of logic.
summary
The reason why I introduce this article to you is because this idea can be used for reference in the article "Life Analysis". The selection of seed genes can be screened by the probability of gene mutation in clinical diseases, and then a two-layer PPI network is constructed for GO and KEGG analysis, thus predicting new unknown disease-related genes. If the expression can be verified from other data packets or clinical samples, the content of the whole article will be more abundant.
Limitation: In fact, many protein interactions in PPI database are meaningless, because many protein interactions can't happen in real life, and only happen with human intervention.