{"title":"一种基于多源数据融合的基因疾病关联预测算法","authors":"Fei Wang","doi":"10.7546/ijba.2022.26.1.000870","DOIUrl":null,"url":null,"abstract":"Accurate gene-disease association prediction results are the basis for effective diagnosis and treatment of complex genetic diseases. However, existing studies related to this topic generally face problems in two aspects: large volume of original data and diverse data type, and data fusion difficulty. Therefore, this paper studied a gene-disease association prediction algorithm based on multi-source data fusion. At first, it processed the multi-dimensional gene phenotype data, analyzed the gene-disease associations of different phenotypes, and completed the selection of disease gene loci under multi-dimensional phenotypes. Then, this paper fused the multi-source data containing the gene expression data, gene sequence data, gene interaction data, and transcriptome sequencing data, and established the corresponding gene-disease association prediction model. At last, the effectiveness of the constructed prediction model was verified by experimental results. The research results obtained in this paper can improve the low utilization of gene datasets, restored the main features of the datasets to the greatest extent, reasonably processed the data noise, effectively enhanced the robustness of the model, and further improved the classification accuracy of the prediction of disease-causing genes.","PeriodicalId":38867,"journal":{"name":"International Journal Bioautomation","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Gene-disease Association Prediction Algorithm Based on Multi-source Data Fusion\",\"authors\":\"Fei Wang\",\"doi\":\"10.7546/ijba.2022.26.1.000870\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate gene-disease association prediction results are the basis for effective diagnosis and treatment of complex genetic diseases. However, existing studies related to this topic generally face problems in two aspects: large volume of original data and diverse data type, and data fusion difficulty. Therefore, this paper studied a gene-disease association prediction algorithm based on multi-source data fusion. At first, it processed the multi-dimensional gene phenotype data, analyzed the gene-disease associations of different phenotypes, and completed the selection of disease gene loci under multi-dimensional phenotypes. Then, this paper fused the multi-source data containing the gene expression data, gene sequence data, gene interaction data, and transcriptome sequencing data, and established the corresponding gene-disease association prediction model. At last, the effectiveness of the constructed prediction model was verified by experimental results. The research results obtained in this paper can improve the low utilization of gene datasets, restored the main features of the datasets to the greatest extent, reasonably processed the data noise, effectively enhanced the robustness of the model, and further improved the classification accuracy of the prediction of disease-causing genes.\",\"PeriodicalId\":38867,\"journal\":{\"name\":\"International Journal Bioautomation\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal Bioautomation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7546/ijba.2022.26.1.000870\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal Bioautomation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7546/ijba.2022.26.1.000870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
A Gene-disease Association Prediction Algorithm Based on Multi-source Data Fusion
Accurate gene-disease association prediction results are the basis for effective diagnosis and treatment of complex genetic diseases. However, existing studies related to this topic generally face problems in two aspects: large volume of original data and diverse data type, and data fusion difficulty. Therefore, this paper studied a gene-disease association prediction algorithm based on multi-source data fusion. At first, it processed the multi-dimensional gene phenotype data, analyzed the gene-disease associations of different phenotypes, and completed the selection of disease gene loci under multi-dimensional phenotypes. Then, this paper fused the multi-source data containing the gene expression data, gene sequence data, gene interaction data, and transcriptome sequencing data, and established the corresponding gene-disease association prediction model. At last, the effectiveness of the constructed prediction model was verified by experimental results. The research results obtained in this paper can improve the low utilization of gene datasets, restored the main features of the datasets to the greatest extent, reasonably processed the data noise, effectively enhanced the robustness of the model, and further improved the classification accuracy of the prediction of disease-causing genes.