PurposeDespite the successful progress next-generation sequencing technologies has achieved in diagnosing the genetic cause of rare Mendelian diseases, the current diagnostic rate is still far from satisfactory because of heterogeneity, imprecision, and noise in disease phenotype descriptions and insufficient utilization of expert knowledge in clinical genetics. To overcome these difficulties, we present a novel method called Xrare for the prioritization of causative gene variants in rare disease diagnosis.MethodsWe propose a new phenotype similarity scoring method called Emission-Reception Information Content (ERIC), which is highly tolerant of noise and imprecision in clinical phenotypes. We utilize medical genetic domain knowledge by designing genetic features implementing American College of Medical Genetics and Genomics (ACMG) guidelines.ResultsERIC score ranked consistently higher for disease genes than other phenotypic similarity scores in the presence of imprecise and noisy phenotypes. Extensive simulations and real clinical data demonstrated that Xrare outperforms existing alternative methods by 10–40% at various genetic diagnosis scenarios.ConclusionThe Xrare model is learned from a large database of clinical variants, and derives its strength from the tight integration of medical genetics features and phenotypic features similarity scores. Xrare provides the clinical community with a robust and powerful tool for variant prioritization.
Objective: The high incidence of respiratory diseases has dramatically increased the medical burden under the COVID-19 pandemic in the year 2020. It is of considerable significance to utilize a new generation of information technology to improve the artificial intelligence level of respiratory disease diagnosis. Methods: Based on the semi-structured data of Chinese Electronic Medical Records (CEMRs) from the China Hospital Pharmacovigilance System, this paper proposed a bi-level artificial intelligence model for the risk classification of acute respiratory diseases. It includes two levels. The first level is a dedicated design of the “BiLSTM+Dilated Convolution+3D Attention+CRF” deep learning model that is used for Chinese Clinical Named Entity Recognition (CCNER) to extract valuable information from the unstructured data in the CEMRs. Incorporating the transfer learning and semi-supervised learning technique into the proposed deep learning model achieves higher accuracy and efficiency in the CCNER task than the popular “Bert+BiLSTM+CRF” approach. Combining the extracted entity data with other structured data in the CEMRs, the second level is a customized XGBoost to realize the risk classification of acute respiratory diseases. Results: The empirical study shows that the proposed model could provide practical technical support for improving diagnostic accuracy. Conclusion: Our study provides a proof-of-concept for implementing a hybrid artificial intelligence-based system as a tool to aid clinicians in tackling CEMR data and enhancing the diagnostic evaluation under diagnostic uncertainty.
The unreasonable setting of urban bus stops is a common problem in real life, which seriously affects people’s happiness, sense of belonging and brand in the city. However, the existing related research on the above problems generally has the defects of high technical complexity and high cost. Therefore, we aim to propose a way to optimize the setting of urban public transportation stations and reduce the technical complexity and high cost of existing public transportation station optimization by using artificial intelligence algorithms. First, we extract and integrate bus GPS data and bus card swipe data in the business system and perform exploratory analysis on the pre-processed data. Second, the original k-NN algorithm is improved, and an ik-NN algorithm is proposed to determine the cardholder’s boarding point. Then, we divide the upstream and downstream lines to calculate the total number of upstream and downstream passengers. Third, we propose an algorithm for calculating the number of passengers getting off at bus stations and calculating the number of passengers getting on and off at each bus station. Finally, according to the number of passengers getting on and off at each bus station, the OD matrix is constructed, the residents’ travel rules are analyzed, and optimization suggestions for the setting of urban bus stations are proposed. This paper selects the public transit GPS data set and swipe card data set of Shenzhen, China for experiments. The experimental results show that: (1) Compared with K-means, the ik-NN algorithm we proposed can effectively determine the actual car station of each cardholder, and the algorithm is less sensitive to feature dimensions. At the same time, the ik-NN algorithm has a high operating efficiency and is less affected by the “[Formula: see text]” value. (2) The calculation algorithm for the number of passengers getting off at bus stations can effectively use the existing data of the business system to determine the number of passengers getting off at each bus station. Therefore, the calculation times of this algorithm are low, and the accuracy is high. (3) The optimization suggestions for bus stations based on the OD matrix analysis of residents’ travel rules meet the needs of urban development and have certain reference value.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.