Data imbalance is frequently encountered in biomedical applications. Resampling techniques can be used in binary classification to tackle this issue. However such solutions are not desired when the number of samples in the small class is limited. Moreover the use of inadequate performance metrics, such as accuracy, lead to poor generalization results because the classifiers tend to predict the largest size class. One of the good approaches to deal with this issue is to optimize performance metrics that are designed to handle data imbalance. Matthews Correlation Coefficient (MCC) is widely used in Bioinformatics as a performance metric. We are interested in developing a new classifier based on the MCC metric to handle imbalanced data. We derive an optimal Bayes classifier for the MCC metric using an approach based on Frechet derivative. We show that the proposed algorithm has the nice theoretical property of consistency. Using simulated data, we verify the correctness of our optimality result by searching in the space of all possible binary classifiers. The proposed classifier is evaluated on 64 datasets from a wide range data imbalance. We compare both classification performance and CPU efficiency for three classifiers: 1) the proposed algorithm (MCC-classifier), the Bayes classifier with a default threshold (MCC-base) and imbalanced SVM (SVM-imba). The experimental evaluation shows that MCC-classifier has a close performance to SVM-imba while being simpler and more efficient.
On the one hand, Support Vector Machines have met with significant success in solving difficult pattern recognition problems with global features representation. On the other hand, local features in images have shown to be suitable representations for efficient object recognition. Therefore, it is natural to try to combine SVM approach with local features representation to gain advantages on both sides. We study in this paper the Mercer property of matching kernels which mimic classical matching algorithms used in techniques based on points of interest. We introduce a new statistical approach of kernel positiveness. We show that despite the absence of an analytical proof of the Mercer property, we can provide bounds on the probability that the Gram matrix is actually positive definite for kernels in large class of functions, under reasonable assumptions. A few experiments validate those on object recognition tasks.
Kernel based methods such as Support Vector Machine (SVM) have provided successful tools for solving many recognition problems. One of the reason of this success is the use of kernels. Positive definiteness has to be checked for kernels to be suitable for most of these methods. For instance for SVM, the use of a positive definite kernel insures that the optimized problem is convex and thus the obtained solution is unique. Alternative class of kernels called conditionally positive definite have been studied for a long time from the theoretical point of view and have drawn attention from the community only in the last decade. We propose a new kernel, named log kernel, which seems particularly interesting for images. Moreover, we prove that this new kernel is a conditionally positive definite kernel as well as the power kernel. Finally, we show from experimentations that using conditionally positive definite kernels allows us to outperform classical positive definite kernels.
Aberrant metabolism is the root cause of several serious health issues, creating a huge burden to health and leading to diminished life expectancy. A dysregulated metabolism induces the secretion of several molecules which in turn trigger the inflammatory pathway. Inflammation is the natural reaction of the immune system to a variety of stimuli, such as pathogens, damaged cells, and harmful substances. Metabolically triggered inflammation, also called metaflammation or low-grade chronic inflammation, is the consequence of a synergic interaction between the host and the exposome—a combination of environmental drivers, including diet, lifestyle, pollutants and other factors throughout the life span of an individual. Various levels of chronic inflammation are associated with several lifestyle-related diseases such as diabetes, obesity, metabolic associated fatty liver disease (MAFLD), cancers, cardiovascular disorders (CVDs), autoimmune diseases, and chronic lung diseases. Chronic diseases are a growing concern worldwide, placing a heavy burden on individuals, families, governments, and health-care systems. New strategies are needed to empower communities worldwide to prevent and treat these diseases. Precision medicine provides a model for the next generation of lifestyle modification. This will capitalize on the dynamic interaction between an individual’s biology, lifestyle, behavior, and environment. The aim of precision medicine is to design and improve diagnosis, therapeutics and prognostication through the use of large complex datasets that incorporate individual gene, function, and environmental variations. The implementation of high-performance computing (HPC) and artificial intelligence (AI) can predict risks with greater accuracy based on available multidimensional clinical and biological datasets. AI-powered precision medicine provides clinicians with an opportunity to specifically tailor early interventions to each individual. In this article, we discuss the strengths and limitations of existing and evolving recent, data-driven technologies, such as AI, in preventing, treating and reversing lifestyle-related diseases.
Genetic etiologies of chronic mucocutaneous candidiasis (CMC) disrupt human IL-17A/F–dependent immunity at mucosal surfaces, whereas those of connective tissue disorders (CTDs) often impair the TGF-β–dependent homeostasis of connective tissues. The signaling pathways involved are incompletely understood. We report a three-generation family with an autosomal dominant (AD) combination of CMC and a previously undescribed form of CTD that clinically overlaps with Ehlers-Danlos syndrome (EDS). The patients are heterozygous for a private splice-site variant of MAPK8, the gene encoding c-Jun N-terminal kinase 1 (JNK1), a component of the MAPK signaling pathway. This variant is loss-of-expression and loss-of-function in the patients’ fibroblasts, which display AD JNK1 deficiency by haploinsufficiency. These cells have impaired, but not abolished, responses to IL-17A and IL-17F. Moreover, the development of the patients’ TH17 cells was impaired ex vivo and in vitro, probably due to the involvement of JNK1 in the TGF-β–responsive pathway and further accounting for the patients’ CMC. Consistently, the patients’ fibroblasts displayed impaired JNK1- and c-Jun/ATF-2–dependent induction of key extracellular matrix (ECM) components and regulators, but not of EDS-causing gene products, in response to TGF-β. Furthermore, they displayed a transcriptional pattern in response to TGF-β different from that of fibroblasts from patients with Loeys-Dietz syndrome caused by mutations of TGFBR2 or SMAD3, further accounting for the patients’ complex and unusual CTD phenotype. This experiment of nature indicates that the integrity of the human JNK1-dependent MAPK signaling pathway is essential for IL-17A– and IL-17F–dependent mucocutaneous immunity to Candida and for the TGF-β–dependent homeostasis of connective tissues.
We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained. We compared eight models from a wide spectrum of predictive models, namely; Generalized Linear Model (GLM), GLM-Net, Partial Least Square (PLS), Support Vector Machines (SVM), Random Forests (RF), Neural Networks, k-Nearest Neighbors (k-NN) and Boosted Trees. In order to compare these models, paired t-test was applied on the model performance differences obtained from data resampling. Random Forests, Boosted Trees, Partial Least Square and GLMNet have superior overall performance, however they are only slightly higher than the other models. The comparative analysis also allowed us to define a relative variable importance as the average of variable importance from the different models. Two sets of variables are identified from this analysis. The first includes number of positive lymph nodes, tumor size, cancer grade and estrogen receptor, all has an important influence on model predictability. The second set incudes variables related to histological parameters and treatment types. The short term vs long term contribution of the clinical variables are also analyzed from the comparative models. From the various cancer treatment plans, the combination of Chemo/Radio therapy leads to the largest impact on cancer prognosis.
A potential role for the long-chain acyl-CoA synthetase family member 1 (ACSL1) in the immunobiology of sepsis was explored during a hands-on training workshop. Participants first assessed the robustness of the potential gap in biomedical knowledge identified via an initial screen of public transcriptome data and of the literature associated with ACSL1. Increase in ACSL1 transcript abundance during sepsis was confirmed in several independent datasets. Querying the ACSL1 literature also confirmed the absence of reports associating ACSL1 with sepsis. Inferences drawn from both the literature (via indirect associations) and public transcriptome data (via correlation) point to the likely participation of ACSL1 and ACSL4, another family member, in inflammasome activation in neutrophils during sepsis. Furthermore, available clinical data indicate that levels of ACSL1 and ACSL4 induction was significantly higher in fatal cases of sepsis. This denotes potential translational relevance and is consistent with involvement in pathways driving potentially deleterious systemic inflammation. Finally, while ACSL1 expression was induced in blood in vitro by a wide range of pathogen-derived factors as well as TNF, induction of ACSL4 appeared restricted to flagellated bacteria and pathogen-derived TLR5 agonists and IFNG. Taken together, this joint review of public literature and omics data records points to two members of the acyl-CoA synthetase family potentially playing a role in inflammasome activation in neutrophils. Translational relevance of these observations in the context of sepsis and other inflammatory conditions remain to be investigated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.