Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k -NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated 3 missingness mechanisms, 3 missing data patterns, and 5 missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k -NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%.1
A high-throughput, QuEChERS (Quick, Easy, Cheap, Effective, Rugged, Safe) sample preparation and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analytical method has been developed and validated for the determination of 191 pesticides in vegetation and fruit samples. Using identical LC analytical column and MS/MS instrumentation and operation parameters, this method was evaluated at the U.S. Food and Drug Administration (FDA), National Research Centre for Grapes (NRCG), India, and Ontario Ministry of the Environment (MOE) laboratories. Method validation results showed that all but 1 of these 191 pesticides can be analyzed by LC-MS/MS with instrument detection limits (IDL) in the parts per trillion (ppt) range. Matrix-dependent IDL studies showed that due to either the low ionization efficiency or matrix effect exerted, 14 of these 191 pesticides could not be analyzed by this method. Method recovery (%R) and method detection limits (MDLs) were determined by the three laboratories using four sample matrices in replicates (N = 4). With >79% of %R data from the fortification studies in the range from 80 to 120%, MDLs were determined in the low parts per billion range with >94% of MDLs in the range from 0.5 to 5 ppb. Applying this method to the analysis of incurred samples showed that two multiple reaction monitoring (MRM) transitions may not be enough to provide 100% true positive identification of target pesticides; however, quantitative results obtained from the three laboratories had an excellent match with only a few discrepancies in the low parts per billion levels. The %R data from the fortification studies were subjected to principal component analysis and showed the majority of %R fell into the cluster of 80% < %R < 120%. Due to the matrix effect exerted by ginseng and peach, outliers were observed at the lowest spiking levels of 10 and 25 ppb. The study also showed that QuEChERS samples should be analyzed as soon as prepared or stored in a freezer to avoid any adverse affect on the analytes evaluated.
BackgroundMany clinicians do not encourage breastfeeding in hepatitis B virus (HBV) carriers, since HBV DNA can be detected in breast milk and breast lesions may increase exposure of infants to HBV. The aim of this study was to determine whether breastfeeding may add risk for perinatal HBV transmission.Methodology/Principal FindingsTotally 546 children (1–7-year-old) of 544 HBV-infected mothers were investigated, with 397 breastfed and 149 formula-fed; 137 were born to HBeAg-positive mothers. All children had been vaccinated against hepatitis B but only 53.3% received hepatitis B immune globulin (HBIG). The overall prevalence of HBsAg+, HBsAg−/anti-HBc+, and anti-HBs (≥10 mIU/ml) in children was 2.4%, 3.1%, and 71.6% respectively. The HBsAg prevalence in breast- and formula-fed children was 1.5% and 4.7% respectively (P = 0.063); the difference was likely due to the higher mothers' HBeAg-positive rate in formula-fed group (formula-fed 49.0% vs. breastfed 15.9%, P<0.001). Further logistic regression analyses showed that breastfeeding was not associated with the HBV infection in the children, adjusting for the effect of maternal HBeAg status and other factors different between the two groups.Conclusions/SignificanceUnder the recommended prophylaxis, breastfeeding is not a risk factor for mother-to-child transmission of HBV. Therefore, clinicians should encourage HBV-infected mothers to breastfeed their infants.
We studied the relationship between individuals' group social capital and their lending outcomes in the online peer-to-peer financial credit market, where individual lenders make direct unsecured microloans to other individual borrowers. Despite its ability to facilitate economic exchange, social capital as public goods may also cause free-rider problems, particularly in an online environment. Based on the analyses of transaction data collected from one of the largest online peer-to-peer lending platform in the U.S., we found that the borrower's general group social capital (i.e., group membership) and relational social capital (i.e., group credibility and verifiability, and group trust) yielded inconsistent effects, and the borrower's structural social capital (i.e., group inclusiveness) had a negative impact on, his/her funding and repayment performance. We discuss the implications of our findings for reconciling two major but conflicting theoretical views of social capital and for improving institutional mechanism design in a decentralized online financial credit market.
Drug-target interaction (DTI) prediction has drawn increasing interest due to its substantial position in the drug discovery process. Many studies have introduced computational models to treat DTI prediction as a regression task, which directly predict the binding affinity of drug-target pairs. However, existing studies (i) ignore the essential correlations between atoms when encoding drug compounds and (ii) model the interaction of drug-target pairs simply by concatenation. Based on those observations, in this study, we propose an end-to-end model with multiple attention blocks to predict the binding affinity scores of drug-target pairs. Our proposed model offers the abilities to (i) encode the correlations between atoms by a relation-aware self-attention block and (ii) model the interaction of drug representations and target representations by the multi-head attention block. Experimental results of DTI prediction on two benchmark datasets show our approach outperforms existing methods, which are benefit from the correlation information encoded by the relation-aware self-attention block and the interaction information extracted by the multi-head attention block. Moreover, we conduct the experiments on the effects of max relative position length and find out the best max relative position length value $k \in \{3, 5\}$. Furthermore, we apply our model to predict the binding affinity of Corona Virus Disease 2019 (COVID-19)-related genome sequences and $3137$ FDA-approved drugs.
As a core subunit of the SCF complex that promotes protein degradation through the 26S proteasome, S-phase kinase-associated protein 1 (SKP1) plays important roles in multiple cellular processes in eukaryotes, including gibberellin (GA), jasmonate, ethylene, auxin and light responses. P7-2 encoded by Rice black streaked dwarf virus (RBSDV), a devastating viral pathogen that causes severe symptoms in infected plants, interacts with SKP1 from different plants. However, whether RBSDV P7-2 forms a SCF complex and targets host proteins is poorly understood. In this study, we conducted yeast two-hybrid assays to further explore the interactions between P7-2 and 25 type I Oryza sativa SKP1-like (OSK) proteins, and found that P7-2 interacted with eight OSK members with different binding affinity. Co-immunoprecipitation assay further confirmed the interaction of P7-2 with OSK1, OSK5 and OSK20. It was also shown that P7-2, together with OSK1 and O. sativa Cullin-1, was able to form the SCF complex. Moreover, yeast two-hybrid assays revealed that P7-2 interacted with gibberellin insensitive dwarf2 (GID2) from rice and maize plants, which is essential for regulating the GA signaling pathway. It was further demonstrated that the N-terminal region of P7-2 was necessary for the interaction with GID2. Overall, these results indicated that P7-2 functioned as a component of the SCF complex in rice, and interaction of P7-2 with GID2 implied possible roles of the GA signaling pathway during RBSDV infection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.