An extensive simulation study has been performed comparing cross-validation, resubstitution and bootstrap estimation for three popular classification rules-linear discriminant analysis, 3-nearest-neighbor and decision trees (CART)-using both synthetic and real breast-cancer patient data. Comparison is via the distribution of differences between the estimated and true errors. Various statistics for the deviation distribution have been computed: mean (for estimator bias), variance (for estimator precision), root-mean square error (for composition of bias and variance) and quartile ranges, including outlier behavior. In general, while cross-validation error estimation is much less biased than resubstitution, it displays excessive variance, which makes individual estimates unreliable for small samples. Bootstrap methods provide improved performance relative to variance, but at a high computational cost and often with increased bias (albeit, much less than with resubstitution).
BackgroundThe complement system, a key component that links the innate and adaptive immune responses, has three pathways: the classical, lectin, and alternative pathways. In the present study, we have analyzed the levels of various complement components in blood samples from dengue fever (DF) and dengue hemorrhagic fever (DHF) patients and found that the level of complement activation is associated with disease severity.Methods and ResultsPatients with DHF had lower levels of complement factor 3 (C3; p = 0.002) and increased levels of C3a, C4a and C5a (p<0.0001) when compared to those with the less severe form, DF. There were no significant differences between DF and DHF patients in the levels of C1q, immunocomplexes (CIC-CIq) and CRP. However, small but statistically significant differences were detected in the levels of MBL. In contrast, the levels of two regulatory proteins of the alternative pathway varied widely between DF and DHF patients: DHF patients had higher levels of factor D (p = 0.01), which cleaves factor B to yield the active (C3bBb) C3 convertase, and lower levels of factor H (p = 0.03), which inactivates the (C3bBb) C3 convertase, than did DF patients. When we considered the levels of factors D and H together as an indicator of (C3bBb) C3 convertase regulation, we found that the plasma levels of these regulatory proteins in DHF patients favored the formation of the (C3bBb) C3 convertase, whereas its formation was inhibited in DF patients (p<0.0001).ConclusionThe data suggest that an imbalance in the levels of regulatory factors D and H is associated with an abnormal regulation of complement activity in DHF patients.
BackgroundRNA-Seq is the recently developed high-throughput sequencing technology for profiling the entire transcriptome in any organism. It has several major advantages over current hybridization-based approach such as microarrays. However, the cost per sample by RNA-Seq is still prohibitive for most laboratories. With continued improvement in sequence output, it would be cost-effective if multiple samples are multiplexed and sequenced in a single lane with sufficient transcriptome coverage. The objective of this analysis is to evaluate what sequencing depth might be sufficient to interrogate gene expression profiling in the chicken by RNA-Seq.ResultsTwo cDNA libraries from chicken lungs were sequenced initially, and 4.9 million (M) and 1.6 M (60 bp) reads were generated, respectively. With significant improvements in sequencing technology, two technical replicate cDNA libraries were re-sequenced. Totals of 29.6 M and 28.7 M (75 bp) reads were obtained with the two samples. More than 90% of annotated genes were detected in the data sets with 28.7-29.6 M reads, while only 68% of genes were detected in the data set with 1.6 M reads. The correlation coefficients of gene expression between technical replicates within the same sample were 0.9458 and 0.8442. To evaluate the appropriate depth needed for mRNA profiling, a random sampling method was used to generate different number of reads from each sample. There was a significant increase in correlation coefficients from a sequencing depth of 1.6 M to 10 M for all genes except highly abundant genes. No significant improvement was observed from the depth of 10 M to 20 M (75 bp) reads.ConclusionThe analysis from the current study demonstrated that 30 M (75 bp) reads is sufficient to detect all annotated genes in chicken lungs. Ten million (75 bp) reads could detect about 80% of annotated chicken genes, and RNA-Seq at this depth can serve as a replacement of microarray technology. Furthermore, the depth of sequencing had a significant impact on measuring gene expression of low abundant genes. Finally, the combination of experimental and simulation approaches is a powerful approach to address the relationship between the depth of sequencing and transcriptome coverage.
Physics-Informed Neural Networks (PINNs) have emerged recently as a promising application of deep neural networks to the numerical solution of nonlinear partial differential equations (PDEs). However, the original PINN algorithm is known to suffer from stability and accuracy problems in cases where the solution has sharp spatio-temporal transitions. These "stiff" PDEs require an unreasonably large number of collocation points to be solved accurately. It has been recognized that adaptive procedures are needed to force the neural network to fit accurately the stubborn spots in the solution of stiff PDEs. To accomplish this, previous approaches have used fixed weights hard-coded over regions of the solution deemed to be important. In this paper, we propose a fundamentally new method to train PINNs adaptively, where the adaptation weights are fully trainable, so the neural network learns by itself which regions of the solution are difficult and is forced to focus on them, which is reminiscent of soft multiplicative-mask attention mechanism used in computer vision. The basic idea behind these Self-Adaptive PINNs is to make the weights increase where the corresponding loss is higher, which is accomplished by training the network to simultaneously minimize the losses and maximize the weights, i.e., to find a saddle point in the cost surface. We show that this is formally equivalent to solving a PDE-constrained optimization problem using a penaltybased method, though in a way where the monotonically-nondecreasing penalty coefficients are trainable. Numerical experiments with an Allen-Cahn "stiff" PDE, the Self-Adaptive PINN outperformed other state-of-the-art PINN algorithms in L2 error by a wide margin, while using a smaller number of training epochs. An Appendix contains additional results with Burger's and Helmholtz PDEs, which confirmed the trends observed in the Allen-Cahn experiments.
BackgroundWe report the detailed development of biomarkers to predict the clinical outcome under dengue infection. Transcriptional signatures from purified peripheral blood mononuclear cells were derived from whole-genome gene-expression microarray data, validated by quantitative PCR and tested in independent samples.Methodology/Principal FindingsThe study was performed on patients of a well-characterized dengue cohort from Recife, Brazil. The samples analyzed were collected prospectively from acute febrile dengue patients who evolved with different degrees of disease severity: classic dengue fever or dengue hemorrhagic fever (DHF) samples were compared with similar samples from other non-dengue febrile illnesses. The DHF samples were collected 2–3 days before the presentation of the plasma leakage symptoms. Differentially-expressed genes were selected by univariate statistical tests as well as multivariate classification techniques. The results showed that at early stages of dengue infection, the genes involved in effector mechanisms of innate immune response presented a weaker activation on patients who later developed hemorrhagic fever, whereas the genes involved in apoptosis were expressed in higher levels.Conclusions/SignificanceSome of the gene expression signatures displayed estimated accuracy rates of more than 95%, indicating that expression profiling with these signatures may provide a useful means of DHF prognosis at early stages of infection.
Partially-observed Boolean dynamical systems (POBDS) are a general class of nonlinear models with application in estimation and control of Boolean processes based on noisy and incomplete measurements. The optimal minimum mean square error (MMSE) algorithms for POBDS state estimation, namely, the Boolean Kalman filter (BKF) and Boolean Kalman smoother (BKS), are intractable in the case of large systems, due to computational and memory requirements. To address this, we propose approximate MMSE filtering and smoothing algorithms based on the auxiliary particle filter (APF) method from sequential Monte-Carlo theory. These algorithms are used jointly with maximum-likelihood (ML) methods for simultaneous state and parameter estimation in POBDS models. In the presence of continuous parameters, ML estimation is performed using the expectation-maximization (EM) algorithm; we develop for this purpose a special smoother which reduces the computational complexity of the EM algorithm. The resulting particle-based adaptive filter is applied to a POBDS model of Boolean gene regulatory networks observed through noisy RNA-Seq time series data, and performance is assessed through a series of numerical experiments using the well-known cell cycle gene regulatory model.
It was hypothesized that a high-concentrate diet fed during early calfhood alters the expression of genes within the arcuate nucleus that subserve reproductive competence. Beef heifers (n = 12) were weaned at approximately 3 mo of age, and after acclimation, were allocated randomly to 1 of 2 nutritional groups: 1) High Concentrate/High Gain (HC/HG), a high concentrate diet fed to promote a gain of 0.91 kg/d; or 2) High Forage/Low Gain (HF/LG), a forage-based diet fed to promote a gain of 0.45 kg/d. Experimental diets were fed under controlled intake for 91 d. At the end of 91 d, heifers were slaughtered by humane procedures, blood samples were collected, brains were removed, liver weights were determined, and rumen fluid was collected for VFA analyses. Tissue blocks containing the hypothalamus were dissected from the brains, frozen, and cut using a cryostat, and frozen sections were mounted on slides. Tissue from the arcuate nucleus (ARC) was dissected from sections for mRNA extraction. Microarray analysis was used to assess genome-wide transcription in the ARC using a 60-mer oligonucleotide 44K bovine expression array. The ADG was greater (P < 0.001) in heifers fed the HC/HG diet than in heifers fed the HF/LG diet. At slaughter, mean propionate to acetate ratios in the ruminal fluid and liver weight as a percentage of BW were increased (P < 0.005) in HC/HG compared with HF/LG heifers. Mean serum concentrations of insulin (P < 0.05) and IGF-1 (P < 0.005) were greater, and leptin tended to be greater (P = 0.1) in HC/HG heifers compared with HF/LG heifers. Approximately 345 genes were observed to be differentially expressed in the HC/HG group with approximately two-thirds of the genes exhibiting increased expression in the HC/HG group. Genes exhibiting decreased expression in the HC/HG group included agouti-related protein and neuropeptide Y, products of which are known to regulate feed intake and energy expenditure. Functional annotation of enriched Gene Ontology terms indicates that a number of biological processes within the hypothalamus are affected by consumption of high-concentrate diets, including those related to control of feed intake, regulation of cellular metabolic processes, receptor and intracellular signaling, and neuronal communication. In summary, dietary treatments shown previously to accelerate the timing of pubertal onset in heifers increased ruminal propionate, promoted enhanced metabolic hormone secretion, and altered gene expression in the ARC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.