scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.
In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods.
BackgroundPlatelets have been involved in both immune surveillance and host defense against severe infection. To date, whether platelet phenotype or other hemostasis components could be associated with predisposition to sepsis in critical illness remains unknown. The aim of this work was to identify platelet markers that could predict sepsis occurrence in critically ill injured patients.MethodsThis single-center, prospective, observational, 7-month study was based on a cohort of 99 non-infected adult patients admitted to ICUs for elective cardiac surgery, trauma, acute brain injury, and post-operative prolonged ventilation and followed up during ICU stay. Clinical characteristics and severity score (SOFA) were recorded on admission. Platelet activation markers, including fibrinogen binding to platelets, platelet membrane P-selectin expression, plasma soluble CD40L, and platelet-leukocytes aggregates were assayed by flow cytometry at admission and 48 h later, and then at the time of sepsis diagnosis (Sepsis-3 criteria) and 7 days later for sepsis patients. Hospitalization data and outcomes were also recorded.MethodsOf the 99 patients, 19 developed sepsis after a median time of 5 days. These patients had a higher SOFA score at admission; levels of fibrinogen binding to platelets (platelet-Fg) and of D-dimers were also significantly increased compared to the other patients. Levels 48 h after ICU admission no longer differed between the two patient groups. Platelet-Fg % was an independent predictor of sepsis (P = 0.0031). By ROC curve analysis, cutoff point for Platelet-Fg (AUC = 0.75) was 50%. In patients with a SOFA cutoff of 8, the risk of sepsis reached 87% when Platelet-Fg levels were above 50%. Patients with sepsis had longer ICU and hospital stays and higher death rate.ConclusionsPlatelet-bound fibrinogen levels assayed by flow cytometry within 24 h of ICU admission help identifying critically ill patients at risk of developing sepsis.Electronic supplementary materialThe online version of this article (doi:10.1186/s40635-017-0145-2) contains supplementary material, which is available to authorized users.
Abstract. We adapt the idea of random projections applied to the output space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage.
Prosody is an integral part of communication, but remains an open problem in state-of-the-art speech synthesis. There are two major issues faced when modelling prosody: (1) prosody varies at a slower rate compared with other content in the acoustic signal (e.g. segmental information and background noise); (2) determining appropriate prosody without sufficient context is an ill-posed problem. In this paper, we propose solutions to both these issues. To mitigate the challenge of modelling a slow-varying signal, we learn to disentangle prosodic information using a word level representation. To alleviate the ill-posed nature of prosody modelling, we use syntactic and semantic information derived from text to learn a contextdependent prior over our prosodic space. Our context-aware model of prosody (CAMP) outperforms the state-of-the-art technique, closing the gap with natural speech by 26%. We also find that replacing attention with a jointly-trained duration model improves prosody significantly.
In this paper, we introduce Kathaka, a model trained with a novel two-stage training process for neural speech synthesis with contextually appropriate prosody. In Stage I, we learn a prosodic distribution at the sentence level from melspectrograms available during training. In Stage II, we propose a novel method to sample from this learnt prosodic distribution using the contextual information available in text.To do this, we use BERT on text, and graph-attention networks on parse trees extracted from text. We show a statistically significant relative improvement of 13.2% in naturalness over a strong baseline when compared to recordings. We also conduct an ablation study on variations of our sampling technique, and show a statistically significant improvement over the baseline in each case.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.