Over the last few years many methods have been developed for analyzing functional data with different objectives. The purpose of this paper is to predict a binary response variable in terms of a functional variable whose sample information is given by a set of curves measured without error. In order to solve this problem we formulate a functional logistic regression model and propose its estimation by approximating the sample paths in a finite dimensional space generated by a basis. Then, the problem is reduced to a multiple logistic regression model with highly correlated covariates. In order to reduce dimension and to avoid multicollinearity, two different approaches of functional principal component analysis of the sample paths are proposed. Finally, a simulation study for evaluating the estimating performance of the proposed principal component approaches is developed.
CYP1A1, CYP2E1 and GSTM1 polymorphisms were evaluated in Chilean healthy controls and lung cancer patients. In the Chilean healthy group, frequencies of CYP1A1 variant alleles for MspI (m2 or CYP1A1*2A) and ile/val (val or CYP1A1*2B) polymorphisms were 0.25 and 0.33, respectively. Frequencies of variant alleles C (CYP2E1*6) and c2 (CYP2E1*5B) for CYP2E1 were 0.21 and 0.16, respectively and frequency for GSTM1(-) was 0.24. The presence of variant alleles for GSTM1, MspI and Ile/val polymorphisms was more frequent in cases than in controls. However, frequencies for the c2 and C alleles were not significantly different in controls and in cases. The estimated relative risk for lung cancer associated to a single mutated allele in CYP1A1, CYP2E1 or GSTM1 was 2.41 for m2, 1.69 for val, 1.16 for C, 0.71 for c2 and 2.46 for GSTM1(-). The estimated relative risk was higher for individuals carrying combined CYP1A1 and GSTM1 mutated alleles (m2/val, OR=6.28; m2/GSTM1(-), OR=3.56) and lower in individuals carrying CYP1A1 and CYP2E1 mutated alleles (m2/C, OR=1.39; m2/c2, OR=2.00; val/C, OR=1.45; val/c2, OR=0.48; not significant). The OR values considering smoking were 4.37 for m2, 4.05 for val, 3.47 for GSTM1(-), 7.38 for m2/val and 3.68 for m2/GSTM1(-), higher values than those observed without any stratification by smoking. Taken together, these findings suggest that Chilean people carrying single or combined GSTM1 and CYP1A1 polymorphisms could be more susceptible to lung cancer induced by environmental pollutants such as polycyclic aromatic hydrocarbons.
Time series statistical analyses (TSSA) have been employed to evaluate the variability of resistive switching memories, and to model the set and reset voltages for modeling purposes. The conventional procedures behind time series theory have been used to obtain autocorrelation and partial autocorrelation functions and determine the simplest analytical models to forecast the set and reset voltages in long series of resistive switching processes. To do so, and for the sake of generality in our study, a wide range of devices have been fabricated and measured. Different oxides and electrodes have been employed, including bilayer dielectrics in devices such as:
Residue levels of pyrifenox, pyridaben, and tralomethrin were determined in unprocessed and processed tomatoes, grown in a experimental greenhouse, to evaluate the effect of three different household processes (washing, peeling, and cooking) and the "unit to unit" variability of these pesticides in tomatoes. The study was carried out on 11 greenhouse tomato samples collected during a 5 week period in which two successive treatments with the studied pesticides were applied. Residue levels in unprocessed and processed tomato samples were determined by means of ethyl acetate extraction and gas chromatography-electron capture detection determination. The washing processing factor results were 0.9 +/- 0.3 for pyridaben, 1.1 +/- 0.3 for pyrifenox, and 1.2 +/- 0.5 for tralomethrin, whereas the peeling processing factors were 0.3 +/- 0.2 for pyridaben and 0.0 +/- 0.0 for both pyrifenox and tralomethrin. The average loss of water in the tomato pure samples during the cooking process was approximately 50%; the cooking processing factors were 2.1 +/- 0.8 for pyridaben, 3.0 +/- 1.1 for pyrifenox, and 1.9 +/- 0.8 for tralomethrin. The unit-to-unit variability factors were determined on three different greenhouse samples analyzing 10 different units of unprocessed tomatoes from each sample. In all cases, the unit-to-unit variability factor results were within the range of 1.3-2.2.
SUMMARYIn recent years, many studies have dealt with predicting a response variable based on the information provided by a functional variable. When the response variable is binary, different problems arise, such as multicollinearity and high dimensionality, which prejudice the estimation of the model and the interpretation of its parameters. In this article we address these problems by using functional logistic regression and principal component analysis. In order to obtain a unique solution for the maximum likelihood estimation of the parameter function, quasi-natural cubic spline interpolation of sample paths on their discrete time observations is proposed. We also introduce a new interpretation of the relationship between the response variable and the functional predictor where the change in the odds of success is evaluated from the estimated parameter function. An analysis of climatological data is finally presented to illustrate the practical performance of the proposed methodologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.