Este artículo propone una metodología para descubrir patrones en datos climatológicos, particularmente temperaturas y precipitación, observados en unidades políticas subnacionales, usando un algoritmo de clasificación automática (un árbol de decisión producido por el algoritmo C4.5). Por lo tanto, los patrones representan árboles de clasificación, en el supuesto de que: 1) cada unidad de división política contiene al menos una estación climatológica y 2) los periodos de registro de las estaciones son relativamente similares en duración y en sus años iniciales y finales. Se produce una serie de modelos de clasificación mediante el uso de diferentes subconjuntos de un conjunto de datos experimentales. Este conjunto de datos contiene información de 3606 estaciones climatológicas en México cuyos periodos de registro tienen diversas duraciones, años iniciales y finales. La variable objetivo (dependiente) en todos estos modelos es el nombre de la unidad política (es decir, el estado). Los predictores son 36 características mensuales por cada estación climatológica: 12 corresponden a una temperatura mínima, 12 a una temperatura máxima y 12 a la precipitación acumulada. También se usó la altitud como predictor adicional a los 36 mencionados, pero sólo para cuantificar su contribución adicional al modelado. Los resultados muestran que los árboles de clasificación son modelos eficaces para describir y representar los patrones no triviales que caracterizan a las unidades de división política, con base en sus temperaturas y precipitación mensual. Uno de los hallazgos destacables es que la precipitación acumulada de mayo es la característica con el mayor poder discriminatorio en esta tarea de caracterización, lo cual es consistente con el trasfondo teórico de la climatología mexicana. Además, los árboles de clasificación ofrecen alta expresividad a personas poco familiarizadas con aprendizaje automático.
ABSTRACTThis article proposes a methodology to discover patterns in observed climatologic data, particularly temperatures and rainfall, in subnational political division units using an automatic classification algorithm (a decision tree produced by the C4.5 algorithm). Thus, the patterns represent classification trees, assuming that: (1) every political division unit contains at least one climatological station, and (2) the recording periods of the stations are relatively similar in duration and in their initial and ending years. A series of classification models are produced by using different subsets from an experimental dataset. This dataset contains information from 3606 climatological stations in Mexico with recording periods whose durations, initial and ending years are diverse. The target (dependent) variable in all these models is the name of the political unit (i.e., the state). The predictors are 36 monthly features per each climatological station: 12 features corresponding to a minimum temperature, 12 to a maximum temperature, and 12 to cumulative rainfall. The altitude feature is also used as one of the predicto...
This work proposes Delta score a simplified nominal measurement for digital divide of cities. It is implemented as a concatenation of alphabetical scores that represent presence percentages of Internet, PC, fixed-line telephone and cell telephone in households of cities. Data from the 2010 Mexican Census on Population and Housing are used to create and evaluate this measurement. A proof of concept shows that the proposed measurement facilitates the creation of digital divide rankings and comparisons among cities within one single country. Also, this measurement suggests being useful for rankings and comparisons among cities of two or more countries. Potential incorporation of this score in inferential statistics is also suggested. Novelty and merits of this proposal are, among others: flexibility of the measurement discrete representation, and possibility to be used in inferential statistics; therefore, its usefulness for research purposes, public policy definition and private company planning is encouraging.
This paper proposes recommendations on the adoption of the 2014 Mexican open data standard for using on climatological data for purposes of scientific research and public policy making in the climate and climate change fields in Mexico. The major benefit of its adoption is a higher accessibility to users who are not climatology or meteorology experts, such as scientists in other research areas, public policy makers, and private company strategists. Four specific sources on climate data from this country are addressed for these purposes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.