Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with statistical measures, while the second considers morphosyntactic variation rules to extract term variants from the corpus. The combination of two term extraction and analysis strategies is the keystone of ITEXT-BIO. These include combined intra-corpus strategies that enable term extraction and analysis either from a single corpus (intra), or from corpora (inter). We assessed the two approaches, the corpus or corpora to be analysed and the type of statistical measures used. Our experimental findings revealed that the proposed methodology could be used: (1) to efficiently extract representative, discriminant and new terms from a given corpus or corpora, and (2) to provide quantitative and qualitative analyses on these terms regarding the study domain.
In urban areas, traffic is one of the main causes of air pollution. Establishing an effective solution to raise public awareness of this phenomenon could help to significantly reduce the level of pollution in urban areas. In this study, we design and implement an agent-based simulation allowing to study the principles of production and dispersion of pollutants from road traffic in urban areas. The simulation takes into account different factors that can produce pollutants from the urban zone (the case of Hanoi city in Vietnam): roads and streets, vehicles (types, quantity), traffic, wind direction, etc. With this simulation, one can observe and study the emission and dispersion of pollutants from traffic by conducting experiments with various scenarios and parameters. This work is an interesting solution to sensitize the public's awareness on the air pollution from traffic in urban areas, so that people can change their behaviors to reduce the air pollution.
Abstract. In this paper, we propose a methodology for designing data lake dedicated to Spatial Data and an implementation of this specific framework. Inspired from previous proposals on general data lake Design and based on the Geographic information – Metadata normalization (ISO 19115), the contribution presented in this paper integrates, with the same philosophy, the spatial and thematic dimensions of heterogeneous data (remote sensing images, textual documents and sensor data, etc). To support our proposal, the process has been implemented in a real data project in collaboration with Montpellier Métropole Méditerranée (3M), a metropolis in the South of France. This framework offers a uniform management of the spatial and thematic information embedded in the elements of the data lake.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.