Saptarsi Goswami scite author profile

Thousands of human lives are lost every year around the globe, apart from significant damage on property, animal life etc.due to natural disasters (e.g., earthquake, flood, tsunami, hurricane and other storms, landslides, cloudburst, heat wave, forest fire). In this paper, we focus on reviewing the application of data mining and analytical techniques designed so far for i) prediction ii) detection and iii) development of appropriate disaster management strategy based on the collected data from disasters. A detailed description of availability of data from geological observatories (seismological, hydrological), satellites, remote sensing and newer sources like social networking sites as twitter is presented. An extensive and in depth literature study on current techniques for disaster prediction, detection and management has been done and the results are summarized according to various types of disasters. Finally a framework for building a disaster management database for India hosted on open source Big Data platform like Hadoop in a phased manner has been proposed.not only the immediate effect as observed in [61], exposure to a natural disaster in the past months increases the likelihood of acute illnesses such as diarrhea, fever, and acute respiratory illness in children under 5 year by 9-18%.. The socioeconomic status of the households has a direct bearing on the magnitude and nature of these effects. The disasters have pronounced effects on business houses as well. As stated in [50] 40% of the companies, which were closed for consecutive 3 days, failed or closed down within a period of 36 months. The disasters are not infrequent as well.Only for earthquake [7], there are as many as 20 earthquakes every year which has a Richter scale reading greater than 7.0. The effects of the disasters are much more pronounced in developing countries like India. Meteorologist,Geologists, Environmental Scientists, Computer Scientistsand scientists from various other disciplines have put a lot of concerted efforts to predict the time, place and severity of the disasters. Apart from advanced weather forecasting models, data mining models also have been used for the same purpose. Another line of research, has concentrated on disaster management, appropriate flow of information, channelizing the relief work and analysis of needs or concerns of the victims. The sources of the underlying data for such tasks have often been social media and other internet media.Diverse data are also collected on regular basis by satellites, wireless and remote sensors, national meteorological and geological departments, NGOs, various other international, government and private bodies, before, during and after the disaster. The data thus collected qualifies to be called "Big Data" because of the volume, variety and the velocity in which the data are generated. A brief technical description of some of the major natural disasters:-

show abstract

A Short Review on Different Clustering Techniques and Their Applications

Ghosal

Nandy

Das

et al. 2019

View full text Add to dashboard Cite

A Novel Feature Selection Technique for Text Classification Using Naïve Bayes

Sarkar

Goswami

Agarwal

et al. 2014

International Scholarly Research Notices

View full text Add to dashboard Cite

With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

show abstract

Feature Selection: A Practitioner View

Goswami¹,

Chakrabarti²

2014

IJITCS

View full text Add to dashboard Cite

Abstract-Feature selection is one of the most important preprocessing steps in data mining and knowledge Engineering. In this short review paper, apart from a brief taxonomy of current feature selection methods, we review feature selection methods that are being used in practice. Subsequently we produce a near comprehensive list of problems that have been solved using feature selection across technical and commercial domain. This can serve as a valuable tool to practitioners across industry and academia. We also present empirical results of filter based methods on various datasets. The empirical study covers task of classification, regression, text classification and clustering respectively. We also compare filter based ranking methods using rank correlation.

show abstract

Estimating gene expression from DNA methylation and copy number variation: A deep learning regression model for multi-omics integration

Seal

Das

Goswami³

et al. 2020

Genomics

View full text Add to dashboard Cite

Automated Stem Angle Determination for Temporal Plant Phenotyping Analysis

Das¹,

Goswami

Bashyam³

et al. 2017

View full text Add to dashboard Cite

Image-based plant phenotyping analysis refers to the monitoring and quantification of phenotyping traits by analyzing images of the plants captured by different types of cameras at regular intervals in a controlled environment. Extracting meaningful phenotypes for temporal phenotyping analysis by considering individual parts of a plant, e.g., leaves and stem, using computer-vision based techniques remains a critical bottleneck due to constantly increasing complexity in plant architecture with variations in self-occlusions and phyllotaxy. The paper introduces an algorithm to compute the stem angle, a potential measure for plants' susceptibility to lodging, i.e., the bending of stem of the plant. Annual yield losses due to stem lodging in the U.S. range between 5 and 25%. In addition to outright yield losses, grain quality may also decline as a result of stem lodging. The algorithm to compute stem angle involves the identification of leaf-tips and leaf-junctions based on a graph theoretic approach. The efficacy of the proposed method is demonstrated based on experimental analysis on a publicly available dataset called Panicoid Phenomap-1. A time-series clustering analysis is also performed on the values of stem angles for a significant time interval during vegetative stage life cycle of the maize plants. This analysis effectively summarizes the temporal patterns of the stem angles into three main groups, which provides further insight into genotype specific behavior of the plants. A comparison of genotypic purity using time series analysis establishes that the temporal variation of the stem angles is likely to be regulated by genetic variation under similar environmental conditions.

show abstract

A Review on Agricultural Advancement Based on Computer Vision and Machine Learning

Paul

Ghosh

Das

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.