We classify low-backscatter regions observed in Synthetic Aperture Radar (SAR) measurements of the surface of the ocean as either oil slicks or look-alike slicks (radar false targets). Our proposed classification algorithm is based on Linear Discriminant Analyses (LDAs) of RADARSAT-1 measurements (402 scenes off the southeast coast of Brazil from July 2001 to June 2003) and Meteorological-Oceanographic (MetOc) data from other earth observation sensors: Advanced Very High Resolution Radiometer (AVHRR), Sea-Viewing Wide Field-of-View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer (MODIS), and Quick Scatterometer (QuikSCAT). Oil slicks are sea-surface expressions of exploration and production oil, ship- and orphan-spills. False targets are associated with environmental phenomena, such as biogenic films, algal blooms, upwelling, low wind, or rain cells. Both categories have been interpreted by domain-experts: mineral oil (n = 350; 45.5%) and petroleum free (n = 419; 54.5%). We explore nine size variables (area, perimeter, etc.) and three types of MetOc information (sea surface temperature, chlorophyll-a, and wind speed) that describe the 769 samples analyzed. Seven attribute–domain combinations are tested with three non-linear transformations (none, cube root, log10), with and without MetOc, adding to 39 attribute subdivisions. Classification accuracies are independent of data transformation and improve when selected size attributes are combined with MetOc, leading to overall accuracies of ~80% and sound levels of sensitivity (~90%), specificity (~80%), positive (~80%) and negative (~90%) predictive values. The effectiveness of this data-driven attempt supports further commercial or academic implementation of our LDA algorithm.
In supervised learning, the imbalanced number of instances among the classes in a dataset can make the algorithms to classify one instance from the minority class as one from the majority class. With the aim to solve this problem, the KNN algorithm provides a basis to other balancing methods. These balancing methods are revisited in this work, and a new and simple approach of KNN undersampling is proposed. The experiments demonstrated that the KNN undersampling method outperformed other sampling methods. The proposed method also outperformed the results of other studies, and indicates that the simplicity of KNN can be used as a base for efficient algorithms in machine learning and knowledge discovery.
Linear discriminant analysis (LDA) is a mathematically robust multivariate data analysis approach that is sometimes used for surface oil slick signature classification. Our goal is to rank the effectiveness of LDAs to differentiate oil spills from look-alike slicks. We explored multiple combinations of (i) variables (size information, Meteorological-Oceanographic (metoc), geo-location parameters) and (ii) data transformations (non-transformed, cube root, log10). Active and passive satellite-based measurements of RADARSAT, QuikSCAT, AVHRR, SeaWiFS, and MODIS were used. Results from two experiments are reported and discussed: (i) an investigation of 60 combinations of several attributes subjected to the same data transformation and (ii) a survey of 54 other data combinations of three selected variables subjected to different data transformations. In Experiment 1, the best discrimination was reached using ten cube-transformed attributes: ~85% overall accuracy using six pieces of size information, three metoc variables, and one geo-location parameter. In Experiment 2, two combinations of three variables tied as the most effective: ~81% of overall accuracy using area (log transformed), length-to-width ratio (log- or cube-transformed), and number of feature parts (non-transformed). After verifying the classification accuracy of 114 algorithms by comparing with expert interpretations, we concluded that applying different data transformations and accounting for metoc and geo-location attributes optimizes the accuracies of binary classifiers (oil spill vs. look-alike slicks) using the simple LDA technique.
The paper describes an optimized computational implementation of a basic 'building block' for nonlinear structural dynamic analysis programs: the combination of the modified Newton-Raphson iterative technique with an implicit time integration operator (in this case a member of the Newmark family), working in an incremental-iterative formulation for the equations of motion. The objective of this implementation is to attain improved computational efficiency, regarding both CPU time and memory requirements. The basic formulation and derivation are presented, along with the implementation details; the positive aspects related to the computational optimization are highlighted.
The evaluation of classifiers is not an easy task. There are various ways of testing them and measures to estimate their performance. The great majority of these measures were defined for two-class problems and there is not a consensus about how to generalize them to multiclass problems. This paper proposes the extension of the F-measure and G-mean in the same fashion as carried out with the AUC. Some datasets with diverse characteristics are used to generate fuzzy classifiers and C4.5 trees. The most common evaluation metrics are implemented and they are compared in terms of their output values: the greater the response the more optimistic the measure. The results suggest that there are two well-behaved measures in opposite roles: one is always optimistic and the other always pessimistic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.