A crime is an action which constitutes a punishable offence by law. It is harmful for society so as to prevent the criminal activity , it is important to understand crime. Data driven researches are useful to prevent and solve crime. Recent research shows that 50% of the crimes are committed by only handful of offenders. The law enforcement officers need early information about the criminal activity to response and solve the spatio-temporal criminal activity. In this research, supervised learning algorithms are used to predict criminal activity . The proposed data driven system predicts crimes by analyzing San Francisco city criminal activity data set for 12 years. Decision tree and k-nearest neighbor (KNN) algorithms are applied to predict crime. But these two algorithms provide low accuracy in prediction. Then, random forest is applied as an ensemble methods and adaboost is used as a boosting method to increase the accuracy of prediction. However, log-loss is used to measure the performance of classifiers by penalizing false classifications. As the dataset contains highly class imbalance problems, a random undersampling method for random forest algorithm gives the best accuracy. The final accuracy is 99.16% with 0.17% log loss.
A crime is a punishable offence that is harmful for an individual and his society. It is obvious to comprehend the patterns of criminal activity to prevent them. Research can help society to prevent and solve crime activates. Study shows that only 10 percent offenders commits 50 percent of the total offences. The enforcement team can respond faster if they have early information and pre-knowledge about crime activities of the different points of a city. In this paper, supervised learning technique is used to predict crimes with better accuracy. The proposed system predicts crimes by analyzing data-set that contains records of previously committed crimes and their patterns. The system stands on two main algorithms -i) decision tree, and ii) k-nearest neighbor. Random Forest algorithm and Adaboost are used to increase the accuracy of the prediction. Finally, oversampling is used for better accuracy. The proposed system is feed with a criminal-activity data set of twelve years of San Francisco city.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.