Web legal information retrieval systems need the capability to reason with the knowledge modeled by legal ontologies. Using this knowledge it is possible to represent and to make inferences about the semantic content of legal documents. In this paper a methodology for applying NLP techniques to automatically create a legal ontology is proposed. The ontology is defined in the OWL semantic web language and it is used in a logic programming framework, EVOLP+ISCO, to allow users to query the semantic content of the documents. ISCO allows an easy and efficient integration of declarative, object-oriented and constraint-based programming techniques with the capability to create connections with external databases. EVOLP is a dynamic logic programming framework allowing the definition of rules for actions and events. An application of the proposed methodology to the legal web information retrieval system of the Portuguese Attorney General's Office is described.
This paper describes our participation in SemEval-2015 Task 12, and the opinion mining system sentiue. The general idea is that systems must determine the polarity of the sentiment expressed about a certain aspect of a target entity. For slot 1, entity and attribute category detection, our system applies a supervised machine learning classifier, for each label, followed by a selection based on the probability of the entity/attribute pair, on that domain. The target expression detection, for slot 2, is achieved by using a catalog of known targets for each entity type, complemented with named entity recognition. In the opinion sentiment slot, we used a 3 class polarity classifier, having BoW, lemmas, bigrams after verbs, presence of polarized terms, and punctuation based features. Working in unconstrained mode, our results for slot 1 were assessed with precision between 57% and 63%, and recall varying between 42% and 47%. In sentiment polarity, sentiue's result accuracy was approximately 79%, reaching the best score in 2 of the 3 domains.
Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.
Abstract. Modern information retrieval systems need the capability to reason about the knowledge conveyed by text bases. In this paper a methodology to automatically create ontologies and class instances from documents is proposed. The ontology is defined in the OWL semantic web language and it is used by a logic programming framework, ISCO, to allow users to query the semantic content of the documents. ISCO allows an easy and efficient integration of declarative, object-oriented and constraint-based programming techniques with the capability to create connections with external databases.
Portugal has the sixth highest road fatality rate among European Union members. This is a problem of different dimensions with serious consequences in people’s lives. This study analyses daily data from police and government authorities on road traffic accidents that occurred between 2016 and 2019 in a district of Portugal. This paper looks for the determinants that contribute to the existence of victims in road traffic accidents, as well as the determinants for fatalities and/or serious injuries in accidents with victims. We use logistic regression models, and the results are compared to the machine-learning model results. For the severity model, where the response variable indicates whether only property damage or casualties resulted in the traffic accident, we used a large sample with a small imbalance. For the serious injuries model, where the response variable indicates whether or not there were victims with serious injuries and/or fatalities in the traffic accident with victims, we used a small sample with very imbalanced data. Empirical analysis supports the conclusion that, with a small sample of imbalanced data, machine-learning models generally do not perform better than statistical models; however, they perform similarly when the sample is large and has a small imbalance.
Current technology facilitates access to the vast amount of information that is produced every day. Both individuals and companies are active consumers of data from the Web and other sources, and these data guide decision making. Due to the huge volume of data to be processed in a business context, managers rely on decision support systems to facilitate data analysis. OLAP tools are Business Intelligence solutions for multidimensional analysis of data, allowing the user to control the perspective and the degree of detail in each dimension of the analysis. A conventional OLAP system is configured to a set of analysis scenarios associated with multidimensional data cubes in the repository. To handle a more spontaneous query, not supported in these provided scenarios, one must have specialized technical skills in data analytics. This makes it very difficult for average users to be autonomous in analyzing their data, as they will always need the assistance of specialists. This article describes an ontology-based natural language interface whose goal is to simplify and make more flexible and intuitive the interaction between users and OLAP solutions. Instead of programming an MDX query, the user can freely write a question in his own human language. The system interprets this question by combining the requested information elements, and generates an answer from the OLAP repository.
Abstract. Legal web information retrieval systems need the capability to reason with the knowledge modeled by legal ontologies. Using this knowledge it is possible to represent and to make inferences about the semantic content of legal documents.In this paper a methodology for applying NLP techniques to automatically create a legal ontology is proposed. The ontology is defined in the OWL semantic web language and it is used in a logic programming framework, EVOLP+ISCO, to allow users to query the semantic content of the documents. ISCO allows an easy and efficient integration of declarative, object-oriented and constraint-based programming techniques with the capability to create connections with external databases. EVOLP is a dynamic logic programming framework allowing the definition of rules for actions and events.An application of the proposed methodology to the legal information retrieval system of the Portuguese Attorney General's Office is described.
This document describes an approach to perform sentiment analysis on social media Portuguese content. In a single system, we perform polarity classification for both the overall sentiment, and target oriented sentiment. In both modes we train a Maximum Entropy classifier. The overall model is based on BoW type features, and also features derived from POS tagging and from sentiment lexicons. Target oriented analysis begins with named entity recognition, followed by the classification of sentiment polarity on these entities. This classifier model uses features dedicated to the entity mention textual zone, including negation detection, and the syntactic function of the target occurrence segment. Our experiments have achieved an accuracy of 75% for target oriented polarity classification, and 97% in overall polarity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.