This article provides an overview of BIOASQ, a new competition on biomedical semantic indexing and question answering (QA). BIOASQ aims to push towards systems that will allow biomedical workers to express their information needs in natural language and that will return concise and user-understandable answers by combining information from multiple sources of different kinds, including biomedical articles, databases, and ontologies. BIOASQ encourages participants to adopt semantic indexing as a means to combine multiple information sources and to facilitate the matching of questions to answers. It also adopts a broad semantic indexing and QA architecture that subsumes current relevant approaches, even though no current system instantiates all of its components. Hence, the architecture can also be seen as our view of how relevant work from fields such as information retrieval, hierarchical classification, question answering, ontologies, and linked data can be combined, extended, and applied to biomedical question answering. BIOASQ will develop publicly available benchmarks and it will adopt and possibly refine existing evaluation measures. The evaluation infrastructure of the competition will remain publicly available beyond the end of BIOASQ.
This paper reports on the Large Scale Hierarchical Classification workshop (http://kmi.open.ac.uk/events/ecir2010/workshops-tutorials), held in conjunction with the European Conference on Information Retrieval (ECIR) 2010. The workshop was associated with the PASCAL 2 Large-Scale Hierarchical Text Classification Challenge (http://lshtc.iit.demokritos.gr), which took place in 2009. We first provide information about the challenge, presenting the data used, the tasks and the evaluation measures and then we provide an overview of the approaches proposed by the participants of the workshop, together with a summary of the results of the challenge.
During the last years, there is increasing interest in analyzing social networks and modeling their dynamics at different scales. This work focuses on predicting the future form of communities, which represent the mesoscale structure of networks, while the communities arise as a result of user interaction. We employ several structural and temporal features to represent communities, along with their past form, that are used to formulate a supervised learning task to predict whether a community will continue as currently is, shrink, grow or completely disappear. To test our methodology, we created a reallife social network dataset consisting of an excerpt of posts from the Mathematics Stack Exchange Q&A site. In the experiments, special care is taken in handling the class imbalance in the dataset and in investigating how the past evolutions of a community affect predictions.
This paper presents a knowledge discovery framework for the construction of Community Web Directories, a concept that we introduced in our recent work, applying personalization to Web directories. In this context, the Web directory is viewed as a thematic hierarchy and personalization is realized by constructing user community models on the basis of usage data. In contrast to most of the work on Web usage mining, the usage data that are analyzed here correspond to user navigation throughout the Web, rather than a particular Web site, exhibiting as a result a high degree of thematic diversity. For modeling the user communities, we introduce a novel methodology that combines the users' browsing behavior with thematic information from the Web directories. Following this methodology, we enhance the clustering and probabilistic approaches presented in previous work and also present a new algorithm that combines these two approaches. The resulting community models take the form of Community Web Directories. The proposed personalization methodology is evaluated both on a specialized artificial and a general-purpose Web directory, indicating its potential value to the Web user. The experiments also assess the effectiveness of the different machine learning techniques on the task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.