Abstract:Abstract-The accelerating progress in science with the active role of the communication media -mainly the web -make person in front of a difficult task, in finding appropriate information during a brief time. In a narrower context, many researches were created in the expertise retrieval domain, as an interesting and complicated task for the scientific community, in face of this huge amount of data scattered across the web. Benefiting from the semantic web technologies and the efforts of data structuring, in th… Show more
“…The set of profiles belong to researchers where each profile contains personal information and the academic work done by the researcher, having the following attributes: name, affiliation, e-mail, location, homepage, summary about his career and a list of all his publications. These profiles are generated in [20] through correlating the information extracted from heterogeneous sources, by taking advantage of data repetition in multiple sources and those existing in one source. On one hand, data is being validated and links between sources are created.…”
With the enormous growth of data, retrieving information from the Web became more desirable and even more challenging because of the Big Data issues (e.g. noise, corruption, bad quality…etc.). Expert seeking, defined as returning a ranked list of expert researchers given a topic, has been a real concern in the last 15 years. This kind of task comes in handy when building scientific committees, requiring to identify the scholars' experience to assign them the most suitable roles in addition to other factors as well. Due to the fact the Web is drowning with plenty of data, this opens up the opportunity to collect different kinds of expertise evidence. In this paper, we propose an expert seeking approach with specifying the most desirable features (i.e. criteria on which researcher's evaluation is done) along with their estimation techniques. We utilized some machine learning techniques in our system and we aim at verifying the effectiveness of incorporating influential features that go beyond publications.
“…The set of profiles belong to researchers where each profile contains personal information and the academic work done by the researcher, having the following attributes: name, affiliation, e-mail, location, homepage, summary about his career and a list of all his publications. These profiles are generated in [20] through correlating the information extracted from heterogeneous sources, by taking advantage of data repetition in multiple sources and those existing in one source. On one hand, data is being validated and links between sources are created.…”
With the enormous growth of data, retrieving information from the Web became more desirable and even more challenging because of the Big Data issues (e.g. noise, corruption, bad quality…etc.). Expert seeking, defined as returning a ranked list of expert researchers given a topic, has been a real concern in the last 15 years. This kind of task comes in handy when building scientific committees, requiring to identify the scholars' experience to assign them the most suitable roles in addition to other factors as well. Due to the fact the Web is drowning with plenty of data, this opens up the opportunity to collect different kinds of expertise evidence. In this paper, we propose an expert seeking approach with specifying the most desirable features (i.e. criteria on which researcher's evaluation is done) along with their estimation techniques. We utilized some machine learning techniques in our system and we aim at verifying the effectiveness of incorporating influential features that go beyond publications.
“…This approach of author entity resolution is the predecessor of our author entity matching and profiling approach (CARP) [11]. In CARP we correlate information from several web resources -DBLP one of them-to formulate researcher profiles.…”
many authors can share the same name and this constitutes a serious problem that affects the relevancy of retrieval results and constitutes our motivation of finding such approach to cover this issue at the author names entity level. Solving such a problem may return with positive gain at the level of document retrieval, web search and the quality of data. This entity resolution task can be tackled as an unsupervised problem, where there are set of features that can be employed for the resolution job, or as supervised problem to compute the similarities among two citations and then classify if they are the same or not. Recent approaches usually utilize features such as: co-author, venue, topic similarity, affiliations and title of publications to deal with author ambiguity. In this paper, three attributes are used to treat this problem sequentially. The coauthorship firstly which is a well-known attribute, and then the topic and affiliation extracted from biographies, which can be found inside the publication, and this is our novelty frame in this paper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.