AuthCrowd: Author Name Disambiguation and Entity Matching using Crowdsourcing

Correia, António; Guimarães, Diogo Lemos; Paulino, Dennis; Jameel, Shoaib; Schneider, Daniel; Fonseca, Benjamim; Paredes, Hugo

doi:10.1109/cscwd49262.2021.9437769

Cited by 6 publications

(2 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, Ferreira et al introduced "AuthCrowd," a crowdsourcing system designed to tackle author name disambiguation and entity matching by decomposing tasks for crowd workers. Experimental results on a real-world dataset of publicly available papers published in peer-reviewed venues demonstrate the potential of this approach to improve author name disambiguation [4].…”

Section: Introductionmentioning

confidence: 93%

Use Large Language Models for Named Entity Disambiguation in Academic Knowledge Graphs

Liu,

Fang

2023

Atlantis Highlights in Computer Sciences

View full text Add to dashboard Cite

This study investigates the application of large language models (LLMs) in disambiguating homonymous named entities in academic knowledge graphs. Current state-of-the-art methods rely on supervised learning techniques that often necessitate extensive annotated datasets, which may be scarce in specialized domains. For further exploration, we constructed an academic knowledge graph in the science and technology domain using publicly available data and extracted contrasting homonymous named entities from different projects to create a test dataset. We evaluated the performance of the ChatGPT model on this dataset using zero-shot, in-context, and chain-of-thought prompting strategies. The experimental results reveal that while LLMs achieve limited success in a zero-shot setting, chain-of-thought prompting can enhance their reasoning abilities. However, a performance gap persists when compared to supervised learning methods specifically trained on the dataset. These findings suggest that LLMs, such as ChatGPT, present a promising direction for assisting in knowledge graph construction for named entity disambiguation, particularly when labeled data is scarce. The utilization of LLMs could be especially beneficial for domains lacking extensive annotated datasets, offering a competitive alternative for disambiguating homonymous named entities.

show abstract

Section: Introductionmentioning

confidence: 93%

Use Large Language Models for Named Entity Disambiguation in Academic Knowledge Graphs

Liu,

Fang

2023

Atlantis Highlights in Computer Sciences

View full text Add to dashboard Cite

show abstract

“…The core of the system lies in a Bi-LSTM classification model, a type of recurrent neural network (RNN) adept at handling sequential data like text.This model analyzes the pre-processed text, identifying linguistic features and stylistic choices often associated with deceptive content. By continuously learning from user-provided feedback on test sample labels [10], the system refines its classification ability, becoming progressively adept at distinguishing between truthful and deceptive statements. This user-centric approach fosters a collaborative environment, empowering users to contribute to a more reliable and trustworthy digital information landscape.…”

Section: Proposed Modelmentioning

confidence: 99%

AI Assisted Deceptive Content Analysis System Using Bi-LSTM

Ajayaghosh

2024

IJRASET

View full text Add to dashboard Cite

In an age of widespread misinformation, creating accurate machine learning algorithms is vital for ensuring the integrity of information sharing. This project aims to develop a machine learning model capable of distinguishing between fake and true news articles. [1]The model utilizes a Bidirectional Long Short-Term Memory (Bi-LSTM) neural network architecture. Exploratory data analysis techniques are employed to gain in-sights into the characteristics and distributions within the dataset. Data visualization techniques aid in understanding patterns and relationships within the dataset. Additionally, unigram analysis is conducted to extract meaningful features from the text data. The datasets are then prepared for model training, involving preprocessing steps such as tokenization and vectorization. Fi-nally, the Bi-LSTM model is constructed, leveraging its ability to capture long-range dependencies in sequential data. The model is trained on the prepared datasets, optimized using appropri-ate techniques, and evaluated using metrics such as accuracy, precision, recall, and F1-score. The Bi-LSTM architecture offers the advantage of capturing long-range dependencies in sequential data, thereby enhancing the model’s ability to discern nuanced patterns in news articles. The primary objective of this project is to develop a robust and accurate system for automatically detecting fake news, thus playing a pivotal role in enhancing the dissemination of reliable information in the digital age.

show abstract

Author name disambiguation literature review with consolidated meta-analytic approach

Rodrigues,

Mariano,

Ralha

2024

Int J Digit Libr

View full text Add to dashboard Cite

Name ambiguity is a common problem in many bibliographic repositories affecting data integrity and validity. This article presents an author name disambiguation (AND) literature review using the theory of the consolidated meta-analytic approach, including quantitative techniques and bibliometric aspects. The literature review covers information from 211 documents of the Web of Science and Scopus databases in the period 2003 to 2022. A taxonomy based on the literature was used to organize the identified approaches to solve the AND problem. We identified that the most widely used AND solving approaches are author grouping associated with similarity functions and clustering methods and some works using author assignment allied to classification methods. The countries that publish most in AND are the USA, China, Germany, and Brazil with 21%, 19%, 13% and 8% of the total papers, respectively. The review results provide an overview of AND state-of-the-art research that can direct further investigation based on the quantitative and qualitative information from the AND research history.

show abstract

AuthCrowd: Author Name Disambiguation and Entity Matching using Crowdsourcing

Cited by 6 publications

References 17 publications

Use Large Language Models for Named Entity Disambiguation in Academic Knowledge Graphs

Use Large Language Models for Named Entity Disambiguation in Academic Knowledge Graphs

AI Assisted Deceptive Content Analysis System Using Bi-LSTM

Author name disambiguation literature review with consolidated meta-analytic approach

Contact Info

Product

Resources

About