Lushan Han scite author profile

Abstract. We describe RDF123, a highly flexible open-source tool for translating spreadsheet data to RDF. Existing spreadsheet-to-rdf tools typically map only to star-shaped RDF graphs, i.e. each spreadsheet row is an instance, with each column representing a property. RDF123, on the other hand, allows users to define mappings to arbitrary graphs, thus allowing much richer spreadsheet semantics to be expressed. Further, each row in the spreadsheet can be mapped with a fairly different RDF scheme. Two interfaces are available. The first is a graphical application that allows users to create their mapping in an intuitive manner. The second is a Web service that takes as input a URL to a Google spreadsheet or CSV file and an RDF123 map, and provides RDF as output.

show abstract

GoRelations: An Intuitive Query System for DBpedia

Han

Finin

Joshi

2012

View full text Add to dashboard Cite

Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy

Han

Finin

McNamee

et al. 2013

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Pointwise mutual information (PMI) is a widely used word similarity measure, but it lacks a clear explanation of how it works. We explore how PMI differs from distributional similarity, and we introduce a novel metric, PMI max , that augments PMI with information about a word's number of senses. The coefficients of PMI max are determined empirically by maximizing a utility function based on the performance of automatic thesaurus generation. We show that it outperforms traditional PMI in the application of automatic thesaurus generation and in two word similarity benchmark tasks: human similarity ratings and TOEFL synonym questions. PMI max achieves a correlation coefficient comparable to the best knowledge-based approaches on the Miller-Charles similarity rating dataset.

show abstract

Robust semantic text similarity using LSA, machine learning, and linguistic resources

Kashyap

Han

Yus

et al. 2015

Lang Resources & Evaluation

View full text Add to dashboard Cite

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines Latent Semantic Analysis and machine learning augmented with data from several linguistic resources. We used a simple term alignment algorithm to handle longer pieces of text. Additional wrappers and resources were used to handle task specific challenges that include processing Spanish text, comparing text sequences of di↵erent lengths, handling informal words and phrases, and matching words with sense definitions. In the *SEM 2013 task on Semantic Textual Similarity, our best performing system ranked first among the 89 submitted runs. In the SemEval-2014 task on Multilingual Semantic Textual Similarity, we ranked a close second in both the English and Spanish subtasks. In the SemEval-2014 task on Cross-Level Semantic Similarity, we ranked first in Sentence-Phrase, Phrase-Word, and Word-Sense subtasks and second in the Paragraph-Sentence subtask.

show abstract

Meerkat Mafia: Multilingual and Cross-Level Semantic Textual Similarity Systems

Kashyap

Han

Yus

et al. 2014

View full text Add to dashboard Cite

We describe UMBC's systems developed for the SemEval 2014 tasks on Multilingual Semantic Textual Similarity (Task 10) and Cross-Level Semantic Similarity (Task 3). Our best submission in the Multilingual task ranked second in both English and Spanish subtasks using an unsupervised approach. Our best systems for Cross-Level task ranked second in Paragraph-Sentence and first in both Sentence-Phrase and Word-Sense subtask. The system ranked first for the PhraseWord subtask but was not included in the official results due to a late submission.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lushan Han

RDF123: From Spreadsheets to RDF

GoRelations: An Intuitive Query System for DBpedia

Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy

Robust semantic text similarity using LSA, machine learning, and linguistic resources

Meerkat Mafia: Multilingual and Cross-Level Semantic Textual Similarity Systems

Contact Info

Product

Resources

About