Dong-Phuong Nguyen scite author profile

Dong-Phuong Nguyen

5Publications

47Citation Statements Received

133Citation Statements Given

How they've been cited

How they cite others

132

Affiliations

University of Twente, Meertens Institute

Publications

Order By: Most citations

What Snippets Say about Pages in Federated Web Search

Demeester

Nguyen

Trieschnigg

et al. 2012

View full text Add to dashboard Cite

Abstract.What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research questions from a global perspective. Our test collection covers the main Web search engines like Google, Yahoo!, and Bing, as well as a number of smaller search engines dedicated to multimedia, shopping, etc., and as such reflects a realistic Web environment. Using a large set of relevance assessments, we are able to investigate the connection between snippet quality and page relevance. The dataset is strongly inhomogeneous, and although the assessors' consistency is shown to be satisfying, care is required when comparing resources. To this end, a number of probabilistic quantities, based on snippet and page relevance, are introduced and evaluated.

show abstract

Text as social and cultural data : a computational perspective on variation in text

Nguyen¹

View full text Add to dashboard Cite

Automatic Detection of Intra-Word Code-Switching

Nguyen¹,

Cornips²

2016

View full text Add to dashboard Cite

Welcome to the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology. The workshop aims to bring together researchers interested in applying computational techniques to problems in morphology, phonology, and phonetics. Our program this year highlights the ongoing and important interaction between work in computational linguistics and work in theoretical linguistics. We received 23 submissions and accepted 11.The volume of submissions made it necessary to recruit several additional reviewers. We'd like to thank all of these people for agreeing to review papers on what seemed like impossibly short notice.This year also marks the first SIGMORPHON shared task, on morphological reinflection. The shared task received 9 submissions, all of which were accepted, and greatly advanced the state of the art in this area.We thank all the authors, reviewers and organizers for their efforts on behalf of the community. AbstractThis paper conceptualizes speech prosody data mining and its potential application in data-driven phonology/phonetics research. We first conceptualize Speech Prosody Mining (SPM) in a time-series data mining framework. Specifically, we propose using efficient symbolic representations for speech prosody time-series similarity computation. We experiment with both symbolic and numeric representations and distance measures in a series of time-series classification and clustering experiments on a dataset of Mandarin tones. Evaluation results show that symbolic representation performs comparably with other representations at a reduced cost, which enables us to efficiently mine large speech prosody corpora while opening up to possibilities of using a wide range of algorithms that require discrete valued data. We discuss the potential of SPM using time-series mining techniques in future works. IntroductionCurrent investigations on the phonology of intonation and tones (or pitch accent) typically employ data-driven approaches by building research on top of manual annotations of a large amount of speech prosody data (for example, (Morén and Zsiga, 2006; Zsiga and Zec, 2013), and many others). Meanwhile, researchers are also limited by the amount of resources invested in such expensive endeavor of manual annotations. Given this paradox, we believe that this type of data driven approach in phonology-phonetics interface can benefit from tools that can efficiently index, query, classify, cluster, summarize, and discover meaningful prosodic patterns from a large speech prosody corpus.The data mining of f 0 1 (pitch) contour patterns from audio data has recently gained success in the domain of Music Information Retrieval (aka MIR, see (Gulati and Serra, 2014; Gulati et al., 2015; Ganguli, 2015) for examples). In contrast, the data mining of speech prosody f 0 data (here on referred to as Speech Prosody Mining (SPM) 2 ) is a less explored research topic (Raskinis and Kazlauskiene, 2013). Fundamentally, SPM in a large prosody corpus aims at discovering meaningful patterns in the f ...

show abstract

Audience and the Use of Minority Languages on Twitter

Nguyen

Trieschnigg

Cornips

2021

ICWSM

View full text Add to dashboard Cite

On Twitter, many users tweet in more than one language. In this study, we examine the use of two Dutch minority languages. Users can engage with different audiences and by analyzing different types of tweets, we find that characteristics of the audience influence whether a minority language is used. Furthermore, while most tweets are written in Dutch, in conversations users often switch to the minority language.

show abstract

The apocalypse on Twitter

Meder

Nguyen

Gravel

2015

Digital Scholarship Humanities

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.