Zuhair Bandar scite author profile

Sentence similarity measures play an increasingly important role in textrelated research and applications in areas such as text mining, web page retrieval and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high dimensional space and are consequently inefficient, require human input and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.

show abstract

ArabChat: An Arabic Conversational Agent

Hijjawi

Bandar

Crockett

et al. 2014

View full text Add to dashboard Cite

An approach for measuring semantic similarity between words using multiple information sources

Bandar

McLean

2003

IEEE Trans. Knowl. Data Eng.

854

View full text Add to dashboard Cite

Silent talker: a new computer-based system for the analysis of facial cues to deception

Rothwell

Bandar

O’Shea

et al. 2006

Appl. Cognit. Psychol.

View full text Add to dashboard Cite

This paper presents the development of a computerised, non-invasive psychological profiling system, 'Silent Talker', for the analysis of non-verbal behaviour. Nonverbal signals hold rich information about mental, behavioural and/or physical states. Previous attempts to extract individual signals and to classify an overall behaviour have been time-consuming, costly, biased, error-prone and complex. Silent Talker overcomes these problems by the use of Artificial Neural Networks. The testing and validation of the system was undertaken by detecting processes associated with 'deception' and 'truth'. In a simulated theft scenario thirty-nine participants 'stole' (or didn't) money, and were interviewed about its location. Silent Talker was able to detect different behaviour patterns indicative of 'deception' and 'truth' significantly above chance. For example, when 15 European men had no prior knowledge of the exact questions, 74% of individual responses ( p < 0.001) and 80% ( p ¼ 0.035) of interviews were classified correctly.There is a long history of research in the field of non-verbal behaviour. Non-verbal cues have, with some difficulty and extensive effort, been collected and analysed by humans. In spite of the many problems this has posed, little work has been carried out on the use of machines for the collection and analysis of these cues. A system that is able to noninvasively, quickly and simultaneously analyse many non-verbal cues and their interrelationships has long been needed. This paper describes such a system and its implementation/testing. Silent Talker is an Artificial Neural Network-based system that automatically collects and analyses non-verbal cues to classify an overall behavioural/ psychological state. It can be adapted to detect different such states and tuned to particular situations, environments and applications. In this paper the determination of a particular state is described as profiling.Non-verbal behaviour consists of all the signs and signals-visual, audio, tactile and chemical-used by human beings to express themselves, except speech or manual sign language (Scherer & Ekman, 1982). These cues hold rich information about mental, behavioural and/or physical states and are, at least in part, involuntary and unintended. Physiological cues, such as EEG waveforms, also hold information, but these generally require invasive contact. The work presented in this paper is limited to visible non-verbal APPLIED COGNITIVE PSYCHOLOGY

show abstract

A Comparative Study of Two Short Text Semantic Similarity Measures

O’Shea

Bandar

Crockett

et al.

View full text Add to dashboard Cite

Abstract. This paper describes a comparative study of STASIS and LSA. These measures of semantic similarity can be applied to short texts for use in Conversational Agents (CAs). CAs are computer programs that interact with humans through natural language dialogue. Business organizations have spent large sums of money in recent years developing them for online customer selfservice, but achievements have been limited to simple FAQ systems. We believe this is due to the labour-intensive process of scripting, which could be reduced radically by the use of short-text semantic similarity measures. "Short texts" are typically 10-20 words long but are not required to be grammatically correct sentences, for example spoken utterances and text messages. We also present a benchmark data set of 65 sentence pairs with human-derived similarity ratings. This data set is the first of its kind, specifically developed to evaluate such measures and we believe it will be valuable to future researchers.

show abstract

On constructing a fuzzy inference framework using crisp decision trees

Crockett

Bandar

McLean

et al. 2006

Fuzzy Sets and Systems

View full text Add to dashboard Cite

A new benchmark dataset with production methodology for short text semantic similarity algorithms

O’Shea

Bandar

Crockett

2013

ACM Trans. Speech Lang. Process.

View full text Add to dashboard Cite

This research presents a new benchmark dataset for evaluating Short Text Semantic Similarity (STSS) measurement algorithms and the methodology used for its creation. The power of the dataset is evaluated by using it to compare two established algorithms, STASIS and Latent Semantic Analysis. This dataset focuses on measures for use in Conversational Agents; other potential applications include e-mail processing and data mining of social networks. Such applications involve integrating the STSS algorithm in a complex system, but STSS algorithms must be evaluated in their own right and compared with others for their effectiveness before systems integration. Semantic similarity is an artifact of human perception; therefore its evaluation is inherently empirical and requires benchmark datasets derived from human similarity ratings. The new dataset of 64 sentence pairs, STSS-131, has been designed to meet these requirements drawing on a range of resources from traditional grammar to cognitive neuroscience. The human ratings are obtained from a set of trials using new and improved experimental methods, with validated measures and statistics. The results illustrate the increased challenge and the potential longevity of the STSS-131 dataset as the Gold Standard for future STSS algorithm evaluation.

show abstract

A Machine Learning Approach to Speech Act Classification Using Function Words

O’Shea

Bandar

Crockett

2010

View full text Add to dashboard Cite

Abstract. This paper presents a novel technique for the classification of sentences as Dialogue Acts, based on structural information contained in function words. It focuses on classifying questions or non-questions as a generally useful task in agent-based systems. The proposed technique extracts salient features by replacing function words with numeric tokens and replacing each content word with a standard numeric wildcard token. The Decision Tree, which is a well-established classification technique, has been chosen for this work. Experiments provide evidence of potential for highly effective classification, with a significant achievement on a challenging dataset, before any optimisation of feature extraction has taken place.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zuhair Bandar

Sentence similarity based on semantic nets and corpus statistics

ArabChat: An Arabic Conversational Agent

An approach for measuring semantic similarity between words using multiple information sources

Silent talker: a new computer-based system for the analysis of facial cues to deception

A Comparative Study of Two Short Text Semantic Similarity Measures

On constructing a fuzzy inference framework using crisp decision trees

A new benchmark dataset with production methodology for short text semantic similarity algorithms

A Machine Learning Approach to Speech Act Classification Using Function Words

Contact Info

Product

Resources

About