Sentence similarity measures play an increasingly important role in textrelated research and applications in areas such as text mining, web page retrieval and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high dimensional space and are consequently inefficient, require human input and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.
No abstract
This paper presents the development of a computerised, non-invasive psychological profiling system, 'Silent Talker', for the analysis of non-verbal behaviour. Nonverbal signals hold rich information about mental, behavioural and/or physical states. Previous attempts to extract individual signals and to classify an overall behaviour have been time-consuming, costly, biased, error-prone and complex. Silent Talker overcomes these problems by the use of Artificial Neural Networks. The testing and validation of the system was undertaken by detecting processes associated with 'deception' and 'truth'. In a simulated theft scenario thirty-nine participants 'stole' (or didn't) money, and were interviewed about its location. Silent Talker was able to detect different behaviour patterns indicative of 'deception' and 'truth' significantly above chance. For example, when 15 European men had no prior knowledge of the exact questions, 74% of individual responses ( p < 0.001) and 80% ( p ¼ 0.035) of interviews were classified correctly.There is a long history of research in the field of non-verbal behaviour. Non-verbal cues have, with some difficulty and extensive effort, been collected and analysed by humans. In spite of the many problems this has posed, little work has been carried out on the use of machines for the collection and analysis of these cues. A system that is able to noninvasively, quickly and simultaneously analyse many non-verbal cues and their interrelationships has long been needed. This paper describes such a system and its implementation/testing. Silent Talker is an Artificial Neural Network-based system that automatically collects and analyses non-verbal cues to classify an overall behavioural/ psychological state. It can be adapted to detect different such states and tuned to particular situations, environments and applications. In this paper the determination of a particular state is described as profiling.Non-verbal behaviour consists of all the signs and signals-visual, audio, tactile and chemical-used by human beings to express themselves, except speech or manual sign language (Scherer & Ekman, 1982). These cues hold rich information about mental, behavioural and/or physical states and are, at least in part, involuntary and unintended. Physiological cues, such as EEG waveforms, also hold information, but these generally require invasive contact. The work presented in this paper is limited to visible non-verbal APPLIED COGNITIVE PSYCHOLOGY
Abstract. This paper describes a comparative study of STASIS and LSA. These measures of semantic similarity can be applied to short texts for use in Conversational Agents (CAs). CAs are computer programs that interact with humans through natural language dialogue. Business organizations have spent large sums of money in recent years developing them for online customer selfservice, but achievements have been limited to simple FAQ systems. We believe this is due to the labour-intensive process of scripting, which could be reduced radically by the use of short-text semantic similarity measures. "Short texts" are typically 10-20 words long but are not required to be grammatically correct sentences, for example spoken utterances and text messages. We also present a benchmark data set of 65 sentence pairs with human-derived similarity ratings. This data set is the first of its kind, specifically developed to evaluate such measures and we believe it will be valuable to future researchers.
This research presents a new benchmark dataset for evaluating Short Text Semantic Similarity (STSS) measurement algorithms and the methodology used for its creation. The power of the dataset is evaluated by using it to compare two established algorithms, STASIS and Latent Semantic Analysis. This dataset focuses on measures for use in Conversational Agents; other potential applications include e-mail processing and data mining of social networks. Such applications involve integrating the STSS algorithm in a complex system, but STSS algorithms must be evaluated in their own right and compared with others for their effectiveness before systems integration. Semantic similarity is an artifact of human perception; therefore its evaluation is inherently empirical and requires benchmark datasets derived from human similarity ratings. The new dataset of 64 sentence pairs, STSS-131, has been designed to meet these requirements drawing on a range of resources from traditional grammar to cognitive neuroscience. The human ratings are obtained from a set of trials using new and improved experimental methods, with validated measures and statistics. The results illustrate the increased challenge and the potential longevity of the STSS-131 dataset as the Gold Standard for future STSS algorithm evaluation.
Abstract. This paper presents a novel technique for the classification of sentences as Dialogue Acts, based on structural information contained in function words. It focuses on classifying questions or non-questions as a generally useful task in agent-based systems. The proposed technique extracts salient features by replacing function words with numeric tokens and replacing each content word with a standard numeric wildcard token. The Decision Tree, which is a well-established classification technique, has been chosen for this work. Experiments provide evidence of potential for highly effective classification, with a significant achievement on a challenging dataset, before any optimisation of feature extraction has taken place.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.