Aygul Garifullina scite author profile

In closed-domain Question Answering (QA), the goal is to retrieve answers to questions within a specific domain. The main challenge of closed-domain QA is to develop a model that only requires small datasets for training since large-scale corpora may not be available. One approach is a flexible QA model that can adapt to different closed domains and train on their corpora. In this paper, we present a novel versatile reading comprehension style approach for closed-domain QA (called CA-AcdQA). The approach is based on pre-trained contextualized language models, Convolutional Neural Network (CNN), and a self-attention mechanism. The model captures the relevance between the question and context sentences at different levels of granularity by exploring the dependencies between the features extracted by the CNN. Moreover, we include candidate answer identification and question expansion techniques for context reduction and rewriting ambiguous questions. The model can be tuned to different domains with a small training dataset for sentence-level QA. The approach is tested on four publicly-available closed-domain QA datasets: Tesla (person), California (region), EU-law (system), and COVID-QA (biomedical) against nine other QA approaches. Results show that the ALBERT model variant outperforms all approaches on all datasets with a significant increase in Exact Match and F1 score. Furthermore, for the Covid-19 QA in which the text is complicated and specialized, the model is improved considerably with additional biomedical training resources (an F1 increase of 15.9 over the next highest baseline).

show abstract

Evaluation of the TV Customer Experience Using Eye Tracking Technology

Zhang

McClean

Garifullina³

et al. 2018

View full text Add to dashboard Cite

As the TV experience evolves to provide customers with a richer, more interactive experience across multiple devices, it is increasingly important to make the best use of subjective and objective techniques to inform the development of TV user interfaces. This paper describes the design of a new experiment to evaluate the TV customer experience using eye tracking technology, focused on the BT Player, a visually-rich Video-on-Demand application. Eye tracking provides an objective assessment which does not interfere with the natural interaction of the user with the system. The evaluation will capture a unique data set through the observation of test subjects exposed to a prioritised set of test conditions presented within a controlled environment. The paper presents the design of the experiments, including requirements capture, hardware and software setup, experimental protocol, data collection and analysis. The paper also outlines the challenges posed by the dynamic nature of the content and user interaction with the TV interface.

show abstract

Using Eye Tracking to Gain Insight into TV Customer Experience by Markov Modelling

Chen

Zhang

McClean

et al. 2019

View full text Add to dashboard Cite

rx-anon -- A Novel Approach on the De-Identification of Heterogeneous Data based on a Modified Mondrian Algorithm

Garifullina¹,

Kern²,

Scherp³

2021

Preprint

View full text Add to dashboard Cite

Traditional approaches for data anonymization consider relational data and textual data independently. We propose rx-anon, an anonymization approach for heterogeneous semi-structured documents composed of relational and textual attributes. We map sensitive terms extracted from the text to the structured data. This allows us to use concepts like 𝑘-anonymity to generate a joined, privacypreserved version of the heterogeneous data input. We introduce the concept of redundant sensitive information to consistently anonymize the heterogeneous data. To control the influence of anonymization over unstructured textual data versus structured data attributes, we introduce a modified, parameterized Mondrian algorithm. The parameter 𝜆 allows to give different weight on the relational and textual attributes during the anonymization process. We evaluate our approach with two real-world datasets using a Normalized Certainty Penalty score, adapted to the problem of jointly anonymizing relational and textual data. The results show that our approach is capable of reducing information loss by using the tuning parameter to control the Mondrian partitioning while guaranteeing 𝑘-anonymity for relational attributes as well as for sensitive terms. As rx-anon is a framework approach, it can be reused and extended by other anonymization algorithms, privacy models, and textual similarity metrics. CCS CONCEPTS• Security and privacy → Data anonymization and sanitization; • Computing methodologies → Information extraction.

show abstract

A Study on Extracting Named Entities from Fine-tuned vs. Differentially Private Fine-tuned BERT Models

Diera¹,

Lell²,

Garifullina³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.