DeepBlueAI at SemEval-2021 Task 1: Lexical Complexity Prediction with A Deep Ensemble Approach

Pan, Chunguang; Song, Bingyan; Wang, Shengguang; Luo, Zhipeng

doi:10.18653/v1/2021.semeval-1.72

Cited by 8 publications

(16 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We trained a SVM model given its high performance at binary CWI Choubey and Pateria, 2016;Sanjay et al, 2016;Kuru, 2016), a BERT model (Devlin et al, 2019) per its competitive performance at LCP-2021 (Shardlow et al, 2021;Yaseen et al, 2021;Pan et al, 2021;Rao et al, 2021), and a BERT + multi-layer perceptron (MLP) model (Gu and Budhkar, 2021) to take full advantage of BERT inferred contextual features as well as the wordlevel features fed into our SVM model. Two naive baseline models were used to evaluate the performances of our SVM, BERT, and BERT + MLP models: a random classifier (RC) and a majority classifier (MC).…”

Section: Modelsmentioning

confidence: 99%

An Evaluation of Binary Comparative Lexical Complexity Models

North¹,

Zampieri²,

Shardlow³

2022

Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

View full text Add to dashboard Cite

Identifying complex words in texts is an important first step in text simplification (TS) systems. In this paper, we investigate the performance of binary comparative Lexical Complexity Prediction (LCP) models applied to a popular benchmark dataset -the CompLex 2.0 dataset used in SemEval-2021 Task 1. With the data from CompLex 2.0, we create a new dataset contain 1,940 sentences referred to as CompLex-BC. Using CompLex-BC, we train multiple models to differentiate which of two target words is more or less complex in the same sentence. A linear SVM model achieved the best performance in our experiments with an F1-score of 0.86.

show abstract

Section: Modelsmentioning

confidence: 99%

An Evaluation of Binary Comparative Lexical Complexity Models

North¹,

Zampieri²,

Shardlow³

2022

Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

View full text Add to dashboard Cite

show abstract

“…We trained a SVM model given its high performance at binary CWI Choubey and Pateria, 2016;Sanjay et al, 2016;Kuru, 2016), a BERT model per its competitive performance at LCP-2021 Yaseen et al, 2021;Pan et al, 2021;Rao et al, 2021), and a BERT + multi-layer perceptron (MLP) model (Gu and Budhkar, 2021) to take full advantage of BERT inferred contextual features as well as the wordlevel features fed into our SVM model. Two naive baseline models were used to evaluate the performances of our SVM, BERT, and BERT + MLP models: a random classifier (RC) and a majority classifier (MC).…”

Section: Modelsmentioning

confidence: 99%

“…Gooding and Kochmar, 2018;Kajiwara and Komachi, 2018). The 2021 shared task was won by Pan et al (2021) using an ensemble of pre-trained Transformers , but submissions with feature-based models also ranked highly (Mosquera, 2021;Rotaru, 2021).…”

Section: Linguistic Complexitymentioning

confidence: 99%

See 1 more Smart Citation

Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

2022

View full text Add to dashboard Cite

As scalable learning technologies become ubiquitous, it generates a large amount of student data, which can be used with machine learning and NLP to develop new instructional technologies, such as personalized practice schedules and adaptive lessons. Additionally, machine learning and NLP are uniquely poised to solve the problems inherent in scaling language instruction to a large number of languages and courses. In this talk, I will describe several projects illustrating these two uses of ML and NLP in language learning at scale at Duolingo -the world's largest language education platform with over 100 courses and around 40 million monthly active learners.

show abstract

“…Table 2.7 shows a brief description of the task's participants. One of the participants with the best scores was the DeepBlueAI [137] team by using a wide variety of pre-trained language models along with different training strategies such as pseudo-labelling and data augmentation to finally apply a stacking method to give the final prediction. With these methods, the team obtained the highest "Pearson's Correlation" in the second task, and the second-best in the first task.…”

Section: Substitute Ranking (Sr)mentioning

confidence: 99%

Designing and Evaluating a User Interface for People with Cognitive Disabilities

Moreno

Alarcón

Martı́nez

2021

Proceedings of the XXI International Conference on Human Computer Interaction

View full text Add to dashboard Cite

The Internet has come a long way in recent years, contributing to the proliferation of large volumes of digitally available information. Through user interfaces we can access these contents, however, they are not accessible to everyone. The main users affected are people with disabilities, who are already a considerable number, but accessibility barriers affect a wide range of user groups and contexts of use in accessing digital information. Some of these barriers are caused by language inaccessibility when texts contain long sentences, unusual words and complex linguistic structures. These accessibility barriers directly affect people with cognitive disabilities. ResumenInternet ha avanzado mucho en los últimos años contribuyendo a la proliferación de grandes volúmenes de información disponible digitalmente. A través de interfaces de usuario podemos acceder a estos contenidos, sin embargo, estos no son accesibles a todas las personas. Los usuarios afectados principalmente son las personas con discapacidad siendo ya un número considerable, pero las barreras de accesibilidad afectan a un gran rango de grupos de usuarios y contextos de uso en el acceso a la información digital. Algunas de estas barreras son causadas por la inaccesibilidad al lenguaje cuando los textos contienen oraciones largas, palabras inusuales y estructuras lingüísticas complejas. Estas barreras de accesibilidad afectan directamente a las personas con discapacidad cognitiva.Con el fin de hacer el contenido textual más accesible, existen iniciativas como las pautas de Lectura Fácil, las pautas de Lenguaje Claro y algunas de las pautas de Accesibilidad al Contenido en la Web (WCAG) específicas para el lenguaje. Estas pautas proporcionan documentación, pero no especifican métodos para cumplir con los requisitos implícitos en estas pautas de manera sistemática. Para obtener una solución, los métodos de la disciplina del Procesamiento del Lenguaje Natural (PLN) pueden dar un soporte para alcanzar la conformidad con las pautas de accesibilidad cognitiva relativas al lenguaje La tarea de la simplificación de textos del PLN tiene como objetivo reducir la complejidad lingüística de un texto desde una perspectiva sintáctica y léxica, siendo esta última el enfoque principal de esta Tesis. En este sentido, un espacio de solución es identificar en un texto qué palabras son complejas o poco comunes, y en el caso de que sí hubiera, proporcionar un sinónimo más usual y sencillo, junto con una definición sencilla, todo ello orientado a las personas con discapacidad cognitiva.Con tal meta, en esta Tesis, se presenta el estudio, análisis, diseño y desarrollo de una arquitectura, métodos PLN, recursos y herramientas para la simplificación léxica de textos para el idioma español en un dominio genérico en el ámbito de la accesibilidad cognitiva. Para lograr esto, se estudia cada uno de los pasos presentes en los procesos de simplificación léxica, junto con métodos para la desambiguación del sentido de las palabras. Como contribución, diferentes tipos de word embeddi...

show abstract

DeepBlueAI at SemEval-2021 Task 1: Lexical Complexity Prediction with A Deep Ensemble Approach

Cited by 8 publications

References 22 publications

An Evaluation of Binary Comparative Lexical Complexity Models

An Evaluation of Binary Comparative Lexical Complexity Models

Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

Designing and Evaluating a User Interface for People with Cognitive Disabilities

Contact Info

Product

Resources

About