Sachin Kumar scite author profile

Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well. In this work, we observe this limitation with respect to the task of native language identification. We find that standard text classifiers which perform well on the test set end up learning topical features which are confounds of the prediction task (e.g., if the input text mentions Sweden, the classifier predicts that the author's native language is Swedish). We propose a method that represents the latent topical confounds and a model which "unlearns" confounding features by predicting both the label of the input text and the confound; but we train the two predictors adversarially in an alternating fashion to learn a text representation that predicts the correct label but is less prone to using information about the confound. We show that this model generalizes better and learns features that are indicative of the writing style rather than the content. 1

show abstract

Earth Mover's Distance Pooling over Siamese LSTMs for Automatic Short Answer Grading

Kumar¹,

Chakrabarti

Roy

2017

View full text Add to dashboard Cite

Automatic short answer grading (ASAG) can reduce tedium for instructors, but is complicated by free-form student inputs. An important ASAG task is to assign ordinal scores to student answers, given some "model" or ideal answers. Here we introduce a novel framework for ASAG by cascading three neural building blocks: Siamese bidirectional LSTMs applied to a model and a student answer, a novel pooling layer based on earth-mover distance (EMD) across all hidden states from both LSTMs, and a flexible final regression layer to output scores. On standard ASAG data sets, our system shows substantial reduction in grade estimation error compared to competitive baselines. We demonstrate that EMD pooling results in substantial accuracy gains, and that a support vector ordinal regression (SVOR) output layer helps outperform softmax. Our system also outperforms recent attention mechanisms on LSTM states.

show abstract

Crustal thickness and bulk Poisson ratios in the Dominican Republic from receiver function analysis

et al. 2020

View full text Add to dashboard Cite

An insight of sodium-ion storage, diffusivity into TiO2 nanoparticles and practical realization to sodium-ion full cell

Ghosh

Kumar

et al. 2019

Electrochimica Acta

View full text Add to dashboard Cite

Binder less-integrated freestanding carbon film derived from pitch as light weight and high-power anode for sodium-ion battery

Ghosh

Kumar

et al. 2020

Electrochimica Acta

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sachin Kumar

Topics to Avoid: Demoting Latent Confounds in Text Classification

Earth Mover's Distance Pooling over Siamese LSTMs for Automatic Short Answer Grading

Crustal thickness and bulk Poisson ratios in the Dominican Republic from receiver function analysis

An insight of sodium-ion storage, diffusivity into TiO2 nanoparticles and practical realization to sodium-ion full cell

Binder less-integrated freestanding carbon film derived from pitch as light weight and high-power anode for sodium-ion battery

Contact Info

Product

Resources

About