Michael Heilman scite author profile

The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggesstions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports,

show abstract

Question Generation via Overgenerating Transformations and Ranking

Heilman¹,

Smith²

2009

125

104

View full text Add to dashboard Cite

We describe an extensible approach to generating questions for the purpose of reading comprehension assessment and practice. Our framework for question generation composes general-purpose rules to transform declarative sentences into questions, is modular in that existing NLP tools can be leveraged, and includes a statistical component for scoring questions based on features of the input, output, and transformations performed. In an evaluation in which humans rated questions according to several criteria, we found that our implementation achieves 43.3% precisionat-10 and generates approximately 6.8 acceptable questions per 250 words of source text.

show abstract

Validation of automated scoring of science assessments

et al. 2016

View full text Add to dashboard Cite

Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of crater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment. # 2015 Wiley Periodicals, Inc. J Res Sci Teach 53: [215][216][217][218][219][220][221][222][223][224][225][226][227][228][229][230][231][232][233] 2016

show abstract

Predicting Grammaticality on an Ordinal Scale

Heilman¹,

Cahill²,

Madnani³

et al. 2014

View full text Add to dashboard Cite

Automated methods for identifying whether sentences are grammatical have various potential applications (e.g., machine translation, automated essay scoring, computer-assisted language learning). In this work, we construct a statistical model of grammaticality using various linguistic features (e.g., misspelling counts, parser outputs, n-gram language model scores). We also present a new publicly available dataset of learner sentences judged for grammaticality on an ordinal scale. In evaluations, we compare our system to the one from Post (2011) and find that our approach yields state-of-the-art performance.

show abstract

An analysis of statistical models and features for reading difficulty prediction

Heilman

Collins-Thompson

Eskénazi

2008

View full text Add to dashboard Cite

A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level. We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntactic parses. We also tested statistical models for nominal, ordinal, and interval scales of measurement. The results indicate that a model for ordinal regression, such as the proportional odds model, using a combination of grammatical and lexical features is most effective at predicting reading difficulty.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael Heilman

Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Question Generation via Overgenerating Transformations and Ranking

Validation of automated scoring of science assessments

Predicting Grammaticality on an Ordinal Scale

An analysis of statistical models and features for reading difficulty prediction

Contact Info

Product

Resources

About