John Mazzeo scite author profile

Recently, Shealy and Stout (1993) proposed a DIF detecting procedure SIBTEST, which is 1) IRT model based, 2) non‐parametric, 3) does not require IRF estimation, 4) provides a test of significance, and 5) estimates the amount of DIF. Current versions of SIBTEST can only be used for dichotomously scored items. However, in this paper an extension to handle polytomous items is developed. This paper presents: (1) a discussion of an appropriate definition of DIF for polytomously scored items, (2) a modified SIBTEST procedure for detecting DIF for polytomous items, and (3) the results of two simulation studies comparing the modified SIBTEST with the Mantel and SMD procedures, one study with data constrained by the Rasch‐like partial credit model (same discrimination across polytomous items), and the other study with data having distinctly discrimations across items. These simulation studies indicate that the methodology of including the studied item in matching subtest for controling impact induced (group ability differences existing) Type I error tends to yield Type‐I/Type II error inflation rates that are highly unacceptable when the equal discrimination condition is violated. These simulation studies provide compelling evidence that the modified SIBTEST procedure is much more robust with regard to controlling impact‐induced Type I error rate inflation than the other procedures.

show abstract

Detecting DIF for Polytomously Scored Items: An Adaptation of the SIBTEST Procedure

Chang

Mazzeo

Roussos

1996

J Educational Measurement

143

View full text Add to dashboard Cite

show abstract

The Equivalence of Scores From Automated and Conventional Educational and Psychological Tests

Mazzeo¹,

Harvey²

1988

ETS Research Report Series

View full text Add to dashboard Cite

A literature review was conducted to determine the current state of knowledge concerning the effects of the computer administration of standardized educational and psychological tests on the psychometric properties of these instruments. Studies were grouped according to a number of factors relevant to the administration of tests by computer. Based on the studies reviewed, we arrived at the following conclusions: The rate at which test‐takers omit items in an automated test may differ from the rate at which they omit items in a conventional presentation. Scores on tests from automated versions of personality inventories such as the Minnesota Multiphasic Personality Inventory are lower than scores obtained in the conventional testing format. These differences may result in part from differing omit rates, as described above, but some of the differences may be caused by other factors. Scores from automated versions of speed tests are not likely to be comparable with scores from paper‐and‐pencil versions. The presentation of graphics in an automated test may have an effect on score equivalence. Such effects were obtained in studies using the Hidden Figures Test. However, in studies with three Armed Services Vocational Aptitude Battery (ASVAB) tests, effects were not found. Tests containing items based on reading passages can become more difficult when presented on a CRT. This was demonstrated in a single study with the ASVAB tests. The possibility of such asymmetric practice effects may make it wise to avoid conducting equating studies based on single‐group counterbalanced designs.

show abstract

Descriptive and Inferential Procedures for Assessing Differential Item Functioning in Polytomous Items

Zwick¹,

Thayer²,

Mazzeo³

1997

Applied Measurement in Education

View full text Add to dashboard Cite

A Multi-Site Analysis of the Prevalence of Food Insecurity in the United States, before and during the COVID-19 Pandemic

Niles¹,

Beavers²,

Clay³

et al. 2021

Current Developments in Nutrition

View full text Add to dashboard Cite

Background The COVID-19 pandemic profoundly affected food systems including food security. Understanding how the COVID-19 pandemic impacted food security is important to provide support, and identify long-term impacts and needs. Objective The National Food Access and COVID research Team (NFACT) was formed to assess food security over different U.S. study sites throughout the pandemic, using common instruments and measurements. This study present results from 18 study sites across 15 states and nationally over the first year of the COVID-19 pandemic. Methods A validated survey instrument was developed and implemented in whole or part through an online survey of adults across the sites throughout the first year of the pandemic, representing 22 separate surveys. Sampling methods for each study site were convenience, representative, or high-risk targeted. Food security was measured using the USDA six-item module. Food security prevalence was analyzed using analysis of variance by sampling method to statistically significant differences. Results Respondents (n = 27,168) indicate higher prevalence of food insecurity (low or very low food security) since the COVID-19 pandemic, as compared to before the pandemic. In nearly all study sites, there is higher prevalence of food insecurity among Black, Indigenous, and People of Color (BIPOC), households with children, and those with job disruptions. The findings demonstrate lingering food insecurity, with high prevalence over time in sites with repeat cross-sectional surveys. There are no statistically significant differences between convenience and representative surveys, but statistically higher prevalence of food insecurity among high-risk compared to convenience surveys. Conclusions This comprehensive study demonstrates higher prevalence of food insecurity in the first year of the COVID-19 pandemic. These impacts were prevalent for certain demographic groups, and most pronounced for surveys targeting high-risk populations. Results especially document the continued high levels of food insecurity, as well as the variability in estimates due to survey implementation method. Summary Multi-site assessment demonstrates widespread food insecurity during COVID-19, especially on households with children, job loss, and Black, Indigenous, People of Color across multiple survey methods.

show abstract

The unique correspondence of the item response function and item category response functions in polytomously scored item response models

Chang

Mazzeo

1994

Psychometrika

View full text Add to dashboard Cite

show abstract

Describing and Categorizing Dif in Polytomous Items

Zwick

Thayer

Mazzeo

1997

ETS Research Report Series

View full text Add to dashboard Cite

The purpose of this project was to evaluate statistical procedures for assessing differential item functioning (DIF) in polytomous items (items with more than two score categories). Three descriptive statistics—the Standardized Mean Difference, or SMD (Dorans & Schmitt, 1991), and two procedures based on SIBTEST (Shealy & Stout, 1993) were considered, along with five inferential procedures—two based on SMD, two based on SIBTEST, and the Mantel (1963) method. The DIF procedures were evaluated through applications to simulated data, as well as data from ETS tests. The simulation included conditions in which the two groups of examinees had the same ability distribution and conditions in which the group means differed by one standard deviation. When the two groups had the same distribution, the descriptive index that performed best was the SMD. When the two groups had different distributions, a modified form of the SIBTEST DIF effect size measure tended to perform best. The five inferential procedures performed almost indistinguishably when the two groups had identical distributions. When the two groups had different distributions and the studied item was highly discriminating, the SIBTEST procedures showed much better Type I error control than did the SMD and Mantel methods, particularly in short tests. The power ranking of the five procedures was inconsistent; it depended on the direction of DIF and other factors. Routine application of these polytomous DIF methods at ETS seems feasible in cases where a reliable test is available for matching examinees. For the Mantel and SMD methods, Type I error control may be a concern under certain conditions. In the case of SIBTEST, the current version cannot easily accommodate matching tests that do not use number‐right scoring. Additional research in these areas is likely to be useful.

show abstract

Sex‐related Performance Differences on Constructed‐response and Multiple‐choice Sections of Advanced Placement Examinations

Mazzeo¹,

Schmitt²,

Bleistein³

1993

ETS Research Report Series

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

John Mazzeo

Detecting Dif for Polytomously Scored Items: An Adaptation of the Sibtest Procedure

Detecting DIF for Polytomously Scored Items: An Adaptation of the SIBTEST Procedure

The Equivalence of Scores From Automated and Conventional Educational and Psychological Tests

Descriptive and Inferential Procedures for Assessing Differential Item Functioning in Polytomous Items

A Multi-Site Analysis of the Prevalence of Food Insecurity in the United States, before and during the COVID-19 Pandemic

The unique correspondence of the item response function and item category response functions in polytomously scored item response models

Describing and Categorizing Dif in Polytomous Items

Sex‐related Performance Differences on Constructed‐response and Multiple‐choice Sections of Advanced Placement Examinations

Contact Info

Product

Resources

About