2022
DOI: 10.35542/osf.io/y4wvj
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Using Demographic Data as Predictor Variables: a Questionable Choice

Abstract: Predictive analytics methods in education are seeing widespread use and are producing increasingly accurate predictions of students’ outcomes. With the increased use of predictive analytics comes increasing concern about fairness for specific subgroups of the population. One approach that has been proposed to increase fairness is using demographic variables directly in models, as predictors. In this paper we explore issues of fairness in the use of demographic variables as predictors of long term student outco… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 48 publications
(67 reference statements)
0
3
0
1
Order By: Relevance
“…In terms of the second most explored concern by participants, data privacy and data use, in line with wider literature, participants were worried about their privacy (Korir et al, 2023;Scott, 2023;Yan et al, 2023;, the engrained potential biases in AI (Bond et al, 2024;Tao et al, 2023;W. Zhang et al, 2023), and its potential impact on diverse student populations (Baker et al, 2023;Nguyen et al, 2020). Some participants like I03 indicated that they would be worried their data being used for training an AIDA system, and what would happen with their data.…”
Section: Results From the Interview: Potential Concerns Of Aida Accor...mentioning
confidence: 86%
See 1 more Smart Citation
“…In terms of the second most explored concern by participants, data privacy and data use, in line with wider literature, participants were worried about their privacy (Korir et al, 2023;Scott, 2023;Yan et al, 2023;, the engrained potential biases in AI (Bond et al, 2024;Tao et al, 2023;W. Zhang et al, 2023), and its potential impact on diverse student populations (Baker et al, 2023;Nguyen et al, 2020). Some participants like I03 indicated that they would be worried their data being used for training an AIDA system, and what would happen with their data.…”
Section: Results From the Interview: Potential Concerns Of Aida Accor...mentioning
confidence: 86%
“…Zhang et al, 2023). Concerns about the dissemination of incorrect information and the potential adverse effects on diverse groups were also highlighted (Baker et al, 2023;Nguyen et al, 2020;Rizvi al., 2022). Additionally, participants expressed concerns around data privacy, how personal information would be used in AI systems, and issues of data and intellectual property ownership (Korir et al, 2023;Yan et al, 2023;.…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, future research should apply NOLFO validation to fairness analyses and investigate accuracy over time for student subgroups. There may be particular concern if, unlike this paper's approach, a model includes demographic identifiers in the training data (a practice that has received recent debate in our community --see discussion in [3]). In that case, the predictive role of demographic identifiers may be susceptible to semantic shift over time in a way that particularly impacts model performance for a specific group but not others.…”
Section: Discussionmentioning
confidence: 99%
“…Second, we have avoided incorporating socio-economic and demographic variables as input to the linear regression and time series prediction models. This choice was based on prior work showing that the addition of demographic features as input to predictive models is not only controversial but also potentially harmful [ 68 ]. In fact, it has been argued that using socio-economic or demographic data as predictor may instead reinforce bias and generate predictions based primarily on demographic variables rather than on more actionable parameters, thus perpetuating inequalities [ 69 ].…”
Section: Limitationsmentioning
confidence: 99%