“…There have been many attempts to improve the recognition of accented speech, with varying degrees of success [7,8,9,10,11]. Some promising approaches include unsupervised adaptation [12,13], multitask learning with accent embeddings [14,15], and domain adversarial training [2,16]. While most approaches have delivered results, they either use massive amounts of accent data (e.g., 23K hours [2]), rely on corpora that are not publicly available [2,3], or use increasingly complex models [10,14,16] that do not shed light on how humans adapt so quickly to new accents.…”