“…For example, van Strien et al (2020) and Hill and Hengchen (2019) show that for 85-90% correctly transcribed texts, good results can be arrived at more or less irrespective of the method applied. Our own research of the impact of the OCR inaccuracies on collocate extraction shows that, compared with a fully accurate transcription, an 80% and more highly accurate transcription provides close to exactly the same results (Sangiacomo et al 2022a). Indeed, for collocate extraction, it seems that a truly random distribution of errors would lead to significant problems only from 70% downwards.…”