“…The classifier, henceforth called detector, can be used to automatically remove machine generated text from online platforms such as social media, e-commerce, email clients, and government forums, when the intention of the TGM generated text is abuse. An ideal detector should be: (i) accurate, that is, good accuracy with a good trade-off for false positives and false negatives depending on the online platform (email client, social media) on which TGM is applied (Solaiman et al, 2019); (ii) data-efficient, that is, needs as few examples as possible from the TGM used by the attacker (Zellers et al, 2019); (iii) generalizable, that is, detects text generated by different modeling choices of the TGM used by the attacker such as model architecture, TGM training data, TGM conditioning prompt length, model size, and text decoding method (Solaiman et al, 2019;Bakhtin et al, 2020;Uchendu et al, 2020); and (iv) interpretable, that is, detector decisions need to be understandable to humans (Gehrmann et al, 2019); and (v) robust, that is, detector can handle adversarial examples (Wolff, 2020). Given the importance of this problem, there has been a flurry of research recently from both NLP and ML communities on building useful detectors.…”