“…Text representations used in propensity score models generally do not yet leverage recent breakthroughs in NLP, and roughly fall into three groups: those using uni-and bigram representations (De Choudhury et al 2016;Johansson, Shalit, and Sontag 2016;Olteanu, Varol, and Kiciman 2017), those using LDA or topic modeling (Falavarjani et al 2017;Roberts, Stewart, and Nielsen 2020;Sridhar et al 2018), and those using neural word embeddings such as GLoVe (Pham and Shen 2017), fastText (Joulin et al 2017;Chen, Montano-Campos, and Zadrozny 2020), or BERT (Veitch, Sridhar, and Blei 2019), (Pryzant et al 2018). Three classes of estimators are commonly used to compute the AT E: inverse probability of treatment weighting (IPTW), propensity score stratification, and matching, either using propensity scores or, less frequently, some other distance metric.…”