“…Early distantly supervised approaches (Mintz et al, 2009) use multi-instance learning (Riedel et al, 2010) and multi-instance multi-label learning (Surdeanu et al, 2012;Hoffmann et al, 2011) to model the assumption that at least one sentence per relation instance correctly expresses the relation. With the increasing popularity of neural networks, PCNN (Zeng et al, 2014) became the most widely used architecture, with extensions for multi-instance learning (Zeng et al, 2015), selective attention (Lin et al, 2016;Han et al, 2018), adversarial training (Wu et al, 2017;Qin et al, 2018), noise models (Luo et al, 2017), and soft labeling (Liu et al, 2017;. Recent work showed graph convolutions (Vashishth et al, 2018) and capsule networks (Zhang et al, 2018a), previously applied to the supervised setting , to be also applicable in a distantly supervised setting.…”