“…It is divided into smaller communities, so-called subreddits, which have been shown to be a rich source of derivationally complex words (Hofmann et al, 2020c). Hofmann et al (2020a) have published a dataset of derivatives found on Reddit annotated with the subreddits in which they occur. 8 Inspired by a content-based subreddit categorization scheme, 9 we define two groups of subreddits, an entertainment set (ent) consisting of the subreddits anime, DestinyTheGame, funny, Games, gaming, leagueoflegends, movies, Music, pics, and videos, as well as a discussion set (dis) consisting of the subred-8 https://github.com/valentinhofmann/ dagobert 9 https://www.reddit.com/r/ TheoryOfReddit/comments/1f7hqc/the_200_ most_active_subreddits_categorized_by dits askscience, atheism, conspiracy, news, Libertarian, politics, science, technology, TwoXChromosomes, and worldnews, and extract all derivationally complex words occurring in them.…”