“…That's roughly twenty tokens for each type, a number that's really high because the book is so big. The ten most frequently appearing types in A43998, after all the words have been converted to lowercase, are the (14,849), of (10,850), and (7,305), to (7,236), is (4,864), that (4,786), in (4,194), a (3,122), by (2,636), and for (2,539). Those are just the top words, of course, and the list goes on from there.…”