2020
DOI: 10.1017/s0956792520000157
|View full text |Cite
|
Sign up to set email alerts
|

Pull out all the stops: Textual analysis via punctuation sequences

Abstract: Whether enjoying the lucid prose of a favourite author or slogging through some other writer’s cumbersome, heavy-set prattle (full of parentheses, em dashes, compound adjectives, and Oxford commas), readers will notice stylistic signatures not only in word choice and grammar but also in punctuation itself. Indeed, visual sequences of punctuation from different authors produce marvellously different (and visually striking) sequences. Punctuation is a largely overlooked stylistic feature in stylometry, the quant… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
0
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 29 publications
0
0
0
Order By: Relevance
“…Each review consists of the following information: <reviewer_ID, review_text, timestamp>. For this data set, we chose the categories to be occurrences of function words and punctuation; both of which have been found to be useful in the stylometric analysis of text [20,[44][45][46][47][48][49]. We use the R package quanteda [50] to tokenize the text into these categories and use quanteda's built-in list of function words (Appendix B).…”
Section: Data Sets and Event Categoriesmentioning
confidence: 99%
“…Each review consists of the following information: <reviewer_ID, review_text, timestamp>. For this data set, we chose the categories to be occurrences of function words and punctuation; both of which have been found to be useful in the stylometric analysis of text [20,[44][45][46][47][48][49]. We use the R package quanteda [50] to tokenize the text into these categories and use quanteda's built-in list of function words (Appendix B).…”
Section: Data Sets and Event Categoriesmentioning
confidence: 99%
“…Text tokenisation: Each sentence is transformed into a sequence of tokens, i.e., words, punctuation and special characters (e.g., emojis). Despite punctuation being relevant in some cases, e.g., writing style analysis [17], we discard it here and focus only on words and other tokens with emotional content like emojis.…”
Section: Text Analysis In Emoatlasmentioning
confidence: 99%
“…Less than 15 minutes 27 More than 15 minutes but less than 30 minutes 17 More than 30 minutes Generally, the internal harmony between the reading process and developing thought process in the students' minds can face a monotonous strenuous problem [6] as can be seen above. A relation between reading fluency, accounting the text speeches, and listening skills have implemented to remove the specific problem.…”
Section: Percentage Of Overall Students Duration Of the Study Time By...mentioning
confidence: 99%