This paper describes an XML-based system to identify and visualize some of the structural features of natural-language poetry. Poetic texts are poster children for overlapping hierarchies, since the organization of poems into cantos, stanzas, lines, and feet is largely independent of the sentences and words of the text. Foot boundaries and word boundaries are mutually independent, yet the implementation of caesura depends on their synchronization. Furthermore, the formal organization of poetry is not only overlapping, but also massively discontinuous in terms of how underlying formal structures like meter or rhyme are realized in natural orthography. In many poetic traditions, stress and pronunciation are only implicit in written texts, and our first challenge is to identify those structures automatically and add markup to make them explicit, so that we can then use them to identify such properties as meter and rhyme. When stress and pronunciation are made explicit within an XML model, the most natural representations will often involve mixed content, which poses special challenges for subsequent XML processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.