Abstract.-Large numbers of legacy taxonomic publications are currently being digitized to make them online available and ready for full text search. The documents are being marked up with XML for two purposes: To preserve the document structure, and to facilitate access via standard query languages like XQuery. With regard to the second aspect, the choice of an appropriate XML schema is crucial. It affects both query performance and the correctness of query results. Over the last few years, several different XML schemas have been proposed as markup standards for taxonomic publications. In this paper, we report on a thorough evaluation and comparison of these schemas. We have examined if they facilitate formulation and correct processing of queries that are common when it comes to taxonomic literature. We also compare the performance of these queries on documents that are marked up with the different schemas. Finally, we propose extensions to the schemas that enhance correctness of query results.Key words. -Heritage literature, quantiative analysis, systematics, taxonomy, TaxonX, xml schema At present, legacy taxonomic publications are being digitized in large numbers (e.g. Biodiversity Heritage Library 1 ). The intention is to store these documents in digital archives and to make them available online. The documents are marked up with XML for two purposes. The first one is to preserve the original document structure and publication-related in-formation like publisher, title and issue. The sec-ond one is to facilitate deployment of standard query languages like XPath (XPath) to access the document collection. In the recent past, several institutions and projects have proposed a variety of XML schemas for this purpose, such as ABCD (ABCD), SDD (SDD), TaxonX (TaxonX), and taXMLit (Weitzman & Lyall). In this paper, we compare these schemas and include some additional ones. Our comparison focuses on the second aspect mentioned above, relevant to the work of biologists. The queries typical for this domain are fine-grained (at the level of individual characters or distribution records), and their results are individual treatments, i.e., descriptions of a particular taxon. We have identified three basic types of criteria on which queries can be based: Taxonomic names, the collection locations, i.e., the locations 1 http://www.bhl.si.edu/ where specimens of a particular taxon have been collected, and the morphological feature concepts as the selection criterion. We investigate both the ease of formulation and the execution performance of these queries. Our results show that the four schemas mentioned above support queries over taxonomic names very well. The same is true for the collection locations. However, SDD is the only schema to allow formulating queries over morphological features at least to a certain degree, and they execute slowly in the environment investigated here. The other schemas do not provide any markup to identify individual concepts within morphologic descriptions. It is not possible to represent the r...