The quantity and complexity of data being generated and published in biology has increased substantially, but few methods exist for capturing knowledge about phenotypes derived from molecular interactions between diverse groups of species, in such a way that is amenable to data-driven biology and research. To improve access to this knowledge, we have constructed a framework for the curation of the scientific literature studying interspecies interactions, using data curated for the Pathogen-Host Interactions Database (PHI-base) as a case study. The framework provides a curation tool, phenotype ontology and controlled vocabularies to curate pathogen-host interaction data (at the level of the host, pathogen, strain, gene and genotype). The concept of a multispecies genotype, the 'metagenotype', is introduced to facilitate capturing changes in the pathogens' disease-causing abilities, and host resistance or susceptibility observed by gene alterations. We report on this framework and describe PHI-Canto, a community curation tool for use by publication authors.