Purpose -With the significant growth in electronic education materials such as syllabus documents and lecture notes, available on the Internet and intranets, there is a need for robust central repositories of such materials to allow both educators and learners to conveniently share, search and access them. This paper reports on our work to develop a national repository for course syllabi in Ireland.Design/methodology/approach -The paper describes a prototype syllabus repository system for higher education in Ireland, which has been developed by utilising a number of information extraction and document classification techniques, including a new fully unsupervised document classification method that uses a web search engine for automatic collection of training set for the classification algorithm.Findings -Preliminary experimental results for evaluating the performance of the system and its various units, particularly the information extractor and the classifier, are presented and discussed.Originality/value -In this paper, we identify three major obstacles associated with creating a large-scale syllabus repository, and provide a comprehensive review of published research work related to addressing these problems. We also identify two different types of syllabus documents and describe a rule-based information extraction system capable of extracting structured information from unstructured syllabus documents. Finally, we highlight the importance of classifying resources in a syllabus digital library, introduce a number of standard education classification schemes, and describe our unsupervised automated document classification system which classifies syllabus documents based on an extended version of the International Standard Classification of Education (ISCED 1997).