This paper describes the development of an information retrieval (IR) model for the indexing, storage and retrieval of documents created in extensible mark-up language (XML). The application area is the software reuse environment, which involves a broader class of documents than can be processed by conventional IR systems. This includes design and analysis documents in unified modelling language (UML) notation, as well as textual format, source code and textual and source code component interface definitions. XML was selected because it is emerging as the key standard for the representation of structured documents on the World Wide Web (WWW) and incorporates methods for the representation of metadata. A model is described that is easily customisable, since it is based upon an extensible object-oriented framework. This allows the development of an IR architecture that can easily be adapted to cope with the proliferation of XML document type definitions (DTDs) that is likely to be a characteristic of the WWW in the near future.University of Strathclyde. This paper begins by highlighting the importance of software reuse and then describes the broad structure and function of the AUTOSOFT system. Some of the key technologies underpinning the project are then outlined, with a particular focus on extensible mark-up language (XML) and its relevance to AUTOSOFT. This is followed by a discussion of the object-oriented framework that has been selected to support the development 50 1 2
This paper describes the development of an information retrieval (IR) model for the indexing, storage and retrieval of documents created in extensible mark-up language (XML). The application area is the software reuse environment, which involves a broader class of documents than can be processed by conventional IR systems. This includes design and analysis documents in unified modelling language (UML) notation, as well as textual format, source code and textual and source code component interface definitions. XML was selected because it is emerging as the key standard for the representation of structured documents on the World Wide Web (WWW) and incorporates methods for the representation of metadata. A model is described that is easily customisable, since it is based upon an extensible object-oriented framework. This allows the development of an IR architecture that can easily be adapted to cope with the proliferation of XML document type definitions (DTDs) that is likely to be a characteristic of the WWW in the near future. 2
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.