Abstract-The large volume of online and offline information that is available today has overwhelmed users' efficiency and effectiveness in processing this information in order to extract relevant information. The exponential growth of the volume of Internet information complicates information access. Thus, it is a very time consuming and complex task for user in accessing relevant information. Information retrieval (IR) is a branch of artificial intelligence that tackles the problem of accessing and retrieving relevant information. The aim of IR is to enable the available data source to be queried for relevant information efficiently and effectively. This paper describes a robust information retrieval framework that can be used to retrieve relevant information. The proposed information retrieval framework is designed to assist users in accessing relevant information effectively and efficiently as it handles queries based on user preferences. Each component and module involved in the proposed framework will be explained in terms of functionality and the processes involved.Index Terms-Information retrieval, information retrieval framework, semantic web.
I. INTRODUCTIONInformation retrieval (IR) is a process that extracts and retrieves information that is relevant to user based on the queries posted. IR deals with many aspects including the representation, storage, organization and retrieving information from data sources. Furthermore, these data sources can be accessed offline or online and they can be categorized into structured, semi-structured or unstructured data. The origin of the IR research can be traced back to ancient times when librarians kept information related to articles or books using catalogue cards [1], [2] and earlier works related to information retrieval can be found in 1950 [3]. The advent of computer has brought the IR system to a new level as computers are capable of processing large volume of data in order to extract and retrieve relevant information [4]. The increase of capacity and computational power has contributed to the rapid growth of unstructured data. For instance, with the advent of World Wide Web(WWW) making the information available online through hyperlink, the research attention of IR have been Rayner Alfred, Gan Kim Soon, and Chin Kim On are with the COESA, Universiti Malaysia Sabah, 88400 Kota Kinabalu, Sabah, g_k_s967@yahoo.com, kimonchin@ums.edu.my).Patricia Anthony is with the Department of Applied Computing, Faculty of Environment, Society and Design, Lincoln University, Christchurch, New Zealand (e-mail: patricia.anthony@lincoln.ac.nz).shifted to Web IR and it is increasingly gaining popularity. Among significant IR tools for WWW IR are the search engines. In order to retrieve information from the WWW, search engines with different capabilities and algorithm shave been developed. However, the advancement of Internet made information available growth exponential through time and a robust framework for web information extraction and retrieval is critically required to pro...