Abstract. There are billions of web pages available on the Internet. Search Engines always have a challenge to find the best ranked list to the user's query from those huge numbers of pages. A lot of search results that correspond to a user's query are not relevant to the user's needs. Most of the page ranking algorithms use Linkbased ranking (web structure) or Content-based ranking to calculate the relevancy of the information to the user's need, but those ranking algorithms might be not enough to provide a good ranked list for the Arabic search. So, in this paper we proposed an efficient Arabic information retrieval system using a new hybrid usage-based ranking algorithm called EHURA. The objective of this algorithm is to overcome the drawbacks of the ranking algorithms and improve the efficiency of web searching. EHURA was applied to 242 Arabic Corpus to measure its performance. The result shows our proposed EHURA algorithm improves the precision over the Content-Based ranking algorithm representation, as well as the recall is affected too in this improvement.
IntroductionThe amount of information in the world is increasing exponentially through the years. Searching within this huge amount of information becomes a critical behavior of our life. Millions of users interact with search engines daily around the globe; more than 360 of them are Arab ones [1,2]. Recently, due to the growing number of internet users around the world, information retrieval (IR) has become of great importance as an essential tool for all tasks of searching on the web. The number of Arab Internet users has increased recursively over the years because of the changes in the requirements of the life. Relatively fewer Arabic search engines are currently available despite the enormous efforts to satisfy the needs of the growing number of Arabic internet users. Moreover, Arabic is a highly inflected language and has a complex morphological structure, which makes information retrieval on Arabic texts a challenge [3].Most of existing web search engines often calculate the relevancy of web pages for a given query by counting the search keywords contained in the web pages, this