The innovation of AJAX resulted in more responsive, interactive and faster web applications due to the clever amalgamation of JavaScript, HTML, and Cascading Style Sheets (CSS)
INTTRODUCTIONThe World Wide Web is a giant source of information in which a continuous change occurs in the way of information storage, retrieval and display.[1]. Simple HTML pages are now being replaced with AJAXembedded web pages making information retrieval (IR) challenging because of complexities in executing JavaScript, constructing the navigation model and analysis of Document Object Model (DOM) [2]. AJAX, which is short for Asynchronous JavaScript and XML (Extensible Markup Language), is one of the prominent new techniques that are being used to develop rich and more interactive web applications such as Facebook, YouTube and Google Maps. Unlike a new programming, scripting language or technology, AJAX is a new way to think, design and develop web applications [3,4]. As content is dynamically and asynchronously produced in AJAX-based web applications, web crawlers are unable to detect AJAX event and execute calls just like humans do using a web browser. Furthermore, a lot of applications on the Web are AJAX-based and are least searchable. A methodology is needed to present content produced by these applications to crawlers for indexing purposes just like in traditional Web IR.A lot of research has already explored the technical aspects of AJAX, challenges and the benefits it provides to web application developers. However, these articles are limited in evaluating and comparing the relative performance of AJAX web crawlers. In this review article, we critically and analytically review the available literature in order to report the state-of-the-art in crawling AJAX-based web applications along with some prominent issues and challenges. We also briefly discuss the nature of AJAX-based web applications and their differences from traditional web applications. We also cover the similarities and differences between traditional web crawlers and existing AJAX crawlers as well as highlight the limitation in AJAX crawlers. For this purpose, we searched the major Computer Science digital libraries and other related databases to collect relevant articles that best describe this new technology. We carefully reviewed and analyzed all the selected articles and reported the state-of-the-art accordingly. We hope that this research paper will open new research avenues for researchers interested in this domain. Rest of the paper is organized as follows: Section 2 presents the journey