The World Wide Web, initially intended as a way to publish static hypertexts on the Internet, is moving toward complex applications. Static Web sites are being gradually replaced by dynamic sites, where information is stored in databases and non-trivial computation is performed.In such a scenario, ensuring the quality of a Web application from the user's perspective is crucial. Techniques are being investigated for the analysis and testing of Web applications for such a purpose. However, a static analysis of the source code may be extremely difficult (and, in general, infeasible) because of the presence of dynamic generation of the HTML code that is part of the application under analysis.In this paper, a dynamic analysis technique is proposed for the extraction of a Web application model through its execution. Availability of statistical data about the accesses to the pages generated by the Web application is exploited for statistical testing, based on the recovered model. Test cases can be prioritized, so as to exercise the most frequently followed paths first. Moreover, statistical reproduction of the user's navigation paths allows for an estimation of the reliability of the application. simplest cases a fixed HTML skeleton is filled-in with values computed dynamically, in more complex applications even the structure of the resulting HTML page is not given a priori, and is constructed dynamically. In such a situation, a static analysis of the server programs generating the Web pages can hardly result in a useful model of the application. In fact, the problem of determining the HTML code produced by a server program is related to the problem of determining if a given execution path is feasible, which is known to be an undecidable problem. Moreover, a Web application involves several programming languages. On the server side, at least one programming language is used for the dynamic production of the HTML pages (e.g. PHP, Java, Perl, VBscript, etc.). If databases are accessed, a related query language, such as SQL, is also present. HTML statements are then generated, but typically they are not pure HTML code, and include client side code for form validation, client side computation, and graphical event handling (e.g. Javascript, Java applets, etc.). Static analysis of such a variety of languages-and of all their possible interactions-is a technological challenge.In this paper, we propose a technique for the extraction of a model of a Web application, obtained by statically analyzing the HTML code that is dynamically generated by the server programs. Input values which cover all relevant navigations are pre-specified by the user, and downloaded pages are either unrolled or merged, in order to produce an abstraction over the set of HTML pages downloaded. The resulting model is computable in the presence of high, even 'extreme', dynamism and requires the ability to parse just HTML. The problem of statically approximating the HTML code being generated is absent. Alternatively, the model obtained may be partial, if th...