2014
DOI: 10.5120/17104-7737
|View full text |Cite
|
Sign up to set email alerts
|

Preprocessing Techniques in Web Usage Mining: A Survey

Abstract: Due to huge, unstructured and scattered amount of data available on web, it is very tough for users to get relevant information in less time. To achieve this, improvement in design of web site, personalization of contents, prefetching and caching activities are done according to user's behavior analysis. User's activities can be captured into a special file called log file. There are various types of log: Server log, Proxy server log, Client/Browser log. These log files are used by web usage mining to analyze … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 16 publications
(9 citation statements)
references
References 27 publications
0
9
0
Order By: Relevance
“…The preliminary action of internet utilization mining evaluation is records preprocessing [19], [20], [21] The raw information have especially decreased enterprise properly virtually nicely worth until they'll be transformed similarly to processed to generate actionable information [22] For that cause, as a manner to allow the assessment, uncooked logs want to be preprocessed to throw out stupid requests, to understand person commands and to put together the log to allow its evaluation. The net logs have a take a look at the Typical Language Format well-known (CLF) [23] and moreover deliver uncooked data together with the IP deal with from which the session have become installed, the day and furthermore time of the decision for, the internet internet page URL or the HTTP popularity lower once more to the patron, for instance.…”
Section: Environmental Results and Discussionmentioning
confidence: 99%
“…The preliminary action of internet utilization mining evaluation is records preprocessing [19], [20], [21] The raw information have especially decreased enterprise properly virtually nicely worth until they'll be transformed similarly to processed to generate actionable information [22] For that cause, as a manner to allow the assessment, uncooked logs want to be preprocessed to throw out stupid requests, to understand person commands and to put together the log to allow its evaluation. The net logs have a take a look at the Typical Language Format well-known (CLF) [23] and moreover deliver uncooked data together with the IP deal with from which the session have become installed, the day and furthermore time of the decision for, the internet internet page URL or the HTTP popularity lower once more to the patron, for instance.…”
Section: Environmental Results and Discussionmentioning
confidence: 99%
“…Among them ECLF (extended common log format) are commonly used by Web servers. The attributes of ECLF (extended common log format) are described as follows [11]: IP address/Host name: Host name or IP address (when host name is not available) Rfcname: User's authentication. A "-"sign indicates that this field is not available.…”
Section: Log Data Sources and Its Collectionmentioning
confidence: 99%
“…are found to complete navigation path of user. The techniques used for data preprocessing can be referred from the literature [8,11]. This paper provides methodology for Data fusion, Data extraction and Data cleaning phases of preprocessing of server log data.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…By imposing barriers to free-flow of data streaming, a compact representation of raw data is sufficient to produce brief but reliable decision making process (Eggers and Khuon, 1990). In addition, an efficiency and scalability of an object refinement; a subsequent step applied to processed data, can be improved through a proper pre-processing (Mitali et al, 2003).…”
Section: Role Of Pre-processingmentioning
confidence: 99%