Creating a Polite Adaptive and Selective Incremental Crawler

TitleCreating a Polite Adaptive and Selective Incremental Crawler
Publication TypeConference Paper
Year of Publication2005
AuthorsBouras, C, Poulopoulos, V, Thanou, A
Conference NameIADIS International Conference WWW/INTERNET 2005, Lisbon, Portugal, Volume I
Date Published19 - 22 October
Abstract

The expansion of the World Wide Web has led to a chaotic state where the users of the internet have to face and overcome the major problem of discovering information. For the solution of this problem, many mechanisms were created based on crawlers who are browsing the www and downloading pages. In this paper we will describe a crawling mechanism which is created in order to support data mining and processing systems and to obtain a history of the web’s content. A crawler has to be efficient and polite, trying not to harm or overload the pages it is visiting. Therefore, it is extremely important to follow specific rules when crawling. In addition to these rules, the mechanism we created includes a selective incremental algorithm, which is used to make the crawler more efficient and more polite in parallel. The structure and design of the mechanism is simple, but the experimental results showed us that this simplicity makes our crawler a very strong and stable mechanism.