01602nas a2200145 4500008004100000245006400041210006400105260001800169300000800187520108500195100002101280700002601301700001701327856011201344 2010 eng d00aEfficient extraction of news articles based on RSS crawling0 aEfficient extraction of news articles based on RSS crawling c3 - 5 October a1-73 a
The expansion of the World Wide Web has led to a state where a vast amount of Internet users face and have to overcome the major problem of discovering desired information. It is inevitable that hundreds of web pages and weblogs are generated daily or changing on a daily basis. The main problem that arises from the continuous generation and alteration of web pages is the discovery of useful information, a task that becomes difficult even for the experienced internet users. Many mechanisms have been constructed and presented in order to overcome the puzzle of information discovery on the Internet and they are mostly based on crawlers which are browsing the WWW, downloading pages and collect the information that might be of user interest. In this manuscript we describe a mechanism that fetches web pages that include news articles from major news portals and blogs. This mechanism is constructed in order to support tools that are used to acquire news articles from all over the world, process them and present them back to the end users in a personalized manner
1 aBouras, Christos1 aPoulopoulos, Vassilis1 aAdam, George uhttps://telematics.upatras.gr/telematics/publications/efficient-extraction-news-articles-based-rss-crawling