To search, Click below search items.

 

All Published Papers Search Service

Title

Intelligent Crawling On Open Web for Business Prospects

Author

Bharat Bhushan, Narender Kumar

Citation

Vol. 12  No. 6  pp. 93-98

Abstract

Dynamic nature of web based systems requires continuous system updating. Information retrieval depends upon crawlers that crawl the web exhaustively, but business corporates expect from their crawlers to retrieve the specific information as per their applications. Crawlers help to download the required information using hyperlinks that occur in Web pages but the information is usually partial & fails to fulfill user¡¯s aspirations. To retrieve updated information from one single link/url is very simple but if many urls give the same information, it becomes difficult to analyze which url/link is giving desired, sufficient, updated & up to date information. Moreover, it becomes difficult how to remove duplicate stories from same link domain. In the present paper attempt has been made to discuss the issues related to intelligent crawling by proposing various techniques to assist the scenario concerned with web mining for business prospects.

Keywords

Web crawler, latency, ethics, reliability, longevity

URL

http://paper.ijcsns.org/07_book/201206/20120612.pdf