by hellyson » Mon Aug 02, 2010 11:00 am
Spiders are the automated programs that are crawling the web all the time to get up to date data for search purposes. The spider visits the given URL, gets all the hyperlinks in the visited page, saves the new links to the URLs list to visit (crawl frontier), saves the page itself to a database file, traverse the URLs list, pick a new URL to visit, and do the same cycle again.