80legs
Type of site | Web crawler |
---|---|
Available in | English |
Owner | Datafiniti, LLC |
Created by | Shion Deysarkar |
URL | www |
Launched | September 2009 |
80legs is a web crawling service that allows its users to create and run web crawls through its software as a service platform.
History
[edit]80legs was created by Computational Crawling, a company in Houston, Texas. The company launched the private beta of 80legs in April 2009 and publicly launched the service at the DEMOfall 09 conference. At the time of its public launch, 80legs offered customized web crawling and scraping services. Later in 2009, it added subscription plans and other product offerings.[1][2]
Technology
[edit]80legs is built on top of a distributed grid computing network.[3][needs update?] This grid consists of approximately 50,000 individual computers, distributed across the world, and uses bandwidth monitoring technology to prevent bandwidth cap overages.[4][needs update?]
80legs has been criticised by numerous site owners for its technology effectively acting as a Distributed Denial of Service attack and not obeying robots.txt.[5][6][7] As the average webmaster is not aware of the existence of 80legs, blocking access to its crawler can only be done when it is already too late, the server DDoSed, and the guilty party detected after a time-consuming in-depth analysis of the logfiles.
Some rulesets for modsecurity block 80legs from accessing the web server completely, in order to prevent a DDoS.[citation needed] As it is a distributed crawler, it is impossible to block this crawler by IP.[citation needed]
References
[edit]- ^ Ha, Anthony (2009-12-22). "80legs sets its web crawler free". VentureBeat. Retrieved 2024-11-28.
- ^ Kirkpatrick, Marshall (2010-04-19). "Thoughts From the Man Who Would Sell The World, Nicely". ReadWriteWeb. Archived from the original on 2010-07-22. Retrieved 2024-11-28.
- ^ Higginbotham, Stacey (2009-09-22). "80Legs Is Where SETI@home Meets Google". GigaOM. Archived from the original on 2012-08-01. Retrieved 2024-11-28.
- ^ "80legs Cares About Your Bandwidth Cap". GigaOM. 2009-05-14. Archived from the original on 2009-06-18. Retrieved 2024-11-28.
- ^ "Reddit robots.txt update reflects new approach to tackle bots – Circuit Bulletin". Circuit Bulletin. 2 November 2024. Archived from the original on 2024-11-21. Retrieved 15 November 2024.
- ^ "DDoSed by 80legs". DataMadness. 2012-01-16. Archived from the original on 2012-01-16.
- ^ https://twitter.com/openstreetmap/status/221188821721681920 Complaint from OpenStreetMap