Category Posts Navigation

Common Crawl – Open Repository of Web Crawl Data Composed Of Over 5 Billion Freely Available Web Pages

Posted by Marcus Zillman

Common Crawl – Open Repository of Web Crawl Data Composed Of Over 5 Billion Freely Available Web Pages
http://www.CommonCrawl.org/

The Common Crawl Foundation is a California 501(c)(3) registered non-profit founded by Gil Elbaz with the goal of democratizing access to web information by producing and maintaining an open repository of web crawl data that is universally accessible and analyzable. Their vision is of a truly open web that allows open access to information and enables greater innovation in research, business and education. They level the playing field by making wholesale extraction, transformation and analysis of web data cheap and easy. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Bot Research Subject Tracer™. This has been added to the tools section of Research Resources Subject Tracer™ Information Blog.

Leave a Reply

Facebook Comments

Sign up for Awareness Watch Newsletter

* = required field

Browse Categories