Index weblogs, mainstream news, and social media.
RSS, Atom, HTML, microformats, and microdata web formats.
All our APIs are powered by JSON for ease of use and
Distributed with a full firehose API which handles
95% of the data indexing requirements. No coding
required. Just start it up and it spools JSON files to
Full visibility into our crawl. We provide a comprehensive admin
console for use by our customers.
+300M Sources Indexed
Indexing over 300M sources available through the API.
Vast coverage of social media, weblogs, mainstream
news, and more.
Integrated full-text search powered by
and Kibana. Run
powerful queries and aggregations on
raw data. Full text search allows for precise
queries over vast amounts of data.
Integrated boilerplate removal and content
extraction based on state of the art information
retrieval techniques. Exclude ads, navigation and
other miscellaneous text on a page.
Language and Spam Detection
Full language detection. Hate spam? Don't worry!
Spinn3r ships with integrated spam prevention.
Spinn3r is built on a fault tolerant
infrastructure and is monitored 24/7 to ensure