Need access to high volumes of blog, news and social media data?
Spinn3r is a web service that provides raw access to posts, articles, tweets, status updates, etc. being published - in real or near real time, allowing you to focus on building your application, mashup, or search engine. We find the sources, index their content and take care of all the heavy lifting around delivering large amounts of relevant data.
How does it work?
Using our open source Reference Client, engineers within your organization call our API every few seconds for the freshest content. You could be up and running in less than an hour.
Due to our years of crawling expertise and since we can amortize the cost of our infrastructure across a large customer base, we can provide the data at a significantly less expensive rate than what you would have to pay if you were going to build an equivalent service yourself. By some measure up to $45k per month.
Fire hose feed which monitors around 40 million sources and generates approximately 1,200,000 posts per hour (200GBs/day).
High Availability Architecture
Spinn3r is built on a fault tolerant infrastructure and is monitored 24/7 to ensure availability.
Media monitoring companies, marketing analytics firms, tech start ups, and large research organizations rely on Spinn3r.
Hate spam? Don't worry - we've got your back! Spinn3r ships with integrated spam prevention.
We're sharing a lot of our internal stats with the public including language breakdown, CMS breakdown, post frequency, etc.
Contact us to start a no commitment evaluation period of our API
We expose everything about our internal operation including crawl throughput, language breakdown, publisher type breakdown, throughput per publisher, ping frequency, etc. Take a look at some of our screenshots for more information.
Spinn3r ships with a unified API that allows you to index the full RSS, full HTML, content extracted version of a post, and all associated metadata under the same framework.
Spinn3r removes the advertising and navigational content, and isolates only the specific text on a given HTML page. This allows our customers to easily build search indexes to power their applications.