A high quality social media index for firehose and search API

Every hour we index over 4 million unique posts published by more than 200 million URLs ­ a publishing pace that accelerates every single day as more individuals publish their unique views and perspectives online.

Simple to use.

You can be up and running with Spinn3r in less than an hour. We ship a standard reference client that integrates directly with your pipeline. If you're running Java, you can get up and running in minutes. If you're using another language, you only need to parse out a few JSON files every few seconds.

Built on web standards

Built from the ground up to support indexing raw HTML5. This includes HTML metadata including microformats and also microdata - which is how Google and other search engines index their content. We don’t stop there. We also index RSS and Atom (including all 9 different RSS variants). We believe strongly in the robustness principle. Normal RSS parsers are fragile - not ours. If there are small errors in the source file, we transparently correct them to make sure you get the content you need.

Reliable infrastructure

Our infrastructure is state of the art and designed to scale. We’re hosted in Softlayer (an IBM company) and all data is stored on ultra-fast Intel SSD drives. We store data in both Cassandra and Elasticsearch and run our entire infrastructure on a horizontally scalable Java crawling infrastructure we’ve developed over the last 8 years.

We have over 40 servers and store more than 10TB of content. Every piece of our infrastructure is designed with triple redundancy redundant with additional hardware on standby in case of a failure.

Spinn3r is monitored 24/7 for any potential error in the system. We're so confident in our infrastructure that we back our service with a notch SLA so you can sleep well at night.