Riak at Appush, San Francisco NoSQL Meetup

2010, May 13  —  ...

A few weeks ago the San Francisco NoSQL Meetup Group held its first meeting at CBS Interactive in San Francisco with the topic of Riak at Appush presented by Dan Reverri from Appush. I had not previously heard of Riak so before attending the talk I very briefly looked up what it is. I discovered it was a key-value created by Basho and since I had been doing some reading about Redis and Voldemort that was enough information to get me interested. After the talk, thanks to Dan's great presentation, I discovered Riak has a lot more capabilities than just a simple key-value store.

Riak is more than a simple single server key-value store, it is distributed, scalable, and supports replication. Unlike Redis, you don't need to implement your own sharding strategy, scalability is built in and automatic. When you add a new physical node, Riak will automatically redistribute data. It also includes a query language for performing map-reduce queries on the nodes.

If you're a small team, unless you're an Erlang shop, one downside to Riak is that it is primarily written in Erlang and C. Why is this a downside? I've heard a valid recommendation that when you are using these new NoSQL products, it really helps to know the language it was written in so that you can help track down the source of bugs (and maybe even submit patches). If you use the language it was written in on a daily basis, it makes that job much easier.

This really seems like a promising solution for a portion of a re-architecture project that we will be conducting at Spoke. My main concern is Erlang under the covers, we just don't have the bandwidth to learn yet another programming language.

Dan Reverri's presentation was quite comprehensive and informative and you can view his Riak slides on Prezi. Below are my notes from the the presentation. Note: these notes were taken using Evernote on an iPad, there might be spelling and grammar issues.

Presentation by Dan Reverri from Appush

How he came to find out about Riak.

Needed to build RESTful services. Looked into: Rails, Django, Twisted, Sinatra

If built truly compliant RESTful services on Sinatra, would have to code up a lot of work. WTF.

Came across something called Webmachine, which helps you build proper, perfect REST. Developed in Erlang. Webmachine is a Basho tool. They also use Basho Rebar. Finally, came across Riak.

Thought that this is ha problem to solve and they probably didn't do it right but they did.

Riak at the heart is a k/v store. There is something called the ring server and a data node for the ring server.

160-bit integer space divided into equally sized partitions across the entire cluster.

Each partition is claimed by a vnode and several vnodes run per node. Vnodes migrate between nodes and nodes come up and down. Keys are consistently hashed. Every node in the system knows about every other key and knows where it lives. It is a distributed hashtable.

Any node in the cluster can act as a server and can accept requests.

No node is special. No master and not single point of failure.

Adding a node will add more capacity and more computation.

Pluggable backends. You can change how it stores data. Memory; ETS. File: DTS, Innostore, Bitcask

You can run a couple nodes in memory and a couple on disk.

Built a Document store on top of the hashtable. May be inconsistent for a while but you can control the consistency N number of replicas per bucket R number of replicas needed for a read per request W number of replicas needed for a read per request DW number of replicas needed for durable write per request (flushed to disk)

Cap Theorem is that in any distributed system, there is consistency (cannot read new and then old value), availability (accepted writes), and partition tolerance () but you can only have two. In Riak, you may not be able to control which two you have at any given moment.

What you are storing is documents. Any distributed system requires merging of documents. Documents may have multiple values and are merged by vector clocks. Let's say client 1 made one change to a doc. Client 1 makes a second write in node 1. The vector clock records who made the change where then conflicts are resolved with specified options.

No concept of locks.

In order to query the documents, they have built in a map-reduce framework. One of the focuses of the team is to run the map-reduce as close to the data as possible. It will run on the server on which the document exists. Reduce is run on the server where the request was initially made but this might change. It has phases, chainable, and runs near the data. Great for reporting.

Has the concept of link walking built on top of the map reduce. A link is like a hyperlink to another document. Don't use Riak to implement a graph database. It's a single linked list and it a double linked list.

Tip: always use return body to make sure the vclock is used.

Can run queries at 2ms per request.

Using Erlang for the map-reduce might be better than using the Javascript. Innostore or Bitcask are pert much the only two choices for consistent level of performance. Selected Cassandra to build next gen product. Cassandra didn't work well under zero load but works great under load. Then Riak comes out and it was a simpler system. Voldemort is an alternative in Java world and similar in performance in EC2 in 10 machine setting. Riak is slightly better at reads and slight worse at writes. Running a 10 nodes cluster in testing on EC2.

Virtualization hurts the JVM.

Riak takes 5 minutes to install unless you have not installed Erlang yet. There are packages for it.

IRC and email list is very responsive.

ElasticSearch is a cluster wrapper around Lucene.