What's kumofs?

Kumofs is a simple and fast distributed key-value store. You can use a memcached client library to set, get, CAS or delete values from/into kumofs. Backend storage is Tokyo Cabinet and it will give you great performance.

Kumofs is used at Nico Nico douga (Wikipedia), the most popular video sharing service in Japan.

Kumofs vs. Voldemort Speed Test

It measured performance of one server node using three client machines. Each client machine gets 12,800 of 1KB values from the server using 32 threads. Specification of the server is Athlon64 X2 5000+, 8GB DDR2 memory, Intel PRO/1000 NIC and linux-2.6.27.10 x86_64. The source code is available from github (kumofs, voldemort).

Design

Design of kumofs

kumo-servers store data and replicate them into other kumo-servers. kumo-managers watch life or death of kumo-servers and proceed automatic rebalancing when number of kumo-servers is changed. kumo-gatway relay the requests from client applications to kumo-servers. Because kumo-gateway implements memcached protocol, you can use memcached client library to access kumofs.

Scalability of kumofs

It measured performance of the cluster using 50 client machines. Each client machine gets 1,024,000 entries form the cluster using 32 threads. Specification of the servers is QuadCore Xeon X3350, linux 2.6.27.21 x86_64. Specification of the clients is Pentium 4 3.20GHz, linux 2.6.27.21 i686.

Getting Started

Installation

You can install kumofs with ports on FreeBSD (databases/kumofs) or emerge on Gentoo Linux (net-misc/kumofs).
On other platforms, download source package from github and run ./configure && make && sudo make install.
Kumofs requires MessagePack (Ruby and C++) and Tokyo Cabinet to install. Please install them first.

Run on localhost

[localhost]$ kumo-manager -v -l localhost
[localhost]$ kumo-server  -v -m localhost -l localhost:19801 -L 19901 -s ./kumodb1.tch  # -l is RPC address
[localhost]$ kumo-server  -v -m localhost -l localhost:19802 -L 19902 -s ./kumodb2.tch  # and port, -L is
[localhost]$ kumo-server  -v -m localhost -l localhost:19803 -L 19902 -s ./kumodb3.tch  # stream port.
[localhost]$ kumo-gateway -v -m localhost -t 11211                   # memcached client is accepted on -t port.
[localhost]$ kumoctl localhost attach                                # attach kumo-servers into the cluster.

kumofs doesn't use configuration files. All configuration is done on command line arguments. You can use memcached client and get(s)/set/cas/delete values from/into kumofs.

Cluster configuration

[on mgr1]$ kumo-manager -l mgr1 -p mgr2                 # you can run two kumo-managers for redudancy.
[on mgr2]$ kumo-manager -l mgr2 -p mgr1                 # specify their address (and port) mutually.
[on svr1]$ kumo-server  -m mgr1 -p mgr2 -l svr1 -s /var/kumodb.tch  # specify managers’ addresses (-m and -p),
[on svr2]$ kumo-server  -m mgr1 -p mgr2 -l svr2 -s /var/kumodb.tch  # address and port of the server (-l) and
[on svr3]$ kumo-server  -m mgr1 -p mgr2 -l svr3 -s /var/kumodb.tch  # path to the database file (-s).
[on app1]$ kumo-gateway -m mgr1 -p mgr2 -t 11211     # run kumo-gateway on every application servers to avoid
[on app2]$ kumo-gateway -m mgr1 -p mgr2 -t 11211     # concentration of load and risc of hardware failure.
[       ]$ kumoctl mgr1 attach    # attach kumo-servers if --auto-replace option is not set for kumo-managers.
[       ]$ kumoctl mgr1 status    # confirm the status of the cluster at the end.
    hash space timestamp:
      Wed Dec 03 22:16:00 +0900 2008 clock 72
    attached node:
      192.168.0.101:19800  (active)
      192.168.0.102:19800  (active)
      192.168.0.103:19800  (active)
    not attached node:

Now use memcached client library and connect to localhost:11211 on app1 or app2 host. The kumo-gateway is always available at localhost even if some servers crashed or number of kumo-servers is changed.

Cluster Management

Monitoring

Kumofs bunldes some management tools. kumotop enables you to monitor status of kumo-servers like UNIX's 'top' command.

$ kumotop -m mgr
6-node cluster of kumofs

This screenshot shows that a server running on 192.168.10.255:19800 is processing 89,877 Get requests and 15,137 Set requests per second. It has processed 10,854,681 Get requests and 2,062,999 Set requests before. And it stores 522,748 items.
As I added them up, the 6-node cluster is processing 646,067 requests per second and stores 3,145,344 items.

Adding, removing and recovering servers

To add servers to the cluster, run new kumo-servers and run kumoctl mgr attach.
To recover crasehd servers, restart the server and run kumoctl mgr attach.
To remove crashed servers and put back number of replicated data, run kumoctl mgr detach.

kumo-managers does attach/detach automatically when --auto-replace is specified on the command line.
Note that attach/detach starts rebalancing. It will cause large network traffic depending on total amount of stored data and number of kumo-servers.

If one of two kumo-managers is down, just restart it on the same address and port, and then run kumoctl mgr replace.
Even if both of the kumo-managers are down, it is no problem. restart them on the same address and port, and run kumoctl mgr attach.

Tuning

Tuning of the database is very important for performance. Use tchmgr command to optimize I/O performance.
First, look into the current status.

$ tchmgr inform -nl /var/kumodb.tch
path: /var/kumodb.tch
database type: hash
additional flags:
bucket number: 131071
alignment: 16
free block pool: 1024
inode number: 372172662
modified time: 2010-02-18T13:57:42+09:00
options:
record number: 59528
file size: 109328752

The most important item is 'bucket number'. It is suitable to set it 2-4 times larger than number of records. Now the 'record number' is 59,528 while bucket number is 13,1071. It is terrible. Now set it 200,000.

$ cp /var/kumodb.tch /var/kumodb.tch.backup
$ tchmgr optimize /var/kumodb.tch 200000

You have to stop the kumo-server before optimization. It is recommended that you create tuned database file before running kumo-server.

$ tchmgr create /var/kumodb.tch 200000
$ kumo-server -v -m mgr1 -p mgr2 -l svr1 -s /var/kumodb.tch

See the document of Tokyo Cabinet for details.

Learn More

Installation

Installation Gude.

Design

Design of kumofs.

HowTo

Tutorials and How-To.

Troubleshooting

Troubleshooting.

Reference

Command line reference.

FAQ

FAQs.

Chat

There are kumofs developers and operators in the #kumofs channel on irc.freenode.net. If you are new to IRC and don't have a client, you can use a web-based client.

Article

Mailing List

Google Groups
Subscribe to kumofs
Email:
Visit this group