What's kumofs?
Kumofs is a simple and fast distributed key-value store. You can use a memcached client library to set, get, CAS or delete values from/into kumofs. Backend storage is Tokyo Cabinet and it will give you great performance.
- Data is partitioned and replicated over multiple servers.
- Extreme single node performance; comparable with memcached.
- Both read and write performance got improved as servers added.
- Servers can be added without stopping the system.
- Servers can be added without modifying any configuration files.
- The system does not stop even if one or two servers crashed.
- The system does not stop to recover crashed servers.
- Automatic rebalancing support with a consistency control algorithm.
- Safe CAS operation support.
- memcached protocol support.
Kumofs is used at Nico Nico douga (Wikipedia), the most popular video sharing service in Japan.
It measured performance of one server node using three client machines. Each client machine gets 12,800 of 1KB values from the server using 32 threads. Specification of the server is Athlon64 X2 5000+, 8GB DDR2 memory, Intel PRO/1000 NIC and linux-2.6.27.10 x86_64. The source code is available from github (kumofs, voldemort).
Design
It measured performance of the cluster using 50 client machines. Each client machine gets 1,024,000 entries form the cluster using 32 threads.
Specification of the servers is QuadCore Xeon X3350, linux 2.6.27.21 x86_64. Specification of the clients is Pentium 4 3.20GHz, linux 2.6.27.21 i686.
Getting Started
Installation
You can install kumofs with ports on FreeBSD (databases/kumofs) or emerge on Gentoo Linux (net-misc/kumofs).
On other platforms, download source package from github and run ./configure && make && sudo make install.
Kumofs requires MessagePack (Ruby and C++) and Tokyo Cabinet to install. Please install them first.
Run on localhost
[localhost]$ kumo-manager -v -l localhost
[localhost]$ kumo-server -v -m localhost -l localhost:19801 -L 19901 -s ./kumodb1.tch
[localhost]$ kumo-server -v -m localhost -l localhost:19802 -L 19902 -s ./kumodb2.tch
[localhost]$ kumo-server -v -m localhost -l localhost:19803 -L 19902 -s ./kumodb3.tch
[localhost]$ kumo-gateway -v -m localhost -t 11211
[localhost]$ kumoctl localhost attach
kumofs doesn't use configuration files. All configuration is done on command line arguments. You can use memcached client and get(s)/set/cas/delete values from/into kumofs.
Cluster configuration
[on mgr1]$ kumo-manager -l mgr1 -p mgr2
[on mgr2]$ kumo-manager -l mgr2 -p mgr1
[on svr1]$ kumo-server -m mgr1 -p mgr2 -l svr1 -s /var/kumodb.tch
[on svr2]$ kumo-server -m mgr1 -p mgr2 -l svr2 -s /var/kumodb.tch
[on svr3]$ kumo-server -m mgr1 -p mgr2 -l svr3 -s /var/kumodb.tch
[on app1]$ kumo-gateway -m mgr1 -p mgr2 -t 11211
[on app2]$ kumo-gateway -m mgr1 -p mgr2 -t 11211
[ ]$ kumoctl mgr1 attach
[ ]$ kumoctl mgr1 status
hash space timestamp:
Wed Dec 03 22:16:00 +0900 2008 clock 72
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
not attached node:
Now use memcached client library and connect to localhost:11211 on app1 or app2 host.
The kumo-gateway is always available at localhost even if some servers crashed or number of kumo-servers is changed.
Cluster Management
Monitoring
Kumofs bunldes some management tools. kumotop enables you to monitor status of kumo-servers like UNIX's 'top' command.
$ kumotop -m mgr
This screenshot shows that a server running on 192.168.10.255:19800 is processing 89,877 Get requests and 15,137 Set requests per second. It has processed 10,854,681 Get requests and 2,062,999 Set requests before. And it stores 522,748 items.
As I added them up, the 6-node cluster is processing 646,067 requests per second and stores 3,145,344 items.
Adding, removing and recovering servers
To add servers to the cluster, run new kumo-servers and run kumoctl mgr attach.
To recover crasehd servers, restart the server and run kumoctl mgr attach.
To remove crashed servers and put back number of replicated data, run kumoctl mgr detach.
kumo-managers does attach/detach automatically when --auto-replace is specified on the command line.
Note that attach/detach starts rebalancing. It will cause large network traffic depending on total amount of stored data and number of kumo-servers.
If one of two kumo-managers is down, just restart it on the same address and port, and then run kumoctl mgr replace.
Even if both of the kumo-managers are down, it is no problem. restart them on the same address and port, and run kumoctl mgr attach.
Tuning
Tuning of the database is very important for performance. Use tchmgr command to optimize I/O performance.
First, look into the current status.
$ tchmgr inform -nl /var/kumodb.tch
path: /var/kumodb.tch
database type: hash
additional flags:
bucket number: 131071
alignment: 16
free block pool: 1024
inode number: 372172662
modified time: 2010-02-18T13:57:42+09:00
options:
record number: 59528
file size: 109328752
The most important item is 'bucket number'. It is suitable to set it 2-4 times larger than number of records.
Now the 'record number' is 59,528 while bucket number is 13,1071. It is terrible. Now set it 200,000.
$ cp /var/kumodb.tch /var/kumodb.tch.backup
$ tchmgr optimize /var/kumodb.tch 200000
You have to stop the kumo-server before optimization. It is recommended that you create tuned database file before running kumo-server.
$ tchmgr create /var/kumodb.tch 200000
$ kumo-server -v -m mgr1 -p mgr2 -l svr1 -s /var/kumodb.tch
See the document of Tokyo Cabinet for details.
Learn More
Chat
There are kumofs developers and operators in the #kumofs channel on irc.freenode.net. If you are new to IRC and don't have a client, you can use a web-based client.
Article