Web Analytics

Monday, February 4, 2013

HOW TO BENCHMARK HBASE USING YCSB

YCSB (Yahoo Cloud Serving Benchmark) is a popular tool for evaluating the performance of different key-value and cloud serving stores. You can use it to test the read/write performance of your Hbase cluster and trust me it's very effective. In this post i'll show you how to build and use YCSB for your particular version of Hbase. So, this is just about setting up and using YCSB and not about YCSB itself. For detailed info on YCSB you can go to the below specified links :

1- Github-YCSB page : https://github.com/brianfrankcooper/YCSB
2- The paper from ACM Symposium on Cloud Computing, "Benchmarking Cloud Serving Systems with YCSB" : http://research.yahoo.com/files/ycsb.pdf

So, let us get started...

Step1- Clone the YCSB git repository :

apache@hadoop:~$ git clone http://github.com/brianfrankcooper/YCSB.git

This will create a directory caleed YCSB inside your current directory. (It might take some time depending on your internet connection speed. So, be patient)

Step2- Go inside this newly created YCSB directory and move inside the hbase directory. You will find an xml file here named as pom.xml. Open this pom.xml file and edit it so that it looks like this :

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>com.yahoo.ycsb</groupId>
    <artifactId>root</artifactId>
    <version>0.1.4</version>
  </parent>
  <artifactId>hbase-binding</artifactId>
  <name>HBase DB Binding</name>
  <dependencies>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase</artifactId>
      <!--<version>${hbase.version}</version>-->
      <version>0.94.4</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
      <!--<version>1.0.0</version>-->
      <version>1.0.4</version>
    </dependency>
    <dependency>
      <groupId>com.yahoo.ycsb</groupId>
      <artifactId>core</artifactId>
      <version>${project.version}</version>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>${maven.assembly.version}</version>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
          <appendAssemblyId>false</appendAssemblyId>
        </configuration>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

      Pay attention to the lines in red. These are the changes that you have to make in order to build YCSB without any problem for your specific version of Hbase.

NOTE : As of this writing I am usign hadoop-1.04 and hbase-0.94.4, so I have mentioned these versions in the above shown file. You have to specify the versions which you are going to use.

Step3- Now, go back to your terminal and move inside the YCSB directory :
apache@hadoop:~$ cd YCSB

Step4- It's time to do the build now :
apache@hadoop: /YCSB/ mvn clean package
This will start the build process. You can see all the information as the build process continues. If everything goes fine then you will see something like this on your terminal :


NOTE: If multiple descriptors or descriptor-formats are provided for this project, the value of this file will be non-deterministic!
[WARNING] Replacing pre-existing project main-artifact file: /hadoop/projects/YCSB/voldemort/target/archive-tmp/voldemort-binding-0.1.4.jar
with assembly file: /hadoop/projects/YCSB/voldemort/target/voldemort-binding-0.1.4.jar
[INFO]                                                                      
[INFO] ------------------------------------------------------------------------
[INFO] Building YCSB Release Distribution Builder 0.1.4
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.3:clean (default-clean) @ ycsb ---
[INFO]
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (validate) @ ycsb ---
[INFO]
[INFO] --- maven-assembly-plugin:2.2.1:single (default) @ ycsb ---
[INFO] Reading assembly descriptor: src/main/assembly/distribution.xml
[INFO] Processing sources for module project: com.yahoo.ycsb:core:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:cassandra-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:hbase-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:hypertable-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:dynamodb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:elasticsearch-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:infinispan-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:jdbc-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:mapkeeper-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:mongodb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:orientdb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:redis-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:voldemort-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:ycsb:pom:0.1.4
[INFO] Building tar : /hadoop/projects/YCSB/distribution/target/ycsb-0.1.4.tar.gz
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] YCSB Root ......................................... SUCCESS [1.940s]
[INFO] Core YCSB ......................................... SUCCESS [23.149s]
[INFO] Cassandra DB Binding .............................. SUCCESS [7.421s]
[INFO] HBase DB Binding .................................. SUCCESS [15.638s]
[INFO] Hypertable DB Binding ............................. SUCCESS [2.805s]
[INFO] DynamoDB DB Binding ............................... SUCCESS [3.451s]
[INFO] ElasticSearch Binding ............................. SUCCESS [8.123s]
[INFO] Infinispan DB Binding ............................. SUCCESS [2:27.468s]
[INFO] JDBC DB Binding ................................... SUCCESS [18.235s]
[INFO] Mapkeeper DB Binding .............................. SUCCESS [10.011s]
[INFO] Mongo DB Binding .................................. SUCCESS [4.874s]
[INFO] OrientDB Binding .................................. SUCCESS [19.702s]
[INFO] Redis DB Binding .................................. SUCCESS [3.960s]
[INFO] Voldemort DB Binding .............................. SUCCESS [14.181s]
[INFO] YCSB Release Distribution Builder ................. SUCCESS [7.076s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:48.305s
[INFO] Finished at: Mon Feb 04 01:13:00 IST 2013
[INFO] Final Memory: 107M/737M
[INFO] ------------------------------------------------------------------------

This shows that the build has been completed successfully and you are all set to go. 

Step5- Step4 will create a directory named target inside your /YCSB/distribution/ directory. You will find the YCSB tar file here, ycsb-0.1.4.tar.gz in my case. Copy this file to some location of your choice and extract it. This will give you the ycsb-1.0.4 directory which contains all the important and necessary stuff.

Step6- Move inside the ycsb-1.0.4 directory where you will find a directory called /hbase-binding. Go inside the /hbase-binding and open the /lib directory situated there. Copy the following jars from your /HBASE_HOME/lib into this /lib directory :
     1-slf4j-api-*.jar
     2-slf4j-log4j12-*.jar
     3-zookeeper-*.jar

Step7- You will find another directory named /conf inside /hbase-binding. You will find an xml file here named as hbase-site.xml file. Replace this hbase-site.xml file with the habse-site.xml present in your /HBASE_HOME/conf directory.

Step8- You are all set for testing your Hbase now. Start the Hadoop and Hbase processes and go inside ycsb-1.0.4. Now, issue the following command to load test your Hbase deployment :
apache@hadoop:/ycsb-0.1.4$ bin/ycsb load hbase -P workloads/workloada -p columnfamily=f1 -p recordcount=1000000 -p threadcount=4 -s | tee -a workloada.dat

This will start the load test and after sometime it will give you the result summary. Do not get overwhelmed by the great amount of information displayed on your terminal after this operation. For our convenience we have piped this ycsb command with the Linux tee command and written the entire output information to the terminal and the workloada.dat. You will find this file inside your ycsb-0.1.4
directory which contains the same content as your terminal has. You can extract useful insights from this file(or from your terminal) like :
The overall runtime in milliseconds
Throughput i.e. operations per second
Number of operations
AverageLatency etc etc

Here are some of the lines from my terminal :
[OVERALL], RunTime(ms), 73258.0
[OVERALL], Throughput(ops/sec), 13650.386305932458
[UPDATE], Operations, 4
[UPDATE], AverageLatency(us), 530564.25
[UPDATE], MinLatency(us), 65895
[UPDATE], MaxLatency(us), 1642179

I hope you found this post helpful. Stay connected for more :)