#onenote# hbase

Hbase command



  1. List table starting with sit (note the .)

list ‘sit.*’

  1. Here is some help for this command:

List all tables in hbase. Optional regular expression parameter could

be used to filter the output. Examples:


hbase> list

hbase> list ‘abc.*’

hbase> list ‘ns:abc.*’

hbase> list ‘ns:.*’

  1.  scan ‘uat_idx_trstimesheet_sector’ , {‘LIMIT’ => 5}
  2. create ‘t1’ , {NAME => ‘f1’,VERSION => 5}

put ‘t1’, ‘1’, ‘f1:name’,’linzhijia’


  1. Add column family

alter ‘t1’, ‘f1’, {NAME => ‘f2’, IN_MEMORY => true}, {NAME => ‘f3’, VERSIONS => 5}

put ‘t1’, ‘1’, ‘f2:name’,’linzhijia’



  1. hbase(main):021:0> import java.util.Date

hbase(main):022:0> Date.new(1218920189000).toString()

hbase(main):023:0> “Sat Aug 16 20:56:29 UTC 2008”



可参考 http://hi.baidu.com/yizhizaitaobi/blog/item/cc1290a0a0cd69974610646f.html 








a)ROW, 根据KeyValue中的row来过滤storefile 


sf1包含kv1r1 cf:q1 v)、kv2r2 cf:q1 v 

sf2包含kv3r3 cf:q1 v)、kv4r4 cf:q1 v 




sf1包含kv1r1 cf:q1 v)、kv2r2 cf:q1 v 

sf2包含kv3r1 cf:q2 v)、kv4r2 cf:q2 v 










From <http://zjushch.iteye.com/blog/1530143>


Coprocessors act like RDBMS triggers/stored procedures


Coprocessors can be loaded globally on all tables and regions hosted by the region server, these are known as system coprocessors; or the administrator can specify which coprocessors should be loaded on all regions for a table on a per-table basis, these are known as table coprocessors.


In order to support sufficient flexibility for potential coprocessor behaviors, two different aspects of extension are provided by the framework. One is the observer, which are like triggers in conventional databases, and the other is the endpoint, dynamic RPC endpoints that resemble stored procedures.


Endpoints, on the other hand, are more powerful, resembling stored procedures. One can invoke an endpoint at any time from the client. The endpoint implementation will then be executed remotely at the target region or regions, and results from those executions will be returned to the client.

Endpoint is an interface for dynamic RPC extension. The endpoint implementation is installed on the server side and can then be invoked with HBase RPC. The client library provides convenience methods for invoking such dynamic interfaces.



Coprocessor Deployment

Currently we provide two options for deploying coprocessor extensions: load from configuration, which happens when the master or region servers start up; or load from table attribute, dynamic loading when the table is (re)opened. Because most users will set table attributes by way of the ‘alter’ command of the HBase shell, let’s call this load from shell.


Load from Configuration

When a region is opened, the framework tries to read coprocessor class names supplied as the configuration entries:

  • hbase.coprocessor.region.classes: for RegionObservers and Endpoints
  • hbase.coprocessor.master.classes: for MasterObservers
  • hbase.coprocessor.wal.classes: for WALObservers

Hers is an example of the hbase-site.xml where one RegionObserver is configured for all the HBase tables:



Pasted from <https://blogs.apache.org/hbase/entry/coprocessor_introduction>



Pasted from <https://blogs.apache.org/hbase/entry/coprocessor_introduction>




Below is snippet of code on how we use Coprocessors in Hbase to create secondary index. Can be helpful to you.

public class TestCoprocessor extends BaseRegionObserver{

private HTablePool pool = null;

private final static String  INDEX_TABLE  = “INDEX_TBL”;
private final static String  SOURCE_TABLE = “SOURCE_TBL”;

public void start(CoprocessorEnvironment env) throws IOException {
pool = new HTablePool(env.getConfiguration(), 10);

public void postPut(
final ObserverContext<RegionCoprocessorEnvironment> observerContext,
final Put put,
final WALEdit edit,
final boolean writeToWAL)
throws IOException {

byte[] table = observerContext.getEnvironment(

// Not necessary though if you register the coprocessor
// for the specific table, SOURCE_TBL
if (!Bytes.equals(table, Bytes.toBytes(SOURCE_TABLE))) {

try {
final List<KeyValue> filteredList = put.get(
Bytes.toBytes ( “colfam1″), Bytes.toBytes(” qaul”));
filteredList.get( 0 ); //get the column value

// get the values
HTableInterface table = pool.getTable(Bytes.toBytes(INDEX_TABLE));

// create row key
byte [] rowkey = mkRowKey () //make the row key
Put indexput = new Put(rowkey);
Bytes.toBytes( “colfam1″),
Bytes.toBytes(” qaul”),
Bytes.toBytes(” value..”));


} catch ( IllegalArgumentException ex) {
// handle excepion.


public void stop(CoprocessorEnvironment env) throws IOException {


To register the above coprocessor on the SOURCE_BL, go to the hbase shell and follow the below steps

  1. disable ‘SOURCE_TBL’
  2. alter ‘SOURCE_TBL’, METHOD => ‘table_att’,’coprocessor’=>’file:///path/to/coprocessor.jar|TestCoprocessor|1001′
  3. enable ‘SOURCE_TBL’


Pasted from <http://stackoverflow.com/questions/14540167/create-secondary-index-using-coprocesor-hbase>



      1. HBase keys are in sorted with asc order
      2. HBase使用和Bigtable非常相同的数据模型.用户存储数据行在一个表里.一个数据行拥有一个可选择的键和任意数量的列.表是疏松的存储的,因此 用户可以给行定义各种不同的列
      1. It’s best to think of hbase as a huge hash table. Just like a hash table, HBase allows you to associate values with keys and perform fast lookup of the values based on a given key

Before we focus on the operations that can be done in HBase, let’s recap the operations supported by hash tables. We can:

      • Put things in a hash table
      • Get things from a hash table
      • Iterate through a hash table (note that HBase gives us even more power here with range scans, which allow specifying a start and end row when scanning)
      • Increment the value of a hash table entry
      • Delete values from a hash table
      1. Timestamp

The second most important consideration for good HBase schema design is understanding and using the timestamp correctly. In HBase, timestamps serve a few important purposes:

      • Timestamps determine which records are newer in case of a put request to modify the record.
      • Timestamps determine the order in which records are returned when multiple versions of a single record are requested.
      • Timestamps are also used to decide if a major compaction is going to remove the out-of-date record in question because the time-to-live (TTL) when compared to the timestamp has elapsed. “Out-of-date” means that the record value has either been overwritten by another put or deleted.

Rowkey & Schema design

  • Try to minimize row and column sizes
1.1 Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. “d” for data/default).
1.2 Attributes

Although verbose attribute names (e.g., “myVeryImportantAttribute”) are easier to read, prefer shorter attribute names (e.g., “via”) to store in HBase.

1.3 Rowkey Length

Keep them as short as is reasonable such that they can still be useful for required data access (e.g., Get vs. Scan). A short key that is useless for data access is not better than a longer key with better get/scan properties. Expect tradeoffs when designing rowkeys.

 1.4 Byte Patterns


A long is 8 bytes. You can store an unsigned number up to 18,446,744,073,709,551,615 in those eight bytes. If you stored this number as a String — presuming a byte per character — you need nearly 3x the bytes.

1.5 Reverse Timestamps

A common problem in database processing is quickly finding the most recent version of a value.

the technique involves appending (Long.MAX_VALUE – timestamp) to the end of any key, e.g., [key][reverse_timestamp].



Time To Live (TTL)

ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row – even the current one. The TTL time encoded in the HBase for the row is specified in UTC.





如何解决索引与主数据的划分存储是引擎第一个需要处理的问题,为了能获得最佳的性能表现,我们并没有将主数据和索引分表储存,而是将它们存放在了同一张表里,通过给索引和主数据的RowKey添加特别设计的Hash前缀,实现了在Region切分时,索引能够跟随其主数据划归到同一Region上,即任意Region上的主数据其索引也必定驻留在同一Region上,这样我们就能把从索引抓取目标主数据的性能损失降低到最小。与此同时,特别设计的Hash前缀还在逻辑上把索引与主数据进行了自动的分离,当全体数据按RowKey排序时,排在前面的都是索引,我们称之为索引区,排在后面的均为主数据,我们称之为主数据区。最后,通过给索引和主数据分配不同的Column Family,又在物理存储上把它们隔离了起来。逻辑和物理上的双重隔离避免了将两类数据存放在同一张表里带来的副作用,防止了它们之间的相互干扰,降低了数据维护的复杂性,可以说这是在性能和可维护性上达到的最佳平衡


Pasted from <http://www.infoq.com/cn/articles/hbase-second-index-engine>


  • Schema design
    • HBase tables are flexible, and you can store anything in the form of byte[].
    • Store everything with similar access patterns in the same column family.
    • Indexing is done on the Key portion of the KeyValue objects, consisting of the rowkey, qualifier, and timestamp in that order. Use it to your advantage.
    • Tall tables can potentially allow you to move toward O(1) operations, but you trade atomicity.
    • De-normalizing is the way to go when designing HBase schemas.
    • Think how you can accomplish your access patterns in single API calls rather than multiple API calls. HBase doesn’t have cross-row transactions, and you want to avoid building that logic in your client code.
    • Hashing allows for fixed-length keys and better distribution but takes away ordering.
    • Column qualifiers can be used to store data, just like cells.
    • The length of column qualifiers impacts the storage footprint because you can put data in them. It also affects the disk and network I/O cost when the data is accessed. Be concise.
    • The length of the column family name impacts the size of data sent over the wire to the client (in KeyValue objects). Be concise



  • Salting

HBase sequential write may suffer from region server hotspotting if your row key is monotonically increasing. Salting the row key provides a way to mitigate the problem

Salting is another trick you can have in your tool belt when thinking about rowkeys.Hashing the timestamp and making the hash value the rowkey requires full table scans, which is highly inefficient, especially if you have the ability to limit the scan. Making the hash value the rowkey isn’t your solution here. You can instead prefix the timestamp with a random number.

For example, you can generate a random salt number by taking the hash code of the timestamp and taking its modulus with some multiple of the number of Region-Servers:


int salt = new Integer(new Long(timestamp).hashCode()).shortValue() % <number
of region servers>

This involves taking the salt number and putting it in front of the timestamp to generate your timestamp:

byte[] rowkey = Bytes.add(Bytes.toBytes(salt) \
+ Bytes.toBytes(|) + Bytes.toBytes(timestamp));

Now your rowkeys are something like the following:



Region Collocation


As we know HBase splits data in a table into number of regions. Each region is having a start and end key (rowkey) which determines which rows it can hold. Every region will be associated with a region server.

The Master process determines which region to go where. There will be region balancing happening in the cluster to make sure the load is equally distributed across all the region servers in cluster. Also when one RS is going down, to make data highly available, the regions from that will get moved to another RS. All these responsibilities will be done by the Master process.

we have two tables, one user table and a corresponding index table. There will be regions for both of them and getting distributed across the RS in cluster. Now when some row is being inserted (Put) into the user table, as per the rowkey, it will be decided to which region that data needs to go. So HBase client side will contact the RS which is serving this region and data will be inserted there. Now when the table is indexed, as part of the write we need to write some data into the index table also. We will be doing this in

the CP hooks. So to which region of the index table this data needs to go depends on the number of regions and start and end keys of the regions in the index table. If this index table region is in another RS, to do the write into the index table, we need to make an RPC call which is costly. This can badly affect the write throughput and higher network usage. An effort here is to make the collocation of these 2 regions (user table region and index table region) in the same RS. Then there won’t be any need of RPC but the CP hook can

get the reference to the index table region and write data directly there




Pasted from <http://www.open-open.com/lib/view/open1417703877261.html>





  1. RowFilter:筛选出匹配的所有的行,对于这个过滤器的应用场景,是非常直观的:使用BinaryComparator可以筛选出具有某个行键的行,或者通过改变比较运算符(下面的例子中是CompareFilter.CompareOp.EQUAL)来筛选出符合某一条件的多条数据,以下就是筛选出行键为row1的一行数据:


Filter rf = new RowFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(“row1”))); // OK 筛选出匹配的所有的行


  1. PrefixFilter:筛选出具有特定前缀的行键的数据。这个过滤器所实现的功能其实也可以由RowFilter结合RegexComparator来实现,不过这里提供了一种简便的使用方法,以下过滤器就是筛选出行键以row为前缀的所有的行:

Filter pf = new PrefixFilter(Bytes.toBytes(“row”)); // OK  筛选匹配行键的前缀成功的行


  1. KeyOnlyFilter:这个过滤器唯一的功能就是只返回每行的行键,值全部为空,这对于只关注于行键的应用场景来说非常合适,这样忽略掉其值就可以减少传递到客户端的数据量,能起到一定的优化作用:

Filter kof = new KeyOnlyFilter(); // OK 返回所有的行,但值全是空


  1. RandomRowFilter:从名字上就可以看出其大概的用法,本过滤器的作用就是按照一定的几率(<=0会过滤掉所有的行,>=1会包含所有的行)来返回随机的结果集,对于同样的数据集,多次使用同一个RandomRowFilter会返回不通的结果集,对于需要随机抽取一部分数据的应用场景,可以使用此过滤器:

Filter rrf = new RandomRowFilter((float) 0.8); // OK 随机选出一部分的行


  1. InclusiveStopFilter:扫描的时候,我们可以设置一个开始行键和一个终止行键,默认情况下,这个行键的返回是前闭后开区间,即包含起始行,单不包含中指行,如果我们想要同时包含起始行和终止行,那么我们可以使用此过滤器:

Filter isf = new InclusiveStopFilter(Bytes.toBytes(“row1”)); // OK 包含了扫描的上限在结果之内


  1. FirstKeyOnlyFilter:如果你只想返回的结果集中只包含第一列的数据,那么这个过滤器能够满足你的要求。它在找到每行的第一列之后会停止扫描,从而使扫描的性能也得到了一定的提升:

Filter fkof = new FirstKeyOnlyFilter(); // OK 筛选出第一个每个第一个单元格


  1. ColumnPrefixFilter:顾名思义,它是按照列名的前缀来筛选单元格的,如果我们想要对返回的列的前缀加以限制的话,可以使用这个过滤器:

Filter cpf = new ColumnPrefixFilter(Bytes.toBytes(“qual1”)); // OK 筛选出前缀匹配的列


  1. ValueFilter:按照具体的值来筛选单元格的过滤器,这会把一行中值不能满足的单元格过滤掉,如下面的构造器,对于每一行的一个列,如果其对应的值不包含ROW2_QUAL1,那么这个列就不会返回给客户端:

Filter vf = new ValueFilter(CompareFilter.CompareOp.EQUAL, new SubstringComparator(“ROW2_QUAL1”)); // OK 筛选某个(值的条件满足的)特定的单元格


  1. ColumnCountGetFilter:这个过滤器来返回每行最多返回多少列,并在遇到一行的列数超过我们所设置的限制值的时候,结束扫描操作:

Filter ccf = new ColumnCountGetFilter(2); // OK 如果突然发现一行中的列数超过设定的最大值时,整个扫描操作会停止


  1. SingleColumnValueFilter:用一列的值决定这一行的数据是否被过滤。在它的具体对象上,可以调用setFilterIfMissing(true)或者setFilterIfMissing(false),默认的值是false,其作用是,对于咱们要使用作为条件的列,如果这一列本身就不存在,那么如果为true,这样的行将会被过滤掉,如果为false,这样的行会包含在结果集中。


Answering the question “What’s a fast operation for HBase?” involves many considerations. Let’s first define some variables:

  • n = Number of KeyValue entries (both the result of Puts and the tombstone markers left by Deletes) in the table.  <— dominating factor
  • b = Number of blocks in an HFile.
  • e = Number of KeyValue entries in an average HFile. You can calculate this if you know the row size.
  • c = Average number of columns per row.



Hash rowkeys

Using hashed keys also buys you a more uniform distribution of data across regions.


If you MD5 the timestamp and use that as the rowkey, you achieve an even distribution across all regions, but you lose the ordering in the data. In other words, you can no longer scan a small time range. You need either to read specific timestamps or scan the entire table. You haven’t lost getting access to records, though, because your clients can MD5 the timestamp before making the request.


you used MD5 as the rowkey. This gave you a fixed-length rowkey. Using hashed keys has other benefits, too. Instead of having userid1+userid2 as the rowkey in the follows table, you can instead have MD5(userid1)MD5(userid2) and do away with the + delimiter. This gives you two benefits. First, your rowkeys are all of consistent length, giving you more predictability in terms of read and write performance. That’s probably not a huge win if you put a limit on the user ID length. The other benefit is that you don’t need the delimiter any more; it becomes simpler to calculate start and stop rows for the scans.


Using MD5s as part of rowkeys to achieve fixed lengths and better distribution



HBase Region split 策略

HBase的region split策略一共有以下几种:

  • IncreasingToUpperBoundRegionSplitPolicy
  • region split的计算公式是:regioncount^3 * 128M * 2,当region达到该size的时候进行split
  • ConstantSizeRegionSplitPolicy
  • 0.94.0之前该策略是region的默认split策略,0.94.0之后region的默认split策略为IncreasingToUpperBoundRegionSplitPolicy,当region size达到hbase.hregion.max.filesize(默认10G)配置的大小后进行split


  • DisabledRegionSplitPolicy


  • KeyPrefixRegionSplitPolicy


根据rowKey的前缀对数据进行分组,这里是指定rowKey的前多少位作为前缀,比如rowKey都是16位的,指定前5位是前缀,那么前5位相同的rowKey在进行region split的时候会分到相同的region中。


  • DelimitedKeyPrefixRegionSplitPolicy

保证相同前缀的数据在同一个region中,例如rowKey的格式为:userid_eventtype_eventid,指定的delimiter _ ,则split的的时候会确保userid相同的数据在同一个region中。




Apache Phoenix enables OLTP and operational analytics in Hadoop for low latency applications by combining the best of both worlds:

      • the power of standard SQL and JDBC APIs with full ACID transaction capabilities and
      • the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store

Apache Phoenix is fully integrated with other Hadoop products such as Spark, Hive, Pig, Flume, and Map Reduce.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s