HBase全攻略

基础概念

Coprocessor

 Coprocessor其实是一个类似 MapReduce的分析组件,不过它极大简化了 MapReduce模型。将请求独立地在各个 Region中并行地运行,并提供了一套框架让用户灵活地自定义 Coprocessor

编程技巧

充分利用好 CellUtil
1
2
3
4
// 直接使用 byte[]进行匹配,效率会更高
// Bad: cf.equals(Bytes.toString(CellUtil.cloneFamily(cell)))
CellUtil.matchingFamily(cell, cf) && CellUtil.matchingQualifier(cell, col)
// 同理,应尽量使用 `Bytes.equals`,来替代 `String#equals`
发挥好协处理的并行计算能力
1
2
3
4
5
6
7
8
9
10
11
12
// 某些很难使得表数据分布均匀的场景下,可以设置好预分区[00, 01, 02, ..., 99],并关闭自动分区(详见:常见命令-分区),则可保证每个 Region上的只有单个 xx前缀。这样,导表数据的时候,轮询地在 rowkey前加上 xx前缀,则可保证无热点 Region
// 在协处理器的程序中,则可先获取到 xx前缀,并在构建 Scan的时候,将前缀加在 startKey/endKey前面即可
static String getStartKeyPrefix(HRegion region) {
if (region == null) throw new RuntimeException("Region is null!");
byte[] startKey = region.getStartKey();
if (startKey == null || startKey.length == 0) return "00";
String startKeyStr = Bytes.toString(startKey);
return isEmpty(startKeyStr) ? "00" : startKeyStr.substring(0, 2);
}
private static boolean isEmpty(final String s) {
return s == null || s.length() == 0;
}
处理好协处理器程序里的异常

 如果在协处理器里面有异常被抛出,并且 hbase.coprocessor.abortonerror参数没有开启,那么,该协处理器会直接从被加载的环境中被删除掉。否则,则需要看异常类型,如果是 IOException类型,则会直接被抛出;如果是 DoNotRetryIOException类型,则不做重试,抛出异常。否则,默认将会尝试 10次 (硬编码在 AsyncConnectionImpl#RETRY_TIMER中了)。因此需要依据自己的业务场景,对异常做好妥善的处理

日志打印
1
2
3
4
5
// 只能使用 Apache Commons的 Log类,否则将无法打印
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
private static final Log log = LogFactory.getLog(CoprocessorImpl.class.getName());

部署

1
2
3
4
5
6
7
8
9
10
# 先上传 coprocessor处理器 jar包
$ hadoop fs -copyFromLocal /home/hbase/script/coprocessor-0.0.1.jar hdfs://yuzhouwan/hbase/coprocessor/
$ hadoop fs -ls hdfs://yuzhouwan/hbase/coprocessor/
# 卸载旧的 coprocessor
$ alter 'yuzhouwan', METHOD => 'table_att_unset', NAME =>'coprocessor$1'
# 指定新的 coprocessor
$ alter 'yuzhouwan', METHOD => 'table_att', 'coprocessor' => 'hdfs://yuzhouwan/hbase/coprocessor/coprocessor-0.0.1.jar|com.yuzhouwan.hbase.coprocessor.Aggregation|111|'
# 通过查看 RegionServer的日志,可观察协处理器的运行状况

常用命令

集群相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
$ su - hbase
$ start-hbase.sh
# HMaster ThriftServer
$ jps | grep -v Jps
32538 ThriftServer
9383 HMaster
8423 HRegionServer
# BackUp HMaster ThriftServer
$ jps | grep -v Jps
24450 jar
21882 HMaster
2296 HRegionServer
14598 ThriftServer
5998 Jstat
# BackUp HMaster ThriftServer
$ jps | grep -v Jps
31119 Bootstrap
8775 HMaster
25289 Bootstrap
14823 Bootstrap
12671 Jstat
9052 ThriftServer
26921 HRegionServer
# HRegionServer
$ jps | grep -v Jps
29356 hbase-monitor-process-0.0.3-jar-with-dependencies.jar # monitor
11023 Jstat
26135 HRegionServer
$ export -p | egrep -i "(hadoop|hbase)"
declare -x HADOOP_HOME="/home/bigdata/software/hadoop"
declare -x HBASE_HOME="/home/bigdata/software/hbase"
declare -x PATH="/usr/local/anaconda/bin:/usr/local/R-3.2.1/bin:/home/bigdata/software/java/bin:/home/bigdata/software/hadoop/bin:/home/bigdata/software/hive/bin:/home/bigdata/software/sqoop/bin:/home/bigdata/software/hbase/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin"
$ java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
uintx MaxHeapSize := 32126271488 {product} # 29.919921875 GB
java version "1.7.0_60-ea"
Java(TM) SE Runtime Environment (build 1.7.0_60-ea-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode)
$ top
top - 11:37:03 up 545 days, 18:45, 5 users, load average: 8.74, 10.39, 10.96
Tasks: 653 total, 1 running, 652 sleeping, 0 stopped, 0 zombie
Cpu(s): 32.9%us, 0.7%sy, 0.0%ni, 66.3%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 264484056k total, 260853032k used, 3631024k free, 2235248k buffers
Swap: 10485756k total, 10485756k used, 0k free, 94307776k cached
# Memory: 252 GB
# `hbase classpath`可以拿到 HBase相关的所有依赖
$ java -classpath ~/opt/hbase/soft/yuzhouwan.jar:`hbase classpath` com.yuzhouwan.hbase.MainApp
# Usage
Usage: hbase [<options>] <command> [<args>]
Options:
--config DIR Configuration direction to use. Default: ./conf
--hosts HOSTS Override the list in 'regionservers' file
Commands:
Some commands take arguments. Pass no args or -h for usage.
shell Run the HBase shell
hbck Run the hbase 'fsck' tool
hlog Write-ahead-log analyzer
hfile Store file analyzer
zkcli Run the ZooKeeper shell
upgrade Upgrade hbase
master Run an HBase HMaster node
regionserver Run an HBase HRegionServer node
zookeeper Run a Zookeeper server
rest Run an HBase REST server
thrift Run the HBase Thrift server
thrift2 Run the HBase Thrift2 server
clean Run the HBase clean up script
classpath Dump hbase CLASSPATH
mapredcp Dump CLASSPATH entries required by mapreduce
pe Run PerformanceEvaluation
ltt Run LoadTestTool
version Print the version
CLASSNAME Run the class named CLASSNAME
# hbase版本信息
$ hbase version
2017-01-13 11:05:07,580 INFO [main] util.VersionInfo: HBase 0.98.8-hadoop2
2017-01-13 11:05:07,580 INFO [main] util.VersionInfo: Subversion file:///e/hbase_compile/hbase-0.98.8 -r Unknown
2017-01-13 11:05:07,581 INFO [main] util.VersionInfo: Compiled by 14074019 on Mon Dec 26 20:17:32 2016
$ hadoop fs -ls /hbase
drwxr-xr-x - hbase hbase 0 2017-03-01 00:05 /hbase/.hbase-snapshot
drwxr-xr-x - hbase hbase 0 2016-10-26 16:42 /hbase/.hbck
drwxr-xr-x - hbase hbase 0 2016-12-19 13:02 /hbase/.tmp
drwxr-xr-x - hbase hbase 0 2017-01-22 20:18 /hbase/WALs
drwxr-xr-x - hbase hbase 0 2015-09-18 09:34 /hbase/archive
drwxr-xr-x - hbase hbase 0 2016-10-18 09:44 /hbase/coprocessor
drwxr-xr-x - hbase hbase 0 2015-09-15 17:21 /hbase/corrupt
drwxr-xr-x - hbase hbase 0 2017-02-20 14:34 /hbase/data
-rw-r--r-- 2 hbase hbase 42 2015-09-14 12:10 /hbase/hbase.id
-rw-r--r-- 2 hbase hbase 7 2015-09-14 12:10 /hbase/hbase.version
drwxr-xr-x - hbase hbase 0 2016-06-28 12:14 /hbase/inputdir
drwxr-xr-x - hbase hbase 0 2017-03-01 10:40 /hbase/oldWALs
-rw-r--r-- 2 hbase hbase 345610 2015-12-08 16:54 /hbase/test_bulkload.txt
$ hadoop fs -ls /hbase/WALs
drwxr-xr-x - hbase hbase 0 2016-12-27 16:08 /hbase/WALs/yuzhouwan03,60020,1482741120018-splitting
drwxr-xr-x - hbase hbase 0 2017-03-01 10:36 /hbase/WALs/yuzhouwan03,60020,1483442645857
drwxr-xr-x - hbase hbase 0 2017-03-01 10:37 /hbase/WALs/yuzhouwan02,60020,1483491016710
drwxr-xr-x - hbase hbase 0 2017-03-01 10:37 /hbase/WALs/yuzhouwan01,60020,1483443835926
drwxr-xr-x - hbase hbase 0 2017-03-01 10:36 /hbase/WALs/yuzhouwan03,60020,1483444682422
drwxr-xr-x - hbase hbase 0 2017-03-01 10:16 /hbase/WALs/yuzhouwan04,60020,1485087488577
drwxr-xr-x - hbase hbase 0 2017-03-01 10:37 /hbase/WALs/yuzhouwan05,60020,1484790306754
drwxr-xr-x - hbase hbase 0 2017-03-01 10:37 /hbase/WALs/yuzhouwan06,60020,1484931966988
$ hadoop fs -ls /hbase/WALs/yuzhouwan01,60020,1483443835926
-rw-r--r-- 3 hbase hbase 127540109 2017-03-01 09:49 /hbase/WALs/yuzhouwan01,60020,1483443835926/yuzhouwan01%2C60020%2C1483443835926.1488330961720
# ...
-rw-r--r-- 3 hbase hbase 83 2017-03-01 10:37 /hbase/WALs/yuzhouwan01,60020,1483443835926/yuzhouwan01%2C60020%2C1483443835926.1488335822133
# log
$ vim /home/hbase/logs/hbase-hbase-regionserver-yuzhouwan03.log
# HBase批处理
$ echo "<command>" | hbase shell
$ hbase shell ../script/batch.hbase
# HBase命令行
$ hbase shell
$ status
1 servers, 0 dead, 41.0000 average load
$ zk_dump
HBase is rooted at /hbase
Active master address: yuzhouwan03,60000,1481009498847
Backup master addresses:
yuzhouwan02,60000,1481009591957
yuzhouwan01,60000,1481009567346
Region server holding hbase:meta: yuzhouwan03,60020,1483442645857
Region servers:
yuzhouwan02,60020,1483491016710
# ...
/hbase/replication:
/hbase/replication/peers:
/hbase/replication/peers/1: yuzhouwan03,yuzhouwan02,yuzhouwan01:2016:/hbase
/hbase/replication/peers/1/peer-state: ENABLED
/hbase/replication/rs:
/hbase/replication/rs/yuzhouwan03,60020,1483442645857:
/hbase/replication/rs/yuzhouwan03,60020,1483442645857/1:
/hbase/replication/rs/yuzhouwan03,60020,1483442645857/1/yuzhouwan03%2C60020%2C1483442645857.1488334114131: 116838271
/hbase/replication/rs/1485152902048.SyncUpTool.replication.org,1234,1:
/hbase/replication/rs/yuzhouwan06,60020,1484931966988:
/hbase/replication/rs/yuzhouwan06,60020,1484931966988/1:
# ...
Quorum Server Statistics:
yuzhouwan02:2015
Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
Clients:
/yuzhouwan:62003[1](queued=0,recved=625845,sent=625845)
# ...
/yuzhouwan:11151[1](queued=0,recved=8828,sent=8828)
Latency min/avg/max: 0/0/1
Received: 161
Sent: 162
Connections: 168
Outstanding: 0
Zxid: 0xc062e91c6
Mode: follower
Node count: 25428
yuzhouwan03:2015
Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
Clients:
/yuzhouwan:39582[1](queued=0,recved=399812,sent=399812)
# ...
/yuzhouwan:58770[1](queued=0,recved=3234,sent=3234)
$ stop-hbase.sh

增删查改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
$ list
TABLE
mytable
yuzhouwan
# ...
20 row(s) in 1.4080 seconds
$ create 'yuzhouwan', {NAME => 'info', VERSIONS => 3}, {NAME => 'data', VERSIONS => 1}
0 row(s) in 0.2650 seconds
=> Hbase::Table - yuzhouwan
$ put 'yuzhouwan', 'rk0001', 'info:name', 'Benedict Jin'
$ put 'yuzhouwan', 'rk0001', 'info:gender', 'Man'
$ put 'yuzhouwan', 'rk0001', 'data:pic', '[picture]'
$ get 'yuzhouwan', 'rk0001', {FILTER => "ValueFilter(=, 'binary:[picture]')"}
COLUMN CELL
data:pic timestamp=1479092170498, value=[picture]
1 row(s) in 0.0200 seconds
$ get 'yuzhouwan', 'rk0001', {FILTER => "QualifierFilter(=, 'substring:a')"}
COLUMN CELL
info:name timestamp=1479092160236, value=Benedict Jin
1 row(s) in 0.0050 seconds
$ scan 'yuzhouwan', {FILTER => "QualifierFilter(=, 'substring:a')"}
ROW COLUMN+CELL
rk0001 column=info:name, timestamp=1479092160236, value=Benedict Jin
1 row(s) in 0.0140 seconds
# [rk0001, rk0003)
$ put 'yuzhouwan', 'rk0003', 'info:name', 'asdf2014'
$ scan 'yuzhouwan', {COLUMNS => 'info', STARTROW => 'rk0001', ENDROW => 'rk0003'}
# row key start with 'rk'
$ put 'yuzhouwan', 'aha_rk0003', 'info:name', 'Jin'
$ scan 'yuzhouwan', {FILTER => "PrefixFilter('rk')"}
ROW COLUMN+CELL
rk0001 column=data:pic, timestamp=1479092170498, value=[picture]
rk0001 column=info:gender, timestamp=1479092166019, value=Man
rk0001 column=info:name, timestamp=1479092160236, value=Benedict Jin
rk0003 column=info:name, timestamp=1479092728688, value=asdf2014
2 row(s) in 0.0150 seconds
$ delete 'yuzhouwan', 'rk0001', 'info:gender'
$ get 'yuzhouwan', 'rk0001'
COLUMN CELL
data:pic timestamp=1479092170498, value=[picture]
info:name timestamp=1479092160236, value=Benedict Jin
2 row(s) in 0.0100 seconds
$ disable 'yuzhouwan'
$ drop 'yuzhouwan'

行列修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 修改表
$ disable 'yuzhouwan'
# 添加列
$ alter 'yuzhouwan', NAME => 'f1'
$ alter 'yuzhouwan', NAME => 'f2'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.3020 seconds
# 修改 CQ
$ create 'yuzhouwan', {NAME => 'info'}
$ put 'yuzhouwan', 'rk00001', 'info:name', 'China'
$ get 'yuzhouwan', 'rk00001', {COLUMN => 'info:name'}, 'value'
$ put 'yuzhouwan', 'rk00001', 'info:address', 'value'
$ scan 'yuzhouwan'
ROW COLUMN+CELL
rk00001 column=info:address, timestamp=1480556328381, value=value
1 row(s) in 0.0220 seconds
# 删除列
$ alter 'yuzhouwan', {NAME => 'f3'}, {NAME => 'f4'}
$ alter 'yuzhouwan', {NAME => 'f5'}, {NAME => 'f1', METHOD => 'delete'}, {NAME => 'f2', METHOD => 'delete'}, {NAME => 'f3', METHOD => 'delete'}, {NAME => 'f4', METHOD => 'delete'}
# 无法细到 CQ级别,alter 'ns_rec:tb_mem_tag', {NAME => 'cf_tag:partyIdType', METHOD => 'delete'}
# 删除行
$ deteleall <table>, <rowkey>

清空表数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 清空表数据
$ describe 'yuzhouwan'
Table yuzhouwan is ENABLED
COLUMN FAMILIES DESCRIPTION
{NAME => 'data', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 'f5', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => 'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE'
, BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
3 row(s) in 0.0230 seconds
# 0.98版本引入的命令,可以清空表数据的同时,保留 region分区
$ truncate_preserve 'yuzhouwan'
# truncate会进行 drop table和 create table的操作
$ truncate 'yuzhouwan'
$ scan 'yuzhouwan'
ROW COLUMN+CELL
0 row(s) in 0.3170 seconds

改表名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 注意 snapshot的名字 不可带 ':' 之类的字符,也就是说,不需要特意去区分 namespace
$ disable 'yuzhouwan'
$ snapshot 'yuzhouwan', 'yuzhouwan_snapshot'
$ clone_snapshot 'yuzhouwan_snapshot', 'ns_site:yuzhouwan'
$ delete_snapshot 'yuzhouwan_snapshot'
$ drop 'yuzhouwan'
$ grant 'site', 'CXWR', 'ns_site:yuzhouwan'
$ user_permission 'yuzhouwan'
User Table,Family,Qualifier:Permission
site default,yuzhouwan,,: [Permission: actions=CREATE,EXEC,WRITE,READ]
hbase default,yuzhouwan,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]
$ disable 'ns_site:yuzhouwan'
$ drop 'ns_site:yuzhouwan'
$ exists 'ns_site:yuzhouwan'
Table ns_site:yuzhouwan does not exist
0 row(s) in 0.0200 seconds

改表属性

1
2
3
4
5
6
7
8
9
10
$ disable 'yuzhouwan'
# versions
$ alter 'yuzhouwan', NAME => 'f', VERSIONS => 5
# ttl (注意,超时属性是针对 CF的,而不是 Table级别的,且单位是 秒)
$ alter 'yuzhouwan', NAME => 'f', TTL => 20
$ enable 'yuzhouwan'
$ describe 'yuzhouwan'

压缩算法

1
2
3
4
5
6
7
8
9
# 压缩算法为 'SNAPPY'报错,ERROR: java.io.IOException: Compression algorithm 'snappy' previously failed test.
# 尝试 LZ4 (低压缩比,高速,在 Spark2.x中已作为默认压缩算法)
$ create 'yuzhouwan', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}, {NAME => 'v', COMPRESSION => 'LZ4', BLOOMFILTER => 'NONE', DATA_BLOCK_ENCODING => 'FAST_DIFF'}
$ describe 'yuzhouwan'
Table yuzhouwan is ENABLED
COLUMN FAMILIES DESCRIPTION
{NAME => 'v', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.0280 seconds

权限控制

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# ACL
# R - read
# W - write
# X - execute
# C - create
# A - admin
$ grant 'benedict', 'WRXC', 'yuzhouwan'
$ echo "scan 'hbase:acl'" | hbase shell > acl.txt
yuzhouwan column=l:benedict, timestamp=1496216745249, value=WRXC
yuzhouwan column=l:hbase, timestamp=1496216737326, value=RWXCA
$ user_permission # 如果不接 <table_name>,将从 'hbase:acl'表中获取全部
$ user_permission 'yuzhouwan'
User Table,Family,Qualifier:Permission
hbase default,yuzhouwan,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]
benedict default,yuzhouwan,,: [Permission: actions=WRITE,READ,EXEC,CREATE]
2 row(s) in 0.0510 seconds
$ revoke 'benedict', 'yuzhouwan'

分区

1
2
3
4
5
6
7
8
9
10
11
# splits
$ create 'yuzhouwan', {NAME => 'f'}, SPLITS => ['1', '2', '3'] # 5 regions
$ alter 'yuzhouwan', SPLITS => ['1', '2', '3', '4', '5', '6', '7', '8', '9'] # not work
# 关闭自动分区
$ alter 'yuzhouwan', {METHOD => 'table_att', SPLIT_POLICY => 'org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy'}
# 配置 master是否去平衡各个 RegionServer的region数量
# 维护或者重启一个 RegionServer时,会关闭 balancer,会导致 region在 RegionServer上的分布不均,这个时候需要手工的开启 balance
$ balance_switch true
$ balance_switch false

命名空间

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# namespace
$ list_namespace_tables 'hbase'
TABLE
acl
meta
namespace
3 row(s) in 0.0050 seconds
$ list_namespace
NAMESPACE
default
hbase
# ...
50 row(s) in 0.3710 seconds
$ create_namespace 'www'
$ exists 'www:yuzhouwan.site'
$ create 'www:yuzhouwan.site', {NAME => 'info', VERSIONS=> 9}, SPLITS => ['1','2','3','4','5','6','7','8','9']
$ alter_namespace 'www', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
$ drop_namespace 'www'

手动Split

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ create 'yuzhouwan', {NAME => 'info', VERSIONS => 3}, {NAME => 'data', VERSIONS => 1}
$ put 'yuzhouwan', 'rk0001', 'info:name', 'Benedict Jin'
$ put 'yuzhouwan', 'rk0001', 'info:gender', 'Man'
$ put 'yuzhouwan', 'rk0001', 'data:pic', '[picture]'
$ put 'yuzhouwan', 'rk0002', 'info:name', 'Yuzhouwan'
# Usage:
# split 'tableName'
# split 'namespace:tableName'
# split 'regionName' # format: 'tableName,startKey,id'
# split 'tableName', 'splitKey'
# split 'regionName', 'splitKey'
$ split 'yuzhouwan', 'rk0002'
# Name Region Server Start Key End Key Locality Requests
yuzhouwan,,1500964657548.bd21cdf7ae9e2d8e5b2ed3730eb8b738. yuzhouwan01:60020 rk0002 1.0 0
yuzhouwan,rk0002,1500964657548.76f95590aed5d39291a087c5e8e83833. yuzhouwan02:60020 rk0002 1.0 2

Phoenix命令

1
2
# 执行外部 SQL脚本
$ sqlline.py <hbase.zookeeper.quorum host without port>:/phoenix sql.txt

实战技巧

Hive数据导入(Bulkload)

 Bulkload就是 依据 Hive表的 schema解析 RCFile,然后通过 MapReduce程序 生成 HBase的 HFile文件,最后直接利用 bulkload机制将 HFile文件导入到 HBase中。也就是 直接存放到 HDFS中。这样会比 调用 Api一条条的导入,效率会高很多 (一般的,Hive数据入库 HBase,都会使用 bulkload的方式)

集群间复制 (CopyTable + Replication)

相关命令

CommendComment
add_peer添加一条复制连接,ID是连接的标识符,CLUSTER_KEY的格式是 HBase.zookeeper.quorum: HBase.zookeeper.property.clientPort: zookeeper.znode.parent
list_peers查看所有的复制连接
enable_peer设置某条复制连接为可用状态,add_peer一条连接默认就是enable的,通过disable_peer命令让该连接变为不可用的时候,可以通过enable_peer让连接变成可用
disable_peer设置某条复制连接为不可用状态
remove_peer删除某条复制连接
set_peer_tableCFs设置某条复制连接可以复制的表信息
默认 add_peer添加的复制连接是可以复制集群所有的表,如果,只想复制某些表的话,就可以用 set_peer_tableCFs,复制连接的粒度可以到表的列族。表之间通过 ‘;’分号 隔开,列族之间通过 ‘,’逗号 隔开。e.g: set_peer_tableCFs ‘2’, “table1; table2:cf1,cf2; table3:cfA,cfB”。使用 ‘set_peer_tableCFs’命令,可以设置复制连接所有的表
append_peer_tableCFs可以为复制连接添加需要复制的表
remove_peer_tableCFs为复制连接删除不需要复制的表
show_peer_tableCFs查看某条复制连接复制的表信息,查出的信息为空时,表示复制所有的表
list_replicated_tables列出所有复制的表

监控Replication

HBase Shell
1
$ status 'replication'
Metrics

源端

Metrics NameComment
sizeOfLogQueue还有多少wal文件没处理
ageOfLastShippedOp上一次复制延迟时间
shippedBatches传输了多少批数据
shippedKBs传输了多少KB的数据
shippedOps传输了多少条数据
logEditsRead读取了多少个logEdits
logReadInBytes读取了多少KB数据
logEditsFiltered实际过滤了多少logEdits

目的端

Metrics NameComment
sink.ageOfLastAppliedOp上次处理的延迟
sink.appliedBatches处理的批次数
sink.appliedOps处理的数据条数

完整步骤

CopyTable
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 明确迁移时间
2017-01-01 00:00:00(1483200000000) 2017-05-01 00:00:00(1493568000000)
# 这里需要转换时间格式为 13位的 毫秒级 unix timestamp
# 在线转换工具 http://tool.chinaz.com/Tools/unixtime.aspx
# 或者用 shell
$ echo "`date -d "2017-01-01 00:00:00" +%s`000"
$ echo "`date -d "2017-05-01 00:00:00" +%s`000"
# 这里不用担心出现 边界问题 [starttime, endtime)
# 源集群执行 (不限制 starttime,可以增加参数 --starttime=0)
$ hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1483200000000 --endtime=1493568000000 --peer.adr=<aim zk address>,<aim zk address>,...:<aim zk port>:/<hbase parent path> <table name>
# 检查数据一致性 (在两个集群分别执行,比较 RowCount是否一致)
$ hbase org.apache.hadoop.hbase.mapreduce.RowCounter <table name> --endtime=1493568000000
# 进一步检查数据一致性 (在两个集群分别执行,比较 字节数是否一致)
$ hadoop fs -du hdfs://<base path>/hbase/data/<namespace>/<table name>
Replication
1
2
3
4
5
6
7
8
# 小集群上执行
# 预先进行 list_peers,避免 peer id冲突
$ list_peers
$ add_peer '<peer id>', "<big cluster zk address>,<big cluster zk address>,...:<big cluster zk port>:/<hbase parent path>"
# 开启表的 REPLICATION_SCOPE
$ disable '<table name>'
$ alter '<table name>', {NAME => '<column family>', REPLICATION_SCOPE => '1'} # 1: open; 0: close(default)
$ enable '<table name>'
Trouble shooting
1
2
3
4
5
# 源集群执行
$ hbase hbck
# 出现问题后 hbase hbck --repair
# 没有问题后 `hbase shell`中执行
$ balance_switch true

关闭自动分区

1
$ alter 'yuzhouwan', {METHOD => 'table_att', SPLIT_POLICY => 'org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy'}

JMX获取部分指标项

1
2
3
4
5
# 语法
http://namenode:50070/jmx?qry=<指标项>
# 例如,只返回NameNodeInfo指标项
http://namenode:50070/jmx?qry=hadoop:service=NameNode,name=NameNodeInfo

踩过的坑

Table is neither in disabled nor in enabled state

描述

 执行完正常的建表语句之后,一直卡在 enable table这步上

解决

1
2
3
4
5
6
7
8
9
10
11
12
# 检查发现 table既不处于 `enable`状态,也不处于 `disable`状态
$ is_enabled 'yuzhouwan'
false
$ is_disabled 'yuzhouwan'
false
$ hbase zkcli
$ delete /hbase/table/yuzhouwan
$ hbase hbck -fixMeta -fixAssignments
# 重启active HMaster
$ is_enabled 'yuzhouwan'
true
$ disable 'yuzhouwan'

No GCs detected

解决

1
2
3
4
# 从 JDK6开始 默认开启偏向锁
-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=0
# 但是不适合高并发的场景,Cassandra中已默认关闭 (https://github.com/apache/cassandra/blob/trunk/conf/jvm.options#L112)
-XX:-UseBiasedLocking

十六进制无法在命令行被识别

解决

1
2
# 只需要用双引号包起来就可以了
$ put 'yuzhouwan', 'rowkey01', 'cf:age', "\xFF" #255

性能优化

社区跟进

 详见,《开源社区

资料

Doc

Blog

Put

Read

Replication

BulkLoad

Flush

Code Resource

更多资源,欢迎加入,一起交流学习

QQ group: (人工智能 1020982 (高级) & 1217710 (进阶) | BigData 1670647)