Druid:一款高效的 OLAP 引擎

基本概念

概述

 Druid 是目前比较流行的高性能的,分布式列存储的 OLAP 框架(具体来说是 MOLAP

特性

  • 亚秒级查询

  Druid 提供了快速的聚合能力以及亚秒级的 OLAP 查询能力,多租户的设计,是面向用户分析应用的理想方式

  • 实时数据注入

  Druid 支持 流数据的注入,并提供了 数据的事件驱动,保证在 实时和离线环境下事件的 时效性统一性

  • 可扩展的 PB 级存储

  Druid 集群可以很方便地 扩容到 PB 的数据量,每秒百万级别的数据注入
  即便在超大数据规模的情况下,也能保证时其高效性

  • 多环境部署

  Druid 既可以运行在商业的硬件上,也可以运行在云上
  它可以将很多种数据系统作为数据源,进行数据注入,包括 HadoopSparkStormKafkaSamza

基础组件

 速记: ITHRCBO, IT HR Combo :-)

Indexing Service

 Druid 中通常还会起一些 Indexing Services 用于数据导入,batch data 和 streaming data 都可以通过给 Indexing Services 发请求来导入数据

Tranquility

 Tranquility 任务,获取实时数据,提交给 Overload Nodes 执行 Realtime Index

Historical Nodes

 从 Deep Storage 中 load segments,并响应 broder Nodes 的请求
 Historical Nodes 采用了一个 无共享架构设计,它知道如何去加载 segment,删除 segment 以及如果基于 segment 查询。可以保证,即使 Deep Storage 不可访问了,Historical Nodes 还是能提供其同步的 segments 的查询服务

Realtime Nodes

 用于存储和查询 热数据,会定期地将数据 build 成 segments 移到 Historical Nodes
 一般会使用外部依赖 Kafka来提高 realtime data ingestion 的可用性
 如果不需要实时 ingest 数据到 Cluster 中,可以舍弃 Realtime Nodes,只定时地 batch ingestion 数据到 Deep Storage

Coordinator Nodes

 Coordinator Nodes 可以被认为是在 Druid 中承担 master 的角色,其通过 Zookeeper 管理 Historical 和 Realtime Nodes,且通过 MySQL 中的 metadata 管理 Segments
 Coordinate Nodes 告诉 Historical Nodes 到哪儿加载新的 segment,移除旧的 segment,对节点上的 segment 做均衡

Broker Nodes

 Broker Nodes 负责响应外部的 查询请求,通过查询 Zookeeper 将请求划分成 segments 分别转发给 Historical 和 Realtime Nodes,最终聚集和合并查询结果,并返回 查询结果给外部(broker 节点知道每个 segment 都在哪儿)

Overload Nodes

 批量索引服务的主节点

外部依赖

MySQL

 存储 Druid 中的各种 metadata(里面的数据都是 Druid自身创建和写入的)

 包含如下 3 张表:

表名作用
druid_config通常是空的
druid_rulesCoordinator Nodes 使用的一些规则信息,比如哪个 segment 从哪个 node 去下载
druid_segments存储每个 segment 的 metadata 信息

Deep Storage

 存储 segments,Druid 目前已经支持本地磁盘、NFS 挂载磁盘、HDFS、S3 等
 Deep Storage 的数据有 两个来源,一个是 Batch Ingestion,另一个是 Realtime Nodes

Zookeeper

 被 Druid 用于管理当前 cluster 的状态,比如记录哪些 segments 从 Realtime Nodes 移到了 Historical Nodes

配置

selectStrategy

 该参数默认为 fillCapacity,意味着分配 Task 的时候,会将某个 MiddleManager 分配满,才会分配新的 Task 到其他 MiddleManager 上。这里可以考虑使用 equalDistribution 策略,将 Task 均匀分配到 MiddleManager 上

1
2
3
$ cd $DRUID_HOME
$ vim conf/druid/overlord/runtime.properties
druid.indexer.selectStrategy=equalDistribution

Druid 查询

PlyQL

基本使用

 通过 --host-q 来指定查询的 broker地址需要查询的内容

1
2
3
4
5
6
7
8
9
10
$ cd /home/druid/software/imply-1.3.0
$ bin/plyql --host <broker host>:8082 -q "show tables" # --host <broker>:<port>
┌─────────────────────────┐
│ Tables_in_database │
├─────────────────────────┤
│ COLUMNS │
│ SCHEMATA │
│ TABLES │
│ yuzhouwan_metrics │
└─────────────────────────┘

表结构查询

1
2
3
4
5
6
7
8
9
$ bin/plyql --host <broker host>:8082 -q "describe yuzhouwan_metrics"
┌────────────┬────────┬──────┬─────┬─────────┬───────┐
│ Field │ Type │ Null │ Key │ Default │ Extra │
├────────────┼────────┼──────┼─────┼─────────┼───────┤
│ __time │ TIME │ YES │ │ │ │
│ metric01 │ NUMBER │ YES │ │ │ │
│ metric02 │ NUMBER │ YES │ │ │ │
│ // ... │ │ │ │ │ │
└────────────┴────────┴──────┴─────┴─────────┴───────┘

聚合查询

简单聚合

 简单的 max/min/count 查询语句

1
2
3
4
5
6
$ bin/plyql --host <broker host>:8082 -q "select max(gcCount_max) from yuzhouwan_metrics where serverName='druid01'"
┌──────────────────┐
│ max(gcCount_max) │
├──────────────────┤
│ 39710 │
└──────────────────┘
时间维度聚合

 利用 TIME_PART 进行时间维度的聚合

1
2
3
4
$ bin/plyql --host <broker host>:8082 -q "select TIME_PART(__time, MINUTE_OF_DAY, 'Asia/Shanghai'), max(gcCount_max) from yuzhouwan_metrics where serverName='druid01' and __time>='2017-04-04' and __time<'2017-04-05' group by 1" -Z Asia/Shanghai
# 不参与 group by 的 指标需要进行 sum/min/max 之类的聚合操作
$ bin/plyql --host <broker host>:8082 -q "select TIME_PART(__time, MINUTE_OF_DAY, 'Asia/Shanghai'), metric, sum(sum) as sum_value from yuzhouwan_metrics where level='level1' and metric='metric1' and __time>='2017-04-04' and __time<'2017-04-05' group by 1, 2 order by sum_value desc limit 10" -Z Asia/Shanghai -v

展示查询对应的 JSON 语句

 增加 -v 参数,可以将查询的 JSON 语句展示出来,用于检查 plyql 语句是否符合预期

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
$ bin/plyql --host <broker host>:8082 -q "select distinct level from yuzhouwan_metrics where __time>='2017-01-16 03:00'" -Z Asia/Shanghai -v
plyql version 0.9.6 (plywood version 0.15.4)
Received query:
select distinct level from yuzhouwan_metrics where __time>='2017-01-16 03:00'
---------------------------
Parsed query as the following plywood expression (as JSON):
{
"op": "split",
"operand": {
"op": "filter",
"operand": {
"op": "ref",
"name": "yuzhouwan_metrics"
},
"expression": {
"op": "greaterThanOrEqual",
"operand": {
"op": "ref",
"name": "__time",
"ignoreCase": true
}, // ...
{
"version": "v1",
"timestamp": "2017-01-16T03:00:00.000Z",
"event": {
"level": "level1",
"!DUMMY": 1608
}
}
}
]
^^^^^^^^^^^^^^^^^^^^^^^^^^
┌────────────────┐
│ level │
├────────────────┤
│ level1 │
│ level2 │
└────────────────┘

计算查询耗时情况

 利用 time 命令,可以计算出查询语句的耗时情况

1
2
3
4
5
$ time bin/plyql -h <broker host>:8082 -q "select * from yuzhouwan_metrics where __time>='2017-03-18' and __time<'2017-03-19' and level='level01' limit 100 " -Z Asia/Shanghai
real 0m0.886s
user 0m0.684s
sys 0m0.062s

RESTful API

Curl 查询命令

1
2
3
4
$ vim query.body
# 编写查询语句
$ curl -X POST "http://<broker host>:8080/druid/v2/?pretty" -H 'content-type: application/json' -d @query.body

JSON 查询主体

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 这里可以直接接 JSON 进行查询
$ curl -X POST "http://<broker host>:8082/druid/v2/?pretty" -H 'content-type: application/json' -d '{
"dimensions": [
"dimensions1",
"dimensions2"
],
"aggregations": [
{
"filter": {
"type": "selector",
"dimension": "metric",
"value": "metrics01"
},
"aggregator": {
"type": "doubleSum",
"fieldName": "sum",
"name": "metric01"
},
"type": "filtered"
}
],
"filter": {
"type": "selector",
"dimension": "level",
"value": "day"
},
"intervals": "2017-02-09T15:03:12+08:00/2017-02-09T16:03:12+08:00",
"limitSpec": {
"limit": 10,
"type": "default",
"columns": [
{
"direction": "descending",
"dimension": "metric01"
}
]
},
"granularity": "all",
"postAggregations": [],
"queryType": "groupBy",
"dataSource": "yuzhouwan_metrics"
}'

可视化

Pivot

配置启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ whereis node
node: /usr/local/bin/node
$ /usr/local/bin/node -v
v4.2.2
# 将环境变量引入,而不用 sudo su druid
$ su - druid
# `--with-comments` 指标可以去掉不用,避免 comment 生成出现问题(部分出现 行开头漏 #号的情况)
$ /usr/local/bin/node /home/druid/software/druid/dist/pivot/bin/pivot --druid <druid.broker.host>:8082 --print-config > /home/druid/software/druid/dist/pivot/bin/yuzhouwan_metrics.yaml
# 需要使用相对路径
$ cd /home/druid/software/imply-2.0.0
$ nohup dist/pivot/bin/pivot -c /home/druid/software/druid/dist/pivot/bin/config_yuzhouwan_metrics.yaml >> /home/druid/software/druid/dist/pivot/bin/nohup.log 2>&1 &
$ vim /home/druid/software/druid/dist/pivot/bin/config_yuzhouwan_metrics.yaml
# 在 pivot 的配置文件中,可以利用简单的表达式,进行计算,如:除以采集的时间窗口,算得 `OPS`
- name: metrics02_OPS
title: metrics02 ops
expression: $main.sum($metrics02_Sum) / $main.sum($period_Sum)

踩过的坑

指标项过多,维护配置困难

 可以通过"列转行"的方式,在 dimensions 里面增加一个 metric 维度,来管理指标项。如此,可以有效地避免在 metricsSpec 里面维护大量的指标。同时,也方便了动态新增指标项

 不过,"列换行"也会带来数据膨胀的问题。如果在资源受限的情况下,很可能还是得在 metricsSepc里面维护指标。这样的话,可以使用我写的 DruidUtils 来快速生成配置文件,避免手动去维护配置

参考

Graphite

基础环境

OS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ uname -a
Linux olap03-sit.yuzhouwan.com 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/version
Linux version 2.6.32-431.el6.x86_64 (mockbuild@c6b8.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Nov 22 03:15:09 UTC 2013
# For Fedora and RHEL-derivatives
# [Doc]: Other System http://airbnb.io/superset/installation.html#os-dependencies
$ sudo yum upgrade python-setuptools -y
$ sudo yum install openssl openssl-devel install zlib zlib-devel readline readline-devel sqlite-devel libffi-devel -y
# Machines
druid.yuzhouwan.com 10.10.10.1 Druid
graphite.yuzhouwan.com 192.168.1.101 Graphite
Python 相关
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
$ python --version
Python 2.7.8
[Note]: Superset is tested using Python 2.7 and Python 3.4+. Python 3 is the recommended version, Python 2.6 won't be supported.'
## 升级 Python (stable: Python 2.7.12 | 3.4.5, lastest: Python 3.5.2 [2016/12/15])
https://www.python.org/downloads/
# 在 python ftp 服务器中下载到 对应版本的 python
$ wget http://python.org/ftp/python/2.7.12/Python-2.7.12.tgz
# 编译
$ tar -zxvf Python-2.7.12.tgz
$ cd /root/software/Python-2.7.12
$ ./configure --prefix=/usr/local/python27
$ make && make install
$ ls /usr/local/python27/ -al
drwxr-xr-x. 6 root root 4096 12月 15 14:22 .
drwxr-xr-x. 13 root root 4096 12月 15 14:20 ..
drwxr-xr-x. 2 root root 4096 12月 15 14:22 bin
drwxr-xr-x. 3 root root 4096 12月 15 14:21 include
drwxr-xr-x. 4 root root 4096 12月 15 14:22 lib
drwxr-xr-x. 3 root root 4096 12月 15 14:22 share
# 覆盖原来的 python6
$ which python
/usr/local/bin/python
$ mv /usr/local/bin/python /usr/local/bin/python_old
$ ln -s /usr/local/python27/bin/python /usr/local/bin/
$ python -V
Python 2.7.12
# 修改 yum 引用的 python 版本为旧版 2.6 的 python
$ vim /usr/bin/yum
# 第一行修改为 python2.6
#!/usr/bin/python2.6
$ yum --version | sed '2,$d'
3.2.29
Pip
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ pip --version
pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
# upgrade setup tools and pip
$ pip install --upgrade setuptools pip
## Offline 环境下安装 pip
# https://pypi.python.org/pypi/setuptools#code-of-conduct 下载 setuptools-32.0.0.tar.gz
$ tar zxvf setuptools-32.0.0.tar.gz
$ cd setuptools-32.0.0
$ cd setuptools-32.0.0
$ python setup.py install
# https://pypi.python.org/pypi/pip 下载 pip-9.0.1.tar.gz
$ wget --no-check-certificate https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
$ tar zxvf pip-9.0.1.tar.gz
$ cd pip-9.0.1
$ python setup.py install
Installed /usr/local/python27/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg
Processing dependencies for pip==9.0.1
Finished processing dependencies for pip==9.0.1
$ pip --version
pip 9.0.1 from /root/software/pip-9.0.1 (python 2.7)
virtualenv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ pip install virtualenv
# virtualenv is shipped in Python 3 as pyvenv
$ virtualenv venv
$ . ./venv/bin/activate
## Offline 环境下安装 virtualenv
# https://pypi.python.org/pypi/virtualenv#downloads 下载 virtualenv-15.1.0.tar.gz
$ tar zxvf virtualenv-15.1.0.tar.gz
$ cd virtualenv-15.1.0
$ python setup.py install
$ virtualenv --version
15.1.0

Graphite 相关

VirtualENV 安装
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# root@graphite-sit.yuzhouwan.com (192.168.1.102)
# cd ~
$ cd /opt
$ virtualenv -p /usr/local/bin/python --system-site-packages graphite
$ cd graphite
$ source bin/activate
$ pip install https://github.com/graphite-project/ceres/tarball/master (ceres-0.10.0rc1)
$ pip install whisper (whisper-0.9.15)
# trouble shooting
$ which python
/root/graphite/bin/python (in virtualenv, otherwise "/usr/local/bin/python")
$ ll /root/graphite/bin/whisper*py
-rwxr-xr-x 1 root root 2847 Jan 3 17:06 /root/graphite/bin/whisper-create.py
-rwxr-xr-x 1 root root 2208 Jan 3 17:06 /root/graphite/bin/whisper-diff.py
-rwxr-xr-x 1 root root 2912 Jan 3 17:06 /root/graphite/bin/whisper-dump.py
-rwxr-xr-x 1 root root 1790 Jan 3 17:06 /root/graphite/bin/whisper-fetch.py
-rwxr-xr-x 1 root root 4309 Jan 3 17:06 /root/graphite/bin/whisper-fill.py
-rwxr-xr-x 1 root root 1081 Jan 3 17:06 /root/graphite/bin/whisper-info.py
-rwxr-xr-x 1 root root 685 Jan 3 17:06 /root/graphite/bin/whisper-merge.py
-rwxr-xr-x 1 root root 5994 Jan 3 17:06 /root/graphite/bin/whisper-resize.py
-rwxr-xr-x 1 root root 929 Jan 3 17:06 /root/graphite/bin/whisper-set-aggregation-method.py
-rwxr-xr-x 1 root root 980 Jan 3 17:06 /root/graphite/bin/whisper-update.py
$ pip install carbon (carbon-0.9.15 constantly-15.1.0 incremental-16.10.1 twisted-16.6.0 txamqp-0.6.2 zope.interface-4.3.3)
# trouble shooting
$ ll /root/graphite/bin/carbon*py
-rwxr-xr-x 1 root root 1095 Jan 3 17:12 /root/graphite/bin/carbon-aggregator.py
-rwxr-xr-x 1 root root 1095 Jan 3 17:12 /root/graphite/bin/carbon-cache.py
-rwxr-xr-x 1 root root 4498 Jan 3 17:12 /root/graphite/bin/carbon-client.py
-rwxr-xr-x 1 root root 1095 Jan 3 17:12 /root/graphite/bin/carbon-relay.py
$ pip install graphite-web
$ pip install cairocffi
# pip freeze | grep graphite-web
# graphite-web==0.9.15
graphite 配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
$ cd /root/graphite/conf (otherwise /opt/graphite/conf)
$ ls -sail
total 72
-rw-r--r-- 1 root root 1798 Jan 3 17:54 aggregation-rules.conf.example
-rw-r--r-- 1 root root 274 Jan 3 17:54 blacklist.conf.example
-rw-r--r-- 1 root root 2594 Jan 3 17:54 carbon.amqp.conf.example
-rw-r--r-- 1 root root 17809 Jan 3 17:54 carbon.conf.example
-rw-r--r-- 1 root root 888 Jan 3 17:54 relay-rules.conf.example
-rw-r--r-- 1 root root 558 Jan 3 17:54 rewrite-rules.conf.example
-rw-r--r-- 1 root root 827 Jan 3 17:54 storage-aggregation.conf.example
-rw-r--r-- 1 root root 489 Jan 3 17:54 storage-schemas.conf.example
-rw-r--r-- 1 root root 315 Jan 3 17:54 whitelist.conf.example
$ cp aggregation-rules.conf.example aggregation-rules.conf
$ cp blacklist.conf.example blacklist.conf
$ cp carbon.amqp.conf.example carbon.amqp.conf
$ cp carbon.conf.example carbon.conf
# following 3 conf files need to install graphite-web firstly
$ cp dashboard.conf.example dashboard.conf
$ cp graphite.wsgi.example graphite.wsgi
$ cp graphTemplates.conf.example graphTemplates.conf
#
$ cp relay-rules.conf.example relay-rules.conf
$ cp rewrite-rules.conf.example rewrite-rules.conf
$ cp storage-aggregation.conf.example storage-aggregation.conf
$ cp storage-schemas.conf.example storage-schemas.conf
$ cp whitelist.conf.example whitelist.conf
$ /root/graphite/bin/carbon-cache.py start
Starting carbon-cache (instance a)
# trouble shooting
$ ps -ef | grep carbon
root 12074 1 0 18:58 ? 00:00:00 /root/graphite/bin/python /root/graphite/bin/carbon-cache.py start
$ vim /root/graphite/conf/carbon.conf
# carbon.conf 文件中,在 cache 区段下,接收端口这一行包含一个默认值,用于通过平文本协议(plaintext protocol )接受输入指标项
[cache]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
$ yum install nc -y
# echo "<metric path> <metric value> <metric timestamp>" | nc -q0 ${SERVER} ${PORT}
$ echo "carbon.agents.graphite-tutorial.metricsReceived 28198 `date +%s`" | nc -c localhost 2003
# Carbon 与 Whisper 交互,将这些时间序列数据存储到文件系统中,可以用 whisper-info 脚本获取为这些指标项创建的 Whisper 文件的元数据信息
$ /root/graphite/bin/whisper-info.py /root/graphite/storage/whisper/carbon/agents/graphite-tutorial/metricsReceived.wsp
graphite web 应用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# dependency
# pip install Django==1.9.12 会导致 'WSGIRequest' object has no attribute 'REQUEST'异常
$ pip install django==1.8.17
$ pip install django-tagging
# configure
$ cd /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite
$ cp local_settings.py.example local_settings.py
# 创建 sqlite3 数据库 & 赋读写权限 & 修改 local_settings.py
# 详见,Question3
$ cd /root/graphite/conf
$ cp dashboard.conf.example dashboard.conf
$ cp graphTemplates.conf.example graphTemplates.conf
# init database
$ cd /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/
$ python /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/manage.py syncdb
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'root'): graphite
Email address: bj@yuzhouwan.com
Password:
Password (again):
Superuser created successfully.
# start
$ mkdir -p /root/graphite/storage/log/webapp/
$ echo '' > /root/graphite/storage/log/webapp/process.log
$ cd /root/graphite
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=8085 --libs=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
# 或者用,python /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/manage.py runserver 0.0.0.0:8085
# trouble shooting
$ tail -f /root/graphite/storage/log/webapp/process.log
http://192.168.1.102:8085/
graphite events
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# PYTHONPATH=$GRAPHITE_ROOT/webapp django-admin.py migrate --settings=graphite.settings --run-syncdb
$ PYTHONPATH=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp django-admin.py migrate --settings=graphite.settings --run-syncdb
Operations to perform:
Synchronize unmigrated apps: account, cli, render, whitelist, metrics, url_shortener, dashboard, composer, events, browser
Apply all migrations: admin, contenttypes, tagging, auth, sessions
Synchronizing apps without migrations:
Creating tables...
Running deferred SQL...
Running migrations:
Rendering model states... DONE
Applying admin.0002_logentry_remove_auto_add... OK
Applying auth.0007_alter_validators_add_error_messages... OK
$ curl -X POST "http://10.10.10.2:8085/events/" -d '{ "what": "Event - deploy", "tags": ["deploy"], "when": 1467844481, "data": "deploy of master branch happened at Wed Jul 6 22:34:41 UTC 2016" }'
# trouble shooting
http://10.10.10.2:8085/events/ graphite events when what tags
22:34:41 Wed 06 Jul 2016 Event - deploy [u'deploy']
$ curl -s "http://10.10.10.2:8085/render/?target=events('exception')&format=json" | json_pp
[
{
"target" : "events(exception)",
"datapoints" : [
[
1, 1388966651
],
[
3, 1388966652
]
]
}
]
参考
graphite-index
1
2
3
4
5
6
7
8
9
10
11
# douban new UI for graphite
$ git clone https://github.com/douban/graph-index.git
$ cd graph-index
$ vim config.py
graphite_url = 'http://192.168.1.101:9097'
$ crontab -e
*/5 * * * * python /root/software/graphite-index
$ python /graph-index.py

整合 Druid

迁移到内网环境
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# 192.168.1.102 to 10.10.10.2 (sit)
# ps -ef | grep graphite # 关闭所有进程
# # rsync 替换 scp 可以确保软链接也能被 cp (补充:用 tar zcvf 打包也是不能解决的)
$ rsync -avuz -e ssh /root/graphite root@10.10.10.2:/root
# 192.168.1.102 to 192.168.2.101 to 192.168.1.101 (product)
$ rsync -avuz -e ssh /root/graphite jinjy@192.168.2.101:/home/jinjy
$ rsync -avuz -e ssh /home/jinjy/graphite root@192.168.1.101:/root
# default: --port=8085
$ /root/graphite/bin/carbon-cache.py start
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=9097 --libs=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
# trouble shooting
$ ps -ef | grep graphite
root 30754 1 0 15:42 ? 00:00:00 /root/graphite/bin/python /root/graphite/bin/carbon-cache.py start
root 30825 28048 3 15:43 pts/1 00:00:00 /root/graphite/bin/python /root/graphite/bin/django-admin runserver --pythonpath /root/graphite/webapp --settings graphite.settings 0.0.0.0:9097
root 30829 30825 5 15:43 pts/1 00:00:00 /root/graphite/bin/python /root/graphite/bin/django-admin runserver --pythonpath /root/graphite/webapp --settings graphite.settings 0.0.0.0:9097
$ cd /root/graphite/storage/log/carbon-cache/carbon-cache-a
tail -f console.log creates.log listener.log # carbon 接收 event 事件相关的日志记录
tail -f /root/graphite/storage/log/webapp/process.log
http://192.168.1.101:9097/
# virtualenv
$ rsync -avuz -e ssh /root/software jinjy@192.168.1.101:/home/jinjy
$ rsync -avuz -e ssh /home/jinjy/software/Python-2.7.12.tgz root@192.168.1.101:/root/software
$ cd /root/software
$ tar zxvf Python-2.7.12.tgz
$ cd Python-2.7.12
$ ./configure --prefix=/usr --enable-shared CFLAGS=-fPIC
$ make -j4 && make -j4 install
$ /sbin/ldconfig -v | grep /
$ python -V
Python 2.7.12
# 虽然软链接已经 rsync 过来了,但是目标机器相关目录下,没有对应的 python 的动态链接库
$ file /root/graphite/lib/python2.7/lib-dynload
/root/graphite/lib/python2.7/lib-dynload: broken symbolic link to `/usr/local/python27/lib/python2.7/lib-dynload' '`
# 需要和联网环境中,创建 virtualenv 时的 python 全局环境一致
$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC
$ make -j4 && make -j4 install
$ /sbin/ldconfig -v | grep /
$ ls /usr/local/python27/lib/python2.7/lib-dynload -sail
修改 Druid 配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
$ sudo su druid
$ cd /home/druid/software/druid
$ find | grep common.runtime.properties | grep -v quickstart | grep -v dist
./conf/druid/_common/common.runtime.properties
$ cp /home/druid/software/druid/conf/druid/_common/common.runtime.properties /home/druid/software/druid/conf/druid/_common/common.runtime.properties.bak
$ vim /home/druid/software/druid/conf/druid/_common/common.runtime.properties
# module
druid.extensions.loadList=[..., "graphite-emitter"]
#
# Monitoring
#
druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.emitter=http
#druid.emitter=logging
druid.emitter.logging.logLevel=info
druid.emitter.http.recipientBaseUrl=http://10.37.2.142:9999/metrics
# monitor
druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.emitter=composing
druid.emitter.composing.emitters=["graphite", "logging"]
druid.emitter.graphite.hostname=localhost
# 端口需要注意,不是 2003(即,非 /root/graphite/conf/carbon.conf 中的 LINE_RECEIVER_PORT or LINE_RECEIVER_PORT,而是 PICKLE_RECEIVER_PORT)
druid.emitter.graphite.port=2004
# druid.emitter.graphite.eventConverter={"type":"whiteList", "namespacePrefix": "cluster_x", "ignoreHostname":true, "ignoreServiceName":false, "mapFile":"/a/b/c"}
druid.emitter.graphite.eventConverter={"ingest/events/thrownAway":["dataSource"],"ingest/events/unparseable":["dataSource"],"ingest/events/processed":["dataSource"],"ingest/handoff/failed":["dataSource"],"ingest/persists":[],"ingest/rows/output":[],"jvm/gc":[],"jvm/mem":[],"query/cpu/time":["dataSource","type"],"query/node/time":["dataSource","type"],"query/node/ttfb":["dataSource","type"],"query/partial/time":["dataSource","type"],"query/segment/time":["dataSource","type"],"query/segmentAndCache/time":["dataSource","type"],"query/time":["dataSource","type"],"query/wait/time":["dataSource","type"],"segment/count":[],"segment/dropQueue/count":[],"segment/loadQueue/count":[],"segment/loadQueue/failed":[],"segment/loadQueue/size":[],"segment/scan/pending":[],"segment/size":[],"segment/usedPercent":[]}
druid.emitter.logging.logLevel=info
druid.emitter.graphite.eventConverter={"type":"all", "namespacePrefix": "druid", "ignoreHostname": false, "ignoreServiceName": false}
## pertty format start ##
{
"ingest/events/thrownAway": ["dataSource"],
"ingest/events/unparseable": ["dataSource"],
"ingest/events/processed": ["dataSource"],
"ingest/handoff/failed": ["dataSource"],
"ingest/persists": [],
"ingest/rows/output": [],
"jvm/gc": [],
"jvm/mem": [],
"query/cpu/time": [
"dataSource",
"type"
],
"query/node/time": [
"dataSource",
"type"
],
"query/node/ttfb": [
"dataSource",
"type"
],
"query/partial/time": [
"dataSource",
"type"
],
"query/segment/time": [
"dataSource",
"type"
],
"query/segmentAndCache/time": [
"dataSource",
"type"
],
"query/time": [
"dataSource",
"type"
],
"query/wait/time": [
"dataSource",
"type"
],
"segment/count": [],
"segment/dropQueue/count": [],
"segment/loadQueue/count": [],
"segment/loadQueue/failed": [],
"segment/loadQueue/size": [],
"segment/scan/pending": [],
"segment/size": [],
"segment/usedPercent": []
}
## pertty format end ##
# kill historical process to make configure activate
$ jps -m
1867 Main server historical
26339 Main server middleManager
$ kill 1867
# trouble shooting
$ tail -f var/sv/supervise.log
# 可以看到
[Thu Jan 5 11:18:17 2017] Running command[historical], logging to[/home/druid/software/imply-2.0.0/var/sv/historical.log]: bin/run-druid historical conf
[Thu Jan 5 11:18:21 2017] Command[historical] exited (pid = 1752, exited = 1)
[Thu Jan 5 11:18:21 2017] Command[historical] failed, see logfile for more details: /home/druid/software/imply-2.0.0/var/sv/historical.log
$ tail -f /home/druid/software/imply-2.0.0/var/sv/historical.log
2017-01-05T11:34:29,203 INFO [GraphiteEmitter-1] io.druid.emitter.graphite.GraphiteEmitter - trying to connect to graphite server
# 如果连接不上,会报错 ERROR [GraphiteEmitter-1] io.druid.emitter.graphite.GraphiteEmitter - 拒绝连接
# 则检查 graphite 进程是否正常
参考

优化配置

依赖
Django
1
2
3
4
5
6
7
8
$ pip freeze | grep Django
Django==1.8
$ pip install --upgrade Django
Successfully installed Django-1.10.5
$ pip uninstall Django
$ pip install Django==1.8.17 # 高版本 会导致 'WSGIRequest' object has no attribute 'REQUEST' 异常
Graphite-Web 相关
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ vim requirements.txt
python-memcached==1.47
txAMQP==0.4
simplejson==2.1.6
django-tagging==0.4.3
gunicorn
pytz
pyparsing==1.5.7
cairocffi
whitenoise
$ pip install -r requirements.txt
采集
1
2
3
4
5
6
7
8
9
10
11
$ vim /root/graphite/conf/storage-schemas.conf
[carbon]
pattern = ^carbon\.
retentions = 60:90d
[default_1min_for_1day]
pattern = .*
# retentions = 60s:1d
# 改为 3 种时间粒度
retentions = 10s:6h,1m:7d,10m:1y
监控
1
2
3
4
5
6
7
8
$ python /root/graphite/examples/example-client.py
sending message
--------------------------------------------------------------------------------
system.loadavg_1min 0.26 1483690449
system.loadavg_5min 0.30 1483690449
system.loadavg_15min 0.35 1483690449
告警
参考
启动命令汇总
1
2
3
4
5
$ python /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/manage.py syncdb
$ PYTHONPATH=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp django-admin.py migrate --settings=graphite.settings --run-syncdb
$ /root/graphite/bin/carbon-cache.py start
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=9097 --libs=/root/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &

踩过的坑

ImportError: No module named carbon.util
描述
1
2
3
4
5
(graphite) [root@graphite-sit.yuzhouwan.com conf]# /root/graphite/bin/carbon-cache.py start
Traceback (most recent call last):
File "/root/graphite/bin/carbon-cache.py", line 28, in <module>
from carbon.util import run_twistd_plugin
ImportError: No module named carbon.util
解决
  • 是否是 carbon 安装未完全成功 –not ok
1
2
3
4
5
6
7
8
9
10
pip freeze
carbon==0.9.15
ceres==0.10.0rc1
constantly==15.1.0
incremental==16.10.1
Twisted==16.6.0
txAMQP==0.6.2
whisper==0.9.15
zope.interface==4.3.3
  • graphite’s default prefix (/opt/graphite) ‘ –ok
1
2
$ mv /root/graphite/lib/python2.7/site-packages/opt/graphite/lib/carbon /root/graphite/lib/python2.7/site-packages/
$ mv /root/graphite/lib/python2.7/site-packages/opt/graphite/lib/twisted/plugins/carbon_* /root/graphite/lib/python2.7/site-packages/twisted/plugins/
参考
django.db.utils.OperationalError: unable to open database file
描述
1
2
3
4
5
6
7
8
9
10
11
$ python manage.py syncdb
/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/settings.py:246: UserWarning: SECRET_KEY is set to an unsafe default. This should be set in local_settings.py for better security
warn('SECRET_KEY is set to an unsafe default. This should be set in local_settings.py for better security')
Traceback (most recent call last):
File "manage.py", line 13, in <module>
execute_from_command_line(sys.argv)
File "/root/graphite/lib/python2.7/site-packages/django/core/management/__init__.py", line 338, in execute_from_command_line
utility.execute()
// ...
django.db.utils.OperationalError: unable to open database file
解决
  • change default SECRET_KEY in settings.py –ok
1
2
3
4
5
$ vim /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/settings.py
# Django 1.5 requires this so we set a default but warn the user
# SECRET_KEY = 'UNSAFE_DEFAULT'
SECRET_KEY = 'graphite'
  • change DATABASE_NAME in sqlites
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ mkdir /root/graphite/sqlite
$ cd /root/graphite/sqlite
# create database
$ sqlite3 graphite.db
$ sqlite3
sqlite>.help
sqlite>.databases
seq name file
--- --------------- ----------------------------------------------------------
0 main /root/graphite/sqlite/graphite.db
Crtl + D (exit like python)
# change DATABASE_NAME --not ok
DATABASE_NAME='/root/graphite/sqlite/graphite.db'
echo $DATABASE_NAME
# run 'python manage.py syncdb' again, then the graphite database disappeared
  • modify settings.py for sqlite database –ok
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
$ cd /root/graphite/storage
$ mkdir db
$ cd db
$ sqlite3 graphite.db
$ vim /root/graphite/lib/python2.7/site-packages/django/conf/project_template/project_name/settings.py
# GRAPHITE_STORAGE_DIR = '/root/graphite/sqlite/graphite.db' --not ok
#DATABASES = {
# 'default': {
# 'ENGINE': 'django.db.backends.sqlite3',
# 'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
# }
#}
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
'NAME': '/root/graphite/storage/db/graphite.db', # Or path to database file if using sqlite3.
'USER': '', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'HOST': '', # Set to empty string for localhost. Not used with sqlite3.
'PORT': '', # Set to empty string for default. Not used with sqlite3.
}
}
# trouble shooting
sqlite3 /root/graphite/storage/db/graphite.db
SQLite version 3.6.20
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .databases
seq name file
--- --------------- ----------------------------------------------------------
0 main /root/graphite/storage/db/graphite.db
$ cd /root/graphite/
$ find | grep /settings.py | grep -v pyc
./lib/python2.7/site-packages/opt/graphite/webapp/graphite/settings.py
./lib/python2.7/site-packages/tagging/tests/settings.py
./lib/python2.7/site-packages/tagging/settings.py
./lib/python2.7/site-packages/django/conf/project_template/project_name/settings.py
# 全部修改完成,即可修复
  • django version too low (<= v1.4) –no
1
2
$ django-admin version
1.8
  • 访问权限 –not ok
1
2
3
4
5
6
7
8
9
$ cut -d: -f1 /etc/passwd | grep graphite
$ echo $USER
root
$ cd /root/graphite/storage/db
$ sudo chown root:root graphite.db
$ sudo chmod o+rw graphite.db
$ sudo chmod o+rwx db/
$ sudo chmod o+rwx ../webapp/
参考
ImportError: No module named graphite.settings
描述
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ ./bin/run-graphite-devel-server.py --port=8085 --libs=/root/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
tail: /root/graphite/storage/log/webapp/process.log: file truncated
Traceback (most recent call last):
File "/root/graphite/bin/django-admin", line 11, in <module>
sys.exit(execute_from_command_line())
File "/root/graphite/lib/python2.7/site-packages/django/core/management/__init__.py", line 338, in execute_from_command_line
utility.execute()
File "/root/graphite/lib/python2.7/site-packages/django/core/management/__init__.py", line 303, in execute
settings.INSTALLED_APPS
File "/root/graphite/lib/python2.7/site-packages/django/conf/__init__.py", line 48, in __getattr__
self._setup(name)
File "/root/graphite/lib/python2.7/site-packages/django/conf/__init__.py", line 44, in _setup
self._wrapped = Settings(settings_module)
File "/root/graphite/lib/python2.7/site-packages/django/conf/__init__.py", line 92, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
File "/usr/local/python27/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named graphite.settings
解决
  • 指定 PYTHONPATH –ok
1
2
3
4
5
6
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=8085 --libs=/root/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
# new problem
ImportError: Cannot import either sping or piddle.
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=8085 --libs=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
  • 修改 local_settings.py –not ok
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
$ vim /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/local_settings.py
##########################
# Database Configuration #
##########################
# By default sqlite is used. If you cluster multiple webapps you will need
# to setup an external database (such as MySQL) and configure all of the webapp
# instances to use the same database. Note that this database is only used to store
# Django models such as saved graphs, dashboards, user preferences, etc.
# Metric data is not stored here.
#
# DO NOT FORGET TO RUN 'manage.py syncdb' AFTER SETTING UP A NEW DATABASE
#
# The following built-in database engines are available:
# django.db.backends.postgresql # Removed in Django 1.4
# django.db.backends.postgresql_psycopg2
# django.db.backends.mysql
# django.db.backends.sqlite3
# django.db.backends.oracle
#
# The default is 'django.db.backends.sqlite3' with file 'graphite.db'
# located in STORAGE_DIR
#
#DATABASES = {
# 'default': {
# 'NAME': '/opt/graphite/storage/graphite.db',
# 'ENGINE': 'django.db.backends.sqlite3',
# 'USER': '',
# 'PASSWORD': '',
# 'HOST': '',
# 'PORT': ''
# }
#}
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
'NAME': '/root/graphite/storage/db/graphite.db', # Or path to database file if using sqlite3.
'USER': '', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'HOST': '', # Set to empty string for localhost. Not used with sqlite3.
'PORT': '', # Set to empty string for default. Not used with sqlite3.
}
}
参考
OError: [Errno 2] No such file or directory: ‘/root/graphite/lib/python2.7/site-packages/opt/graphite/storage/log/webapp/info.log’
描述
1
2
3
4
5
6
7
8
9
http://192.168.1.102:8085/
Traceback (most recent call last):
File "/root/graphite/lib/python2.7/site-packages/django/core/handlers/base.py", line 119, in get_response
resolver_match = resolver.resolve(request.path_info)
// ...
File "/usr/local/python27/lib/python2.7/logging/__init__.py", line 943, in _open
stream = open(self.baseFilename, self.mode)
IOError: [Errno 2] No such file or directory: '/root/graphite/lib/python2.7/site-packages/opt/graphite/storage/log/webapp/info.log'
解决
  • 增加 info.log 文件 –ok
1
2
$ mkdir -p /root/graphite/lib/python2.7/site-packages/opt/graphite/storage/log/webapp/
$ echo '' > /root/graphite/lib/python2.7/site-packages/opt/graphite/storage/log/webapp/info.log
Graphite Web 页面无 event 数据
描述
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
http://192.168.1.101:9097/events/
# 在 Druid 中是能看到 historical 进程的确在产生数据,并且成功连接到了 graphite
$ tail -f /home/druid/software/imply-2.0.0/var/sv/historical.log
2017-01-05T11:34:29,203 INFO [GraphiteEmitter-1] io.druid.emitter.graphite.GraphiteEmitter - trying to connect to graphite server
# 如果连接不上,会报错 ERROR [GraphiteEmitter-1] io.druid.emitter.graphite.GraphiteEmitter - 拒绝连接
# 则检查 graphite 进程是否正常
# 在 Graphite 中也能看到数据被收到了
$ cd /root/graphite/storage/log/carbon-cache/carbon-cache-a
$ tail -f console.log creates.log listener.log
05/01/2017 20:05:18 :: Sorted 75 cache queues in 0.000208 seconds
# 如果数据有误,会报错 05/01/2017 20:32:32 :: invalid line ((L1483619493L) received from client 10.10.10.1:41752, ignoring
# 则检查 druid emitter 配置是否正常
解决
  • 是否是 sqlite 数据库没有成功存储
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
# graphite 配置没有问题
$ vim /root/graphite/lib/python2.7/site-packages/django/conf/project_template/project_name/settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
'NAME': '/root/graphite/storage/db/graphite.db', # Or path to database file if using sqlite3.
'USER': '', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'HOST': '', # Set to empty string for localhost. Not used with sqlite3.
'PORT': '', # Set to empty string for default. Not used with sqlite3.
}
}
# 发现 sqlite 中并没有将 events 记录下来
(graphite) [root@kylin03-pre db]# sqlite3 /root/graphite/storage/db/graphite.db
SQLite version 3.6.20
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .databases
seq name file
--- --------------- ----------------------------------------------------------
0 main /root/graphite/storage/db/graphite.db
sqlite> .tables
account_mygraph dashboard_dashboard
account_profile dashboard_dashboard_owners
account_variable django_admin_log
account_view django_content_type
account_window django_migrations
auth_group django_session
auth_group_permissions events_event
auth_permission tagging_tag
auth_user tagging_taggeditem
auth_user_groups url_shortener_link
auth_user_user_permissions
sqlite> select * from auth_user;
1|pbkdf2_sha256$20000$oEgzveEmcg9B$8xbilUymXlwVBAaB48xpUQwsfIucmeP/4C4YF3U6SlI=|1|graphite|||bj@yuzhouwan.com|1|1|2017-01-04 05:59:10.615950|2017-01-05 08:24:54.957631
2|pbkdf2_sha256$20000$gG1lK6FNg0h7$dXH47Wqc+Gj/qTyI6EKOajd+Pj1kKN+U5CtnmDo0K/0=|0|default|||default@localhost.localdomain|0|1|2017-01-04 06:53:34.687401|
3|pbkdf2_sha256$20000$fcQ5sYbw0cjk$anjZc4J0eRE51HGJ6D50c0c9+d08iY7lhWseke9RmEY=|0|druid||||0|1|2017-01-05 09:03:48.696161|
sqlite> select * from events_event; # no data!
# 尝试使用 MySQL 替换 SQLite
# 192.168.1.102
$ mkdir -p /root/software/mysql
yum install -y --downloadonly --downloaddir=/root/software/mysql mysql
yum install -y --downloadonly --downloaddir=/root/software/mysql mysql-server
yum install -y --downloadonly --downloaddir=/root/software/mysql MySQL-python
# 192.168.1.101
yum install -y mysql mysql-server MySQL-python
$ cd /root/software/mysql
$ wget http://dev.mysql.com/get/mysql57-community-release-el5-7.noarch.rpm
$ yum localinstall mysql57-community-release-el5-7.noarch.rpm
$ yum repolist enabled | grep "mysql.*-community.*"
$ yum install mysql-community-server
$ vim /usr/bin/yum-config-manager
#!/usr/bin/python2.6 -tt
$ yum-config-manager --enable mysql57-community
$ service mysqld start
$ mysql -uroot -p -S /home/mysql/data/mysql.sock
# 后面规范化部署的时候,可以创建 graphite 用户,并赋权
CREATE DATABASE graphite;
# GRANT ALL PRIVILEGES ON graphite.* TO 'graphite'@'localhost' IDENTIFIED BY 'sysadmin';
GRANT ALL PRIVILEGES ON graphite.* TO 'root'@'localhost' IDENTIFIED BY 'sysadmin';
FLUSH PRIVILEGES;
$ vim /root/graphite/lib/python2.7/site-packages/django/conf/project_template/project_name/settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
# 'NAME': 'jdbc:mysql://192.168.1.101:3306/graphite',
'NAME': 'graphite',
'USER': 'root',
# 'HOST': 'localhost',
'PASSWORD': 'root'
}
}
# TIME_ZONE = 'UTC'
TIME_ZONE = 'Asia/Shanghai'
# DEBUG = False
DEBUG = True
$ cd /root/graphite/
$ find | grep /settings.py | grep -v pyc
$ vim /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/settings.py
$ vim /root/graphite/lib/python2.7/site-packages/tagging/tests/settings.py
$ vim /root/graphite/lib/python2.7/site-packages/tagging/settings.py
# ./lib/python2.7/site-packages/django/conf/project_template/project_name/settings.py
# 全部修改完成,即可修复
$ python /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/manage.py syncdb
# 如果需要添加其他的 superuser,可以使用如下命令 admin/admin
# echo "from django.contrib.auth.models import User; User.objects.create_superuser('admin', 'admin@hihuron.com', 'sysadmin')" | python /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp/graphite/manage.py shell
$ /root/graphite/bin/carbon-cache.py start
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=9097 --libs=/root/graphite/lib/python2.7/site-packages/opt/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
$ cd /root/graphite/webapp
$ cp -r content/ /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp
$ cd /root/graphite/lib/python2.7/site-packages/opt/graphite/webapp
$ cp -r graphite/ /root/graphite/webapp
$ PYTHONPATH=/root/graphite/storage/whisper /root/graphite/bin/run-graphite-devel-server.py --port=9097 --libs=/root/graphite/webapp /root/graphite 1>/root/graphite/storage/log/webapp/process.log 2>&1 &
参考
ImportError: No module named twisted.python.util
描述
1
2
3
4
5
6
7
$ python carbon-cache.py start
Traceback (most recent call last):
File "carbon-cache.py", line 28, in <module>
from carbon.util import run_twistd_plugin
File "/opt/graphite/lib/carbon/util.py", line 20, in <module>
from twisted.python.util import initgroups
ImportError: No module named twisted.python.util
解决
1
2
3
4
5
6
# pip freeze | grep zope.interface # 没有则需要安装
# pip install zope.interface==3.6.0
$ wget https://pypi.python.org/packages/source/T/Twisted/Twisted-14.0.0.tar.bz2#md5=9625c094e0a18da77faa4627b98c9815 --no-check-certificate
$ tar -jxf Twisted-14.0.0.tar.bz2
$ cd Twisted-14.0.0;
$ python setup.py install
‘WSGIRequest’ object has no attribute ‘REQUEST’
描述
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
http://192.168.1.102:9097/
AttributeError at /render/
'WSGIRequest' object has no attribute 'REQUEST'
Request Method: GET
Request URL: http://192.168.1.102:9097/render/?width=586&height=308&_salt=1483685265.903
Django Version: 1.9.12
Exception Type: AttributeError
Exception Value:
'WSGIRequest' object has no attribute 'REQUEST'
Exception Location: /root/graphite/webapp/graphite/render/views.py in parseOptions, line 236
Python Executable: /root/graphite/bin/python
Python Version: 2.7.12
Python Path:
['/root/graphite/webapp',
'/root/graphite/webapp',
'/root/graphite/webapp',
'/root/graphite/bin',
'/root/graphite/webapp',
'/root/graphite/storage/whisper',
'/root/graphite/lib/python27.zip',
'/root/graphite/lib/python2.7',
'/root/graphite/lib/python2.7/plat-linux2',
'/root/graphite/lib/python2.7/lib-tk',
'/root/graphite/lib/python2.7/lib-old',
'/root/graphite/lib/python2.7/lib-dynload',
'/usr/local/python27/lib/python2.7',
'/usr/local/python27/lib/python2.7/plat-linux2',
'/usr/local/python27/lib/python2.7/lib-tk',
'/root/graphite/lib/python2.7/site-packages',
'/root/graphite/lib/python2.7/site-packages/graphite-0.71-py2.7.egg',
'/root/graphite/lib/python2.7/site-packages/spring-5.8.7-py2.7-linux-x86_64.egg',
'/root/graphite/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-linux-x86_64.egg',
'/root/graphite/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg',
'/root/graphite/lib/python2.7/site-packages/numpy-1.12.0rc2-py2.7-linux-x86_64.egg',
'/root/graphite/lib/python2.7/site-packages/logger-1.4-py2.7.egg',
'/root/graphite/lib/python2.7/site-packages/decorator-4.0.10-py2.7.egg',
'/root/graphite/lib/python2.7/site-packages/sping-1.1.15-py2.5.egg',
'/usr/local/python27/lib/python2.7/site-packages',
'/root/graphite/webapp/graphite/thirdparty']
Server time: Fri, 6 Jan 2017 14:47:46 +0800
解决
1
2
3
4
5
可能 Django 版本不对应导致的?
django==1.10.5 --no
django==1.9.12 --no
django==1.8.17 --ok

参考

Doc
Blog
Describe
graphite 目录
1
2
3
4
5
6
7
8
(/root/graphite/lib/python2.7/site-packages/opt/graphite in virtualenv, otherwise /opt/graphite)
build
bin 二进制文件目录,包括 carbon-cache.py、carbon-relay.py、validate-storage-schemas.py、carbon-aggregator.py、carbon-client.py 等程序
conf 配置文件目录
lib 库目录
storage 数据存放目录,包括 log,whisper 数据库,索引,rrd 数据等
webapp webapp 文件存放目录
Github
Resource

Superset

 出于篇幅考虑,单独写了一篇博客,详见:Apache Superset

Druid Client

 可以进行流控、权限管理、统一 SQL 层 等,便于 Druid 的服务化

参考

架构

Lambda 流式架构

参考

数据结构

R-tree

参考

HyperLogLog

 基数计数会得到一个近似精确的计算结果,比如在执行 Count / Distinct Count 等计数查询的时候,会返回一个浮点数作为预估值

整体知识树

Druid

踩过的坑

“true” / “false” 字符串存为维度后成 NULL

解决

 Druid 本身是无法将 ture / false 之类的 boolean 类型作为维度的,可以考虑将 "true" / "false" 字符串作为维度存入
 但是,如果自定义的 Bean 对象中,有 String isTimeout = "false" 的属性存在,就不能直接使用 JSON.toJSONString 进行转换。因为 toJSONString 方法中会识别出 "true"/"false" 字符串,并将其自动转化为 boolean 类型。因此,需要通过 Map<String, Object> 将所有字段都存入,然后再调用 JSON.toJSONString 方法即可

1
2
3
4
5
6
7
8
9
10
11
$ bin/plyql --host localhost:8082 -q "select * from log"
┌─────────────────────────────────────────┬───────┬───────────┬─────┬─────┬──────┬──────────────────────────────────────┐
│ __time │ count │ isTimeout │ max │ min │ sum │ uuid │
├─────────────────────────────────────────┼───────┼───────────┼─────┼─────┼──────┼──────────────────────────────────────┤
│ Wed Aug 02 2017 17:35:00 GMT+0800 (CST) │ 4 │ NULL │ 860 │ 860 │ 3440 │ 4621a23d-8270-4bc3-948a-f577b460d72b │
│ Wed Aug 02 2017 17:42:00 GMT+0800 (CST) │ 1 │ NULL │ 860 │ 860 │ 860 │ 4621a23d-8270-4bc3-948a-f577b460d72b │
│ Wed Aug 02 2017 17:44:00 GMT+0800 (CST) │ 1 │ NULL │ 860 │ 860 │ 860 │ 4621a23d-8270-4bc3-948a-f577b460d72b │
│ Wed Aug 02 2017 18:03:00 GMT+0800 (CST) │ 3 │ NULL │ 0 │ 0 │ 0 │ 85f030bd-d737-4863-9af1-e6fd8bd3b15c │
│ Wed Aug 02 2017 19:01:24 GMT+0800 (CST) │ 2 │ NULL │ 0 │ 0 │ 0 │ 85f030bd-d737-4863-9af1-e6fd8bd3b15c │
│ Wed Aug 02 2017 19:09:49 GMT+0800 (CST) │ 1 │ false │ 0 │ 0 │ 0 │ ba11de00-7faf-4eaf-a8ea-1cf3c5033de5 │
└─────────────────────────────────────────┴───────┴───────────┴─────┴─────┴──────┴──────────────────────────────────────┘

Pool was initialized with limit = 0

描述

1
2
3
4
5
6
7
# 执行 RESTful 查询语句的时候,报错
{
"error" : "Unknown exception",
"errorMessage" : "Pool was initialized with limit = 0, there are no objects to take.",
"errorClass" : "java.lang.IllegalStateException",
"host" : "druid01:8101"
}

解决

1
2
3
4
5
6
# 检查 broker、historical、middleManger 是否都配置了 `druid.processing.numMergeBuffers`
$ cd /home/druid/software/druid/conf/druid
$ cat broker/runtime.properties historical/runtime.properties middleManager/runtime.properties | grep numMergeBuffers
druid.processing.numMergeBuffers=4
druid.processing.numMergeBuffers=4
druid.processing.numMergeBuffers=4

参考

社区跟进

 详见:《开源社区》

资料

Doc

Blog

Druid

Kylin

Group

Wiki

Book

更多资源,欢迎加入,一起交流学习

QQ group: (人工智能 1020982 (高级) & 1217710 (进阶) | BigData 1670647)