Aapche Drill 是什么?
Apache Drill ™ is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google’s Dremel .
(图片来源:Pexels ™ 官网,已确认无版权)
优缺点 优势
支持自定义的嵌套数据结构
兼容 Hive(包括 Hive 的 UDF,且支持自定义 UDF)
高性能、低延迟的 SQL 查询
支持多数据源(插件化,包括 Apache Kafka 、Apache HBase 、Apache Hive、OpenTSDB、S3 等 )
UDF(User Defined Funcation):用户定义普通函数,只作用于单行记录
UDAF(User Defined Aggregation Funcation):用户定义聚合函数,只作用于多行记录
UDTF(User Defined Table Generating Funcation):用户定义表生成函数,可以输入一行记录输出多行记录
劣势
比对 Apache Drill vs Presto
Apache Drill
Presto
针对领域
非关系型数据库
分布式数据库
企业级
✔
✔
成功案例
MapR™
Teradata™
部署
较繁琐
较快捷
实战 1 2 3 4 $ wget https://mirror.bit.edu.cn/apache/drill/drill-1.18.0/apache-drill-1.18.0.tar.gz $ tar zxvf apache-drill-1.18.0.tar.gz $ ln -s apache-drill-1.18.0 drill $ cd drill
启动
1 2 Apache Drill 1.18.0 "In Drill We Trust."
展示所有 Tables
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 +-----------+--------------------+----------------------+--------------+---------+----------+------------+-----------+---------------------------+----------------+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_CAT | TYPE_SCHEM | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | +-----------+--------------------+----------------------+--------------+---------+----------+------------+-----------+---------------------------+----------------+ | DRILL | information_schema | CATALOGS | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | COLUMNS | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | FILES | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | PARTITIONS | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | SCHEMATA | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | TABLES | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | information_schema | VIEWS | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | boot | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | connections | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | drillbits | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | functions | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | internal_options | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | internal_options_old | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | memory | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | options | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | options_old | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | profiles | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | profiles_json | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | threads | SYSTEM TABLE | null | null | null | null | null | null | | DRILL | sys | version | SYSTEM TABLE | null | null | null | null | null | null | +-----------+--------------------+----------------------+--------------+---------+----------+------------+-----------+---------------------------+----------------+
查询 1 apache drill> select * from cp.`employee.json` limit 2 ;
1 2 3 4 5 6 7 +-------------+-----------------+------------+-----------+-------------+--------------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+ | employee_id | full_name | first_name | last_name | position_id | position_title | store_id | department_id | birth_date | hire_date | salary | supervisor_id | education_level | marital_status | gender | management_role | +-------------+-----------------+------------+-----------+-------------+--------------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+ | 1 | Sheri Nowmer | Sheri | Nowmer | 1 | President | 0 | 1 | 1961-08-26 | 1994-12-01 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S | F | Senior Management | | 2 | Derrick Whelply | Derrick | Whelply | 2 | VP Country Manager | 0 | 1 | 1915-07-03 | 1994-12-01 00:00:00.0 | 40000.0 | 1 | Graduate Degree | M | M | Senior Management | +-------------+-----------------+------------+-----------+-------------+--------------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+ 2 rows selected (0.281 seconds)
退出
踩过的坑 解决 1 SELECT to_timestamp(CONCAT(`timestamp `, ' +0800' ), 'YYYY-MM-dd HH:mm:ss.SSS Z' ) AS tm FROM `yuzhouwan`.`blog`
资料 Doc
群名称
群号
人工智能(高级)
人工智能(进阶)
BigData
算法