如何成为 Apache 的 PMC
关于本文
本文主要是为了,记录给 Apache Druid
/ Apache Eagle
/ Apache Flink
/ Apache HBase
/ Apache Kafka
/ Apache Superset
/ Apache ZooKeeper
& Apache Curator
/ TensorFlow
/ Alibaba DataX
等开源项目贡献代码,尽自己一点绵薄之力的过程
文章最后,总结了一些经验之谈,期冀能帮助到同样热爱开源、也想成为 PMC 的小伙伴们
开源贡献纪实
Apache / Druid
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
Some code refactor for better performance of Avro-Extension #4092 |
Merged | 2017-03-22 | 2017-04-25 |
Explain Avro´s unnecessary EOFException (#4098) #4100 | Merged | 2017-03-23 | 2017-03-24 |
Improve collection related things that reusing a immutable object instead of creating a new object #4135 |
Merged | 2017-03-30 | 2017-05-17 |
Increment the resource of JVM and the number of threads in Travis instead of default #4139 |
Closed | 2017-03-31 | |
Update outdated RLE paper and improve some code refactoring #4286 | Merged | 2017-05-17 | 2017-05-19 |
Fix bug in SegmentAnalyzer.analyzeComplexColumn() #5939 #5954 | Merged | 2018-07-07 | 2018-07-10 |
Remove redundant type parameters and enforce some other style and inspection rules #5980 | Merged | 2018-07-09 | 2018-07-28 |
Add IRC#druid-dev shields.io into README #6002 | Merged | 2018-07-13 | 2018-07-22 |
Add the ‘—fail-at-end’ option to maven command for ‘strictly compiled’ part #6078 | Merged | 2018-07-31 | 2018-08-01 |
Fix missing exception handling as part of io.druid.java.util.http.client.netty.HttpClientPipelineFactory #6090 |
Merged | 2018-08-02 | 2018-08-11 |
Make time-related variables more readable #6158 | Merged | 2018-08-12 | 2018-08-22 |
Add maven.exec.xxx.skip option for exec-maven-plugin #6162 | Merged | 2018-08-13 | 2018-09-25 |
Fix assertionError at testCheckpointForInactiveTaskGroup in KafkaSupervisorTest #6192 | Merged | 2018-08-19 | 2018-08-22 |
Fix wrong counter getFailedSendingTimeCounter method #6793 | Merged | 2019-01-02 | 2019-01-02 |
In addition to special cases such as avoiding deadlock, make sure that the current thread has got the connectionLock object lock when accessing the statements object #6903 | Merged | 2019-01-23 | 2019-01-28 |
For performance reasons, use java.util.Base64 instead of Base64 in Apache Commons Codec and Guava #6913 |
Merged | 2019-01-25 | 2019-01-26 |
Add “REVERSE” / “REPEAT” / “RIGHT” / “LEFT” functions #7334 | Merged | 2019-03-23 | 2019-04-10 |
Fix broken links in api-reference.md #7670 | Merged | 2019-04-16 | 2019-04-16 |
Bump httpcore from 4.4.4 to 4.4.11 #7870 | Merged | 2019-06-12 | 2019-08-10 |
Optimize images by ImgBot #7873 | Merged | 2019-06-12 | 2019-06-25 |
Bump jmh from 1.19 to 1.21 #7876 | Merged | 2019-06-13 | 2019-06-17 |
Bump RoaringBitmap from 0.8.0 to 0.8.6 #7906 | Merged | 2019-06-17 | 2019-06-17 |
Bump commons-cli from 1.2 to 1.3.1 #7966 | Merged | 2019-06-26 | 2019-06-26 |
Bump jaxb-api from 2.3.0 to 2.3.1 #7978 | Merged | 2019-06-27 | 2019-06-27 |
Bump commons-validator from 1.4.0 to 1.5.1 #7987 | Merged | 2019-06-28 | 2019-06-28 |
Bump commons-codec from 1.7 to 1.12 #7995 | Merged | 2019-06-29 | 2019-06-29 |
Bump rhino from 1.7R5 to 1.7.11 #8008 | Merged | 2019-07-02 | 2019-08-10 |
Bump JUnitParams from 1.0.4 to 1.1.1 #8017 | Merged | 2019-07-03 | 2019-08-21 |
Hide descriptive comments in pull_request_template.md #8313 |
Merged | 2019-08-15 | 2019-08-16 |
Fix missing format argument #8331 | Merged | 2019-08-19 | 2019-08-19 |
Fix resource leak #8337 | Merged | 2019-08-19 | 2019-08-20 |
Add grade shield #8344 | Merged | 2019-08-20 | 2019-08-21 |
Fix unused format argument #8345 | Merged | 2019-08-20 | 2019-08-21 |
Fix result of division may be truncated #8355 | Merged | 2019-08-21 | 2019-09-04 |
Reduce the size of images with lossless compression #8358 | Merged | 2019-08-21 | 2019-08-22 |
Suppress index-out-of-bounds warning from LGTM about loop unrolling #8380 | Merged | 2019-08-23 | 2019-09-04 |
Fix inconsistent equals and hashCode #8381 | Merged | 2019-08-23 | 2019-09-04 |
Fix alerts from LGTM about python files #8383 | Merged | 2019-08-23 | 2019-09-07 |
Add docker shield #8403 | Merged | 2019-08-26 | 2019-08-27 |
Fix missing space in string literal and spurious Javadoc @param tags from LGTM #8491 | Merged | 2019-09-09 | 2019-09-16 |
Fix resource leaks and suppress an incorrect LGTM alert #8589 | Merged | 2019-09-25 | 2019-10-11 |
Fix NPE for subquery with limit #8775 | Merged | 2019-10-29 | 2019-12-18 |
Exclude .asf.yaml from the configuration of the rat plugin #9088 | Merged | 2019-12-21 | 2019-12-24 |
Issues
All issues in Druid
Open / Close / Total : 0 / 5 / 5
Apache / Eagle
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
[EAGLE-981] GC overhead limit exceeded #896 | Merged | 2017-03-30 | 2017-04-18 |
[EAGLE-982] The log length has exceeded the limit of 4 MB in Travis #897 | Merged | 2017-03-30 | 2017-04-18 |
[EAGLE-992] HBase Naming that unify Hbase and HBase into HBase #905 |
Merged | 2017-04-06 | 2017-04-18 |
[EAGLE-1009] Fix return inside finally block may result in losing exception #920 |
Merged | 2017-04-18 | 2017-04-19 |
[MINOR] Fix some project construction problems about test sources #922 |
Closed | 2017-04-19 | |
[EAGLE-1012] Some language level problems #925 | Closed | 2017-04-19 | |
[MINOR] Fxi sprk/egle issues in JPM module #943 | Closed | 2017-06-07 |
Issues
All issues in Eagle
Open / Close / Total : 4 / 5 / 9
Apache / Flink
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
FLINK-6868[build] Using scala.binary.version for flink-streaming-scala in Cassandra Connector #4087 |
Merged | 2017-06-08 | 2017-06-25 |
FLINK-7369: Add more information for Key group index out of range of key group range exception #4474 |
Merged | 2017-08-04 | 2017-08-04 |
Issues
All issues in Flink
Open / Close / Total : 0 / 3 / 3
Tips: How to Contribute to Flink
Apache / HBase
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
HBASE-18470: Fix a bug in RetriesExhaustedWithDetailsException#getDesc describe#56 |
Merged | 2017-07-28 | 2017-08-03 |
Apache / Kafka
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
[MINOR] Improve runtime / storage / metrics / config parts #4525 | Merged | 2018-02-05 | 2018-02-14 |
MINOR: Catch null pointer exception for empty leader URL when assignment is null #4798 | Merged | 2018-03-30 | 2018-11-17 |
MINOR: Remove magic number and extract Pattern instance from method as class field #4799 | Merged | 2018-03-30 | 2018-04-09 |
Apache / Superset
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
little code refactor in models.py #2124 | Merged | 2017-02-07 | 2017-02-07 |
Fix werkzeug instance was created twice in Debug Mode (#2135) #2136 | Merged | 2017-02-08 | 2017-02-14 |
Fix ExtDeprecationWarning (#2137) #2138 | Merged | 2017-02-08 | 2017-02-09 |
Some code refactoring #2139 | Merged | 2017-02-08 | 2017-02-09 |
Using the time zone with specific name for querying Druid #2143 | Merged | 2017-02-09 | 2017-02-10 |
Aggregate data outside of topN into a single category #2176 | Closed | 2017-02-15 | |
Fix UNKNOWN option in setup.py #2199 | Closed | 2017-02-17 | |
fix timezone issues in slices (#2354) #2370 | Closed | 2017-03-08 | |
Fix unicode issues (#2308 #2282) #2401 | Merged | 2017-03-14 | 2017-03-15 |
Fix rst grammar problems #4116 | Merged | 2017-12-26 | 2017-12-26 |
Fix invaild gitter url #4125 | Merged | 2017-12-27 | 2018-01-05 |
Hanization #4126 | Merged | 2017-12-27 | 2018-01-13 |
Issues
All issues in Superset
Open / Close / Total : 0 / 21 / 21
Apache / ZooKeeper
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
ZOOKEEPER-2784: Add same sid config problem check #257 |
Closed | 2017-05-18 | |
ZOOKEEPER-2789: Reassign ZXID for solving 32bit overflow problem #262 |
Open | 2017-05-23 | |
ZOOKEEPER-2815: 1. Using try clause to close resource; 2. Others code refactoring for PERSISTENCE module #283 | Merged | 2017-06-16 | 2017-06-26 |
ZOOKEEPER-2816: Code refactoring for ZK_SERVER module #288 |
Merged | 2017-06-20 | 2017-06-26 |
ZOOKEEPER-2817: Using Collections.singletonList instead of Arrays.asList(oneElement) #290 |
Closed | 2017-06-22 | |
ZOOKEEPER-2821: 1. Fix spell issues; 2. Remove unnecessary boxing / unboxing; 3. Simplify return clause; 4. … #293 |
Closed | 2017-06-27 | |
ZOOKEEPER-2822: Wrong ObjectName about MBeanServer in JMX module #294 |
Merged | 2017-06-27 | 2018-11-27 |
ZOOKEEPER-2823: 1. Fix spell issues; 2. Standardize StringBuilder#append usage; 3. Using try clause for … #295 |
Closed | 2017-06-28 | |
ZOOKEEPER-2824: FileChannel#size info should be added to FileTxnLog#commit to solve the confuse that reason is too large log or too busy disk I/O #296 |
Merged | 2017-06-28 | 2018-02-02 |
ZOOKEEPER-2825: 1. Remove unnecessary import; 2. contains instead of indexOf > -1 for more readable … #297 |
Merged | 2017-06-29 | 2018-01-31 |
ZOOKEEPER-2826: Code refactoring for CLI module #298 |
Merged | 2017-06-29 | 2018-01-31 |
ZOOKEEPER-2835: Run server with -XX:+AlwaysPreTouch jvm flag #301 |
Closed | 2017-07-04 | |
ZOOKEEPER-2837: Add a special START_SERVER_JVMFLAGS option only for start command to distinguish … #302 |
Closed | 2017-07-04 | |
ZOOKEEPER-2840: Should using System.nanoTime() ^ this.hashCode() for StaticHostProvider #303 |
Open | 2017-07-05 | |
ZOOKEEPER-2892: Improve lazy initialize and close stream for PrepRequestProcessor #361 |
Merged | 2017-09-06 | 2018-02-07 |
Issues
All issues in ZooKeeper
Open / Close / Total : 10 / 8 / 18
Apache / Curator
Pull Request
Title | Status | Create Date | Merge Date |
---|---|---|---|
CURATOR-523: Fix ByteBuffer’s compatibility issues #321 | Merged | 2019-08-01 | 2019-08-23 |
其他
- Alibaba DataX
- Aliyun TSDB Java SDK
- Apache Atlas
- Apache IoTDB
- Apache Pinot
- Helm Charts
- CrateDB
- Elasticsearch
- Grpc-Java
- Hexo Theme Next
- HttpRunner
- ID-CNN-CWS
- Kafka Connect UI
- Lucene
- OpenTSDB
- Snappy-Java
- Tensorflow
初衷
需要提前说明的是,大家不能为了成为 PMC 而去成为 PMC。我们的初衷应该是认可开源,热爱开源,乐于和全球的开发者一起努力,做出来一些有意义的东西。换句话说,如果一行代码都还没贡献,就开始到处找攻略或捷径,则是很不可取的。当然有这个目标是很好的,但是正常情况下,成为 PMC 不是一蹴而就的事情。需要我们付出足够多的心力和汗水之后,才会水到渠成地收到来自 ASF 的邀请函。
技巧
如果说成为 PMC 这件事完全没有任何技巧,也是不可能的。下面就来说一些大家需要注意的事项,尽量地少走弯路:
选择项目
第一步当然是寻找一个合适自己的项目,能和自己的兴趣或工作内容相关的项目,自然是最好不过了。另外,也要从是否有商业公司在背后主导,是否有长期活跃的 PMC,是否具备和其他同类产品的核心竞争力,是否在架构上存在重大缺陷,是否对新代码贡献者很包容 等等角度来判断,这个项目是否值得你去付出大量的精力。
代码贡献
按理说,任何的代码提交,都已经先建立 issues 发起一次讨论。一方面,可以让社区里的其他代码贡献者,知道你想要做什么样的事情;另一方面,也可以听取其他人的意见,包括你要做的这件事的必要性,以及是否有其他更好的实现思路。如果刚开始对某一个项目进行代码贡献,则可以从认领一些 issues 来开始。比较好的开源项目,还会在 issues 上打上类似 Contributions Welcome
的标签。
代码评审
很多小伙伴会认为只有 Committer 才能对其他人的代码进行 review。其实不然,任何你想到的有价值的意见,都可以提出来。并且,在帮助评审他人代码的同时,还能学习到别人解决问题的方式方法。尤其是在一些有争议的地方,可以看到不同的贡献者,针对同一个问题从不同角度的思考,对自己的思维开拓也是大有裨益的。
代码提交的几个注意点
通用技巧
保持 Diff 信息最简
PR 中不应该做无关的 Code Format(import / 代码缩进 等)
保持 PR 的相关性
不做无关的 代码优化,尤其是对其他不相关的类(可以另起 PR 进行优化)
保持 代码风格一致
关闭 IDE 自动优化 import 合并为通配符 *
的功能(如,import java.io.*
等)
因地制宜
Apache Druid
遵照 Apache Druid 编码规则,比如 参数列表,需要分行写 等
Apache Eagle
Apache Eagle 需要 squash PR 中的 commits 等等
Apache HBase
Apache HBase 是以 Patch + Jira 为主的代码维护模式,Github 里面只是代码镜像,所以提交代码不需要创建 PR
Apache Superset
Apache Superset / Tensorflow 之类的 Python 项目,需要注意 PEP 规范
Apache ZooKeeper
参照官方给出的提交代码的文档即可 How to Contribute to ZooKeeper
Elasticsearch
除了遵循 CONTRIBUTING.md 文档中提及的内容,还有一些不成文的规定,例如 Elasticsearch 的 Committer 对于 varA == false
这类冗余的写法很是赞同,认为可以减少丢失 !
的风险
好的习惯
- 在提交之前,先更新 master 分支,并通过
git rebase -i master
命令,将自己的提交置顶(主分支,也可能不叫 master,比如 Kafka 的主分支是 trunk) - 保证自己的代码,能够被单元测试覆盖到。如果原本的测试用例,无法覆盖到,则需要自己编写对应的单元测试
- 提交性能提升型 PR,需要自己写好 benchmark,并贴出压测结果
- 提交 PR 的时候,在标题的前面增加
[JIRA]
(对应的 Jira 号)、[MINOR]
(微小的改动)、[WIP]
(未完成的修改)和[Benchmarking]
(性能测试中) 之类的标示,可以帮助 Committer 更高效地处理 PR
Tips: 关于 Git 的相关操作,详见我的另一篇博客:《Git 高级玩法》
那么成为 Apache PMC 有什么好处?
可以拥有一个 @apache.org 的邮箱
在任何一个 Apache 项目中发言都将标识 Member 徽章
可以免费使用 IDEA 全家桶中的所有产品
(对 IntelliJ IDEA™ 的截图)