The following is the Roadmap for the Doris community in 2022. The plan includes all aspects of code features, documentation, community building, etc. that are to be developed, have already been developed, and have been completed but require ongoing optimization. The plan is currently under discussion, so if you have comments or suggestions on any aspect of the plan or beyond, please feel free to leave a comment or send an email to dev@doris.apache.org.
We will gradually create issues or jira for each direction of the plan to describe and track the progress in detail. Developers who wish to contribute are also welcome to create issues directly and associate with them (just leave a comment)
The directions marked (Good First Issue) in the plan are more independent modules, which are more suitable for newbie tasks or developers who are new to Doris. If you are interested in the relevant direction, please contact us at dev@doris.apache.org or under this issue, and we will provide detailed guidance, help and discussion.
The directions marked with (Q1) are the current work to be completed in the first quarter of 2022. We will update the schedule and progress of other directions gradually.
The marked (Done & Optimizing) directions are the directions that are currently completed but need continuous optimization. Such as ease of use, feature additions, and documentation additions.
We encourage developers to discuss anything in the dev mailing list, to subscribe to the mailing list please refer to How to subscribe.
Doris' current "materialized view" is more of a "materialized index" concept. Doris will later implement a true Materialized View to support full and incremental construction of single and multi-table views.
Support for the new UDF framework has solved the problems of high writing difficulty, poor isolation, and poor compatibility with existing C++ frameworks.
Optimize the performance of compaction task. And try to refactor the compaction logic. For example, only one replica do the compaction and sync to other replicas.
Provides a visual interface for Doris deployment, monitoring, and operations maintenance. Simplifies Doris deployment, scaling, upgrades, task management, status checking, and other operations.
Refactored Doris official website to provide best practices, community progress, blog posts, FAQ, and more.
Doris Documentation (Good First Issue)
Non-code contributions are as important as code contributions, and the community is very open to developers improving and proofreading the project documentation.
Reorganization of the official Doris documentation to improve readability, operability, and guidance.
Translation and proofreading of the English documentation.
Github Action (Good First Issue)
Introduced more Github Actions to help improve the management of the code base. This includes but is not limited to PR autoresponders, tagging, etc. If you have a good Action to recommend, please leave a comment.
27条答案
按热度按时间n1bvdmb61#
The following is the Roadmap for the Doris community in 2022.
The plan includes all aspects of code features, documentation, community building, etc. that are to be developed, have already been developed, and have been completed but require ongoing optimization.
The plan is currently under discussion, so if you have comments or suggestions on any aspect of the plan or beyond, please feel free to leave a comment or send an email to dev@doris.apache.org.
We will gradually create issues or jira for each direction of the plan to describe and track the progress in detail. Developers who wish to contribute are also welcome to create issues directly and associate with them (just leave a comment)
The directions marked (Good First Issue) in the plan are more independent modules, which are more suitable for newbie tasks or developers who are new to Doris. If you are interested in the relevant direction, please contact us at dev@doris.apache.org or under this issue, and we will provide detailed guidance, help and discussion.
The directions marked with (Q1) are the current work to be completed in the first quarter of 2022. We will update the schedule and progress of other directions gradually.
The marked (Done & Optimizing) directions are the directions that are currently completed but need continuous optimization. Such as ease of use, feature additions, and documentation additions.
We encourage developers to discuss anything in the dev mailing list, to subscribe to the mailing list please refer to How to subscribe.
Features
#7571
Extensible new query optimizer framework
#6370
Standard test set support and performance enhancements
#7572
Pipeline execution engine
Algorithm Concurrency Control and Resource Control
#7573
#7570
Map
Struct
#7574
Provides Schemaless semantics for fast analysis of semi-structured data.
Supports cold data storage to object storage at partition granularity with remote access capabilities and local Cache acceleration.
Doris' current "materialized view" is more of a "materialized index" concept. Doris will later implement a true Materialized View to support full and incremental construction of single and multi-table views.
Provide Kudu-like data update support.
#7577
WindowFunnel #8299
#7578
Support for the new UDF framework has solved the problems of high writing difficulty, poor isolation, and poor compatibility with existing C++ frameworks.
UDF #7519
UDAF #8312
UDTF
Java UDF #8389
#7579 (Good First Issue)
#7552
#7650
Add more resource limits
#7129
More builtin function support
#7678
Performance Optimization
#7580 (Q1)
Query layer vectorization
Storage level vectorization
Vectorization function supplementation
Query layer storage layer arithmetic unification
Import Vectorization
Json Parsing Optimization (Good First Issue)
#7551
#7743
Optimize the performance of compaction task. And try to refactor the compaction logic. For example, only one replica do the compaction and sync to other replicas.
Stability and Observability
Solve the problems of inaccurate memory prediction and OOM, and improve memory observability by global + thread + task level memory management.
Provides fine-grained IO speed limit, priority scheduling, etc. through global IO management.
Introduces OpenTelemetry to enhance system internal state observability and unify monitoring data format.
Testing
#7583
FE
Refine the FE single test framework to support multi-node simulation testing of features.
Provide testing framework to simplify the difficulty of writing complex unit tests (e.g. data builds) for BE.
Provide Case collection or submission framework for refining and accumulating regression test sets.
Provide a Benchmark testing framework to ensure that adding new code does not impact performance.
Implement ChaosMesh chaos testing to improve the correctness and stability of the system in case of anomalies.
Functional Optimization
#7149
#7149
Support vectorization engine Z-Order
Agg/Uniq Key model support for Z-Order
Schema Change
Lateral View
Support bitmap, string, json_array expansion (Done & Optimizing)
Array type expansion support
Table Function
Other features
#7680
CreateTableAsStmt support decimal
Deployment and Maintenance
Provides a visual interface for Doris deployment, monitoring, and operations maintenance. Simplifies Doris deployment, scaling, upgrades, task management, status checking, and other operations.
Provides a Helm Charts-based K8S deployment solution.
Peripheral Ecology
#7588
#6568 (Done & Optimizing)
#7389 (Q1)
Hudi
Parquet File Format Support
Generate and read Parquet files directly to facilitate Doris data to be read directly by external systems.
Data Integration
Routine Load support for Canal format parsing (Q1)
Flink Connector (Done & Optimizing)
Spark Connector (Done & Optimizing)
SeaTunnel Sink (Done & Optimizing)
SeaTunnel Source
DataX (Addax) (Done & Optimizing)
#7781
Compilation Tools
#7590 (Q1)
Community
Refactored Doris official website to provide best practices, community progress, blog posts, FAQ, and more.
Non-code contributions are as important as code contributions, and the community is very open to developers improving and proofreading the project documentation.
Introduced more Github Actions to help improve the management of the code base. This includes but is not limited to PR autoresponders, tagging, etc. If you have a good Action to recommend, please leave a comment.
ymzxtsji2#
For regression test and performance test, we could follow clickhouse's test method. If it is allowed, I could do this.
tvz2xvvm3#
Clang compile is already on process, see #7451
x8goxv8g4#
Could you please open an email to discuss Roadmap 2022 of Doris ?
dfuffjeb5#
支持parquet 文件存储格式也应该加进去吧
dauxcl2d6#
希望考虑跨版本升级功能。
oxalkeyp7#
What about supporting AVRO format in LOAD function?
fivyi3re8#
Looking forward to push based pipeline engine @morningman-cmy@yiguolei
8gsdolmq9#
Doris Manager:
1.Follow-up Doris Manager upgrade
2.User UI interaction improvement
3.Doris Manager supports Doris automated upgrade
h7wcgrx310#
我们公司已经有一个回归测试框架。大体是用groovy的dsl去完成测试sql、stream load、安装tpch等功能,大概使用方式如下图。
后续可以提给社区
iovurdzv11#
既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力
pbpqsu0x12#
既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力
好主意,你是否有一些RFC 模板可供参考?
30byixjq13#
这是 cockroach 的 实践
vsnjm48y14#
What about supporting AVRO format in LOAD function?
#7650
c8ib6hqw15#
What about supporting AVRO format in LOAD function?
#7650
Thx for opening an issue.