incubator-doris Doris Roadmap 2022

pxy2qtax  于 2022-04-22  发布在  Java
关注(0)|答案(27)|浏览(491)
  • No description provided.*
n1bvdmb6

n1bvdmb61#

The following is the Roadmap for the Doris community in 2022.
The plan includes all aspects of code features, documentation, community building, etc. that are to be developed, have already been developed, and have been completed but require ongoing optimization.
The plan is currently under discussion, so if you have comments or suggestions on any aspect of the plan or beyond, please feel free to leave a comment or send an email to dev@doris.apache.org.

We will gradually create issues or jira for each direction of the plan to describe and track the progress in detail. Developers who wish to contribute are also welcome to create issues directly and associate with them (just leave a comment)

The directions marked (Good First Issue) in the plan are more independent modules, which are more suitable for newbie tasks or developers who are new to Doris. If you are interested in the relevant direction, please contact us at dev@doris.apache.org or under this issue, and we will provide detailed guidance, help and discussion.

The directions marked with (Q1) are the current work to be completed in the first quarter of 2022. We will update the schedule and progress of other directions gradually.

The marked (Done & Optimizing) directions are the directions that are currently completed but need continuous optimization. Such as ease of use, feature additions, and documentation additions.

We encourage developers to discuss anything in the dev mailing list, to subscribe to the mailing list please refer to How to subscribe.

Features

  • #7571

  • Extensible new query optimizer framework

  • #6370

  • Standard test set support and performance enhancements

  • TPC-DS feature pass rate 100%
  • TPC-H performance enhancements
  • #7572

  • Pipeline execution engine

  • Algorithm Concurrency Control and Resource Control

  • #7573

  • #7570

  • Map

  • Struct

  • #7574

Provides Schemaless semantics for fast analysis of semi-structured data.

Supports cold data storage to object storage at partition granularity with remote access capabilities and local Cache acceleration.

Doris' current "materialized view" is more of a "materialized index" concept. Doris will later implement a true Materialized View to support full and incremental construction of single and multi-table views.

Provide Kudu-like data update support.

Support for the new UDF framework has solved the problems of high writing difficulty, poor isolation, and poor compatibility with existing C++ frameworks.

Performance Optimization

  • #7580 (Q1)

  • Query layer vectorization

  • Storage level vectorization

  • Vectorization function supplementation

  • Query layer storage layer arithmetic unification

  • Import Vectorization

  • Json Parsing Optimization (Good First Issue)

  • #7551

  • #7743

Optimize the performance of compaction task. And try to refactor the compaction logic. For example, only one replica do the compaction and sync to other replicas.

Stability and Observability

Solve the problems of inaccurate memory prediction and OOM, and improve memory observability by global + thread + task level memory management.

Provides fine-grained IO speed limit, priority scheduling, etc. through global IO management.

Introduces OpenTelemetry to enhance system internal state observability and unify monitoring data format.

Testing

Refine the FE single test framework to support multi-node simulation testing of features.

  • BE

Provide testing framework to simplify the difficulty of writing complex unit tests (e.g. data builds) for BE.

Provide Case collection or submission framework for refining and accumulating regression test sets.

Provide a Benchmark testing framework to ensure that adding new code does not impact performance.

Implement ChaosMesh chaos testing to improve the correctness and stability of the system in case of anomalies.

Functional Optimization

  • #7149

  • #7149

  • Support vectorization engine Z-Order

  • Agg/Uniq Key model support for Z-Order

  • Schema Change

  • Lateral View

  • Support bitmap, string, json_array expansion (Done & Optimizing)

  • Array type expansion support

  • Table Function

  • Other features

  • #7680

  • CreateTableAsStmt support decimal

Deployment and Maintenance

Provides a visual interface for Doris deployment, monitoring, and operations maintenance. Simplifies Doris deployment, scaling, upgrades, task management, status checking, and other operations.

Provides a Helm Charts-based K8S deployment solution.

Peripheral Ecology

Generate and read Parquet files directly to facilitate Doris data to be read directly by external systems.

  • Data Integration

  • Routine Load support for Canal format parsing (Q1)

  • Flink Connector (Done & Optimizing)

  • Spark Connector (Done & Optimizing)

  • SeaTunnel Sink (Done & Optimizing)

  • SeaTunnel Source

  • DataX (Addax) (Done & Optimizing)

  • #7781

  • Compilation Tools

  • #7590 (Q1)

Community

Refactored Doris official website to provide best practices, community progress, blog posts, FAQ, and more.

  • Doris Documentation (Good First Issue)

Non-code contributions are as important as code contributions, and the community is very open to developers improving and proofreading the project documentation.

  • #6336
  • Reorganization of the official Doris documentation to improve readability, operability, and guidance.
  • Translation and proofreading of the English documentation.
  • Github Action (Good First Issue)

Introduced more Github Actions to help improve the management of the code base. This includes but is not limited to PR autoresponders, tagging, etc. If you have a good Action to recommend, please leave a comment.

ymzxtsji

ymzxtsji2#

For regression test and performance test, we could follow clickhouse's test method. If it is allowed, I could do this.

tvz2xvvm

tvz2xvvm3#

Clang compile is already on process, see #7451

x8goxv8g

x8goxv8g4#

Could you please open an email to discuss Roadmap 2022 of Doris ?

dfuffjeb

dfuffjeb5#

支持parquet 文件存储格式也应该加进去吧

dauxcl2d

dauxcl2d6#

希望考虑跨版本升级功能。

oxalkeyp

oxalkeyp7#

What about supporting AVRO format in LOAD function?

fivyi3re

fivyi3re8#

Looking forward to push based pipeline engine @morningman-cmy@yiguolei

8gsdolmq

8gsdolmq9#

Doris Manager:
1.Follow-up Doris Manager upgrade
2.User UI interaction improvement
3.Doris Manager supports Doris automated upgrade

h7wcgrx3

h7wcgrx310#

我们公司已经有一个回归测试框架。大体是用groovy的dsl去完成测试sql、stream load、安装tpch等功能,大概使用方式如下图。
后续可以提给社区

iovurdzv

iovurdzv11#

既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力

pbpqsu0x

pbpqsu0x12#

既然后续有这么多内容,关于社区部分建一个 RFC 目录挺有必要的,大型的 PR 的 design doc 放进去,一方面是为了社区新人的快速融入,另外也减小PR review的压力

好主意,你是否有一些RFC 模板可供参考?

vsnjm48y

vsnjm48y14#

What about supporting AVRO format in LOAD function?

#7650

c8ib6hqw

c8ib6hqw15#

What about supporting AVRO format in LOAD function?

#7650

Thx for opening an issue.

相关问题