Skip to main content
Version: 1.0.x

Data Model

Columnar Storage Design Principles

Typically, transactional databases adopt row-based storage for supporting transactions and high-concurrency read-write operations, while analytical databases use column-based storage to reduce IO and facilitate compression. ByConity employs a columnar storage approach to ensure read-write performance, support transactional consistency, and accommodate large-scale data computation.

Data Layout

Table data is physically divided into multiple parts based on the partition key and stored in a unified distributed file system or cloud storage logical storage path. Each part has limits on data size and the number of rows. Computing groups obtain their corresponding parts based on service node allocation strategies (pre-allocation and real-time allocation).

Part Delta

Initially, part data is constructed as a hybrid row-column storage part data file. As DML, data dictionaries, bitmap indexes, and other construction tasks progress, there may be incremental data for the part. This incremental data can be stored in two ways:

  1. Rewriting the entire part data during each construction.
  2. Generating incremental data and asynchronously merging it into a larger part file in the background.

The first approach can have a significant impact on the overall cluster's availability. Therefore, we primarily adopt the second approach, considering the following two points:

  • Each DML, data dictionary, or other construction task may involve full IO operations on all parts of the table, which can be costly.
  • Construction times can be relatively long, causing DML and other operations to take a considerable amount of time to complete, which is not user-friendly.

Part File

Part data typically consists of two components: metadata and actual data.

  • Metadata: Contains important information about the location of data within the data file (e.g., offset), the number of rows it contains, the data schema, and column data information. This metadata needs to be persistently stored and is often cached in compute nodes for fast data location and access.
  • Actual Data: This includes the actual column data (Column Bin Data), column marker data (Column Mrk Data), Map key binary data (Map Key Bin), Map key index (Map Key Index), data dictionary data (Data Dictionary Data), bitmap index data (Bitmap Index Data), etc. This data is stored sequentially in the part's data file according to the offset information in the metadata.

Compaction

ByConity supports splitting a part file into multiple smaller files. The compacted parts need to satisfy the maximum size and row limits configured for parts.

Compaction in ByConity is performed globally, maintaining consistency with the globally unique block IDs mentioned earlier.