Skip to main content

Fuse Engine

Databend utilizes the Fuse engine as its default engine, offering a data management system with a user-friendly interface reminiscent of Git. Users have the ability to effortlessly query data at any given moment and effortlessly restore data to any desired point in time.

Related topic: Find Peter Parker in Databend

Syntax

CREATE TABLE table_name (
column_name1 column_type1,
column_name2 column_type2,
...
) [ENGINE = Fuse] [CLUSTER BY(<expr> [, <expr>, ...] )] [Options];

For more information about the CREATE TABLE command, see CREATE TABLE.

ENGINE

If an engine is not explicitly specified, Databend will automatically default to using the Fuse engine to create tables, which is equivalent to Engine = Fuse.

CLUSTER BY

The CLUSTER BY parameter specifies the sorting method for data that consists of multiple expressions, which is useful during compaction or recluster. A suitable CLUSTER BY parameter can significantly accelerate queries.

Options

The Fuse engine offers options(case-insensitive) that allow you to configure various settings such as bloom index columns, compression method, storage format, snapshot location, block size threshold, block per segment, and row per block. To modify the options of an existing table, use ALTER TABLE OPTION.

OptionSyntaxDescription
bloom_index_columnsbloom_index_columns = '<column> [, <column> ...]'Specifies the columns to be used for the bloom index. The data type of these columns can be Map, Number, String, Date, or Timestamp. If no specific columns are specified, the bloom index is created by default on all supported columns. bloom_index_columns='' disables the bloom indexing.
compressioncompression = '<compression>'Specifies the compression method for the engine. Compression options include lz4, zstd, snappy, or none. The compression method defaults to zstd in object storage and lz4 in file system (fs) storage.
storage_formatstorage_format = '<storage_format>'Specifies how data is stored. By default, the storage_format is set to Parquet, which offers high compression and is ideal for cloud-native object storage. Additionally, the experimental Native format is supported, optimizing memory copy overhead for storage devices like file systems.
snapshot_locsnapshot_loc = '<snapshot_loc>'Specifies a location parameter in string format, allowing easy sharing of a table without data copy.
block_size_thresholdblock_size_threshold = '<block_size_threshold>'Specifies the maximum data size for a file.
block_per_segmentblock_per_segment = '<block_per_segment>'Specifies the maximum number of files that can be stored in a segment.
row_per_blockrow_per_block = '<row_per_block>'Specifies the maximum number of rows that can be stored in a file.