Writer

Write to Delta Tables

deltalake.write_deltalake

write_deltalake(table_or_uri: str | Path | DeltaTable, data: ArrowStreamExportable | ArrowArrayExportable | Sequence[ArrowArrayExportable], *, partition_by: list[str] | str | None = None, mode: Literal['error', 'append', 'overwrite', 'ignore'] = 'error', name: str | None = None, description: str | None = None, configuration: Mapping[str, str | None] | None = None, schema_mode: Literal['merge', 'overwrite'] | None = None, storage_options: dict[str, str] | None = None, predicate: str | None = None, target_file_size: int | None = None, writer_properties: WriterProperties | None = None, post_commithook_properties: PostCommitHookProperties | None = None, commit_properties: CommitProperties | None = None) -> None

Write to a Delta Lake table

If the table does not already exist, it will be created.

Parameters:

Name	Type	Description	Default
`table_or_uri`	`str \| Path \| DeltaTable`	URI of a table or a DeltaTable object.	required
`data`	`ArrowStreamExportable \| ArrowArrayExportable \| Sequence[ArrowArrayExportable]`	Data to write. If passing iterable, the schema must also be given.	required
`partition_by`	`list[str] \| str \| None`	List of columns to partition the table by. Only required when creating a new table.	`None`
`mode`	`Literal['error', 'append', 'overwrite', 'ignore']`	How to handle existing data. Default is to error if table already exists. If 'append', will add new data. If 'overwrite', will replace table with new data. If 'ignore', will not write anything if table already exists.	`'error'`
`name`	`str \| None`	User-provided identifier for this table.	`None`
`description`	`str \| None`	User-provided description for this table.	`None`
`configuration`	`Mapping[str, str \| None] \| None`	A map containing configuration options for the metadata action.	`None`
`schema_mode`	`Literal['merge', 'overwrite'] \| None`	If set to "overwrite", allows replacing the schema of the table. Set to "merge" to merge with existing schema.	`None`
`storage_options`	`dict[str, str] \| None`	options passed to the native delta filesystem.	`None`
`predicate`	`str \| None`	When using `Overwrite` mode, replace data that matches a predicate. Only used in rust engine.'	`None`
`target_file_size`	`int \| None`	Override for target file size for data files written to the delta table. If not passed, it's taken from `delta.targetFileSize`.	`None`
`writer_properties`	`WriterProperties \| None`	Pass writer properties to the Rust parquet writer.	`None`
`post_commithook_properties`	`PostCommitHookProperties \| None`	properties for the post commit hook. If None, default values are used.	`None`
`commit_properties`	`CommitProperties \| None`	properties of the transaction commit. If None, default values are used.	`None`

deltalake.BloomFilterProperties `dataclass`

BloomFilterProperties(set_bloom_filter_enabled: bool | None, fpp: float | None = None, ndv: int | None = None)

The Bloom Filter Properties instance for the Rust parquet writer.

Create a Bloom Filter Properties instance for the Rust parquet writer:

Parameters:

Name	Type	Description	Default
`set_bloom_filter_enabled`	`bool \| None`	If True and no fpp or ndv are provided, the default values will be used.	required
`fpp`	`float \| None`	The false positive probability for the bloom filter. Must be between 0 and 1 exclusive.	`None`
`ndv`	`int \| None`	The number of distinct values for the bloom filter.	`None`

deltalake.ColumnProperties `dataclass`

ColumnProperties(dictionary_enabled: bool | None = None, statistics_enabled: Literal['NONE', 'CHUNK', 'PAGE'] | None = None, bloom_filter_properties: BloomFilterProperties | None = None)

The Column Properties instance for the Rust parquet writer.

Create a Column Properties instance for the Rust parquet writer:

Parameters:

Name	Type	Description	Default
`dictionary_enabled`	`bool \| None`	Enable dictionary encoding for the column.	`None`
`statistics_enabled`	`Literal['NONE', 'CHUNK', 'PAGE'] \| None`	Statistics level for the column.	`None`
`bloom_filter_properties`	`BloomFilterProperties \| None`	Bloom Filter Properties for the column.	`None`

deltalake.WriterProperties `dataclass`

WriterProperties(data_page_size_limit: int | None = None, dictionary_page_size_limit: int | None = None, data_page_row_count_limit: int | None = None, write_batch_size: int | None = None, max_row_group_size: int | None = None, compression: Literal['UNCOMPRESSED', 'SNAPPY', 'GZIP', 'BROTLI', 'LZ4', 'ZSTD', 'LZ4_RAW'] | None = None, compression_level: int | None = None, statistics_truncate_length: int | None = None, default_column_properties: ColumnProperties | None = None, column_properties: dict[str, ColumnProperties] | None = None)

A Writer Properties instance for the Rust parquet writer.

Create a Writer Properties instance for the Rust parquet writer:

Parameters:

Name	Type	Description	Default
`data_page_size_limit`	`int \| None`	Limit DataPage size to this in bytes.	`None`
`dictionary_page_size_limit`	`int \| None`	Limit the size of each DataPage to store dicts to this amount in bytes.	`None`
`data_page_row_count_limit`	`int \| None`	Limit the number of rows in each DataPage.	`None`
`write_batch_size`	`int \| None`	Splits internally to smaller batch size.	`None`
`max_row_group_size`	`int \| None`	Max number of rows in row group.	`None`
`compression`	`Literal['UNCOMPRESSED', 'SNAPPY', 'GZIP', 'BROTLI', 'LZ4', 'ZSTD', 'LZ4_RAW'] \| None`	compression type.	`None`
`compression_level`	`int \| None`	If none and compression has a level, the default level will be used, only relevant for GZIP: levels (1-9), BROTLI: levels (1-11), ZSTD: levels (1-22),	`None`
`statistics_truncate_length`	`int \| None`	maximum length of truncated min/max values in statistics.	`None`
`default_column_properties`	`ColumnProperties \| None`	Default Column Properties for the Rust parquet writer.	`None`
`column_properties`	`dict[str, ColumnProperties] \| None`	Column Properties for the Rust parquet writer.	`None`

Convert to Delta Tables

deltalake.convert_to_deltalake

convert_to_deltalake(uri: str | Path, mode: Literal['error', 'ignore'] = 'error', partition_by: Schema | None = None, partition_strategy: Literal['hive'] | None = None, name: str | None = None, description: str | None = None, configuration: Mapping[str, str | None] | None = None, storage_options: dict[str, str] | None = None, commit_properties: CommitProperties | None = None, post_commithook_properties: PostCommitHookProperties | None = None) -> None

Convert parquet tables to delta tables.

Currently only HIVE partitioned tables are supported. Convert to delta creates a transaction log commit with add actions, and additional properties provided such as configuration, name, and description.

Parameters:

Name	Type	Description	Default
`uri`	`str \| Path`	URI of a table.	required
`partition_by`	`Schema \| None`	Optional partitioning schema if table is partitioned.	`None`
`partition_strategy`	`Literal['hive'] \| None`	Optional partition strategy to read and convert	`None`
`mode`	`Literal['error', 'ignore']`	How to handle existing data. Default is to error if table already exists. If 'ignore', will not convert anything if table already exists.	`'error'`
`name`	`str \| None`	User-provided identifier for this table.	`None`
`description`	`str \| None`	User-provided description for this table.	`None`
`configuration`	`Mapping[str, str \| None] \| None`	A map containing configuration options for the metadata action.	`None`
`storage_options`	`dict[str, str] \| None`	options passed to the native delta filesystem. Unused if 'filesystem' is defined.	`None`
`commit_properties`	`CommitProperties \| None`	properties of the transaction commit. If None, default values are used.	`None`
`post_commithook_properties`	`PostCommitHookProperties \| None`	properties for the post commit hook. If None, default values are used.	`None`

Writer

Write to Delta Tables

deltalake.write_deltalake

deltalake.BloomFilterProperties dataclass

deltalake.ColumnProperties dataclass

deltalake.WriterProperties dataclass

Convert to Delta Tables

deltalake.convert_to_deltalake

deltalake.BloomFilterProperties `dataclass`

deltalake.ColumnProperties `dataclass`

deltalake.WriterProperties `dataclass`