ducklake.create#

ducklake.create(
catalog_url: str | sa.URL,
*,
data_path: str,
storage_options: dict[str, str] | None = None,
) Ducklake[source]#

Create a new DuckLake by initializing a new catalog database.

Currently, supported catalog databases are PostgreSQL, MySQL, and SQLite.

Parameters:
  • catalog_url – The URL of the catalog database. This may either be a string or a URL object from sqlalchemy. If a string is provided, the URL must be sqlalchemy-compatible.

  • data_path – The root path where data files should be stored. This may be a local path (including NFS paths) or a cloud storage path for S3, GCS, or Azure Blob Storage. See also: https://ducklake.select/docs/stable/duckdb/usage/choosing_storage.

  • storage_options – Optional dictionary of storage options. These may be provided to connect to cloud storage services. If not provided, storage options will be inferred from environment variables.

Returns:

A Ducklake instance that can be used to interact with the DuckLake.

Raises:

AlreadyInitializedError – If the catalog database is already initialized. In this case, call connect() instead.