ducklake.connect#

ducklake.connect(
catalog_url: str | sa.URL,
*,
at: int | dt.datetime | None = None,
migrate: bool = False,
storage_options: dict[str, str] | None = None,
) Ducklake[source]#

Connect to an existing DuckLake by connecting to its catalog database.

Currently, supported catalog databases are PostgreSQL, MySQL, and SQLite.

Parameters:
  • catalog_url – The URL of the catalog database. This may either be a string or a URL object from sqlalchemy. If a string is provided, the URL must be sqlalchemy-compatible.

  • at – Optional argument to specify a historical snapshot to connect to. This may either be a snapshot ID (int) or a snapshot timestamp (datetime). If not provided, the connection will be made to the latest snapshot. If provided, the connection will be read-only. Trying to make any modifications will raise an exception.

  • migrate – Whether to automatically migrate to the latest supported catalog version if the catalog database is still on an older version.

  • storage_options – Optional dictionary of storage options. These may be provided to connect to cloud storage services. If not provided, storage options will be inferred from environment variables.

Returns:

A Ducklake instance that can be used to interact with the DuckLake.

Raises:

NotInitializedError – If the catalog database is not yet initialized. In this case, call create() first.