Documentation Index
Fetch the complete documentation index at: https://docs.summation.com/llms.txt
Use this file to discover all available pages before exploring further.
The Databricks connector supports three connection modes. The form changes depending on which mode you pick — the common fields are always there, then a mode-specific section appears below.
| Mode | When to use it |
|---|
| SQL Warehouse | Query through a Databricks SQL Warehouse. Recommended for most users. See SQL warehouse types. |
| Spark Connect | Query through an interactive cluster with Spark Connect. See Compute. |
| Delta Lake | Read Delta tables directly from object storage (S3, Azure Blob, or GCS) without going through a Databricks cluster. See What is Delta Lake?. |
For all modes, find your endpoint, warehouse ID, and cluster ID in the Databricks UI under Connection details for compute resources.
Common fields
These appear in every mode:
| Field | Required | Stored as | Notes |
|---|
| Mode | Yes | Config | One of SQL Warehouse, Spark Connect, Delta Lake. |
| Endpoint | Yes | Config | Workspace hostname, e.g. dbc-a1b2345c-d6e7.cloud.databricks.com. Don’t include https://. See Connection details. |
| Use SSL | Yes | Config | true or false. Almost always true. |
| Authentication | Yes | Config | Personal Access Token or Service Principal (see below). |
Authentication
Choose one of the two:
| Auth type | Fields | Vendor docs |
|---|
| Personal Access Token | Personal Access Token (starts with dapi, stored as a secret) | Personal access token authentication |
| Service Principal | Client ID (config), Client Secret (secret) | Service principals for Databricks automation |
Mode-specific fields
SQL Warehouse
| Field | Required | Notes |
|---|
| SQL Warehouse ID | Yes | The 16-character ID of the warehouse, e.g. 2b4e24cff378fb24. Find it under SQL Warehouses → your warehouse → Connection details, documented at Get connection details for a Databricks SQL warehouse. |
Spark Connect
| Field | Required | Notes |
|---|
| Cluster ID | Yes | e.g. 1234-567890-abcde123. Find it under Compute → your cluster → ⋮ More → View JSON, or in the cluster URL. See Get connection details for a Databricks compute resource. |
Delta Lake
In Delta Lake mode, Summation reads files directly from your object store. You’ll see one extra field plus an Object Store picker.
| Field | Required | Notes |
|---|
| Client Timeout | Optional | e.g. 30s. |
| Object Store | Yes | AWS S3, Azure Blob, or Google Cloud Storage. |
The fields shown after that depend on which object store you pick:
AWS S3
| Field | Required | Stored as | Notes |
|---|
| AWS Region | Optional | Config | e.g. us-west-2. See AWS Regions and Zones. |
| AWS Endpoint | Optional | Config | Custom S3 endpoint, e.g. s3.us-west-2.amazonaws.com. |
| AWS Access Key ID | Yes | Secret | See Managing access keys for IAM users. |
| AWS Secret Access Key | Yes | Secret | |
| Allow HTTP | Optional | Config | true or false. Default false. |
Azure Blob
| Field | Required | Stored as | Notes |
|---|
| Azure Storage Account Name | Yes | Config | e.g. myaccount. See Storage account overview. |
| Azure Storage Endpoint | Optional | Config | e.g. blob.core.windows.net. |
| Azure Authentication | Yes | Config | Account Key, Service Principal, or SAS Key. See Authorize access to Azure Blob Storage. |
| Azure Storage Account Key | If Account Key | Secret | See Manage account access keys. |
| Azure Storage Client ID | If Service Principal | Secret | |
| Azure Storage Client Secret | If Service Principal | Secret | |
| Azure Storage SAS Key | If SAS Key | Secret | See Grant limited access with SAS. |
Google Cloud Storage
| Field | Required | Stored as | Notes |
|---|
| Google Service Account Path | Yes | Config | Path to a service account JSON file, e.g. /path/to/service-account.json. See Create and delete service account keys. |
Adding datasets
For SQL Warehouse and Spark Connect, browse Unity Catalog catalogs / schemas / tables. Source references use the form:
databricks:catalog.schema.table
For Delta Lake, point at s3://, abfss://, or gs:// Delta table paths directly.
Common problems
| Error or symptom | Likely cause |
|---|
401 Unauthorized | PAT is expired, or the service principal doesn’t have workspace access. Regenerate or re-grant. |
Cluster ... is not running | Spark Connect requires the cluster to be running. Use a SQL Warehouse for serverless / on-demand workloads. |
| Delta Lake mode can’t read files | Storage credentials or IAM are wrong. A Databricks PAT alone isn’t enough — the storage policy must permit reads of the path. |