> ## Documentation Index
> Fetch the complete documentation index at: https://docs.summation.com/llms.txt
> Use this file to discover all available pages before exploring further.

# GCS

> Connect Summation to Google Cloud Storage to read Parquet, CSV, JSON, JSONL, and TSV files.

The GCS connector exposes objects in a Google Cloud Storage bucket as datasets. Summation reads Parquet, CSV, JSON, JSONL, and TSV files directly from the bucket.

## What you'll need

* A **Cloud Storage bucket** and optional prefix.
* A Google Cloud **service account** with read access to the bucket. The `roles/storage.objectViewer` role includes object read and list permissions. See [IAM roles for Cloud Storage](https://docs.cloud.google.com/storage/docs/access-control/iam-roles).
* A **service account JSON key**. See [Create and delete service account keys](https://cloud.google.com/iam/docs/keys-create-delete).
* Optional **HMAC keys** if your setup requires S3-compatible Cloud Storage credentials. See [Manage HMAC keys for service accounts](https://docs.cloud.google.com/storage/docs/authentication/managing-hmackeys).

<Tip>
  Grant the service account access only to the bucket that Summation should read.
</Tip>

## Form fields

| Field                   | Required | Stored as | Notes                                                                                                                                         |
| ----------------------- | -------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| **Base Path**           | Yes      | Config    | A `gcs://` prefix that scopes the connection, e.g. `gcs://my-bucket/path`. `gs://` paths are also accepted. Browsing starts from this prefix. |
| **Service Account Key** | Yes      | Secret    | The full service account JSON key, including the surrounding `{` and `}`.                                                                     |
| **HMAC Access ID**      | Optional | Secret    | Cloud Storage HMAC access ID. If set, **HMAC Secret** must also be set.                                                                       |
| **HMAC Secret**         | Optional | Secret    | Cloud Storage HMAC secret. If set, **HMAC Access ID** must also be set.                                                                       |

## Grant read access

Grant the service account object viewer access to the bucket you want Summation to read.

```bash theme={null}
PROJECT_ID=my-gcp-project-123
BUCKET_NAME=my-bucket
SERVICE_ACCOUNT=summation@$PROJECT_ID.iam.gserviceaccount.com

gcloud storage buckets add-iam-policy-binding "gs://$BUCKET_NAME" \
  --member="serviceAccount:$SERVICE_ACCOUNT" \
  --role="roles/storage.objectViewer"
```

Generate a JSON key for the service account and paste the full file contents into **Service Account Key**.

```bash theme={null}
gcloud iam service-accounts keys create summation-gcs-key.json \
  --iam-account="$SERVICE_ACCOUNT"
```

<Warning>
  The downloaded `summation-gcs-key.json` is a credential. Paste it into Summation, then delete the local copy. See [Best practices for managing service account keys](https://cloud.google.com/iam/docs/best-practices-for-managing-service-account-keys).
</Warning>

## Adding datasets

Each dataset is a single file or folder in the bucket. Supported file formats are `parquet`, `csv`, `json`, `jsonl`, and `tsv`. Source references use the form:

```text theme={null}
gcs://bucket/path/to/data/
```

## Common problems

| Error or symptom                  | Likely cause                                                                                             |
| --------------------------------- | -------------------------------------------------------------------------------------------------------- |
| `GCS base path is required`       | **Base Path** is blank or doesn't include a bucket.                                                      |
| `Invalid GCS service account key` | The pasted JSON is incomplete or malformed. Paste the full key file contents.                            |
| `GCS authentication failed`       | The service account key is invalid, disabled, or lacks bucket access.                                    |
| `GCS bucket does not exist`       | The bucket name in **Base Path** is wrong, or the service account cannot see it.                         |
| HMAC validation error             | Only one HMAC field is filled. Provide both **HMAC Access ID** and **HMAC Secret**, or leave both blank. |
