Data Sources & Snapshots
Data sources
Section titled “Data sources”A data source is a named dataset attached to a project. Data can come from:
- CSV/Excel upload — via the
upload_datatool (max 100 MB) - Google Sheets — via the
connect_google_sheettool (OAuth2 linked)
All uploaded data is converted to Parquet format and stored in S3. Each upload creates an immutable snapshot.
Snapshots
Section titled “Snapshots”A snapshot is a point-in-time copy of your data. When a model runs, it mounts the most recent snapshot for its data sources.
Why snapshots?
Section titled “Why snapshots?”- Reproducibility — re-running a model against the same snapshot produces the same results
- Audit trail — you can see exactly which data a model run used
- Safety — uploading new data doesn’t retroactively change past runs
Snapshot tools
Section titled “Snapshot tools”| Tool | Description |
|---|---|
list_snapshots | List snapshots with dates and sizes |
delete_snapshot | Delete an unreferenced snapshot (cannot delete if used by a model run) |
Discovering data sources
Section titled “Discovering data sources”Before writing a query_data SQL statement, use list_data_sources to see all data sources attached to a project, their DuckDB table names, and column schemas. This avoids guessing table or column names:
list_data_sources → inspect tables/columns → query_datalist_data_sources returns one entry per source with tables (exact DuckDB table names to use in SQL) and schema (column names and types per table). For Google Sheet sources, snapshot_taken_at shows when the data was last synced — call connect_google_sheet to refresh stale snapshots.
Querying data
Section titled “Querying data”The query_data tool lets you run read-only SQL (SELECT/WITH) against your data sources via DuckDB. It supports both uploaded Parquet files and Google Sheet snapshots:
- Uploaded files (
upload_data) — table name is thesource_namegiven at upload time. - Google Sheet snapshots (
connect_google_sheet) — each tab becomes a table named{source_name}_{tab_name}(non-alphanumeric characters in the tab name are replaced with underscores).
-- Query an uploaded Parquet file (source_name = "revenue_data"):SELECT product_line, SUM(revenue) AS totalFROM revenue_dataGROUP BY product_lineORDER BY total DESCLIMIT 10
-- Query a linked Google Sheet (source = "budget", tab = "Q1 Revenue"):SELECT * FROM budget_Q1_Revenue LIMIT 10Results are capped at 10,000 rows. The sources field in the response lists each loaded source, its DuckDB table names, and snapshot_taken_at (for Google Sheet sources) so you can verify freshness.
Google Sheets integration
Section titled “Google Sheets integration”The connect_google_sheet tool links a Google Sheet to a project:
- Bridge Town reads the specified tabs (or all tabs)
- Converts each tab to CSV, uploads to S3 as Parquet
- Stores an encrypted OAuth refresh token for future refreshes
After running a model, write outputs back to the sheet using write_gsheet:
spec.mode="replace"— writes to a targeted A1 notation range (e.g."Forecast!A1"or"Sheet1!B2:F20"). Only cells within the range are overwritten; content outside is preserved.spec.mode="append"— adds rows after existing content in the named tab without overwriting.