How it Works

Workspace

Your isolated, persistent environment where AI works with your data.

Your workspace is where your AI writes queries, runs scripts, caches results, and builds artifacts. It's a dedicated container in the cloud, provisioned for you alone, with a Linux shell, DuckDB, Python, and a persistent filesystem at /workspace.

Every MarcoPolo user gets their own workspace. Fully isolated from other users. Persistent across sessions.

Treat it like a repo

The workspace is meant to be used like a checked-out repo. Your AI runs git status and git diff as part of normal work. Files accumulate, get committed, get edited the same way you'd edit any project. Two consequences worth knowing up front:

  • Files persist across sessions and across clients. A query your AI wrote in Claude is there when you open the workspace from Codex tomorrow. The web app shows the same tree.
  • The workspace is the surface, not the MCP. Almost all real work happens through the connection and cron CLIs inside the workspace. The four MCP tools get the AI into the workspace and into a browser-setup flow; everything else flows through the workspace itself.

Layout

/workspace/
  README.md                       workspace overview
  RULES.md                        workspace-wide rules and conventions
  workflows/                      curated guides for recurring tasks
    README.md
    setup-connection.md
    query-and-analyze-data.md
    build-dashboard.md
    setup-automation.md
  connections/                    one subdirectory per visible connection
    <name>/
      README.md                   summary + authoritative capabilities
      RULES.md                    runtime-managed working guidance
      SYNTAX.md                   query syntax reference
      queries/                    saved query files
      metadata/                   snapshots from `connection describe`
      profile/                    profiling snapshots
      scratch/                    temporary work
    DUCKDB/                       in-workspace analytical connection
  scripts/                        reusable programs (Python, shell, Node)
  artifacts/                      user-facing outputs
    dashboards/                   .dashboard manifests + view.tsx
  data/
    uploads/                      user-provided files
    downloads/                    files fetched via `connection download`
    databases/                    database files (SQLite, etc.)
  schedules/                      cron job definitions
  .dv/                            runtime-managed state (DuckDB, etc.)

A few things to notice:

  • Connections are first-class directories. Each one has a README.md listing its authoritative capabilities, a SYNTAX.md for the query dialect, a RULES.md for working guidance, and queries/, metadata/, profile/, scratch/ for the AI's working files.
  • DUCKDB is a connection. The in-workspace analytical engine appears at connections/DUCKDB/ and is queried exactly like any other connection — connection query DUCKDB --file connections/DUCKDB/queries/<file>.sql --json. Use it for joins across upstream connections, intermediate tables, and ad-hoc analysis on uploaded files.
  • workflows/ is the canonical reference. When the AI doesn't know how to approach a task, it reads from here. These are short, opinionated guides — kept in sync with the platform.
  • .dv/ is runtime-managed. Don't write to it; the runtime owns DuckDB state, schedule history, and other ephemeral data there.

Where things go

WhatPath
Query filesconnections/<name>/queries/
Metadata snapshotsconnections/<name>/metadata/ (written by connection describe)
Reusable programsscripts/
User-facing outputsartifacts/
Dashboardsartifacts/dashboards/<name>.dashboard + view.tsx
Schedule definitionsschedules/
User-uploaded datadata/uploads/
Downloaded datadata/downloads/ (default destination of connection download)
Local database filesdata/databases/ (SQLite, etc.)

How your AI uses it

A typical question chains a few workspace_shell calls:

1. connection list --json                                   discover what's connected
2. cat connections/<name>/README.md SYNTAX.md RULES.md      read the seeded docs
3. connection describe <name> --json                        refresh metadata if stale
4. cat > connections/<name>/queries/<file>.sql <<'SQL' …    author the query
5. connection query <name> --file ... --json                execute, materialize into DuckDB
6. connection query DUCKDB --file ...                       follow-up analysis

The full result of every connection query is materialized into DuckDB as a named relation. Follow-up filtering, aggregation, and joins query DuckDB directly — fast, free, and doesn't re-hit the upstream system.

Persistence

The filesystem is permanent. Compute (CPU, memory) scales down when you're idle and spins back up on demand:

  • Files, scripts, cached data, and artifacts are always there.
  • No cold starts on context — your AI picks up where it left off.
  • You don't pay for compute when you're not using it.

Accessing your workspace

  • Through your AI. Your AI reads and writes files as part of normal conversations through the workspace_shell MCP tool. This is the primary interface.
  • Through the web app. Browse files, view dashboards, upload data, download exports.
  • Through any AI client. Claude, ChatGPT, Cursor, Codex, Replit — they all connect to the same workspace. The context you build carries across every tool.

A growing knowledge base

An empty workspace is a blank slate. Over time it accumulates queries that worked, RULES.md files documenting your business logic, scripts for recurring transformations, cached results from previous analyses, and dashboards and exports under artifacts/.

This is the compounding effect. Each conversation is more productive than the last because your AI doesn't start from scratch — it has your data's history, conventions, and previous work to build on.

Your first conversation starts from zero. Your tenth picks up mid-stride.

On this page