Workspace
Your isolated, persistent environment where AI works with your data.
Your workspace is where your AI writes queries, runs scripts, caches results, and builds artifacts. It's a dedicated container in the cloud, provisioned for you alone, with a Linux shell, DuckDB, Python, and a persistent filesystem at /workspace.
Every MarcoPolo user gets their own workspace. Fully isolated from other users. Persistent across sessions.
Treat it like a repo
The workspace is meant to be used like a checked-out repo. Your AI runs git status and git diff as part of normal work. Files accumulate, get committed, get edited the same way you'd edit any project. Two consequences worth knowing up front:
- Files persist across sessions and across clients. A query your AI wrote in Claude is there when you open the workspace from Codex tomorrow. The web app shows the same tree.
- The workspace is the surface, not the MCP. Almost all real work happens through the
connectionandcronCLIs inside the workspace. The four MCP tools get the AI into the workspace and into a browser-setup flow; everything else flows through the workspace itself.
Layout
/workspace/
README.md workspace overview
RULES.md workspace-wide rules and conventions
workflows/ curated guides for recurring tasks
README.md
setup-connection.md
query-and-analyze-data.md
build-dashboard.md
setup-automation.md
connections/ one subdirectory per visible connection
<name>/
README.md summary + authoritative capabilities
RULES.md runtime-managed working guidance
SYNTAX.md query syntax reference
queries/ saved query files
metadata/ snapshots from `connection describe`
profile/ profiling snapshots
scratch/ temporary work
DUCKDB/ in-workspace analytical connection
scripts/ reusable programs (Python, shell, Node)
artifacts/ user-facing outputs
dashboards/ .dashboard manifests + view.tsx
data/
uploads/ user-provided files
downloads/ files fetched via `connection download`
databases/ database files (SQLite, etc.)
schedules/ cron job definitions
.dv/ runtime-managed state (DuckDB, etc.)A few things to notice:
- Connections are first-class directories. Each one has a
README.mdlisting its authoritativecapabilities, aSYNTAX.mdfor the query dialect, aRULES.mdfor working guidance, andqueries/,metadata/,profile/,scratch/for the AI's working files. - DUCKDB is a connection. The in-workspace analytical engine appears at
connections/DUCKDB/and is queried exactly like any other connection —connection query DUCKDB --file connections/DUCKDB/queries/<file>.sql --json. Use it for joins across upstream connections, intermediate tables, and ad-hoc analysis on uploaded files. workflows/is the canonical reference. When the AI doesn't know how to approach a task, it reads from here. These are short, opinionated guides — kept in sync with the platform..dv/is runtime-managed. Don't write to it; the runtime owns DuckDB state, schedule history, and other ephemeral data there.
Where things go
| What | Path |
|---|---|
| Query files | connections/<name>/queries/ |
| Metadata snapshots | connections/<name>/metadata/ (written by connection describe) |
| Reusable programs | scripts/ |
| User-facing outputs | artifacts/ |
| Dashboards | artifacts/dashboards/<name>.dashboard + view.tsx |
| Schedule definitions | schedules/ |
| User-uploaded data | data/uploads/ |
| Downloaded data | data/downloads/ (default destination of connection download) |
| Local database files | data/databases/ (SQLite, etc.) |
How your AI uses it
A typical question chains a few workspace_shell calls:
1. connection list --json discover what's connected
2. cat connections/<name>/README.md SYNTAX.md RULES.md read the seeded docs
3. connection describe <name> --json refresh metadata if stale
4. cat > connections/<name>/queries/<file>.sql <<'SQL' … author the query
5. connection query <name> --file ... --json execute, materialize into DuckDB
6. connection query DUCKDB --file ... follow-up analysisThe full result of every connection query is materialized into DuckDB as a named relation. Follow-up filtering, aggregation, and joins query DuckDB directly — fast, free, and doesn't re-hit the upstream system.
Persistence
The filesystem is permanent. Compute (CPU, memory) scales down when you're idle and spins back up on demand:
- Files, scripts, cached data, and artifacts are always there.
- No cold starts on context — your AI picks up where it left off.
- You don't pay for compute when you're not using it.
Accessing your workspace
- Through your AI. Your AI reads and writes files as part of normal conversations through the
workspace_shellMCP tool. This is the primary interface. - Through the web app. Browse files, view dashboards, upload data, download exports.
- Through any AI client. Claude, ChatGPT, Cursor, Codex, Replit — they all connect to the same workspace. The context you build carries across every tool.
A growing knowledge base
An empty workspace is a blank slate. Over time it accumulates queries that worked, RULES.md files documenting your business logic, scripts for recurring transformations, cached results from previous analyses, and dashboards and exports under artifacts/.
This is the compounding effect. Each conversation is more productive than the last because your AI doesn't start from scratch — it has your data's history, conventions, and previous work to build on.
Your first conversation starts from zero. Your tenth picks up mid-stride.