Context
The artifacts that teach your AI to write correct queries: RULES.md, SYNTAX.md, and example queries.
MarcoPolo executes queries in your workspace and returns only summaries to the AI. A 4MB query result becomes a 1KB summary in the conversation. This keeps the context window focused on reasoning, not raw data.
But summaries alone aren't enough. Schemas don't tell your AI that "active users" means users with a session_start event in the last 30 days, or that the revenue column in invoices_eu is in EUR while orders.revenue is in USD. And schema alone won't teach it the right query syntax for Salesforce SOQL vs. PostgreSQL vs. ClickHouse.
MarcoPolo uses three types of context artifacts, stored in your workspace's docs/ directory, that your AI reads before writing any query.
RULES.md: your business logic
RULES.md files document the knowledge that makes queries correct. Your AI reads them before every query so it arrives with your definitions, conventions, and gotchas already loaded.
A good RULES.md turns your AI from a generic SQL generator into an analyst who understands your data. This is the single highest-leverage thing you can do in MarcoPolo.
Global RULES.md
Lives at docs/RULES.md. Company-wide context that applies across all data sources:
- Descriptions of each data source and what lives where
- Universal metric definitions (DAU, MAU, MRR, churn, LTV)
- Naming conventions and terminology
- Cross-source relationships and canonical sources of truth
- Anything a new analyst would need on their first day
Per-data-source RULES.md
Lives at docs/{datasource}/RULES.md. Context specific to a single data source:
- Which tables to use (and which to avoid)
- How metrics are calculated in this system
- Known quirks: NULLs, data gaps, encoding issues
- Join keys between tables
- Query patterns that work well (and ones that are slow)
SYNTAX.md: how to query each system
Each data source gets a docs/{datasource}/SYNTAX.md file, auto-generated by MarcoPolo. These are query syntax guides tailored to the specific system your AI is querying.
For a Salesforce data source, SYNTAX.md covers SOQL queries, REST API passthrough, and metadata operations. For PostgreSQL, it covers SQL dialect specifics. For ClickHouse, it explains FINAL, PREWHERE, and materialized views.
Your AI reads the relevant SYNTAX.md before writing a query, so it uses the right syntax for the right system. You don't need to write these yourself. MarcoPolo generates them.
Example queries: verified patterns
The examples/ directory in your workspace contains read-only, verified working query patterns. These are pre-populated references your AI can check before writing a new query.
You: "Before querying, check if there's an example for this kind of query."Your AI runs ls examples/ and reviews relevant patterns. This is especially useful for complex query types (Salesforce metadata retrieval, ClickHouse materialized views, parameterized queries).
How to edit RULES.md
SYNTAX.md and examples are managed by MarcoPolo. RULES.md is yours to write and maintain.
In the web app
Global rules: From the MarcoPolo home screen, click the RULES.md button.

Per-data-source rules: Open any data source's detail view and click the RULES.md tab.

Through your AI
RULES.md files live in your workspace, so your AI can write and update them directly. After a session where your AI corrected a mistake or discovered a quirk:
You: "Update RULES.md with anything you learned that would have
helped you answer more accurately."You: "Add what you discovered about the join between Salesforce
accounts and our Postgres companies table to the RULES.md."The AI writes only what it can support from the session: corrections it made, schema details it confirmed, or logic it had to work through.
What to write in RULES.md
Start with things that have caused wrong results. Here are the highest-value patterns:
Metric definitions
## Key Metrics
- **MAU**: Users with at least one `session_start` event in the last 30 calendar days.
Do NOT use `users.last_active_at`. It is not reliably updated.
- **MRR**: Sum of `subscriptions.monthly_amount` where `status = 'active'`
and `plan_type != 'trial'`. Exclude one-time charges.Canonical tables
## Where to find things
- User identity: `core.users`. Do not use `legacy.user_profiles`.
- Revenue: `billing.subscriptions` for recurring, `billing.charges` for one-time.
- Events: `analytics.events` (partitioned by `event_date`).Gotchas
## Known Issues
- `orders.amount` is in cents, not dollars. Divide by 100.
- The `region` column is NULL for all records before 2022-01-01.
- `users_v2` and `users` are separate tables with different data.
Always use `users_v2` for anything after the 2023 migration.Cross-source joins
## Joining Across Data Sources
- Salesforce → Postgres: `salesforce.accounts.external_id = postgres.companies.crm_id`
- HubSpot → Postgres: `hubspot.contacts.email = postgres.users.email`Why this matters
Every data tool requires human judgment to use correctly. A SQL client doesn't stop you from querying a deprecated table. A BI tool doesn't know that two columns with the same name mean different things.
These context artifacts transfer that judgment to your AI. RULES.md encodes your business logic. SYNTAX.md teaches the right query language. Examples show proven patterns. Together, they turn a generic AI into one that understands your data.
The 15 minutes you spend writing RULES.md will save hours of corrections.