Data Sources

Databases

Working with PostgreSQL, MySQL, ClickHouse, MongoDB, and Microsoft Fabric in MarcoPolo.

MarcoPolo connects to relational and document databases including PostgreSQL, MySQL, ClickHouse, MongoDB, and Microsoft Fabric SQL Analytics.

Connecting

Ask your AI to generate a connection link, or go directly to mcp.marcopolo.dev/app/connections. You'll need your host and port, database name, and credentials (stored encrypted, never exposed to the AI).

Make sure MarcoPolo's IP (34.208.3.240/32) is allowlisted if your database has a firewall.

How querying works

MarcoPolo seeds each connection with a SYNTAX.md so your AI writes SQL tailored to the specific system. The full workflow:

1. Read connections/<name>/RULES.md and SYNTAX.md
2. Refresh metadata with `connection describe <name> ...`
3. Write SQL → saved to connections/<name>/queries/top_customers.sql
4. Execute with `connection query <name> --file ... --json`
5. Full result materialized into DuckDB for follow-up analysis

Cached results in DuckDB mean your AI can iterate (filtering, aggregating, or correlating with other connections) without re-hitting your database.

Platform notes

PostgreSQL

The most common database in MarcoPolo. Handles PostgreSQL-specific syntax (CTEs, window functions, ILIKE, array operations) natively. Works for transactional data, application databases, and analytics. If you're running Aurora or RDS, connect with the standard PostgreSQL endpoint.

MySQL

Full MySQL dialect support. Works with Aurora MySQL, RDS, and PlanetScale. Connect with the standard MySQL endpoint.

ClickHouse

Popular for event data and high-volume analytics. Handles ClickHouse-specific syntax like FINAL, PREWHERE, and materialized view queries. Teams use it for everything from session analysis to order flow tracking.

MongoDB

Queries using the MongoDB query language. Schema inspection surfaces collections, document structure, and field types. Works well for exploratory analysis of document-oriented data.

Microsoft Fabric SQL Analytics

Connect using the SQL endpoint for your Fabric workspace. Teams use this for manufacturing analytics, quality reporting, and warehouse-style queries against Fabric data.

Best practices

Write connections/<name>/RULES.md for your database. Document which tables to use, which to avoid, known quirks (NULL handling, deprecated columns), and metric definitions. This is the difference between your AI guessing and your AI knowing. See Context (RULES.md).

Start with schema exploration. Before diving into analysis, ask your AI to summarize the schema and generate an ER diagram. Five minutes of orientation saves hours of confusion.

Let results accumulate in DuckDB. Don't re-query when you can iterate on cached results. Ask follow-up questions that build on what's already loaded.

On this page