Data Sources

Databases

Working with PostgreSQL, MySQL, ClickHouse, MongoDB, and Microsoft Fabric in MarcoPolo.

MarcoPolo connects to relational and document databases including PostgreSQL, MySQL, ClickHouse, MongoDB, and Microsoft Fabric SQL Analytics.

Connecting

Ask your AI to generate a connection link, or go directly to mcp.marcopolo.dev/app/datasources. You'll need your host and port, database name, and credentials (stored encrypted, never exposed to the AI).

Make sure MarcoPolo's IP (34.208.3.240/32) is allowlisted if your database has a firewall.

How querying works

MarcoPolo provides syntax guides for each database type, so your AI writes SQL tailored to your specific system. The full workflow:

1. Read RULES.md for this data source
2. Inspect schema with get_schema
3. Write SQL → saved to workspace (e.g. queries/postgres_prod/top_customers.sql)
4. Execute via query
5. Results load into DuckDB for follow-up analysis

Results cached in DuckDB mean your AI can iterate (filtering, aggregating, or correlating with other sources) without hitting your database again.

Platform notes

PostgreSQL

The most common database in MarcoPolo. Handles PostgreSQL-specific syntax (CTEs, window functions, ILIKE, array operations) natively. Works for transactional data, application databases, and analytics. If you're running Aurora or RDS, connect with the standard PostgreSQL endpoint.

MySQL

Full MySQL dialect support. Works with Aurora MySQL, RDS, and PlanetScale. Connect with the standard MySQL endpoint.

ClickHouse

Popular for event data and high-volume analytics. Handles ClickHouse-specific syntax like FINAL, PREWHERE, and materialized view queries. Teams use it for everything from session analysis to order flow tracking.

MongoDB

Queries using the MongoDB query language. Schema inspection surfaces collections, document structure, and field types. Works well for exploratory analysis of document-oriented data.

Microsoft Fabric SQL Analytics

Connect using the SQL endpoint for your Fabric workspace. Teams use this for manufacturing analytics, quality reporting, and warehouse-style queries against Fabric data.

Best practices

Write RULES.md for your database. Document which tables to use, which to avoid, known quirks (NULL handling, deprecated columns), and metric definitions. This is the difference between your AI guessing and your AI knowing. See Context (RULES.md).

Start with schema exploration. Before diving into analysis, ask your AI to summarize the schema and generate an ER diagram. Five minutes of orientation saves hours of confusion.

Let results accumulate in DuckDB. Don't re-query when you can iterate on cached results. Ask follow-up questions that build on what's already loaded.

On this page