Skip to content

docs: update SQL-to-FalkorDB migration guide for latest feature set#402

Merged
shahar-biron merged 1 commit intomainfrom
docs/pricing-updates-max-graph-dataset
Mar 29, 2026
Merged

docs: update SQL-to-FalkorDB migration guide for latest feature set#402
shahar-biron merged 1 commit intomainfrom
docs/pricing-updates-max-graph-dataset

Conversation

@shahar-biron
Copy link
Copy Markdown
Contributor

@shahar-biron shahar-biron commented Mar 26, 2026

Summary

  • refresh operations/migration/sql-to-falkordb.md to match current DM-SQL-to-FalkorDB capabilities
  • add BigQuery and Spark SQL (Livy) source coverage
  • align connector capability notes (daemon/purge/scaffold support)
  • update control-plane section with current API endpoints and persisted state/log/metrics behavior
  • refresh metrics defaults and operational guidance

Validation

  • reviewed content against current loader manifests, README files, and CLI flags in source

Artifacts

Co-Authored-By: Oz oz-agent@warp.dev

Summary by CodeRabbit

  • Documentation
    • Added support for BigQuery and Spark SQL as migration sources.
    • Introduced schema introspection and template scaffolding CLI commands (--introspect-schema, --generate-template).
    • Expanded control plane capabilities including schema preview, graph topology visualization, state inspection, and persisted logs.
    • Enhanced metrics and observability features for all supported data sources.
    • Added loader-specific run mode guidance and operational best practices.

Co-Authored-By: Oz <oz-agent@warp.dev>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Updated SQL-to-FalkorDB migration guide to add BigQuery and Spark SQL support, introduce schema introspection and template scaffolding capabilities, expand control-plane features, and refine loader configuration documentation with execution modes and state backend details.

Changes

Cohort / File(s) Summary
SQL-to-FalkorDB Migration Documentation
operations/migration/sql-to-falkordb.md
Added BigQuery and Spark SQL as supported sources with new CLI sections and credentials. Introduced schema introspection (--introspect-schema) and template scaffolding (--generate-template) capabilities. Refined loader configuration with per-mapping execution modes (full/incremental) and state backend framing. Expanded control-plane features to include schema preview, template generation, topology visualization, and state inspection. Updated metrics section defaults and port override guidance. Added loader-specific operational notes and durability clarifications.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • FalkorDB/docs#384 — Modifies the same SQL-to-FalkorDB migration guide with overlapping updates to supported sources and loader documentation.
  • FalkorDB/docs#381 — Earlier version extending the same migration document; this PR builds on foundational SQL migration content with additional BigQuery/Spark support and control-plane features.

Suggested reviewers

  • gkorland
  • barakb

Poem

🐰 Hopping through SQL realms so wide,
BigQuery and Spark now by our side,
Schemas introspected, templates in hand,
FalkorDB's migration—wonderfully planned!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main change—updating the SQL-to-FalkorDB migration documentation to reflect the latest feature set, which matches the substantial documentation updates described in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/pricing-updates-max-graph-dataset

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
operations/migration/sql-to-falkordb.md (2)

316-327: Consider explicitly calling out default-port collisions.

Since defaults are reused by multiple loaders, add a brief sentence that concurrent mixed-loader runs will often require explicit per-loader port overrides.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operations/migration/sql-to-falkordb.md` around lines 316 - 327, The docs
list default metrics ports for each loader but don't warn that several defaults
duplicate across loaders, causing port collisions when running mixed loaders
concurrently; update the paragraph after the port list to explicitly call out
that defaults are reused and that concurrent mixed-loader runs will usually
require per-loader overrides via the CLI flag `--metrics-port` or each loader's
`*_TO_FALKORDB_METRICS_PORT` environment variable so users know to set unique
ports for each running loader.

217-217: Normalize source naming for consistency.

This line says Spark, while other sections consistently say Spark SQL (via Livy sessions) / Spark SQL (Livy). Consider using one canonical label throughout to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operations/migration/sql-to-falkordb.md` at line 217, Replace the
inconsistent "Spark" label in the supported list string "Supported by: BigQuery,
ClickHouse, Databricks, MariaDB, MySQL, PostgreSQL, Snowflake, Spark, SQL
Server." with the canonical form used elsewhere (e.g., "Spark SQL (via Livy
sessions)" or "Spark SQL (Livy)") so the document matches other sections; update
that exact phrase in operations/migration/sql-to-falkordb.md to use the chosen
canonical label.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@operations/migration/sql-to-falkordb.md`:
- Around line 316-327: The docs list default metrics ports for each loader but
don't warn that several defaults duplicate across loaders, causing port
collisions when running mixed loaders concurrently; update the paragraph after
the port list to explicitly call out that defaults are reused and that
concurrent mixed-loader runs will usually require per-loader overrides via the
CLI flag `--metrics-port` or each loader's `*_TO_FALKORDB_METRICS_PORT`
environment variable so users know to set unique ports for each running loader.
- Line 217: Replace the inconsistent "Spark" label in the supported list string
"Supported by: BigQuery, ClickHouse, Databricks, MariaDB, MySQL, PostgreSQL,
Snowflake, Spark, SQL Server." with the canonical form used elsewhere (e.g.,
"Spark SQL (via Livy sessions)" or "Spark SQL (Livy)") so the document matches
other sections; update that exact phrase in
operations/migration/sql-to-falkordb.md to use the chosen canonical label.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fe7bd30f-4387-4ae3-87c3-fa387ff08612

📥 Commits

Reviewing files that changed from the base of the PR and between df21552 and ff5d39d.

📒 Files selected for processing (1)
  • operations/migration/sql-to-falkordb.md

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the SQL-to-FalkorDB online migration documentation to reflect the current DM-SQL-to-FalkorDB loader/control-plane feature set, including new sources and CLI/control-plane capabilities.

Changes:

  • Added BigQuery and Spark SQL (Livy) as documented migration sources, including CLI run examples and capability notes.
  • Documented schema introspection and template scaffolding flags (--introspect-schema, --generate-template) and clarified purge-mode support.
  • Refreshed control-plane feature list, API endpoints, and metrics/operational guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Snowflake: `9992`
- Spark: `9997`
- SQL Server: `9996`

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default metrics port list includes duplicates (BigQuery/MySQL both 9995; MariaDB/Spark both 9997). If two loaders are started with defaults on the same host, the metrics HTTP server will fail to bind or one loader’s metrics will be unavailable. Consider assigning unique defaults (if the loaders support it) or explicitly calling out these collisions in the doc and recommending setting --metrics-port when running those pairs together.

Suggested change
Note: BigQuery and MySQL share the same default metrics port (`9995`), and MariaDB and Spark share `9997`. When running any of these pairs concurrently on the same host, configure distinct metrics ports for each loader.

Copilot uses AI. Check for mistakes.
- The Metrics UI uses persisted snapshots and does not expose internal scrape endpoint/port settings.
- Runs execute locally on the host machine running the control plane server.
- Runtime artifacts are persisted under `CONTROL_PLANE_DATA_DIR`, including a SQLite DB and per-run files.
- SSE auth with API key may use query-string token fallback because browser `EventSource` does not support custom auth headers.
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note recommends using an API key via query-string for SSE auth. Query-string tokens are easy to leak via browser history, reverse-proxy/access logs, and the Referer header on subsequent requests. Please add a short warning and mitigation guidance (e.g., prefer HTTPS, short-lived tokens, disable/refine access logging, or use a cookie-based/session approach / SSE polyfill when possible).

Suggested change
- SSE auth with API key may use query-string token fallback because browser `EventSource` does not support custom auth headers.
- SSE auth with API key may use query-string token fallback because browser `EventSource` does not support custom auth headers. Be aware that query-string tokens can leak via browser history, logs, and `Referer` headers; mitigate by using HTTPS, short-lived tokens, minimized/filtered access logging, or a cookie/session-based approach or SSE polyfill where possible.

Copilot uses AI. Check for mistakes.
@shahar-biron shahar-biron merged commit 43125d0 into main Mar 29, 2026
8 checks passed
@shahar-biron shahar-biron deleted the docs/pricing-updates-max-graph-dataset branch March 29, 2026 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants