diff --git a/local-antora-playbook.yml b/local-antora-playbook.yml index fe2fe8c63b..984ae65331 100644 --- a/local-antora-playbook.yml +++ b/local-antora-playbook.yml @@ -18,7 +18,7 @@ content: - url: https://github.com/redpanda-data/docs branches: [v/*, shared, site-search,'!v-end-of-life/*'] - url: https://github.com/redpanda-data/cloud-docs - branches: main + branches: 'DOC-2128-migrate-iceberg-catalog' - url: https://github.com/redpanda-data/redpanda-labs branches: main start_paths: [docs,'*/docs'] diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index ee8e828c6c..50fdc21b86 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -193,6 +193,7 @@ *** xref:manage:mountable-topics.adoc[] ** xref:manage:iceberg/index.adoc[Iceberg] *** xref:manage:iceberg/about-iceberg-topics.adoc[About Iceberg Topics] +*** xref:manage:iceberg/migrate-to-iceberg-topics.adoc[Migrate to Iceberg Topics] *** xref:manage:iceberg/specify-iceberg-schema.adoc[Specify Iceberg Schema] *** xref:manage:iceberg/use-iceberg-catalogs.adoc[Use Iceberg Catalogs] *** xref:manage:iceberg/rest-catalog/index.adoc[Integrate with REST Catalogs] @@ -201,7 +202,7 @@ **** xref:manage:iceberg/iceberg-topics-gcp-biglake.adoc[GCP BigLake] **** xref:manage:iceberg/redpanda-topics-iceberg-snowflake-catalog.adoc[Snowflake and Open Catalog] *** xref:manage:iceberg/query-iceberg-topics.adoc[Query Iceberg Topics] -*** xref:manage:iceberg/migrate-to-iceberg-topics.adoc[Migrate to Iceberg Topics] +*** xref:manage:iceberg/migrate-iceberg-catalog.adoc[Migrate Iceberg Catalogs] *** xref:manage:iceberg/iceberg-performance-tuning.adoc[Tune Iceberg Performance] *** xref:manage:iceberg/iceberg-troubleshooting.adoc[Troubleshoot Iceberg Topics] ** xref:manage:schema-reg/index.adoc[Schema Registry] diff --git a/modules/manage/pages/iceberg/migrate-iceberg-catalog.adoc b/modules/manage/pages/iceberg/migrate-iceberg-catalog.adoc new file mode 100644 index 0000000000..0d0f687678 --- /dev/null +++ b/modules/manage/pages/iceberg/migrate-iceberg-catalog.adoc @@ -0,0 +1,126 @@ += Migrate Iceberg Catalogs +:description: Switch the Iceberg catalog backend for an existing Redpanda cluster without losing untranslated topic data. + +// tag::single-source[] +:page-topic-type: how-to +:page-categories: Iceberg, Migration +:personas: ops_admin, streaming_developer +:learning-objective-1: Verify that a target Iceberg catalog supports your existing schemas and partition specs +:learning-objective-2: Pause Iceberg translation and drain pending commits without losing untranslated data +:learning-objective-3: Apply new catalog configuration and resume translation safely + +Switch your cluster from one Iceberg catalog backend to another without losing untranslated topic data. Use this procedure when moving from the filesystem-based `object_storage` catalog to a managed REST catalog, or when changing between REST catalogs. + +The procedure pauses Iceberg translation per topic, lets pending commits drain to the old catalog, applies the new catalog configuration, and restarts the cluster. While translation is paused, retention is temporarily set to infinite to prevent untranslated data from being deleted. + +After reading this page, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +[IMPORTANT] +==== +Do not change config_ref:iceberg_catalog_type,true,properties/cluster-properties[`iceberg_catalog_type`] or any other catalog cluster property in place without following this procedure. In-flight commits and untranslated data can be lost or stuck if the catalog changes mid-translation. +==== + +== Prerequisites + +* Iceberg topics enabled and running on your Redpanda cluster. +* Network connectivity from all brokers to the new catalog endpoint. +* Credentials configured for the new catalog (REST endpoint, authentication mode, secret or token). For configuration guidance for each catalog type, see xref:manage:iceberg/use-iceberg-catalogs.adoc[]. +* The new catalog must support the current schema and partition spec of every Iceberg topic. See <>. + +== Verify catalog compatibility + +Before starting the migration, verify that the new catalog can host every Iceberg topic's table with its existing schema and partition spec. Mismatches discovered after migration cause already-translated Parquet files to fail to commit, blocking translation in a state that is difficult to recover from. + +The simplest validation is to manually create a test table in the new catalog with the same schema and partition spec as one of your Iceberg topics. If the create call fails, fix the partition spec or schema before migrating. Delete the test table after validation. + +[CAUTION] +==== +AWS Glue does not support partitioning on a nested field, which is Redpanda's default partition spec for Iceberg topics. If you migrate to AWS Glue, you must change the partition spec to a Glue-compatible form before starting the migration procedure. +==== + +== Run the migration + +. Save the current `retention.ms` and `retention.bytes` values for every Iceberg topic, then set both to `-1` (infinite retention): ++ +[,bash] +---- +rpk topic alter-config --set retention.ms=-1 --set retention.bytes=-1 +---- ++ +While Iceberg translation is paused in the next step, the topic's retention anchor on the log is released. Without infinite retention, the cluster could delete untranslated data before the migration completes. + +. Pause Iceberg translation on every Iceberg topic by setting `redpanda.iceberg.mode` to `disabled`. Save each topic's previous mode value so you can restore it later. ++ +[,bash] +---- +rpk topic alter-config --set redpanda.iceberg.mode=disabled +---- ++ +Setting the mode to `disabled` stops new translation while letting already-translated data finish committing to the old catalog. For more about Iceberg modes, see xref:manage:iceberg/specify-iceberg-schema.adoc[]. ++ +NOTE: Do not change config_ref:iceberg_enabled,true,properties/cluster-properties[`iceberg_enabled`] at the cluster level. The Iceberg integration must remain enabled at the cluster level so that pending commits can drain to the old catalog. + +. Wait for pending commits to drain. Monitor the `redpanda_iceberg_pending_commit_lag` metric until it reaches `0` for every Iceberg topic-partition. ++ +This metric reports the number of offsets pending a commit to the Iceberg catalog. While it is non-zero, Redpanda is still flushing translated data to the old catalog. The translation lag metric, `redpanda_iceberg_pending_translation_lag`, can remain non-zero. That value reflects new records the cluster has not yet translated, which is expected while translation is paused. ++ +[TIP] +==== +If you scrape Prometheus, the following expression returns `0` only when every Iceberg-topic partition has fully drained: + +[,promql] +---- +sum(redpanda_iceberg_pending_commit_lag) +---- +==== + +. Apply the new catalog configuration. For example, to switch from `object_storage` to a REST catalog, update the catalog cluster properties: ++ +[,bash] +---- +rpk cluster config set iceberg_catalog_type rest +rpk cluster config set iceberg_rest_catalog_endpoint +rpk cluster config set iceberg_rest_catalog_authentication_mode oauth2 +# Set additional credential properties for your chosen authentication mode. +---- ++ +For full guidance on setting catalog cluster properties, see xref:manage:iceberg/use-iceberg-catalogs.adoc#rest[Connect to a REST catalog] and the individual xref:manage:iceberg/rest-catalog/index.adoc[REST catalog integration pages]. + +. Restart Redpanda. The catalog cluster properties require a restart to take effect. ++ +ifndef::env-cloud[] +For instructions, see xref:manage:cluster-maintenance/rolling-restart.adoc[]. +endif::[] +ifdef::env-cloud[] +Coordinate the restart with Redpanda Support. The restart must occur after `redpanda_iceberg_pending_commit_lag` has reached `0` (step 3) and before you resume translation in the next step. +endif::[] + +. After the cluster comes up, check broker logs for successful catalog requests and the absence of authentication errors to verify the new catalog connection. + +. Resume Iceberg translation by restoring `redpanda.iceberg.mode` on every Iceberg topic to its previous value: ++ +[,bash] +---- +rpk topic alter-config --set redpanda.iceberg.mode= +---- + +. Restore `retention.ms` and `retention.bytes` on every Iceberg topic to the values you saved in step 1. + +== Verify the migration + +After the migration completes, confirm that new data is reaching the new catalog: + +* Query an Iceberg table in your query engine using the new catalog and confirm that row counts continue to increase as your topic produces new records. +* Check broker logs for any commit failures referencing the new catalog. Repeated failures often indicate a schema or partition spec mismatch. See <> for details. + +== Troubleshooting + +* Pending commits stuck after restart: A schema or partition spec mismatch between the original tables and the new catalog is the most common cause. See <>. If you cannot resolve the mismatch, contact https://support.redpanda.com/hc/en-us/requests/new[Redpanda Support^]. +* Authentication errors against the new REST catalog: Verify that the credential cluster properties (for example, `iceberg_rest_catalog_client_id`, `iceberg_rest_catalog_client_secret`, `iceberg_rest_catalog_token`) match what the new catalog expects. For OAuth, also check `iceberg_rest_catalog_oauth2_server_uri`. +* Translation does not resume after restoring `redpanda.iceberg.mode`: Check that `redpanda_iceberg_pending_translation_lag` is increasing as new records are produced. If it remains `0`, the cluster is not translating new records. Verify that your producer is still writing to the topic and that the topic's mode value is one of `key_value`, `value_schema_id_prefix`, or `value_schema_latest`. + +// end::single-source[] diff --git a/modules/manage/pages/iceberg/use-iceberg-catalogs.adoc b/modules/manage/pages/iceberg/use-iceberg-catalogs.adoc index dfb9b44a6c..aed76b102c 100644 --- a/modules/manage/pages/iceberg/use-iceberg-catalogs.adoc +++ b/modules/manage/pages/iceberg/use-iceberg-catalogs.adoc @@ -20,7 +20,7 @@ For production deployments, Redpanda recommends <