Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,20 @@ This change added tests and can be verified as follows:

- Does this pull request introduce a new feature? (yes / no)
- If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

---

##### Was generative AI tooling used to co-author this PR?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the goal with this section? 

It makes the template longer and I personally don't see much benefit from it. Eventually 99% of people will probably be using AI and the answer will be always yes. Also, if people want to provide the information of which tool they're using, they can already add "Generated-by: " to the commits, which most code assistants fill automatically if you use them to commit and you also added below.


<!--
If generative AI tooling has been used in the process of authoring this PR, please
change the checkbox below to `[X]` followed by the name of the tool, and uncomment the
"Generated-by" line. See the ASF Generative Tooling Guidance for details:
https://www.apache.org/legal/generative-tooling.html
-->

- [ ] Yes (please specify the tool below)

<!--
Generated-by: [Tool Name and Version]
-->
329 changes: 329 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,329 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Flink AI Agent Instructions

This file provides guidance for AI coding agents working with the Apache Flink codebase.

## Prerequisites

- Java 11, 17 (default), or 21. Java 11 syntax must be used in all modules. Java 17 syntax (records, sealed classes, pattern matching) is only permitted in the `flink-tests-java17` module.
- Maven 3.8.6 (Maven wrapper `./mvnw` included; prefer it)
- Git
- Unix-like environment (Linux, macOS, WSL, Cygwin)

## Commands

### Build

- Full build (Java 17 default): `./mvnw clean package -DskipTests -Djdk17 -Pjava17-target`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues with the build commands:

  1. The -Djdk17, -Djdk11, -Djdk21 flags don't exist as properties in the root pom.xml. Only the -P profiles exist (java17-target, java11-target, java21-target). An agent following these commands would pass a useless flag silently.

  2. Consider adding a fast build command for development. The project has a -Pfast profile (root pom.xml line 1271) that skips checkstyle, rat, spotless, enforcer, javadoc, japicmp, and cyclonedx all at once. Combined with -Pskip-webui-build (skips the Node.js frontend build), this gives agents a much faster iteration cycle.

Suggested build section:

  • Fast dev build: ./mvnw clean install -T4 -DskipTests -DskipITs -Pfast -Pskip-webui-build -fn
  • Full build: ./mvnw clean package -DskipTests
  • Java 11 target: ./mvnw clean package -DskipTests -Pjava11-target
  • Java 21 target: ./mvnw clean package -DskipTests -Pjava21-target
  • Single module: ./mvnw clean package -DskipTests -pl flink-core-api
  • Full build with tests: ./mvnw clean verify

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you @gustavodemorais. I would replace -T4 with -T1C because it scales with the machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would rather go with -Dfast, which matches how the POM declares it:

<id>fast</id>
<activation>
    <property>
        <name>fast</name>
    </property>
</activation>

Copy link
Contributor

@snuyanzin snuyanzin Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -Djdk17, -Djdk11, -Djdk21 flags don't exist as properties in the root pom.xml. Only the -P profiles exist (java17-target, java11-target, java21-target). An agent following these commands would pass a useless flag silently.

@gustavodemorais such property doesn't need to be present
as @raminqaf mentioned it is in activation, meaning it is enough to pass it to maven to activate profile
see for more details https://maven.apache.org/guides/introduction/introduction-to-profiles.html#properties

so I would keep all the flags

-Djdk17, -Djdk11, -Djdk21 flags

- Java 11: `./mvnw clean package -DskipTests -Djdk11 -Pjava11-target`
- Java 21: `./mvnw clean package -DskipTests -Djdk21 -Pjava21-target`
- Full build with tests: `./mvnw clean verify`
- Single module: `./mvnw clean package -DskipTests -pl flink-core-api`
- Single module with tests: `./mvnw clean verify -pl flink-core-api`

### Testing

- Single test class: `./mvnw -pl flink-core-api -Dtest=MemorySizeTest test`
- Single test method: `./mvnw -pl flink-core-api -Dtest=MemorySizeTest#testParseBytes test`

### Code Quality

- Format code (Java + Scala): `./mvnw spotless:apply`
- Check formatting: `./mvnw spotless:check`
- Checkstyle config: `tools/maven/checkstyle.xml` (v10.18.2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Checkstyle config: `tools/maven/checkstyle.xml` (v10.18.2)
- Checkstyle config: `tools/maven/checkstyle.xml`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe point out to the checkstyle verion in pom.xml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add checkstyle

./mvnw checkstyle:check -T1C


## Repository Structure

Every module from the root pom.xml, organized by function. Flink provides three main user-facing APIs (recommended in this order: SQL, Table API, DataStream API) plus a newer DataStream v2 API.

### Core Infrastructure

- `flink-annotations` — Stability annotations (`@Public`, `@PublicEvolving`, `@Internal`, `@Experimental`) and `@VisibleForTesting`
- `flink-core-api` — Core API interfaces (functions, state, types) shared by all APIs
- `flink-core` — Core implementation (type system, serialization, memory management, configuration)
- `flink-runtime` — Distributed runtime (JobManager, TaskManager, scheduling, network, state)
- `flink-clients` — CLI and client-side job submission
- `flink-rpc/` — RPC framework
- `flink-rpc-core` — RPC interfaces
- `flink-rpc-akka`, `flink-rpc-akka-loader` — Pekko-based RPC implementation

### SQL / Table API (recommended API for most users)

- `flink-table/`
- `flink-sql-parser` — SQL parser (extends Calcite SQL parser)
- `flink-table-common` — Shared types, descriptors, catalog interfaces
- `flink-table-api-java` — Table API for Java
- `flink-table-api-scala` — Table API for Scala
- `flink-table-api-bridge-base`, `flink-table-api-java-bridge`, `flink-table-api-scala-bridge` — Bridges between Table and DataStream APIs
- `flink-table-api-java-uber` — Uber JAR for Table API
- `flink-table-planner` — SQL/Table query planning and optimization (Calcite-based)
- `flink-table-planner-loader`, `flink-table-planner-loader-bundle` — Classloader isolation for planner
- `flink-table-runtime` — Runtime operators for Table/SQL queries
- `flink-table-calcite-bridge` — Bridge to Apache Calcite
- `flink-sql-gateway-api`, `flink-sql-gateway` — SQL Gateway for remote SQL execution
- `flink-sql-client` — Interactive SQL CLI
- `flink-sql-jdbc-driver`, `flink-sql-jdbc-driver-bundle` — JDBC driver for SQL Gateway
- `flink-table-code-splitter` — Code generation utilities
- `flink-table-test-utils` — Test utilities for Table/SQL
Comment on lines +68 to +85
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I usually use to decrease build time when I make changes in parser and want to apply them to the planner. I clean/build only the flink-table modules.

./mvnw clean install -T1C -DskipTests -Pskip-webui-build -pl flink-table/flink-table-common,flink-table/flink-sql-parser,flink-table/flink-table-planner-loader,flink-table/flink-table-planner,flink-table/flink-table-api-java -am

The -am (also-make) flag builds required upstream dependencies, so you don't have to list every transitive module. After the first full build, you can drop -am for faster rebuilds when you're only changing code within those table modules.


### DataStream API (original streaming API)

- `flink-streaming-java` — DataStream API and stream processing operator implementations

### DataStream API v2 (newer event-driven API)

- `flink-datastream-api` — DataStream v2 API definitions
- `flink-datastream` — DataStream v2 API implementation

### Connectors (in-tree)

- `flink-connectors/`
- `flink-connector-base` — Base classes for source/sink connectors
- `flink-connector-files` — Unified file system source and sink
- `flink-connector-datagen` — DataGen source for testing
- `flink-connector-datagen-test` — Tests for DataGen connector
- `flink-hadoop-compatibility` — Hadoop InputFormat/OutputFormat compatibility
- `flink-file-sink-common` — Common file sink utilities
- Most connectors (Kafka, JDBC, Elasticsearch, etc.) live in separate repos under [github.com/apache](https://github.com/apache); see README.md for the full list

### Formats

- `flink-formats/`
- `flink-json`, `flink-csv`, `flink-avro`, `flink-parquet`, `flink-orc`, `flink-protobuf` — Serialization formats
- `flink-avro-confluent-registry` — Avro with Confluent Schema Registry
- `flink-sequence-file`, `flink-compress`, `flink-hadoop-bulk`, `flink-orc-nohive` — Hadoop-related formats
- `flink-format-common` — Shared format utilities
- `flink-sql-json`, `flink-sql-csv`, `flink-sql-avro`, `flink-sql-parquet`, `flink-sql-orc`, `flink-sql-protobuf` — SQL-layer format integrations
- `flink-sql-avro-confluent-registry` — SQL-layer Avro with Confluent Schema Registry

### State Backends

- `flink-state-backends/`
- `flink-statebackend-rocksdb` — RocksDB state backend
- `flink-statebackend-forst` — ForSt state backend (experimental, RocksDB-based with JNI bindings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ForSt is not "RocksDB-based" - it's a fork of RocksDB (the name literally stands for "Fork of RocksDB"). The dependency is forstjni, not rocksdbjni. Saying "RocksDB-based" implies it wraps RocksDB.

Suggestion: ForSt state backend (experimental; ForSt is a fork of RocksDB)

- `flink-statebackend-heap-spillable` — Heap-based spillable state backend
- `flink-statebackend-changelog` — Changelog state backend
- `flink-statebackend-common` — Shared state backend utilities
- `flink-dstl/flink-dstl-dfs` — State changelog storage (DFS-based persistent changelog for incremental checkpointing)

### File Systems

- `flink-filesystems/`
- `flink-hadoop-fs` — Hadoop FileSystem abstraction
- `flink-s3-fs-hadoop`, `flink-s3-fs-presto`, `flink-s3-fs-base` — S3 file systems
- `flink-oss-fs-hadoop` — Alibaba OSS
- `flink-azure-fs-hadoop` — Azure Blob Storage
- `flink-gs-fs-hadoop` — Google Cloud Storage
- `flink-fs-hadoop-shaded` — Shaded Hadoop dependencies

### Queryable State

- `flink-queryable-state/`
- `flink-queryable-state-runtime` — Server-side queryable state service
- `flink-queryable-state-client-java` — Client for querying operator state from running jobs

### Deployment

- `flink-kubernetes` — Kubernetes integration
- `flink-yarn` — YARN integration
- `flink-dist`, `flink-dist-scala` — Distribution packaging
- `flink-container` — Container entry-point and utilities for containerized deployments

### Metrics

- `flink-metrics/`
- `flink-metrics-core` — Metrics API and core implementation
- Reporter implementations: `flink-metrics-jmx`, `flink-metrics-prometheus`, `flink-metrics-datadog`, `flink-metrics-statsd`, `flink-metrics-graphite`, `flink-metrics-influxdb`, `flink-metrics-slf4j`, `flink-metrics-dropwizard`, `flink-metrics-otel`

### Libraries

- `flink-libraries/`
- `flink-cep` — Complex Event Processing
- `flink-state-processing-api` — Offline state access (savepoint reading/writing)

### Other

- `flink-models` — AI model integration (sub-modules: `flink-model-openai`, `flink-model-triton`)
- `flink-python` — PyFlink (Python API)
- `flink-runtime-web` — Web UI for JobManager dashboard
- `flink-external-resources` — External resource management (e.g., GPU)
- `flink-docs` — Documentation sources
- `flink-examples` — Example programs
- `flink-quickstart` — Maven archetype for new projects
- `flink-walkthroughs` — Tutorial walkthrough projects

### Testing

- `flink-tests` — Integration tests
- `flink-end-to-end-tests` — End-to-end tests
- `flink-test-utils-parent` — Test utility classes
- `flink-yarn-tests` — YARN-specific tests
- `flink-fs-tests` — FileSystem tests
- `flink-architecture-tests` — ArchUnit architectural boundary tests
- `tools/ci/flink-ci-tools` — CI tooling

## Architecture Boundaries

1. **Client** submits jobs to the cluster. Submission paths include the CLI (`bin/flink run` via `flink-clients`), the SQL Client (`bin/sql-client.sh` via `flink-sql-client`), the SQL Gateway (`flink-sql-gateway`, also accessible via JDBC driver), the REST API (direct HTTP to JobManager), programmatic execution (`StreamExecutionEnvironment.execute()` or `TableEnvironment.executeSql()`), and PyFlink (`flink-python`, wraps the Java APIs).
2. **JobManager** (`flink-runtime`) orchestrates execution: receives jobs, creates the execution graph, manages scheduling, coordinates checkpoints, and handles failover. Never runs user code directly.
3. **TaskManager** (`flink-runtime`) executes the user's operators in task slots. Manages network buffers, state backends, and I/O.
4. **Table Planner** (`flink-table-planner`) translates SQL/Table API programs into DataStream programs. The planner is loaded in a separate classloader (`flink-table-planner-loader`) to isolate Calcite dependencies.
5. **Connectors** communicate with external systems. Source connectors implement the `Source` API (FLIP-27); sinks implement the `SinkV2` API. Most connectors are externalized to separate repositories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no class named SinkV2 in the codebase. The actual interface is Sink in package org.apache.flink.api.connector.sink2. An agent searching for a SinkV2 class would find nothing. This is referenced in multiple places in this file (here, line 229 in Common Change Patterns, and line 317 in Boundaries).

Suggestion: use Sink API (package sink2) throughout.

6. **State Backends** persist keyed state and operator state. RocksDB is the primary backend for production use.
7. **Checkpointing** provides exactly-once guarantees. The JobManager coordinates barriers through the data stream; TaskManagers snapshot local state to a distributed file system.

Key separations:

- **Planner vs Runtime:** The table planner generates code and execution plans; the runtime executes them. Changes to planning logic live in `flink-table-planner`; changes to runtime operators live in `flink-table-runtime` or `flink-streaming-java`.
- **API vs Implementation:** Public API surfaces (`flink-core-api`, `flink-datastream-api`, `flink-table-api-java`) are separate from implementation modules. API stability annotations control what users can depend on.
- **ArchUnit enforcement:** `flink-architecture-tests/` contains ArchUnit tests that enforce module boundaries. New violations should be avoided; if unavoidable, follow the freeze procedure in `flink-architecture-tests/README.md`.

## Common Change Patterns

This section maps common types of Flink changes to the modules they touch and the verification they require.

### Adding a new SQL built-in function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've recently added BuiltInProcessTableFunction as well


1. Register in `flink-table-common` in `BuiltInFunctionDefinitions.java` (definition, input/output type strategies, runtime class reference)
2. Implement in `flink-table-runtime` under `functions/scalar/` (extend `BuiltInScalarFunction`, implement `eval()` method)
3. Add tests in `flink-table-planner` (planner integration) and `flink-table-runtime` (unit tests)
4. Document in `flink-docs`
5. Verify: planner tests + runtime tests + SQL ITCase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. Verify: planner tests + runtime tests + SQL ITCase
5. Extend support for TableAPI's api
6. Verify: planner tests + semantic tests (ITCase should be avoided) + restore tests if you require state


### Adding a new configuration option

1. Define `ConfigOption<T>` in the relevant config class (e.g., `ExecutionConfigOptions.java` in `flink-table-api-java`)
2. Use `ConfigOptions.key("table.exec....")` builder with type, default value, and description
3. Add `@Documentation.TableOption` annotation for auto-generated docs
4. Document in `flink-docs` if user-facing
5. Verify: unit test for default value, ITCase for behavior change

### Adding a new table operator (e.g., join type, aggregate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite simplified... The natural order, still simplified, would be actually the opposite

  1. Runtime operator
  2. Add harness test for runtime operator
  3. Add ExecNode
  4. Add Physical Node + Physical Rules
  5. Add Logical Node + Planner rule
  6. Add semantic tests, plan tests, restore tests


1. **Planner rule** in `flink-table-planner` under `plan/rules/physical/stream/` (match conditions, transform to physical node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paths here only mention stream/ subdirectories, but there are equivalent batch/ directories for both planner rules (plan/rules/physical/batch/) and ExecNodes (plan/nodes/exec/batch/). An agent adding a batch operator would be left without guidance.

Consider mentioning both stream/ and batch/ paths.

2. **ExecNode** in `flink-table-planner` under `plan/nodes/exec/stream/` (bridge between planner and runtime; Jackson-serializable with `@ExecNodeMetadata`)
3. **Runtime operator** in `flink-table-runtime` under `operators/` (extends `TableStreamOperator`, implements `OneInputStreamOperator` or `TwoInputStreamOperator`)
4. Verify: planner rule tests + ExecNode JSON plan tests + runtime operator tests + SQL ITCase

### Adding a new connector (Source or Sink)

1. Implement the `Source` API (`flink-connector-base`): `SplitEnumerator`, `SourceReader`, `SourceSplit`, serializers (`SimpleVersionedSerializer`)
2. Or implement the `SinkV2` API for sinks
3. Most new connectors go in separate repos under `github.com/apache`, not in the main Flink repo
4. Verify: unit tests + ITCase with real or embedded external system

### Modifying state serializers

1. Changes to `TypeSerializer` require a corresponding `TypeSerializerSnapshot` for migration
2. Bump version in `getCurrentVersion()`, handle old versions in `readSnapshot()`
3. Snapshot must have no-arg constructor for reflection-based deserialization
4. Implement `resolveSchemaCompatibility()` for upgrade paths
5. Verify: serializer snapshot migration tests, checkpoint restore tests across versions

### Introducing or changing user-facing APIs (`@Public`, `@PublicEvolving`, `@Experimental`)

1. New user-facing API requires a voted FLIP (Flink Improvement Proposal); this applies to `@Public`, `@PublicEvolving`, and `@Experimental` since users build against all three
2. Every user-facing API class and method must carry a stability annotation
3. Changes to existing `@Public` or `@PublicEvolving` API must maintain backward compatibility
4. `@Internal` APIs can be changed freely; users should not depend on them
5. Update JavaDoc on the changed class/method
6. Add to release notes
7. Verify: ArchUnit tests pass, no new architecture violations

## Coding Standards

- **Format Java files with Spotless immediately after editing:** `./mvnw spotless:apply`. Uses google-java-format with AOSP style.
- **Scala formatting:** Spotless + scalafmt (config at `.scalafmt.conf`, maxColumn 100).
- **Checkstyle:** `tools/maven/checkstyle.xml` (version defined in root `pom.xml` as `checkstyle.version`). Some modules (flink-core, flink-optimizer, flink-runtime) are not covered by checkstyle enforcement, but conventions should still be followed.
- **No new Scala code.** All Flink Scala APIs are deprecated per FLIP-265. Write all new code in Java.
- **Apache License 2.0 header** required on all new files (enforced by Apache Rat). Use an HTML comment for markdown files.
- **API stability annotations:** Every user-facing API class and method must have a stability annotation. `@Public` (stable across minor releases), `@PublicEvolving` (may change in minor releases), `@Experimental` (may change at any time). These are all part of the public API surface that users build against. `@Internal` marks APIs with no stability guarantees that users should not depend on.
- **Logging:** Use parameterized log statements (SLF4J `{}` placeholders), never string concatenation.
- **No Java serialization** for new features (except internal RPC message transport).
- Full code style guide: https://flink.apache.org/how-to-contribute/code-style-and-quality-preamble/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Always use final variables where applicable
  • Do not add unnecessary comments that simply states what one can understand from the code. Add comments that explain "the why" for places which are relevant.


## Testing Standards

- Add tests for new behavior, covering success, failure, and edge cases.
- Use **JUnit 5** + **AssertJ** assertions. Do not use JUnit 4 or Hamcrest in new test code.
- Prefer real test implementations over Mockito mocks where possible.
- **Integration tests:** Name classes with `ITCase` suffix (e.g., `MyFeatureITCase.java`).
- **Red-green verification:** For bug fixes, verify that new tests actually fail without the fix before confirming they pass with it.
- **Test location** mirrors source structure within each module.
- Follow the testing conventions at https://flink.apache.org/how-to-contribute/code-style-and-quality-common/#7-testing

## Commits and PRs

### Commit message format

- `[FLINK-XXXX][component] Description` where FLINK-XXXX is the JIRA issue number
- `[hotfix][component] Description` for typo fixes without JIRA
- Each commit must have a meaningful message including the JIRA ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Each commit must have a meaningful message including the JIRA ID
- Each commit must have a meaningful message including the JIRA ID
- If you don't know what the ticket number is, ask me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how my local rule is writen and works for me 🙂

- Separate cleanup/refactoring from functional changes into distinct commits
- When AI tools were used: add `Generated-by: <Tool Name and Version>` trailer per [ASF generative tooling guidance](https://www.apache.org/legal/generative-tooling.html)

### Pull request conventions

- Title format: `[FLINK-XXXX][component] Title of the pull request`
- A corresponding JIRA issue is required (except hotfixes for typos)
- Fill out the PR template completely but concisely: describe purpose, change log, testing approach, impact assessment
- Each PR should address exactly one issue
- Ensure `./mvnw clean verify` passes before opening a PR
- Always push to your fork, not directly to `apache/flink`
- Rebase onto the latest target branch before submitting

### AI-assisted contributions

- Disclose AI usage by checking the AI disclosure checkbox and uncommenting the `Generated-by` line in the PR template
- Add `Generated-by: <Tool Name and Version>` to commit messages
- Never add `Co-Authored-By` with an AI agent as co-author; agents are assistants, not authors
- You must be able to explain every change and respond to review feedback substantively

## Boundaries

### Ask first

- Adding or changing `@Public`, `@PublicEvolving`, or `@Experimental` annotations (these are user-facing API commitments requiring a FLIP)
- Large cross-module refactors
- New dependencies
- Changes to serialization formats (affects state compatibility)
- Changes to checkpoint/savepoint behavior

### Never

- Commit secrets, credentials, or tokens
- Push directly to `apache/flink`; always work from your fork
- Mix unrelated changes into one PR
- Use Java serialization for new features
- Edit generated files by hand when a generation workflow exists
- Use the legacy `SourceFunction` or `SinkFunction` interfaces for connectors; use the `Source` API (FLIP-27) and `SinkV2` API instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gustavodemorais mentioned this with the non existing SinkV2

- Add `Co-Authored-By` with an AI agent as co-author in commit messages; AI agents are assistants, not authors. Use `Generated-by: <Tool Name and Version>` instead.
- Suppress or bypass checkstyle rules (no `CHECKSTYLE:ON`/`CHECKSTYLE:OFF` comments, no adding entries to `tools/maven/suppressions.xml`, no `@SuppressWarnings`). Fix the code to satisfy checkstyle instead.
- Use destructive git operations unless explicitly requested

## References

- [README.md](README.md) — Build instructions and project overview
- [DEVELOPMENT.md](DEVELOPMENT.md) — IDE setup and development environment
- [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md) — Contribution process
- [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md) — PR checklist
- [Code Style Guide](https://flink.apache.org/how-to-contribute/code-style-and-quality-preamble/) — Detailed coding guidelines
- [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) — AI tooling policy