[FLINK-XXXXX][docs] Add AGENTS.md and AI disclosure to PR template#27776
[FLINK-XXXXX][docs] Add AGENTS.md and AI disclosure to PR template#27776MartijnVisser wants to merge 2 commits intoapache:masterfrom
Conversation
Add AGENTS.md with guidelines for AI coding agents: prerequisites, build/test commands, repository structure, architecture boundaries, common change patterns, coding standards, testing standards, commit conventions, and boundaries. Add AI-assisted contribution disclosure checkbox to the PR template, linking to ASF Generative Tooling Guidance. Generated-by: Claude Code 1.0.33 (Claude Opus 4.6, 1M context)
9d4bd36 to
61e65e7
Compare
- Clarify Java 11 syntax requirement (Java 17 syntax only in flink-tests-java17) - Remove hardcoded checkstyle version, reference pom.xml property instead - Add no new Scala code rule (FLIP-265) - Add red-green test verification requirement for bug fixes - Add "concisely" to PR template guidance - Add CHECKSTYLE:ON/OFF and @SuppressWarnings to Never section Generated-by: Claude Code 1.0.33 (Claude Opus 4.6, 1M context)
gustavodemorais
left a comment
There was a problem hiding this comment.
Hey @MartijnVisser thanks for driving this. In general, I'm +1 for the Initiative and it will be a continuous work to keep this up-to-date and correct. I've added several suggestions for the first version
One more feedback for the whole PR: for the Table/SQL Stack, Semantic tests are the recommended tests instead of IT test case. Semantic tests can also perform validation checks and can test most of what we test with Plan Tests and should be preferred. Also, RestoreTests should be used for cases where state is used and the restore mechanism should be tested.
|
|
||
| --- | ||
|
|
||
| ##### Was generative AI tooling used to co-author this PR? |
There was a problem hiding this comment.
What's the goal with this section?
It makes the template longer and I personally don't see much benefit from it. Eventually 99% of people will probably be using AI and the answer will be always yes. Also, if people want to provide the information of which tool they're using, they can already add "Generated-by: " to the commits, which most code assistants fill automatically if you use them to commit and you also added below.
|
|
||
| - Format code (Java + Scala): `./mvnw spotless:apply` | ||
| - Check formatting: `./mvnw spotless:check` | ||
| - Checkstyle config: `tools/maven/checkstyle.xml` (v10.18.2) |
There was a problem hiding this comment.
| - Checkstyle config: `tools/maven/checkstyle.xml` (v10.18.2) | |
| - Checkstyle config: `tools/maven/checkstyle.xml` |
|
|
||
| This section maps common types of Flink changes to the modules they touch and the verification they require. | ||
|
|
||
| ### Adding a new SQL built-in function |
There was a problem hiding this comment.
I've recently added BuiltInProcessTableFunction as well
| 2. Implement in `flink-table-runtime` under `functions/scalar/` (extend `BuiltInScalarFunction`, implement `eval()` method) | ||
| 3. Add tests in `flink-table-planner` (planner integration) and `flink-table-runtime` (unit tests) | ||
| 4. Document in `flink-docs` | ||
| 5. Verify: planner tests + runtime tests + SQL ITCase |
There was a problem hiding this comment.
| 5. Verify: planner tests + runtime tests + SQL ITCase | |
| 5. Extend support for TableAPI's api | |
| 6. Verify: planner tests + semantic tests (ITCase should be avoided) + restore tests if you require state |
| 4. Document in `flink-docs` if user-facing | ||
| 5. Verify: unit test for default value, ITCase for behavior change | ||
|
|
||
| ### Adding a new table operator (e.g., join type, aggregate) |
There was a problem hiding this comment.
This is quite simplified... The natural order, still simplified, would be actually the opposite
- Runtime operator
- Add harness test for runtime operator
- Add ExecNode
- Add Physical Node + Physical Rules
- Add Logical Node + Planner rule
- Add semantic tests, plan tests, restore tests
| - **API stability annotations:** Every user-facing API class and method must have a stability annotation. `@Public` (stable across minor releases), `@PublicEvolving` (may change in minor releases), `@Experimental` (may change at any time). These are all part of the public API surface that users build against. `@Internal` marks APIs with no stability guarantees that users should not depend on. | ||
| - **Logging:** Use parameterized log statements (SLF4J `{}` placeholders), never string concatenation. | ||
| - **No Java serialization** for new features (except internal RPC message transport). | ||
| - Full code style guide: https://flink.apache.org/how-to-contribute/code-style-and-quality-preamble/ |
There was a problem hiding this comment.
- Always use
finalvariables where applicable - Do not add unnecessary comments that simply states what one can understand from the code. Add comments that explain "the why" for places which are relevant.
|
|
||
| - `[FLINK-XXXX][component] Description` where FLINK-XXXX is the JIRA issue number | ||
| - `[hotfix][component] Description` for typo fixes without JIRA | ||
| - Each commit must have a meaningful message including the JIRA ID |
There was a problem hiding this comment.
| - Each commit must have a meaningful message including the JIRA ID | |
| - Each commit must have a meaningful message including the JIRA ID | |
| - If you don't know what the ticket number is, ask me. |
There was a problem hiding this comment.
That's how my local rule is writen and works for me 🙂
|
|
||
| ### Build | ||
|
|
||
| - Full build (Java 17 default): `./mvnw clean package -DskipTests -Djdk17 -Pjava17-target` |
There was a problem hiding this comment.
Two issues with the build commands:
-
The
-Djdk17,-Djdk11,-Djdk21flags don't exist as properties in the rootpom.xml. Only the-Pprofiles exist (java17-target,java11-target,java21-target). An agent following these commands would pass a useless flag silently. -
Consider adding a fast build command for development. The project has a
-Pfastprofile (root pom.xml line 1271) that skips checkstyle, rat, spotless, enforcer, javadoc, japicmp, and cyclonedx all at once. Combined with-Pskip-webui-build(skips the Node.js frontend build), this gives agents a much faster iteration cycle.
Suggested build section:
- Fast dev build:
./mvnw clean install -T4 -DskipTests -DskipITs -Pfast -Pskip-webui-build -fn - Full build:
./mvnw clean package -DskipTests - Java 11 target:
./mvnw clean package -DskipTests -Pjava11-target - Java 21 target:
./mvnw clean package -DskipTests -Pjava21-target - Single module:
./mvnw clean package -DskipTests -pl flink-core-api - Full build with tests:
./mvnw clean verify
|
|
||
| - `flink-state-backends/` | ||
| - `flink-statebackend-rocksdb` — RocksDB state backend | ||
| - `flink-statebackend-forst` — ForSt state backend (experimental, RocksDB-based with JNI bindings) |
There was a problem hiding this comment.
ForSt is not "RocksDB-based" - it's a fork of RocksDB (the name literally stands for "Fork of RocksDB"). The dependency is forstjni, not rocksdbjni. Saying "RocksDB-based" implies it wraps RocksDB.
Suggestion: ForSt state backend (experimental; ForSt is a fork of RocksDB)
| 2. **JobManager** (`flink-runtime`) orchestrates execution: receives jobs, creates the execution graph, manages scheduling, coordinates checkpoints, and handles failover. Never runs user code directly. | ||
| 3. **TaskManager** (`flink-runtime`) executes the user's operators in task slots. Manages network buffers, state backends, and I/O. | ||
| 4. **Table Planner** (`flink-table-planner`) translates SQL/Table API programs into DataStream programs. The planner is loaded in a separate classloader (`flink-table-planner-loader`) to isolate Calcite dependencies. | ||
| 5. **Connectors** communicate with external systems. Source connectors implement the `Source` API (FLIP-27); sinks implement the `SinkV2` API. Most connectors are externalized to separate repositories. |
There was a problem hiding this comment.
There is no class named SinkV2 in the codebase. The actual interface is Sink in package org.apache.flink.api.connector.sink2. An agent searching for a SinkV2 class would find nothing. This is referenced in multiple places in this file (here, line 229 in Common Change Patterns, and line 317 in Boundaries).
Suggestion: use Sink API (package sink2) throughout.
|
|
||
| ### Adding a new table operator (e.g., join type, aggregate) | ||
|
|
||
| 1. **Planner rule** in `flink-table-planner` under `plan/rules/physical/stream/` (match conditions, transform to physical node) |
There was a problem hiding this comment.
The paths here only mention stream/ subdirectories, but there are equivalent batch/ directories for both planner rules (plan/rules/physical/batch/) and ExecNodes (plan/nodes/exec/batch/). An agent adding a batch operator would be left without guidance.
Consider mentioning both stream/ and batch/ paths.
What is the purpose of the change
Make the Flink codebase accessible to AI coding agents and establish lightweight guidelines for AI-assisted contributions.
Brief change log
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (yes / no)Documentation
Was generative AI tooling used to co-author this PR?