Skip to content

feat: multi-language orchestration loop with per-language config discovery#1918

Open
mashraf-222 wants to merge 7 commits intomainfrom
cf-1075-multi-language-orchestration
Open

feat: multi-language orchestration loop with per-language config discovery#1918
mashraf-222 wants to merge 7 commits intomainfrom
cf-1075-multi-language-orchestration

Conversation

@mashraf-222
Copy link
Copy Markdown
Contributor

@mashraf-222 mashraf-222 commented Mar 27, 2026

Summary

Adds multi-language support to the CLI. A single codeflash invocation now discovers and processes all configured languages (Python, Java, JS/TS) sequentially.

CF-1075

How it works

Config discovery (find_all_config_files in config_parser.py):

  • Walks upward from CWD — finds pyproject.toml (Python), package.json (JS/TS), pom.xml/build.gradle (Java)
  • Scans immediate subdirectories for monorepo support
  • Closest config wins per language, one project per language

Orchestration loop (main.py):

  • For each discovered LanguageConfig, deep-copies args, applies language-specific config, runs optimizer
  • Error in one language does not block others
  • Git/GitHub checks run once before the loop

Key behaviors:

  • --file filters to the matching language automatically
  • --all re-resolves per language (each gets its own module_root)
  • Backward compatible: single-language projects use existing code path unchanged

Files changed

File What
config_parser.py LanguageConfig dataclass, find_all_config_files(), normalize_toml_config() extraction
cli.py apply_language_config() — maps LanguageConfig onto CLI args
main.py Orchestration loop with per-language error isolation and summary logging
test_git_utils.py Language-agnostic test updates
test_registry.py Java registration in clear-functions test
test_multi_config_discovery.py 16 tests for config discovery
test_multi_language_orchestration.py 36 tests for orchestration, CLI routing, error handling

Test plan

  • 52 unit tests pass (pytest tests/test_multi_language_orchestration.py tests/test_multi_config_discovery.py -v)
  • prek run --from-ref origin/main passes
  • E2E: single-language Python project still works unchanged
  • E2E: multi-language project discovers all configs and processes each

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 27, 2026

Claude finished @mashraf-222's task in 7m 57s —— View job


PR Review Summary

  • Triage PR size — LARGE (350+ production lines changed)
  • Run prek lint/typecheck
  • Resolve stale threads — no unresolved threads
  • Code review
  • Duplicate detection
  • Test coverage analysis
  • Post final summary

Prek Checks

ruff check and ruff format: PASS

Mypy shows 14 errors across config_parser.py and git_utils.py. Spot-checked: the config_parser.py errors (list_str_keys annotation, bool/str incompatibility in the normalization loops) are pre-existing patterns carried over from the inline code in the original parse_config_file. The git_utils.py errors are pre-existing (only line 152 changed in this PR). No new mypy regressions introduced.


Code Review

Fixed since last review cycle

All three critical bugs from the previous review cycle are addressed:

  • add_help=False removed from optimize subparser (cli.py)
  • package.json false positives: _check_dir_for_configs now requires a "codeflash" key (config_parser.py:212)
  • Filesystem walk stops at git root (config_parser.py:263)
  • filter_configs_for_file correctly excludes spurious configs when --file is specified (main.py:41-53)

Previous note corrected: "benchmark": False in normalize_toml_config's bool_keys is not new — the same key was already in the inline normalization code of parse_config_file on main. No issue here.

Still open — medium severity

init_sentry and posthog_cf.initialize_posthog are still called once per language inside the orchestration loop (main.py:181-182), not once before it. In every other code path (single-lang, verify-setup, command dispatch), they're called once. If telemetry init is idempotent the runtime impact is zero, but it's architecturally inconsistent and could cause issues if disable_telemetry differs between language configs (last-language setting wins for whichever call actually sticks). Fix this →

Minor

  • filter_configs_for_file fallback (main.py:53): when --file is given but no config's project root contains it, the function returns all configs unchanged. This is acknowledged design, but means the "spurious config" scenario still triggers if neither config covers the file. Not a bug given the comment, but worth knowing.
  • check_running_in_git_repo (git_utils.py:149) signature is (module_root: str) but tests document it accepting None. The mypy error is pre-existing; just noting the annotation is slightly loose.

Duplicate Detection

No duplicates detected. normalize_toml_config is correctly shared between both the new multi-lang discovery path and the existing parse_config_file. The resolve_config_onto_args extraction properly eliminates the duplication that would have existed between process_pyproject_config and apply_language_config.


Test Coverage

59 new tests across three test files — all pass. Coverage on the new orchestration path:

File Coverage
main.py 62%
config_parser.py 47%
cli_cmds/cli.py 31%

The lower numbers reflect untouched legacy paths (init, auth, verify-setup, etc.) that are exercised in other test suites. The new paths added by this PR are well-covered by test_multi_language_orchestration.py and test_multi_config_discovery.py.

Missing test noted from previous review (still missing): a test verifying init_sentry/posthog are called exactly once in multi-language mode. Once the call is moved before the loop, this would be straightforward to add.


Last updated: 2026-04-01

mashraf-222 and others added 3 commits March 31, 2026 08:14
…discovery

Adds the ability for codeflash CLI to discover and optimize multiple
languages in a single run. When configs for Python, Java, and/or JS/TS
are found, the optimizer runs once per language with isolated args.

- Add LanguageConfig dataclass and find_all_config_files() for upward
  walk + monorepo subdirectory scanning (Python/JS/Java)
- Extract normalize_toml_config() from inline parse_config_file logic
- Add apply_language_config() for per-language arg setup in cli.py
- Set language singleton early in process_pyproject_config()
- Add orchestration loop in main.py with --file language filtering,
  per-language error isolation, and summary logging
- Use build_config_strategy.parse_java_project_config() (from #1906)
- Make get_git_diff tests language-agnostic (no singleton dependency)

Intentionally omits auto_configure_language and
detect_unconfigured_languages per Linear CF-1075 review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeScript projects discovered via package.json were incorrectly
assigned Language.JAVASCRIPT. Now derives language from the config
dict returned by parse_package_json_config() and deduplicates both
JS and TS together.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract resolve_config_onto_args() from duplicated logic between
process_pyproject_config and apply_language_config (~100 lines deduped).
This fixes:
- Missing benchmark validation in apply_language_config (HIGH)
- Missing LSP guard in apply_language_config (MEDIUM)
- --all /specific/path silently overridden in orchestration loop (MEDIUM)
- Silent except Exception: pass in config discovery (MEDIUM)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mashraf-222 mashraf-222 force-pushed the cf-1075-multi-language-orchestration branch from 03d890d to bc6444a Compare March 31, 2026 08:16
mashraf-222 and others added 2 commits March 31, 2026 08:42
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In the multi-language orchestration path, handle_optimize_all_arg_parsing
runs before config is loaded, so args.git_remote is unset. This caused
repo.remote(name=None) to raise ValueError, crashing every --file/--all
run without --no-pr. Default to "origin" when git_remote is not set.

Also catch git.NoSuchPathError in check_running_in_git_repo and
handle_optimize_all_arg_parsing so non-existent module_root paths
return graceful errors instead of uncaught exceptions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aseembits93
Copy link
Copy Markdown
Contributor

@claude review

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 31, 2026

Claude finished @aseembits93's task in 2m 58s —— View job


PR Review

  • Gather context
  • Read changed source files
  • Analyze diff for bugs/issues
  • Review test coverage
  • Post review feedback

Summary

Solid feature work overall — the orchestration loop, config discovery, and test coverage are well-structured. Found several bugs and a few design issues that should be addressed before merging.


Bugs

1. benchmark added to normalize_toml_config bool_keys — wasn't there before

config_parser.py:123-127:

bool_keys = {
    "override-fixtures": False,
    "disable-telemetry": False,
    "disable-imports-sorting": False,
    "benchmark": False,  # ← NEW, wasn't in original
}

benchmark is a CLI flag, not a pyproject.toml config key. Adding it here means every normalized config dict now gets benchmark: False injected, which is misleading. It doesn't currently cause a bug (it's not in supported_keys so it won't be applied to args), but it's dead weight in the config dict and could cause confusion. Fix this →

2. add_help=False on optimize subparser — likely unintentional

cli.py:413:

trace_optimize = subparsers.add_parser("optimize", help="...", add_help=False)

This removes -h/--help from the optimize subcommand and is not mentioned in the PR description. This is a regression — codeflash optimize --help will no longer work. Fix this →

3. init_sentry and posthog called per language in the orchestration loop

main.py:162-163:

for lang_config in language_configs:
    ...
    init_sentry(enabled=not pass_args.disable_telemetry, exclude_errors=True)   # ← N times
    posthog_cf.initialize_posthog(enabled=not pass_args.disable_telemetry)       # ← N times

These are initialization calls that are called once in all other code paths. In the multi-language loop, they're called once per language. Even if idempotent, this is inconsistent and could cause issues if disable_telemetry differs between language configs (the last language's setting wins). Move them before the loop. Fix this →


Design Issues

4. set_current_language singleton overwritten in each loop iteration

cli.py:144-145:

if language is not None:
    set_current_language(language)

resolve_config_onto_args is called per language, each time overwriting the global singleton. After the loop completes, the singleton holds the last processed language. If any code reads current_language() outside the optimizer (e.g., in cleanup, logging, or telemetry), it sees a stale/wrong value. This is a pre-existing concern but the multi-language loop amplifies it.

5. find_all_config_files walks to filesystem root with no depth limit

config_parser.py:242-248:

dir_path = start_dir.resolve()
while True:
    _check_dir_for_configs(dir_path, configs, seen_languages)
    parent = dir_path.parent
    if parent == dir_path:
        break
    dir_path = parent

On developer machines this will routinely scan /home/user, /home, /, etc. A pyproject.toml at /home/user/pyproject.toml (not uncommon for global tools) would be picked up. Consider stopping at the git repo root (git rev-parse --show-toplevel) or at a configurable max depth.

6. No first-run experience in the multi-config path

main.py:99-119:

language_configs = find_all_config_files()
if not language_configs:
    loaded_args = _handle_config_loading(args)  # ← first-run only in this branch
    ...

If find_all_config_files() returns any config (even one with a bare pom.xml or package.json), the first-run experience via _handle_config_loading is completely bypassed. A user with only a Java or JS project who hasn't done codeflash init will silently skip setup.

7. _parse_java_config_for_dir is a redundant wrapper

config_parser.py:163-166:

def _parse_java_config_for_dir(dir_path: Path) -> dict[str, Any] | None:
    from codeflash.languages.java.build_config_strategy import parse_java_project_config
    return parse_java_project_config(dir_path)

This wrapper only exists to defer the import, but the same pattern is already used inline elsewhere. It's also the only reason _parse_java_config_for_dir needs to be mockable in tests (the tests patch codeflash.code_utils.config_parser._parse_java_config_for_dir). The wrapper is fine for testability, but it's worth documenting that purpose.

8. original_all save/restore logic is fragile

main.py:135-154: The manual capture of original_all before handle_optimize_all_arg_parsing mutates args.all (to None when module_root is absent) and then per-language restoration is correct but subtle. A reader unfamiliar with the mutation would miss why original_all is needed. The comment helps, but consider whether handle_optimize_all_arg_parsing can be split into a "git checks" part and an "args mutation" part to make the intent clearer.


Test Coverage

  • 52 unit tests are good coverage for the new code paths
  • The test for test_root_config_wins_over_subdir correctly tests the "closest config wins" invariant
  • Missing: test for the --all "" (no path) vs --all /path distinction in the orchestration loop's original_all handling
  • Missing: test that init_sentry/posthog are called exactly once in multi-language mode
  • The test_git_utils.py cleanup (removing set_current_language singleton setup from tests) is a good improvement

Minor

  • config_parser.py:194: pyproject.open("rb") followed by tomlkit.parse(f.read()) — this pattern is borrowed from parse_config_file and works because tomlkit handles bytes, but it's undocumented. Not a bug.
  • git_utils.py:152: (git.InvalidGitRepositoryError, git.NoSuchPathError) — correct fix.
  • cli.py:263: getattr(args, "git_remote", None) or "origin" — clean fix for the git_remote=None case.

@HeshamHM28
Copy link
Copy Markdown
Contributor

Code Review: Bug Report

Ran a thorough static analysis and runtime verification of this PR. Found 7 issues (2 critical, 2 medium, 3 low).


Bug 1 (Critical): Any package.json triggers JS config discovery — even without codeflash section

Files: config_js.py:262, config_parser.py:209-214

parse_package_json_config() auto-detects and returns a full config for any package.json, even without a "codeflash" key. In _check_dir_for_configs, this means a Python project with a package.json for dev tools (prettier, eslint, etc.) will falsely trigger JS optimization.

Contrast with Python, which correctly requires [tool.codeflash] in pyproject.toml.

Repro:

# package.json: {"name": "my-project", "devDependencies": {"prettier": "^3.0"}}
# No "codeflash" section — but parse_package_json_config still returns a full config
result = parse_package_json_config(path)  # result is NOT None

Fix: _check_dir_for_configs should check for "codeflash" key in package.json before treating it as a JS config, or parse_package_json_config should return None when no codeflash section exists.


Bug 2 (Critical): testsRoot path resolution is wrong in config_js.py:279

# Line 279 (BUGGY):
config["tests_root"] = str(project_root / Path(codeflash_config["testsRoot"]).resolve())
#                                          ^^^ resolves relative to CWD, THEN joins with project_root

# Line 273 (CORRECT pattern for moduleRoot):
config["module_root"] = str((project_root / Path(codeflash_config["moduleRoot"])).resolve())
#                            ^^^ joins first, THEN resolves

Verified: With CWD=/home/user, project_root=/home/user/project, testsRoot="tests":

  • Buggy: /home/user/project/home/user/tests
  • Correct: /home/user/project/tests

Fix: str((project_root / Path(codeflash_config["testsRoot"])).resolve())


Bug 3 (Medium): Java tests_root auto-detection uses wrong parent level

File: cli.py:152-153

test_path = Path(args.module_root).parent / test_dir if "/" in test_dir else Path(test_dir)

For module_root=/project/src/main/java, test_dir="src/test/java":

  • .parent/project/src/main
  • Result: /project/src/main/src/test/javawrong
  • Expected: /project/src/test/java

Fix: Resolve relative to the project root or config_path, not module_root.parent.


Bug 4 (Medium): JS/TS tests_root auto-detection is CWD-relative, not project-relative

File: cli.py:162-164

for test_dir in ["test", "tests", "__tests__"]:
    if Path(test_dir).is_dir():  # checks relative to CWD, not project root
        args.tests_root = test_dir  # stores bare relative string

In a monorepo where CWD != project root, this finds the wrong directory. Also, the bare string "test" stored in args.tests_root will fail the Path(args.tests_root).is_dir() assertion at line 178 if CWD changes between detection and assertion.


Bug 5 (Low): handle_optimize_all_arg_parsing sets args.all = None wastefully in multi-lang path

Files: cli.py:272, main.py:139

In the multi-language path, handle_optimize_all_arg_parsing runs before any language config is applied. args.module_root is None at that point, so args.all = args.module_root sets it to None. This is mitigated by main.py:148-154 which re-resolves per language, but the mutation is wasted work and the comment at main.py:132-134 acknowledges this oddity.


Bug 6 (Low): Type annotation mismatch for check_running_in_git_repo

File: git_utils.py:149

Signature is (module_root: str) -> bool but called with None at cli.py:73. Works because git.Repo(None, ...) accepts None, but annotation should be str | None.


Bug 7 (Low/Design): JS/TS configs skip normalize_toml_config normalization

File: config_parser.py:209-214

Python configs go through normalize_toml_config() (defaults, key conversion, validation). JS/TS configs come from parse_package_json_config() with separate normalization logic. While both produce underscored keys, the divergent paths could cause subtle inconsistencies (e.g., language key present in JS but absent in Python configs).


Summary: Bugs 1-2 are the most impactful — Bug 1 causes false JS discovery in Python-only projects, and Bug 2 produces wrong test paths for any JS/TS project with explicit testsRoot. Bugs 3-4 affect auto-detection correctness for Java and JS/TS projects.

- Stop upward config walk at git repo root to prevent picking up
  configs from parent directories outside the project
- Require "codeflash" key in package.json for JS/TS multi-language
  discovery, matching Python's [tool.codeflash] opt-in model
- Remove accidental add_help=False from optimize subparser

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mashraf-222 mashraf-222 force-pushed the cf-1075-multi-language-orchestration branch from d21a088 to ad661a7 Compare April 1, 2026 09:18
@mashraf-222
Copy link
Copy Markdown
Contributor Author

mashraf-222 commented Apr 1, 2026

Addressed in ad661a7:

  • Bug 1 (package.json false positives): Fixed — _check_dir_for_configs now requires a "codeflash" key in package.json, matching Python's [tool.codeflash] opt-in model.
  • add_help=False: Fixed — removed, was accidental.
  • Filesystem root walk: Fixed — upward walk now stops at git repo root.

Not addressed (pre-existing, not introduced by this PR):

  • Bug 2 (testsRoot resolution): Real bug in config_js.py:279 but file not touched by this PR. Separate fix needed.
  • Bugs 3-4 (CWD-relative tests_root): Inherited from existing single-language code path. Not a regression.
  • Bugs 5-7: Low/design — args.all=None is handled by optimize_all_requested, type annotation is cosmetic, separate normalization paths are by design.

@HeshamHM28

…vered project root

When running `--file` from a repo containing subdirectories with build files
(e.g. codeflash-java-runtime/ with pom.xml), config discovery picks up those
subdirs as language configs. This causes a ValueError crash because the target
file isn't within the spurious project root.

Two fixes:
- main.py: filter_configs_for_file() excludes configs whose project root
  doesn't contain the target file
- cli.py: only override project_root for Java when module_root is within
  config_path (defense-in-depth)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mashraf-222
Copy link
Copy Markdown
Contributor Author

Fixed in b320ef9. Two changes:

  1. main.py: Added filter_configs_for_file() — when --file is specified, excludes discovered configs whose project root doesn't contain the target file. Falls back to all configs if none match.

  2. cli.py: Java project_root override now only applies when module_root is actually within config_path. Previously it was unconditional, which broke when the user's --module-root pointed elsewhere.

Added 3 tests covering the spurious config scenario. E2E validated all 3 reviewer repro commands (Python, Java fixture, JS fixture path).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants