Add QRS-Tune PUCT hyperparameter tuning and match statistics#1178
Draft
ChinChangYang wants to merge 15 commits intolightvector:masterfrom
Draft
Add QRS-Tune PUCT hyperparameter tuning and match statistics#1178ChinChangYang wants to merge 15 commits intolightvector:masterfrom
ChinChangYang wants to merge 15 commits intolightvector:masterfrom
Conversation
Introduce tune-params subcommand for sequential optimization of KataGo PUCT parameters (cpuctExploration, cpuctExplorationLog, cpuctUtilityStdevPrior) using QRS-Tune, a quadratic response surface optimizer with logistic regression and confidence-based pruning. Add match statistics output with Bradley-Terry Elo ratings, Wilson confidence intervals, and pairwise win/loss/draw summaries. New files: - cpp/qrstune/QRSOptimizer.h: header-only QRS-Tune optimizer library - cpp/command/tuneparams.cpp: tune-params subcommand implementation Modified files: - cpp/CMakeLists.txt: add tuneparams.cpp to build - cpp/main.h, cpp/main.cpp: register tune-params subcommand - cpp/command/match.cpp: add Elo/CI/p-value statistics after matches https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
The root-level build/ directory is used for out-of-source CMake builds. https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
- Add missing NeuralNet::globalCleanup() call before ScoreValue::freeTables() to properly clean up neural net backend state on exit - Hoist bestWinRate computation out of per-dimension loop in printRegressionCurves() (value is invariant across dimensions) - Remove unnecessary step-number comments that restated the code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename ALL_CAPS constants to camelCase (nDims, paramNames, plotW, plotH, qrsDefaultMins/Maxs, rangeMinKeys/MaxKeys, eloPerStrength) - Change nullptr to NULL to match KataGo's dominant convention - Change "// Comment" to "//Comment" (no space after //) - Change "// --- Section ---" separators to "//Section" style - Leave QRSOptimizer.h unchanged (standalone library, own namespace) https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
…stency Add graceful SIGINT/SIGTERM shutdown to tuneparams matching the pattern used by match.cpp and other long-running commands. Fix QRSBuffer::prune to retain highest-quality samples rather than oldest insertion-order ones when applying min_keep. Add missing inline on gaussianSolve in header. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add convergence detection to computeBradleyTerryElo in match.cpp so that a warning is logged when the Newton-Raphson solver hits the 200 iteration limit without converging. Change QRSOptimizer.h free functions from static inline to inline for correct weak external linkage in the namespaced header-only library. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The header-only design violated KataGo's convention of separating declarations (.h) from implementations (.cpp). Move all non-trivial function bodies to QRSOptimizer.cpp, replace #pragma once with #ifndef guard, trim header includes, and have predict() delegate to score() to eliminate duplicated logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build directory moved under cpp/build which is already gitignored. https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
Move wilsonCI95 and oneTailedPValue from static functions in match.cpp to FancyMath namespace in core/fancymath.h/.cpp, following KataGo's pattern of placing reusable math utilities in core namespaces. Move computeBradleyTerryElo from static function in match.cpp to ComputeElos namespace in core/elo.h/.cpp, alongside the existing Elo computation utilities. match.cpp now calls FancyMath::wilsonCI95(), FancyMath::oneTailedPValue(), and ComputeElos::computeBradleyTerryElo() instead of file-local statics. tuneparams.cpp static functions (qrsDimToReal, qrsToPUCT, printRegressionCurves) are kept as file-local statics since they are command-specific helpers, matching KataGo's pattern for command files. https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
- Use existing ELO_PER_LOG_GAMMA constant instead of recomputing - Hoist Newton-loop allocations in computeBradleyTerryElo (grad, H, aug, delta) - Hoist Newton-loop allocations in QRSModel::fit (grad, negH) - Remove dead sigReceived state in tuneparams.cpp - Add n <= 0 guard to FancyMath::wilsonCI95 to prevent division by zero - Use std::move in QRSBuffer::prune for kept sample vectors - Fix cpp/README.md: qrstune is no longer header-only, fix algorithm name https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
Readability: - Add file-level comment explaining the QRS-Tune algorithm - Document feature layout with example (D=2: [1, a, b, a^2, b^2, a*b]) - Name magic numbers: SINGULAR_THRESHOLD, CONVERGENCE_THRESHOLD, SIGMOID_CLAMP - Rename shadow variable 'f' to 'mult' in gaussianSolve - Rename terse variables: z->logit, w->hessianWeight, resid->residual, maxd->maxStep, b_lin->linearCoeffs, b_quad->quadCoeffs, b_cross->crossCoeffs, p_best->bestPrediction, kv->entry, nx/ny->newXs/newYs - Add phase comments in fit() documenting Newton-Raphson steps - Add algorithm-level comment above fit() explaining the objective function Tests (8 test cases): - numFeatures: verify D=0,1,2,3 - computeFeatures: verify feature vector for D=2 - sigmoid: boundary, midpoint, and clamp behavior - gaussianSolve: 2x2 system, 3x3 identity, singular detection - QRSModel fit+predict: 1D and 2D separable data - QRSModel mapOptimum: optimum better than anti-optimum - QRSTuner end-to-end: 100 trials with deterministic seed - QRSBuffer prune: verify buffer size reduction https://claude.ai/code/session_01396bbJUdHCsiWRVPM58895
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
tune-paramssubcommand for automated PUCT hyperparameter tuning using QRS (Quadratic Response Surface) optimization. The optimizer runs sequential head-to-head matches between a base bot and an experiment bot, fitting a quadratic logistic regression model to propose better parameter combinations over time.Three PUCT parameters are tuned:
cpuctExploration,cpuctExplorationLog, andcpuctUtilityStdevPrior.Changes
New:
tune-paramssubcommand (cpp/command/tuneparams.cpp)QRSTuner::addResult()QRSTuner::nextSample()(MAP optimum + decaying Gaussian noise)match.cpp)New:
cpp/qrstune/QRSOptimizer.h+QRSOptimizer.cpp[1, x_i, x_i², x_i·x_j]prune_marginbelow the current MAP best, keeping at leastmin_keephighest-quality samplessigma_inittosigma_finEnhanced:
matchcommand statistics (cpp/command/match.cpp)New:
cpp/configs/tune_params_example.cfgExample config with 500 trials,
numGameThreads = 8,maxVisits = 500, with comments explaining all tuning-specific keys and search range defaults.Documentation
README.md— Added command example fortune-paramscpp/README.md— Addedqrstune/to source folder summary andtuneparams.cppto command listFiles changed
cpp/command/tuneparams.cpp— New subcommandcpp/qrstune/QRSOptimizer.h+QRSOptimizer.cpp— QRS optimizer (model, buffer, tuner)cpp/command/match.cpp— Bradley-Terry Elo and match statisticscpp/configs/tune_params_example.cfg— Example configcpp/CMakeLists.txt— Build integrationREADME.md,cpp/README.md— Documentation updates