[AIMIGRAPHX-828] Cross Compile Pt 2: add env variable to enable cross compilation#4795
[AIMIGRAPHX-828] Cross Compile Pt 2: add env variable to enable cross compilation#4795
Conversation
…vice_name_change
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #4795 +/- ##
===========================================
- Coverage 92.49% 92.49% -0.00%
===========================================
Files 583 583
Lines 29562 29565 +3
===========================================
+ Hits 27343 27345 +2
- Misses 2219 2220 +1
🚀 New features to boost your workflow:
|
|
@pfultz2 @CharlieL7 this is the implementation using environment variables. I'm thinking of instead adding it to compile options with an additional map that can be used to tweak number of CUs, chiplets, etc. Also let me know what you think of the rest of the changes. |
There was a problem hiding this comment.
Pull request overview
Adds a GPU cross-compilation mode driven by environment variables so MIGraphX can compile for a specified gfx* target without depending on a physical device, primarily by synthesizing GPU device properties and disabling tuning/finalization steps that require execution.
Changes:
- Introduces
MIGRAPHX_GPU_ARCH(plus CU/chiplet overrides) and a synthetic HIP device-props path to enable GPU cross-compilation. - Disables kernel benchmarking/tuning paths when cross-compiling (problem cache load + compile_ops benchmarking).
- Updates GPU passes and data-type elimination logic to query device capabilities via
contextwhere available; documents the new env vars and adds a changelog entry.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/targets/gpu/target.cpp | Adds env vars, creates cross-compile GPU context, and avoids problem-cache load in cross-compile mode. |
| src/targets/gpu/include/migraphx/gpu/eliminate_data_type_for_gpu.hpp | Adds context* plumbing to support context-based device capability queries. |
| src/targets/gpu/include/migraphx/gpu/cross_compile_device.hpp | Declares helper to build synthetic hipDeviceProp_t for cross compilation. |
| src/targets/gpu/include/migraphx/gpu/context.hpp | Adds cross-compile constructors/mode and blocks runtime execution paths in that mode. |
| src/targets/gpu/eliminate_data_type_for_gpu.cpp | Uses context-aware queries (when provided) for fp8/bf16 capability checks. |
| src/targets/gpu/device_name.cpp | Refactors hipBLASLt support helpers and retains context/no-context query variants. |
| src/targets/gpu/cross_compile_device.cpp | Implements synthetic HIP device properties for cross compilation. |
| src/targets/gpu/compile_ops.cpp | Skips benchmarking/tuning when cross-compiling. |
| src/targets/gpu/CMakeLists.txt | Adds new cross-compilation device source to the GPU library. |
| src/program.cpp | Skips finalize/module-finalize when MIGRAPHX_GPU_ARCH is set. |
| src/include/migraphx/program.hpp | Declares MIGRAPHX_GPU_ARCH env var for core program compilation logic. |
| docs/reference/MIGraphX-dev-env-vars.rst | Documents MIGRAPHX_GPU_ARCH, CU/chiplet overrides, and stated limitations. |
| CHANGELOG.md | Adds a changelog entry for cross-compilation support. |
| @@ -342,7 +344,8 @@ void program::compile(const target& t, compile_options options) | |||
| MIGRAPHX_THROW("Dangling reference in module " + mod->name() + " from instruction " + | |||
| std::to_string(index)); | |||
| } | |||
| mod->finalize(this->impl->contexts); | |||
| if(not cross_compiling) | |||
| mod->finalize(this->impl->contexts); | |||
| if(not arch.empty()) | ||
| { | ||
| auto num_cu = value_of(MIGRAPHX_GPU_NUM_CU{}, 120); | ||
| auto num_chiplets = value_of(MIGRAPHX_GPU_NUM_CHIPLETS{}, 1); |
| #include <migraphx/gpu/export.h> | ||
| #include <migraphx/config.hpp> | ||
| #include <hip/hip_runtime_api.h> | ||
| #include <string> | ||
|
|
||
| namespace migraphx { | ||
| inline namespace MIGRAPHX_INLINE_NS { | ||
| namespace gpu { | ||
|
|
||
| /// Populate a hipDeviceProp_t with synthetic values for cross-compilation. | ||
| /// Used when no physical GPU is present and the target architecture | ||
| /// is specified via environment variables. | ||
| MIGRAPHX_GPU_EXPORT hipDeviceProp_t make_cross_compile_device_props(const std::string& arch_name, | ||
| std::size_t cu_count); |
| struct MIGRAPHX_GPU_EXPORT eliminate_data_type_for_gpu | ||
| { | ||
| bool disable_64bit = false; | ||
| const context* ctx = nullptr; | ||
| std::string name() const { return "gpu::eliminate_data_type_for_gpu"; } | ||
| void apply(module_pass_manager& mpm) const; |
| | Enables cross-compilation mode by specifying a target GPU architecture without requiring a physical GPU. | ||
| | When set, kernel benchmarking and finalization are skipped. MIOpen, hipBLASLt, and CK operations are currently not supported in this mode. | ||
|
|
||
| - | Takes a valid GPU architecture string (e.g. ``gfx942``, ``gfx1100``). | ||
|
|
||
| | Default: Not set. A physical GPU is used. |
| if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty()) | ||
| this->finalize(); |
| ctx.set_exhaustive_tune_flag(options.exhaustive_tune); | ||
| ctx.load_problem_cache(); | ||
| if(not ctx.is_cross_compile()) | ||
| ctx.load_problem_cache(); |
There was a problem hiding this comment.
The problem_cache should still be loaded when cross compiling. Otherwise compile_ops will not pick the fastest config.
| } | ||
| } | ||
| this->finalize(); | ||
| if(string_value_of(MIGRAPHX_GPU_ARCH{}).empty()) |
There was a problem hiding this comment.
I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.
| auto&& passes = t.get_passes(this->impl->contexts.front(), options); | ||
| run_passes(*this, passes, options.trace); | ||
| auto mods = this->get_modules(); | ||
| bool cross_compiling = not string_value_of(MIGRAPHX_GPU_ARCH{}).empty(); |
There was a problem hiding this comment.
I dont think we should use env variable here. Instead the target or context should tell us if we can run finalize.
| if(ctx != nullptr) | ||
| return f(*ctx); | ||
| return f(); | ||
| } |
There was a problem hiding this comment.
Remove this. Everything should just use the function that takes the context. The other overloads should be removed, in a later PR.
Motivation
builds on top of part 1: #4771
Technical Details
MIGRAPHX_GPU_ARCHwhich can be set to whatever is supported by the compiler.Changelog Category
Add a
CHANGELOG.mdentry for any option other thanNot Applicable