Add symbolic expression by pfultz2 · Pull Request #4782 · ROCm/AMDMIGraphX

pfultz2 · 2026-04-13T19:03:21Z

Motivation

This PR introduces a revamped symbolic expression subsystem (migraphx::sym) with richer scalar/interval semantics, parsing, rewriting/simplification, and broad new test coverage. It also adds a reusable lightweight parser utility used by the symbolic expression parser.

Technical Details

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

- Added: New functionality.
- Changed: Changes to existing functionality.
- Removed: Functionality or support that has been removed. (Compared to a previous release)
- Optimized: Component performance that has been optimized or improved.
- Resolved Issues: Known issues from a previous version that have been resolved.
- Not Applicable: This PR is not to be included in the changelog.

shivadbhavsar

I did a quick initial scan focusing on the interface mostly and it should be relatively straight forward to integrate (I think a few explicit conversions from scalar to size_t need to be added in some places). I'd prefer to get #4702 in first and make sure that all the shape integration tests also pass here.

I'll do another round soon to actually go through the implementation details

shivadbhavsar · 2026-04-16T00:44:04Z

+        : pimpl(make_impl(std::move(node), std::move(children)))
+    {
+    }
+


can we add a is_literal somewhere in here, its need for is_fixed in the shape class.

You can check the name of the node: expr.name() == "literal".

shivadbhavsar · 2026-04-16T19:47:45Z

+            scalar_max(scalar_max(p1, p2), scalar_max(p3, p4))};
+}
+
+interval operator/(interval a, interval b)


i think this should probably throw if interval b crosses 0 (eg. [-2, 4]). Wont really affect our current use case but would be good for correctness

So [-2, 4] doesnt mean we will divide by zero. These are ranges of max and min and doesnt mean every value in between is always used.

well for floats especially, the true resulting interval would be unbounded. And also for integers im not sure this gives the correct result, bounds would have to be computed at -1 and 1

I see, the formula is derived by taking the reciprocal. The reciprocal formula is [1/max, 1/min] so for [-2, 4], it produces [1/3, -1/2] which is not a correct interval. The correct answer is the union of two intervals (-inf, -1/2) U (1/3, inf), which as a single interval will have to be (-inf, inf).

shivadbhavsar · 2026-04-16T19:57:37Z

+
+    if(var_optimals.empty())
+        return {eval({})};


why do we want to add optimals at all in this case?

shivadbhavsar · 2026-04-16T20:25:06Z

-expr operator+(const expr& a, const expr& b)
+bool expr::is_raw() const { return pimpl and pimpl->raw_flag; }
+
+const std::vector<expr>& expr::children() const { return pimpl->children; }


missing null check?

shivadbhavsar · 2026-04-16T21:21:21Z

+    friend bool operator==(const literal_node& a, const literal_node& b)
+    {
+        return scalar_invoke_common<bool>(
+            [](auto a, auto b) { return float_equal(a, b); }, a.val, b.val);


this makes lit(1) == lit(1.0). But they will hash differently when using expr in maps and sets

Ok let me fix the hashing.

shivadbhavsar · 2026-04-16T21:29:47Z

 }

-static expr_ptr parse_unary(const char*& p)
+std::size_t expr::eval_uint(const std::unordered_map<expr, std::size_t>& symbol_map) const


might be good to explicitly throw if eval result is negative

Copilot

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 12 comments.

Copilot · 2026-04-24T03:58:26Z

+    bool empty() const { return start == last; }
+
+    std::size_t size() const { return std::distance(start, last); }


iterator_range::size() uses std::distance, but ranges.hpp doesn’t include <iterator>. Please include it explicitly rather than relying on transitive includes.

Copilot · 2026-04-24T03:58:27Z

+    auto compute = [&](const scalar& x, const scalar& y) -> scalar {
+        if(std::holds_alternative<int64_t>(x) and std::holds_alternative<int64_t>(y))
+            return f(std::get<int64_t>(x), std::get<int64_t>(y));
+        return fd(to<double>(x), to<double>(y));
+    };


interval operator%: the integer branch uses % directly and will hit UB if any endpoint of b is 0 (or if b spans 0). This should be detected and handled explicitly.

Copilot · 2026-04-24T03:58:27Z

+    if((std::isdigit(p.peek_char()) == 0) and p.peek_char() != '.')
+        return {};


std::isdigit(p.peek_char()) passes a char directly into ctype; if char is signed this is UB for negative values. Cast to unsigned char before calling ctype functions (consistent with src/driver/main.cpp:357-359).

Copilot · 2026-04-24T03:58:27Z

+            overloaded{[](const literal_node& n) -> std::optional<scalar> { return n.val; },
+                       [](const auto&) -> std::optional<scalar> { return std::nullopt; }},


In eval_uint, the replacer returns nullopt for variable_node when the variable isn’t in the map. For a leaf variable this means generic_eval will eventually try std::get<op_node>(...) on a variable_node, throwing std::bad_variant_access instead of a controlled MIGRAPHX_THROW (e.g. “unbound symbol”). Handle unbound variables explicitly in the replacer (and consider validating that map keys are variables).

Suggested change

overloaded{[](const literal_node& n) -> std::optional<scalar> { return n.val; },

[](const auto&) -> std::optional<scalar> { return std::nullopt; }},

overloaded{

[](const literal_node& n) -> std::optional<scalar> { return n.val; },

[](const variable_node&) -> std::optional<scalar> {

MIGRAPHX_THROW("Unbound symbol in eval_uint");

},

[](const auto&) -> std::optional<scalar> { return std::nullopt; }},

Copilot · 2026-04-24T03:58:27Z

+    if constexpr(std::is_integral<T>{})
+        return lit(static_cast<int64_t>(x));
+    else
+        return lit(static_cast<double>(x));


arg(T) casts integral values to int64_t directly. For large unsigned values this is implementation-defined / lossy and can wrap. Prefer lit(make_scalar(x)) here to match the clipping/typing rules used elsewhere.

Suggested change

if constexpr(std::is_integral<T>{})

return lit(static_cast<int64_t>(x));

else

return lit(static_cast<double>(x));

return lit(make_scalar(x));

Copilot · 2026-04-24T03:58:28Z

+            MIGRAPHX_CALL_FUNC(ceil),
+        };
+#undef MIGRAPHX_CALL_FUNC
+    return functions.at(name)(args);


call_function uses functions.at(name), so unknown names throw std::out_of_range and lose useful context. Prefer a find() + MIGRAPHX_THROW that includes the unknown function/operator name (and parse position when called from parsing).

Suggested change

return functions.at(name)(args);

auto it = functions.find(name);

if(it == functions.end())

MIGRAPHX_THROW("Unknown function/operator: " + name);

return it->second(args);

Copilot · 2026-04-24T03:58:28Z

+    char c = p.peek_char();
+    if((std::isalpha(c) == 0) and c != '_')
+        return {};
+    auto name = p.parse_while([](unsigned char ch) { return std::isalnum(ch) or ch == '_'; });


Same issue here: std::isalpha(c) is called with a char. Cast to unsigned char (and similarly for std::isalnum) to avoid UB on non-ASCII bytes.

Copilot · 2026-04-24T03:58:28Z

+    return to<std::size_t>(generic_eval<scalar>(*this, [&](const expr& e) -> std::optional<scalar> {
+        auto it = symbol_map.find(e);


Also, eval_uint converts the final scalar to std::size_t without checking for negative results. A negative expression value will silently wrap to a huge size_t. Consider preserving the previous behavior of throwing if the evaluated value is negative.

Copilot · 2026-04-24T03:58:29Z

+    auto letters = p.parse_while([](char c) { return std::isalpha(c); });
+    EXPECT(letters == "abc");
+    auto digits = p.parse_while([](char c) { return std::isdigit(c); });


std::isalpha / std::isdigit are called with char values here. If char is signed, passing negative values to ctype is UB. Please cast to unsigned char in these predicates (consistent with src/driver/main.cpp:357-359).

Suggested change

auto letters = p.parse_while([](char c) { return std::isalpha(c); });

EXPECT(letters == "abc");

auto digits = p.parse_while([](char c) { return std::isdigit(c); });

auto letters =

p.parse_while([](char c) { return std::isalpha(static_cast<unsigned char>(c)); });

EXPECT(letters == "abc");

auto digits =

p.parse_while([](char c) { return std::isdigit(static_cast<unsigned char>(c)); });

Copilot · 2026-04-24T03:58:29Z

+template <class Iterator>
+void hash_range(std::size_t& seed, Iterator first, Iterator last)
+{
+    std::for_each(first, last, [&](const auto& x) { hash_combine(seed, x); });
+}


hash_range uses std::for_each, but this header doesn’t include <algorithm>, which can break compilation depending on include order. Please add <algorithm> (or avoid std::for_each).

Copilot

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 14 comments.

Copilot · 2026-04-24T15:51:05Z

+    char c = p.peek_char();
+    if((std::isalpha(c) == 0) and c != '_')
+        return {};
+    auto name = p.parse_while([](unsigned char ch) { return std::isalnum(ch) or ch == '_'; });


parse_func_or_var uses std::isalpha(c) on a char from peek_char(). In this repo we typically cast to unsigned char before ctype functions to avoid UB when char is signed (e.g. src/driver/main.cpp:357-359). Please use std::isalpha(static_cast<unsigned char>(c)) here (and similarly for any other new ctype calls on char).

Copilot · 2026-04-24T15:51:06Z

+interval log(interval x) { return {std::log(to<double>(x.min)), std::log(to<double>(x.max))}; }

-static add_parts extract_add(const expr_ptr& e)
+interval sqrt(interval x)
 {
-    return std::visit(
-        overloaded{[](const integer_data& d) -> add_parts { return {d.value, {}}; },
-                   [](const add_data& d) -> add_parts { return {d.constant, d.terms}; },
-                   [&](const mul_data& d) -> add_parts {
-                       auto base = build_mul(1, d.factors);
-                       return {0, {{base, d.coefficient}}};
-                   },
-                   [&](const auto&) -> add_parts { return {0, {{e, 1}}}; }},
-        e->data);
+    auto lo = std::sqrt(std::max(0.0, to<double>(x.min)));
+    auto hi = std::sqrt(std::max(0.0, to<double>(x.max)));
+    return {lo, hi};


interval log(interval) and interval sqrt(interval) don't handle out-of-domain inputs robustly. log on an interval with min <= 0 will produce NaNs via std::log, and sqrt currently clamps negative endpoints to 0 which can yield a seemingly-valid interval for an invalid domain (e.g. sqrt([-4,-1]) becomes [0,0]). Consider returning a conservative extended-real bound (e.g. log([-1, 5]) -> [-inf, log(5)]) or throwing when the interval is entirely out of domain (max <= 0 for log, max < 0 for sqrt).

Copilot · 2026-04-24T15:51:06Z

 std::size_t expr::eval_uint(const std::unordered_map<expr, std::size_t>& symbol_map) const
 {
-    if(empty())
-        return 0;
-    binding_map bindings;
-    for(const auto& [k, v] : symbol_map)
-    {
-        if(k.empty() or not holds<symbol_data>(k.p->node))
-            MIGRAPHX_THROW("sym::expr::eval_uint: map key '" + k.to_string() + "' is not a symbol");
-        bindings[k.p->node] = v;
-    }
-    auto v = eval_direct(p->node, bindings);
-    if(v < 0)
-        MIGRAPHX_THROW("sym::expr::eval_uint: expression evaluated to negative value");
-    return v;
+    return to<std::size_t>(generic_eval<scalar>(*this, [&](const expr& e) -> std::optional<scalar> {
+        auto it = symbol_map.find(e);
+        if(it != symbol_map.end())
+            return make_scalar(it->second);
+        return std::visit(
+            overloaded{[](const literal_node& n) -> std::optional<scalar> { return n.val; },
+                       [](const auto&) -> std::optional<scalar> { return std::nullopt; }},
+            get_node(e));
+    }));
 }


expr::eval_uint no longer handles empty expressions and can hit get_node(e) on an empty expr (assert/deref). It also converts the evaluated scalar to std::size_t without checking for negative or non-integer results, so lit(-1).eval_uint({}) would silently wrap to a huge value. Please add an explicit empty guard and validate the evaluated scalar is a non-negative integer before converting/returning.

Copilot · 2026-04-24T15:51:06Z

+scalar expr::eval(const std::unordered_map<expr, scalar>& vars) const
 {
-    if(a.empty() or b.empty())
-        return {};
-    return {std::make_shared<expr::impl>(make_add(a.p->node, b.p->node))};
+    return generic_eval<scalar>(*this, [&](const expr& e) -> std::optional<scalar> {
+        auto it = vars.find(e);
+        if(it != vars.end())
+            return it->second;
+        return std::visit(
+            overloaded{[](const literal_node& n) -> std::optional<scalar> { return n.val; },
+                       [](const auto&) -> std::optional<scalar> { return std::nullopt; }},
+            get_node(e));
+    });


expr::eval calls get_node(e) via the replace lambda even when *this is empty; get_node asserts pimpl != nullptr, so evaluating an empty expression will crash. Please add an explicit if(empty()) behavior (e.g., throw MIGRAPHX_THROW with a clear message) or handle empty in the replace lambda before touching get_node.

Copilot · 2026-04-24T15:51:06Z

+    if((std::isdigit(p.peek_char()) == 0) and p.peek_char() != '.')
+        return {};
+    auto token    = p.parse_while([](unsigned char c) { return std::isdigit(c) or c == '.'; });
+    bool is_float = token.find('.') != std::string_view::npos;
+    if(is_float)
+        return lit(std::stod(std::string(token)));
+    return lit(std::stoll(std::string(token)));


parse_number calls std::isdigit(p.peek_char()) where peek_char() is a char; if char is signed and the input contains non-ASCII bytes this is undefined behavior. Also, tokenization accepts any mix of digits and '.' (e.g. "." or "1..2"), which can make std::stod/std::stoll throw a standard exception rather than a MIGRAPHX_THROW with location context. Consider casting to unsigned char for ctype calls, validating the numeric token (single '.' / at least one digit), and converting parse failures into MIGRAPHX_THROW(p.error_message(...)).

Copilot · 2026-04-24T15:51:08Z

+interval expr::eval_interval(const std::unordered_map<expr, interval>& vars) const
 {
-    if(a.empty() or b.empty())
-        return {};
-    return {std::make_shared<expr::impl>(make_sub(a.p->node, b.p->node))};
+    return generic_eval<interval>(*this, [&](const expr& e) -> std::optional<interval> {
+        auto it = vars.find(e);
+        if(it != vars.end())
+            return it->second;
+        return std::visit(
+            overloaded{[](const literal_node& n) -> std::optional<interval> {
+                           return interval{n.val, n.val};
+                       },
+                       [](const variable_node& n) -> std::optional<interval> {
+                           if(not n.constraints.empty())
+                               return n.constraints.front();
+                           MIGRAPHX_THROW("Variable '" + n.name + "' not found in interval map");
+                       },
+                       [](const op_node&) -> std::optional<interval> { return std::nullopt; }},
+            get_node(e));
+    });


expr::eval_interval has the same empty-expression issue as eval: if *this is empty, the replace lambda falls through to get_node(e) and will assert/crash. Consider handling empty() up front (or in the replace lambda) with a consistent policy (throw or return a sentinel interval).

Copilot · 2026-04-24T15:51:08Z

+static std::string scalar_to_string(const scalar& v)
 {
-    if(a.empty() and b.empty())
-        return true;
-    if(a.empty() != b.empty())
-        return false;
-    return expr_equal(a.p->node, b.p->node);
+    return std::visit(
+        [](auto x) -> std::string {
+            std::ostringstream ss;
+            ss << x;
+            return ss.str();
+        },
+        v);


scalar_to_string uses default std::ostringstream formatting for doubles, which may emit scientific notation (e.g. 1e-06). However parse_number only accepts digits and '.' and will reject/throw on that output, breaking parse(to_string(expr)) for many valid doubles. Please either (a) extend parse_number to accept exponent forms, or (b) force a non-scientific, round-trippable formatting in scalar_to_string (and add coverage for such cases).

Copilot · 2026-04-24T15:51:09Z

+    auto inner = lit(7) * lit(2);
+    EXPECT(inner.eval_uint({}) == 14);


eval_uint_symbol_map_partial currently doesn't use e or the symbol map at all; it just evaluates lit(7) * lit(2). This doesn't test partial evaluation/mapping behavior as the name/comment suggest. Consider changing it to actually call e.eval_uint({{x, 7}}) (or removing/renaming the test if partial mapping isn't supported).

Suggested change

auto inner = lit(7) * lit(2);

EXPECT(inner.eval_uint({}) == 14);

EXPECT(e.eval_uint({{x, 7}}) == 14);

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

pfultz2 added 30 commits March 20, 2026 17:11

Add sym::expr

45fa74f

Format

3150f5a

Move constructor inline

e1f4252

Format

3e0d84b

Move .cpp

8752a2f

Format

31ca343

Add comparison ops

a6552c1

Format

e84dd47

Add assign operators

c2f3514

Add more functions

0ba12f2

Format

684d536

Add to_string method

90d30ac

Format

cecb90a

Add call associative

c399c54

Foramt

d2fc809

Add implicit const folding

92c716e

Format

7cb021a

Normalize subtract

169c34a

Format

2b01317

Normalize subtraction

cfdd0b0

Format

ed58ec1

Some cleanup

aeba6af

Format

f9364dd

Add normalize_expr

03f6064

Add normalization

063645e

Format

db9720d

Add rewrite rules

8b19cfc

Format

c55d529

Add parser

fd2c803

Format

cfa99b6

pfultz2 added 3 commits April 14, 2026 08:57

Add windows export

87f463f

Call make_scalar

794c532

Format

3e0c1ec

shivadbhavsar mentioned this pull request Apr 15, 2026

[AIMIGRAPHX-835] integrate symbolic expression in dynamic_dimension #4702

Merged

shivadbhavsar reviewed Apr 16, 2026

View reviewed changes

pfultz2 and others added 12 commits April 23, 2026 18:52

Use hash method

93b46fb

Format

177a0cb

Use the provided hash functions

a2ad982

Format

6f97ea3

Add test for hash function

ebcf6ea

Format

08220cd

Fix hashing

aacb571

Format

32abd4b

Ignore constraints for equality

6f8f0dc

Refactor eval_optimals

6caa980

Format

dbe8f16

Merge branch 'develop' into sym-expr2

5835500

pfultz2 requested a review from Copilot April 24, 2026 03:48

Copilot started reviewing on behalf of pfultz2 April 24, 2026 03:50 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

pfultz2 and others added 4 commits April 24, 2026 10:24

Handle division by zero

8a04ffb

Format

fd08c6d

Handl mod intervals with 0

a040784

Merge branch 'develop' into sym-expr2

98050ff

pfultz2 requested a review from Copilot April 24, 2026 15:41

Copilot started reviewing on behalf of pfultz2 April 24, 2026 15:43 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

pfultz2 and others added 2 commits April 24, 2026 13:18

Update src/sym.cpp

158cba8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update test/sym.cpp

dc84048

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

		bool empty() const { return start == last; }

		std::size_t size() const { return std::distance(start, last); }

		if((std::isdigit(p.peek_char()) == 0) and p.peek_char() != '.')
		return {};

		overloaded{[](const literal_node& n) -> std::optional<scalar> { return n.val; },
		[](const auto&) -> std::optional<scalar> { return std::nullopt; }},

-    return functions.at(name)(args);
+    auto it = functions.find(name);
+    if(it == functions.end())
+        MIGRAPHX_THROW("Unknown function/operator: " + name);
+    return it->second(args);

		return to<std::size_t>(generic_eval<scalar>(*this, [&](const expr& e) -> std::optional<scalar> {
		auto it = symbol_map.find(e);

		auto inner = lit(7) * lit(2);
		EXPECT(inner.eval_uint({}) == 14);

	auto inner = lit(7) * lit(2);
	EXPECT(inner.eval_uint({}) == 14);
	EXPECT(e.eval_uint({{x, 7}}) == 14);

Conversation

pfultz2 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Changelog Category

Uh oh!

shivadbhavsar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

pfultz2 commented Apr 13, 2026 •

edited

Loading