Skip to content

Add SimplifiedLayerNormToRMSNorm surgery#2348

Open
unnim-qti wants to merge 4 commits intomicrosoft:mainfrom
CodeLinaro:dev/unnim-qti/qnn-gpu-simplifiedlayernorm-to-rmsnorm-surgery
Open

Add SimplifiedLayerNormToRMSNorm surgery#2348
unnim-qti wants to merge 4 commits intomicrosoft:mainfrom
CodeLinaro:dev/unnim-qti/qnn-gpu-simplifiedlayernorm-to-rmsnorm-surgery

Conversation

@unnim-qti
Copy link
Copy Markdown

@unnim-qti unnim-qti commented Mar 5, 2026

Describe your changes

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

Comment thread olive/passes/onnx/graph_surgeries.py
@xiaoyu-work
Copy link
Copy Markdown
Collaborator

@unnim-qti can you please update this PR? We are going to release new Olive version this Friday and this PR will be included

@unnim-qti unnim-qti force-pushed the dev/unnim-qti/qnn-gpu-simplifiedlayernorm-to-rmsnorm-surgery branch from 63ebd29 to bd06441 Compare April 12, 2026 20:55
@justinchuby
Copy link
Copy Markdown
Contributor

Why not make this a rewrite rule, out of curiosity?

@unnim-qti
Copy link
Copy Markdown
Author

Why not make this a rewrite rule, out of curiosity?

I considered a rewrite rule, but this transform isn’t purely structural.

  • It has opset‑dependent ReduceMean semantics, handles two operator variants and requires conditional rewiring of an optional second output with ValueInfo preservation, which is cumbersome to express in a pattern rewrite.
  • A ProtoSurgeon was therefore a cleaner and safer fit.

@justinchuby
Copy link
Copy Markdown
Contributor

Handling value infos and optional output handling should be straightforward with the onnxscript rewriter. I wonder if you saw any blockers?

@unnim-qti
Copy link
Copy Markdown
Author

unnim-qti commented Apr 23, 2026

I didn’t see any hard blockers in the rewriter, but I followed the existing pattern SimplifiedLayerNormToL2Norm, which use ProtoSurgeon to handle variant‑specific wiring, and optional outputs explicitly. Given that precedent and the comparable complexity here, I kept this transform in the ProtoSurgeon style for the consistency.

@unnim-qti
Copy link
Copy Markdown
Author

Handling value infos and optional output handling should be straightforward with the onnxscript rewriter. I wonder if you saw any blockers?

can you review the changes and initiate the pending checks.

Comment thread olive/passes/onnx/graph_surgeries.py Outdated
add_eps_name = self.create_new_name(node_name, op_type, "AddEps")
add_eps_out = f"{add_eps_name}_out"

eps_const = numpy_helper.from_array(np.array([eps_value], dtype=np.float32), name=f"{add_eps_name}_const")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@unnim-qti, this causes an error with fp16 activations, since eps_const is always fp32 in this surgery:

Type Error: Type parameter (T) of Optype (Add) bound to different types (tensor(float16) and tensor(float) in node (/model/layers.0/input_layernorm/LayerNorm_AddEps).

We should be able to get the datatype for epsilon from the type of ln_input.

Similar situation with pow_const, although Pow is allowed to have inputs of different types, so it's not a hard error unlike for Add.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qti-mattsinc , Thanks for pointing this out — I’ve updated the surgery to assign epsilon (and pow_const) to match the ln_input datatype, which resolves the FP16 activation type mismatch in the Add node.

@unnim-qti unnim-qti force-pushed the dev/unnim-qti/qnn-gpu-simplifiedlayernorm-to-rmsnorm-surgery branch from 7a1d902 to 4f05344 Compare April 28, 2026 19:29
@unnim-qti
Copy link
Copy Markdown
Author

@jambayk , Could you please re‑initiate the rebuild? The docs build failed due to a 404 Client Error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants