-
Notifications
You must be signed in to change notification settings - Fork 303
Fix activation scale inf issue for const_weight and const_scale #2448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -243,6 +243,18 @@ def post_quantization_cleanup(self): | |
| None: Cleans up observers and sets quantized call. | ||
| """ | ||
| self._tracker.unlock() | ||
| if not self._is_quantized: | ||
| # Clean up observer only if it exists | ||
| if hasattr(self, "input_observer"): | ||
| if hasattr(self, "_layers") and self.input_observer in self._layers: | ||
| self._layers.remove(self.input_observer) | ||
| # Set call to pass-through/original | ||
| if hasattr(self, "call"): | ||
| # pass through | ||
| pass | ||
| self._const_variables = [] | ||
| self._tracker.lock() | ||
| return | ||
|
Comment on lines
+246
to
+257
|
||
| if hasattr(self, "_layers") and hasattr(self, "input_observer"): | ||
| if self.input_observer in self._layers: | ||
| self._layers.remove(self.input_observer) | ||
|
|
@@ -447,6 +459,16 @@ def post_quantization_cleanup(self): | |
| None: Cleans up observers and original weights. | ||
| """ | ||
| self._tracker.unlock() | ||
| if not self._is_quantized: | ||
| if hasattr(self, "input_observer"): | ||
| if hasattr(self, "_layers") and self.input_observer in self._layers: | ||
| self._layers.remove(self.input_observer) | ||
| # Set call to pass-through/original | ||
| if hasattr(self, "call"): | ||
| pass | ||
| self._const_variables = [] | ||
| self._tracker.lock() | ||
| return | ||
|
Comment on lines
+462
to
+471
|
||
| if hasattr(self, "_kernel") and self._kernel in self._trainable_variables: | ||
| self._trainable_variables.remove(self._kernel) | ||
| del self._kernel | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new
if not self._is_quantized: ... returnshort-circuitspost_quantization_cleanup()in the normal successful quantization path because_is_quantizedis only set toTrueat the end of this method. This prevents switchingself.calltocall_symmetric/call_asymmetricand prevents converting const vars, so static quantization will effectively never activate (and deserialized models will still havecall()pointing atinput_observer, which may not exist). Consider removing this early-return and instead gating the skip-path on a separate flag set byconvert()when calibration fails (e.g.,_skip_quantization=True), where you also explicitly setcallto a pass-through implementation and delete/clearinput_observerconsistently.