Skip to content

Experiment: conserve in TrampolinedRewrite#1382

Closed
jiribenes wants to merge 2 commits intomainfrom
experiment/rewrite-conserve
Closed

Experiment: conserve in TrampolinedRewrite#1382
jiribenes wants to merge 2 commits intomainfrom
experiment/rewrite-conserve

Conversation

@jiribenes
Copy link
Copy Markdown
Contributor

I tried to put together a rewrite of TrampolinedRewrite that (tries to?) rebuild.

It currently always makes a very short lived allocation, this could be alleviated by using a proper macro.
Nevertheless, this is likely worth checking out. :)

@jiribenes jiribenes requested a review from b-studios April 15, 2026 21:44
@jiribenes jiribenes added area:compiler quality-of-life experiment Experimental branch, do not merge! labels Apr 15, 2026
@b-studios
Copy link
Copy Markdown
Collaborator

b-studios commented Apr 15, 2026

Looks great and that gives me an idea: can't we even have a combinator like

def rewrite[T <: AnyRef](t: T)(run: T => T): T 

that does what you already implemented?

Then we can even write phases manually like:

def rewrite(s: Stmt): Stmt = rewrite(s) {
  case If(c, t, e) =>
    If(c, rewrite(t), rewrite(e))
  ...
}

and it would automagically work as long as one does not pattern match more than one level.

This would strike the balance of making even @phischu happy, because he can still pattern match and reconstruct.

We could even implement it like:

def rewrite[T <: AnyRef](t: T)(run: T => T): T = 
  val res = run(t) 
  if res == t then t else res

which would even work when pattern matching more than one level! It seems crazy at first since one might think that we structurally compare at every recursive level. However, if we are lucky case classes have a fast path with eq and thus this would end up pretty similar to your code.

@jiribenes
Copy link
Copy Markdown
Contributor Author

I also now have a version where rebuild is a macro: I'd first like to benchmark main vs rebuild-naive vs rebuild-macro and see what happens 👀

One thing I like about it a lot is that it's opt-in: if you forget it, you "only" get worse perf. (And YIL/TIL that List.mapConserve exists which might help a lot!)

But I agree that the experience could be better, I just wanted to quickly put smth together

@b-studios
Copy link
Copy Markdown
Collaborator

b-studios commented Apr 15, 2026

This very simple thing seems to work https://scastie.scala-lang.org/ZhwS41kgRa2PYtCIYLILkA

One could try to look at the generated equals instance with something like the "-Vprint:typer" option.

@jiribenes
Copy link
Copy Markdown
Contributor Author

jiribenes commented Apr 16, 2026

Here are the results of my benchmarks:
I tested on the other/unify benchmark since it pulls in a lot of modules, so there's a lot to DCE.
I always ran ≥20 runs of each of the three variants (NONE = main, RUNTIME, MACRO), and checked both the deadcode phase times and the whole build time. Everything was only build times + -no-optimize.

  1. RUNTIME is slower than NONE by 5ms out of ~40-50ms
  2. MACRO is faster than NONE by 5ms out of ~40-50ms
  3. Only MACRO vs RUNTIME is super statistically significant, everything else is borderline -- everything is super noisy -- there's only one pass using the TrampolinedRewrite...

The total runtimes were: NONE 3.41s+-0.55, RUNTIME 3.55s+-0.51, MACRO 3.20s+-0.42.

So if at all, it should be a macro.

@b-studios
Copy link
Copy Markdown
Collaborator

b-studios commented Apr 16, 2026

Thanks for measuring! I am wondering:

  • did you check that the identity transformation preserves reference equality?
  • Where is the trampolining rewrite used again (just checked: it is only Deadcode)?
  • Are the analyses used before and after that? (typecheck is called the Aggregate phase, but only on debug).
    So maybe it would be interesting to run typecheck before and after deadcode and mesure the time again. If (and this is a big if) the reference equality is preserved by the phase, it should reuse the cached results.

I investigated the generated equals implementation for the Scastie above. It looks like:

override def equals(x$0: Object): Boolean =
[info]         this.eq(x$0:Object).||(
[info]           matchResult4[Boolean]:
[info]             {
[info]               case val x5: Object = x$0
[info]               if x5.isInstanceOf[Exp.Add] then
[info]                 {
[info]                   case val x$0: Exp.Add = x5.asInstanceOf[Exp.Add]
[info]                   return[matchResult4]
[info]                     this.l().==(x$0.l()).&&(this.r().==(x$0.r()))
[info]                 }
[info]                else ()
[info]               return[matchResult4] false
[info]             }
[info]         )

which does look good to me. First check eq then proceed on the children (which in turn again will first check eq).

@jiribenes
Copy link
Copy Markdown
Contributor Author

jiribenes commented Apr 16, 2026

I investigated the generated equals implementation for the Scastie above. It looks like:

override def equals(x$0: Object): Boolean =
[info]         this.eq(x$0:Object).||(
[info]           matchResult4[Boolean]:
[info]             {
[info]               case val x5: Object = x$0
[info]               if x5.isInstanceOf[Exp.Add] then
[info]                 {
[info]                   case val x$0: Exp.Add = x5.asInstanceOf[Exp.Add]
[info]                   return[matchResult4]
[info]                     this.l().==(x$0.l()).&&(this.r().==(x$0.r()))
[info]                 }
[info]                else ()
[info]               return[matchResult4] false
[info]             }
[info]         )

which does look good to me. First check eq then proceed on the children (which in turn again will first check eq).

If my benchmarks are to be believed, constructing the RHS of the rebuild is noticeably expensive -- the macro version never builds the "new"/"updated" version of the tree unless it must do so.
Therefore I'd wager that even this equals is too slow.

@jiribenes
Copy link
Copy Markdown
Contributor Author

jiribenes commented Apr 16, 2026

  • did you check that the identity transformation preserves reference equality?

Yep, I hope I did it correctly:

$ effekt --no-optimize --build examples/benchmarks/other/unify.effekt
[warning] before vs after: true, true
class Identity extends core.Tree.TrampolinedRewrite {
  override def rewrite(m: ModuleDecl): ModuleDecl = m match {
    case m @ ModuleDecl(path, includes, declarations, externs, definitions, exports) =>
      m.rebuild(ModuleDecl(path, includes, declarations, externs, definitions, exports))
  }
}

object Identity extends Phase[CoreTransformed, CoreTransformed] {
  val phaseName: String = "identity"
  def run(input: CoreTransformed)(using Context): Option[CoreTransformed] =
    input match {
      case CoreTransformed(source, tree, mod, core) =>
        val identical = Context.timed("identity", source.name) {
          new Identity().rewrite(core)
        }
        Context.warning(pretty"before vs after: ${core eq identical}, ${core == identical}")
        Some(CoreTransformed(source, tree, mod, identical))
    }
}

Where is the trampolining rewrite used again (just checked: it is only Deadcode)?

yep, plus #1381 where I want to benchmark it -- o/w that PR feels really slow

@b-studios
Copy link
Copy Markdown
Collaborator

Given your numbers, we talk about 10% slowdown, right? I wonder whether this starts to amortize only after using this pattern extensively over the code base. Other developers (personal communication) that also maintain object id reported that they needed to carefully inspect reconstruction in rewrites to actually observe benefits.

@jiribenes
Copy link
Copy Markdown
Contributor Author

I can't write the macro version correctly and can't prove that the rebuild does anything, so I'm closing this PR.
In case someone wants to work on it, feel free to reopen.

I also considered a different variant (runtime):

extension [T <: Product](original: T) {
  def rebuild(trampolined: Trampoline[T]): Trampoline[T] =
    trampolined.map { rebuilt =>
      val arity = original.productArity
      var i = 0
      var changed = false
      while (i < arity && !changed) {
        if !(original.productElement(i).asInstanceOf[AnyRef] eq rebuilt.productElement(i).asInstanceOf[AnyRef]) then
          changed = true
        i += 1
      }
      if changed then rebuilt else original
    }
}

used as

case s @ Stmt.If(cond, thn, els) => s.rebuild {
  for {
    cond2 <- rewrite(cond)
    thn2  <- rewrite(thn)
    els2  <- rewrite(els)
  } yield Stmt.If(cond2, thn2, els2)
}

but I don't think it's better than the one in this PR.

@jiribenes jiribenes closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:compiler experiment Experimental branch, do not merge! quality-of-life

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants