Layer cleanup to prevent runtime layer invalidation#1274
Conversation
During automation runs for new Node.js runtimes, occasionally we'd see a PR opened where no new runtimes were added but one of the test snapshots had changed. The specific snapshot was always [the pnpm test for rebuilding a native module](https://github.com/heroku/buildpacks-nodejs/blob/cf9cb431fea999e30ac7528a93d3fe841b853f9c/tests/snapshots/test_pnpm_native_modules_are_recompiled_even_on_cache_restore.snap) and it would toggle back and forth between the following lines during a rebuild: - `Adding layer 'heroku/nodejs:virtual'` - `Reusing layer 'heroku/nodejs:virtual'` This indicated that files were present in this layer with content that was changing between runs in a way that could trigger the runtime layer to be invalidated and it was trigger by the `node-gyp` compilation. Both npm and Yarn have equivalent tests for native module recompilation, so I was curious if they were also affected. This led me to notice that the npm test snapshot reported `Adding layer 'heroku/nodejs:dist'` during a rebuild instead of `Reusing layer 'heroku/nodejs:dist'`. Yarn did not seem to be affected. So I spent some time analyzing the contents of these layers to see if I could track down what exactly was changing between builds. This involved: - adding debug output to the local code to print out the `sha256sum` of every file in the `/layers` directory - creating a test harness script to execute the native module rebuild tests for each package manager and collect the logs - creating an analysis script to extract the `(hash filename)` pairs from the raw logs and put them into a map of `(filename, [hash, hash, ...])` values where any entry that didn't contain hashes that were all the same would be reported The results of this analysis were, when a native module would be compiled, the layer containing the `node-gyp` binary (`dist` for bundled npm, `npm_engine` for installed version, `pnpm` for installed version) would contain `__pycache__` artifacts. E.g.; - `/layers/heroku_nodejs/dist/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__pycache__/__init__.cpython-312.pyc` - ``/layers/heroku_nodejs/dist/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__pycache__/common.cpython-312.pyc` - (and several others consistent across all these layers) And, for pnpm's `virtual` layer containing the modules installed to the virtual store, `node-gyp` will execute in the module's directory and is hardcoded to create a `build` folder. The file consistently triggering invalidation would be a `Makefile` like the following: - `/layers/heroku_nodejs/virtual/store/dtrace-provider@0.8.8/node_modules/dtrace-provider/build/Makefile` > [!NOTE] > Neither npm nor Yarn are impacted because those tools install modules directly into the workspace directory, not into a layer. This PR attempts to remove these specific build artifacts by collecting layers known to contain these files during the build process and then traversing those layer folders at the end of the build to remove the target files. This will be reported in buildpack output as: ``` - Cleaning up Node.js installation layer - Cleaning up npm installation layer - Removed 3 __pycache__ directories ``` Where the layer to be cleaned will always be reported. If any of the target files are detected, there will be extra output indicating how many of these files were removed.
|
I'd like to get to bottom of the root issue before we decide that this cleanup step is the only solution since it adds complexity to the buildpack that might not be needed at all. Knowing why we have this issue would potentially offer a simpler solution. In general, I'm not comfortable stopping an investigation why an issue exists because we can "nuke it away" by fixing the symptom. Maybe there is more to it than meets the eye? Happy to assist if you'd like a pairing session about this. |
Removed commented out code Signed-off-by: Colin Casey <casey.colin@gmail.com>
Signed-off-by: Colin Casey <casey.colin@gmail.com>
## heroku/nodejs ### Added - Cleanup of non-deterministic files in Node.js, npm, and pnpm layers to prevent unnecessary invalidation of runtime layers for export. ([#1274](#1274)) ### Changed - Automated sync of `@yarnpkg/plugin-prune-dev-dependencies.js` from heroku/heroku-buildpack-nodejs@ab3aa2a
## heroku/nodejs ### Added - Cleanup of non-deterministic files in Node.js, npm, and pnpm layers to prevent unnecessary invalidation of runtime layers for export. ([#1274](heroku/buildpacks-nodejs#1274)) ### Changed - Automated sync of `@yarnpkg/plugin-prune-dev-dependencies.js` from heroku/heroku-buildpack-nodejs@ab3aa2a
## heroku/nodejs ### Added - Cleanup of non-deterministic files in Node.js, npm, and pnpm layers to prevent unnecessary invalidation of runtime layers for export. ([#1274](heroku/buildpacks-nodejs#1274)) ### Changed - Automated sync of `@yarnpkg/plugin-prune-dev-dependencies.js` from heroku/heroku-buildpack-nodejs@ab3aa2a Co-authored-by: heroku-linguist[bot] <136119646+heroku-linguist[bot]@users.noreply.github.com>
@colincasey IMO this is something that longer term should be fixed in In this case, landing a short-term workaround here is absolutely fine, however, I perhaps would have first looked upstream in the Looking very briefly now, I found these:
I don't suppose you could:
|
During automation runs for new Node.js runtimes, occasionally we'd see a PR opened where no new runtimes were added but one of the test snapshots had changed. The specific snapshot was always the pnpm test for rebuilding a native module and it would toggle back and forth between the following lines during a rebuild:
Adding layer 'heroku/nodejs:virtual'Reusing layer 'heroku/nodejs:virtual'This indicated that files were present in this layer with content that was changing between runs in a way that would cause the runtime layer to be invalidated and it was triggered by the
node-gypcompilation. Both npm and Yarn have equivalent tests for native module recompilation, so I was curious if they were also affected. This led me to notice that the npm test snapshot reportedAdding layer 'heroku/nodejs:dist'during a rebuild instead ofReusing layer 'heroku/nodejs:dist'. Yarn did not seem to be affected.So I spent some time analyzing the contents of these layers to see if I could track down what exactly was changing between builds. This involved:
sha256sumof every file in the/layersdirectory(hash filename)pairs from the raw logs and put them into a map of(filename, [hash, hash, ...])values where any entry that didn't contain hashes that were all the same would be reportedThe results of this analysis were, when a native module would be compiled, the layer containing the
node-gypbinary (distfor bundled npm,npm_enginefor installed version,pnpmfor installed version) would contain__pycache__artifacts. E.g.;/layers/heroku_nodejs/dist/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__pycache__/__init__.cpython-312.pyc/layers/heroku_nodejs/dist/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__pycache__/common.cpython-312.pycAnd, for pnpm's
virtuallayer containing the modules installed to the virtual store,node-gypwill execute in the module's directory and is hardcoded to create abuildfolder. The file consistently triggering invalidation would be aMakefilelike the following:/layers/heroku_nodejs/virtual/store/dtrace-provider@0.8.8/node_modules/dtrace-provider/build/MakefileNote
Neither npm nor Yarn are impacted because those tools install modules directly into the workspace directory, not into a layer.
This PR attempts to remove these specific build artifacts by collecting layers known to contain these files during the build process and then traversing those layer folders at the end of the build to remove the target files.
This will be reported in buildpack output as:
Where the layer to be cleaned will always be reported. If any of the target files are detected, there will be extra output indicating how many of these files were removed.
W-19793309