Strip HTTP response bodies after all modules finish processing by liquidsec · Pull Request #3002 · blacklanternsecurity/bbot

liquidsec · 2026-03-31T14:17:38Z

Summary

BBOT is essentially one big memory leak - this is unavoidable, because it is truly recursive, and the whole scan is essentially a giant tree that is formed throughout the scan. That tree includes HTTP_RESPONSE events, which include the full body of the response, which can be significantly large. Prior to this fix, every single one of those was sitting in memory for the entire scan. This change aims to alleviate that by removing the response body from HTTP_RESPONSE events after every module that wants to use it, has.

How it works

_module_consumers counter on BaseEvent — incremented when an event is queued to a module, decremented when the module finishes processing (via _release())
_minimize() — strips body and raw_header from the event's data dict when the counter reaches zero. The event object itself stays alive (parent chains, tags, metadata all preserved)
All decrement paths covered: handle_event, handle_batch, events rejected by postcheck, FINISHED events, and a fallback in ScanEgress.forward_event() for events no module accepts

github-actions · 2026-03-31T14:52:15Z

📊 Performance Benchmark Report

Comparing 3.0 (baseline) vs memory-optimize-refcount-httpresponse (current)

📈 Detailed Results (All Benchmarks)

📋 Complete results for all benchmarks - includes both significant and insignificant changes

🧪 Test Name	📏 Base	📏 Current	📈 Change	🎯 Status
Bloom Filter Dns Mutation Tracking Performance	`4.14ms`	`5.07ms`	+22.4% 🔴🔴🔴	⚠️
Bloom Filter Large Scale Dns Brute Force	`17.02ms`	`17.01ms`	-0.1% ⚪	✅
Large Closest Match Lookup	`353.83ms`	`350.76ms`	-0.9% ⚪	✅
Realistic Closest Match Workload	`187.02ms`	`185.98ms`	-0.6% ⚪	✅
Event Memory Medium Scan	`1776 B/event`	`1784 B/event`	+0.5% ⚪	✅
Event Memory Large Scan	`1759 B/event`	`1768 B/event`	+0.5% ⚪	✅
Event Validation Full Scan Startup Small Batch	`397.25ms`	`401.65ms`	+1.1% ⚪	✅
Event Validation Full Scan Startup Large Batch	`568.66ms`	`576.17ms`	+1.3% ⚪	✅
Make Event Autodetection Small	`30.73ms`	`30.76ms`	+0.1% ⚪	✅
Make Event Autodetection Large	`315.33ms`	`316.80ms`	+0.5% ⚪	✅
Make Event Explicit Types	`13.80ms`	`13.74ms`	-0.5% ⚪	✅
Excavate Single Thread Small	`3.947s`	`3.962s`	+0.4% ⚪	✅
Excavate Single Thread Large	`9.582s`	`9.559s`	-0.2% ⚪	✅
Excavate Parallel Tasks Small	`4.075s`	`4.061s`	-0.3% ⚪	✅
Excavate Parallel Tasks Large	`7.182s`	`7.162s`	-0.3% ⚪	✅
Is Ip Performance	`3.19ms`	`3.17ms`	-0.6% ⚪	✅
Make Ip Type Performance	`11.41ms`	`11.41ms`	+0.1% ⚪	✅
Mixed Ip Operations	`4.50ms`	`4.50ms`	-0.0% ⚪	✅
Memory Use Web Crawl	`259.7 MB`	`43.6 MB`	-83.2% 🟢🟢🟢	🚀
Memory Use Subdomain Enum	`19.3 MB`	`19.3 MB`	+0.2% ⚪	✅
Scan Throughput 100	`7.807s`	`8.045s`	+3.1% ⚪	✅
Scan Throughput 1000	`39.868s`	`40.817s`	+2.4% ⚪	✅
Typical Queue Shuffle	`65.61µs`	`63.78µs`	-2.8% ⚪	✅
Priority Queue Shuffle	`735.02µs`	`732.40µs`	-0.4% ⚪	✅

🎯 Performance Summary

+ 1 improvement 🚀
! 1 regression ⚠️
  22 unchanged ✅

🔍 Significant Changes (>10%)

Bloom Filter Dns Mutation Tracking Performance: 22.4% 🐌 slower
Memory Use Web Crawl: 83.2% 🚀 less memory

🐍 Python Version 3.11.15

codecov · 2026-03-31T15:10:01Z

Codecov Report

❌ Patch coverage is 92.30769% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 91%. Comparing base (6b359ac) to head (db8d16c).
⚠️ Report is 12 commits behind head on 3.0.

Files with missing lines	Patch %	Lines
bbot/modules/base.py	90%	3 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff           @@
##             3.0   #3002    +/-   ##
======================================
+ Coverage     91%     91%    +1%     
======================================
  Files        440     440            
  Lines      37230   37330   +100     
======================================
+ Hits       33711   33809    +98     
- Misses      3519    3521     +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pytest's own allocations (~200 MB) contaminate tracemalloc peak measurements when scans run in-process, masking real differences between branches. Run each benchmark scan as a subprocess instead so measurements reflect only the scan's own memory use. Also rename tests to test_memory_use_* for clarity.

…unt-based)

liquidsec · 2026-04-01T16:49:54Z

@TheTechromancer i looked into what would happen if a module errored. I think in a situation where a module actually hits an unhandled exception, and errors out that way - yes - we'd miss some. But all the failure does is revert back to the old system - of the bodies hanging around until GC. So i don't think it's a serious concern.

liquidsec · 2026-04-01T18:17:31Z

@TheTechromancer simplified some of the changes on this branch

bbot/core/event/base.py

liquidsec force-pushed the memory-optimize-refcount-httpresponse branch 3 times, most recently from 3c3862a to db6b2b0 Compare March 31, 2026 19:47

liquidsec force-pushed the additional-memory-benchmarks branch from aa7e2bc to 590e979 Compare March 31, 2026 19:48

liquidsec added 2 commits March 31, 2026 15:49

Strip HTTP response bodies after all modules finish processing (refco…

38db27e

…unt-based)

liquidsec force-pushed the memory-optimize-refcount-httpresponse branch from db6b2b0 to 38db27e Compare March 31, 2026 19:49

Add comment to _minimize() docstring

1929e24

Base automatically changed from additional-memory-benchmarks to 3.0 April 1, 2026 16:22

liquidsec added 2 commits April 1, 2026 12:49

Merge branch '3.0' into memory-optimize-refcount-httpresponse

d268448

Merge _release() into _minimize() for clarity

c879fc6

TheTechromancer reviewed Apr 2, 2026

View reviewed changes

bbot/core/event/base.py Show resolved Hide resolved

Move _minimize() stripping logic to HTTP_RESPONSE override

db8d16c

TheTechromancer approved these changes Apr 2, 2026

View reviewed changes

liquidsec merged commit edefefc into 3.0 Apr 2, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Strip HTTP response bodies after all modules finish processing#3002

Strip HTTP response bodies after all modules finish processing#3002
liquidsec merged 6 commits into3.0from
memory-optimize-refcount-httpresponse

liquidsec commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

liquidsec commented Apr 1, 2026

Uh oh!

liquidsec commented Apr 1, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

liquidsec commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Uh oh!

github-actions bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Performance Benchmark Report

🎯 Performance Summary

🔍 Significant Changes (>10%)

Uh oh!

codecov bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

liquidsec commented Apr 1, 2026

Uh oh!

liquidsec commented Apr 1, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liquidsec commented Mar 31, 2026 •

edited

Loading

github-actions bot commented Mar 31, 2026 •

edited

Loading

codecov bot commented Mar 31, 2026 •

edited

Loading