Skip to content

Add Go decoded stackstring extraction via runtime.slicebytetostring callsites#1232

Open
kunalsz wants to merge 1 commit intomandiant:masterfrom
kunalsz:go-enhancement
Open

Add Go decoded stackstring extraction via runtime.slicebytetostring callsites#1232
kunalsz wants to merge 1 commit intomandiant:masterfrom
kunalsz:go-enhancement

Conversation

@kunalsz
Copy link
Copy Markdown
Contributor

@kunalsz kunalsz commented Mar 15, 2026

Resolves #828

This PR aims to implement :

  • Go-specific decoded stackstring pass that locates runtime.slicebytetostring (and common aliases) in the vivisect workspace and collects its callsites
  • Follows the idea : Note that generally decoding strings such as this involves converting a byte array back into a string, so stackstrings are typically followed by a call to runtime_slicebytetostring as discussed in Update Stackstrings Algorithm in Go Extraction Code #828

Till now a module has been added(pclntab.py) to parse Go metadata (pclntab/moduledata) for function names and addresses as the repo doesnt have Go runtime symbol resolution helpers.
This is used to locate runtime.slicebytetostring , function VA of runtime.slicebytetostring and list of call sites.
pclntab_floss

Next steps would be to , recover the decoded bytes at each identified call site and hence the runtime stackstrings.

References/Inspirations for the work :

Note : For test_language_go_pclntab.py help of copilot was taken to build version specific minimal pclntab blob

…allsites

Signed-off-by: kunalsz <kunalavengers@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Go binary analysis capabilities by introducing a mechanism to identify and extract decoded stackstrings. It achieves this by intelligently locating calls to runtime.slicebytetostring using a robust pclntab parser that can handle various Go versions and obfuscation techniques. This foundational work lays the groundwork for future recovery of runtime stackstrings from identified call sites.

Highlights

  • Go Decoded Stackstring Extraction: Implemented a Go-specific pass to locate and extract decoded stackstrings by identifying calls to runtime.slicebytetostring and its common aliases within the vivisect workspace.
  • PCLNTAB Parsing for Go Metadata: Introduced a new module (pclntab.py) to parse Go's pclntab metadata, enabling the resolution of function names and addresses. This is crucial for locating runtime.slicebytetostring even in stripped, garbled, or obfuscated Go binaries.
  • Multi-Version PCLNTAB Support: The pclntab parser supports various Go pclntab versions, including 1.2 (covering 1.2-1.15), 1.16, 1.18, and 1.20, ensuring broad compatibility.
  • Robust PCLNTAB Location: The pclntab locator can find the pclntab data via named sections (e.g., .gopclntab) or through magic byte scanning across all sections, making it resilient to binary stripping or obfuscation.
  • Debug Script and Comprehensive Testing: Added a dedicated debug script to test and visualize runtime.slicebytetostring callsites, along with extensive unit tests for the pclntab parsing logic using minimal in-memory blobs and real Go binary fixtures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • floss/language/go/extract.py
    • Updated imports to include vivisect.const and find_runtime_functions from the new pclntab module.
    • Added find_slicebytetostring_callsites function to identify calls to runtime.slicebytetostring using both vivisect's name table and pclntab parsing.
  • floss/language/go/pclntab.py
    • Introduced pclntab.py to parse Go's pclntab data structure, extracting function name to virtual address mappings.
    • Implemented _locate_pclntab for finding pclntab data via section names or magic byte scanning, supporting stripped/obfuscated binaries.
    • Provided version-specific parsers (_parse_v12, _parse_v116, _parse_v118) for different Go pclntab formats.
    • Added find_runtime_functions to search for specific runtime functions within the parsed pclntab data.
  • scripts/debug_find_slicebytetostring.py
    • Created debug_find_slicebytetostring.py to demonstrate and test the new Go stackstring extraction capabilities.
    • Included options to display pclntab symbol table information and control the number of callsites printed.
  • tests/test_language_go_pclntab.py
    • Implemented unit tests for floss.language.go.pclntab, covering pclntab parsing for Go versions 1.2, 1.16, 1.18, and 1.20.
    • Included tests for _locate_pclntab and find_runtime_functions using mock PE structures.
    • Added integration tests for parsing real Go binaries.
Activity
  • A new module, pclntab.py, has been added to parse Go metadata (pclntab/moduledata) for function names and addresses, addressing the lack of Go runtime symbol resolution helpers.
  • The find_slicebytetostring_callsites function was implemented to locate runtime.slicebytetostring and its aliases, along with their call sites.
  • Version-specific minimal pclntab blobs were built with the assistance of Copilot for test_language_go_pclntab.py.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust mechanism for identifying runtime.slicebytetostring callsites in Go binaries, which is a crucial first step for extracting decoded stackstrings. A new module, pclntab.py, is added to parse Go's pclntab data structure, enabling function symbol resolution even in stripped or obfuscated binaries. This new capability is integrated into extract.py. The changes are well-structured, include comprehensive unit tests for the new parser, and a helpful debug script. My feedback includes a couple of suggestions to improve maintainability and code clarity by replacing a magic number with a constant and avoiding variable shadowing.

Comment on lines +485 to +495
_pe = pe
if _pe is None:
# Derive PE path from the vivisect workspace file list
files = vw.getFiles()
if not files:
logger.debug("vw.getFiles() returned nothing, cannot open PE for pclntab scan")
return out
sample_path = files[0]
_pe = pefile.PE(sample_path, fast_load=True)

hits = find_runtime_functions(_pe, _SLICEBYTETOSTRING_TOKENS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The local variable _pe shadows the function parameter pe. This can be confusing and is generally discouraged. It's better to use a different name for the local variable to make the code clearer.

Suggested change
_pe = pe
if _pe is None:
# Derive PE path from the vivisect workspace file list
files = vw.getFiles()
if not files:
logger.debug("vw.getFiles() returned nothing, cannot open PE for pclntab scan")
return out
sample_path = files[0]
_pe = pefile.PE(sample_path, fast_load=True)
hits = find_runtime_functions(_pe, _SLICEBYTETOSTRING_TOKENS)
pe_to_use = pe
if pe_to_use is None:
# Derive PE path from the vivisect workspace file list
files = vw.getFiles()
if not files:
logger.debug("vw.getFiles() returned nothing, cannot open PE for pclntab scan")
return out
sample_path = files[0]
pe_to_use = pefile.PE(sample_path, fast_load=True)
hits = find_runtime_functions(pe_to_use, _SLICEBYTETOSTRING_TOKENS)

return {}

nfunctab = _uptr(data, 8, ptrsize)
if nfunctab == 0 or nfunctab > 1_000_000:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The magic number 1_000_000 is used as a heuristic limit for the number of functions. It would be better to define this as a constant at the module level (e.g., _MAX_FUNCTIONS_HEURISTIC = 1_000_000) to improve readability and maintainability. This number is also used in _parse_v116 (line 264) and _parse_v118 (line 340).

@kunalsz
Copy link
Copy Markdown
Contributor Author

kunalsz commented Mar 15, 2026

@mr-tz A small surface level review for the direction of the PR would be helpful! 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Stackstrings Algorithm in Go Extraction Code

1 participant