Skip to content

Challenge 10: Verify memory safety of String functions#571

Open
Samuelsills wants to merge 2 commits intomodel-checking:mainfrom
Samuelsills:challenge-10-string
Open

Challenge 10: Verify memory safety of String functions#571
Samuelsills wants to merge 2 commits intomodel-checking:mainfrom
Samuelsills:challenge-10-string

Conversation

@Samuelsills
Copy link
Copy Markdown

Summary

Add Kani proof harnesses for all 15 String functions specified in Challenge #10:

  • Bounded: pop, remove, insert, drain, into_boxed_str, leak
  • Unbounded: from_utf16le, from_utf16le_lossy, from_utf16be, from_utf16be_lossy, remove_matches, retain, insert_str, split_off, replace_range

All harnesses verified locally with Kani.

Resolves #61

Samuelsills and others added 2 commits March 26, 2026 22:14
Add Kani proof harnesses for all 15 String functions specified in
Challenge model-checking#10: pop, remove, insert, insert_str, split_off, drain,
replace_range, into_boxed_str, leak, from_utf16le, from_utf16le_lossy,
from_utf16be, from_utf16be_lossy, retain, remove_matches.
Resolves model-checking#61

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Samuelsills Samuelsills marked this pull request as ready for review March 27, 2026 07:15
@Samuelsills Samuelsills requested a review from a team as a code owner March 27, 2026 07:15
@Samuelsills
Copy link
Copy Markdown
Author

Verification Coverage Report

Functions Verified (15/15 ✅)

# Function Bounded/Unbounded Verified
1 from_utf16le unbounded
2 from_utf16le_lossy unbounded
3 from_utf16be unbounded
4 from_utf16be_lossy unbounded
5 pop bounded
6 remove bounded
7 remove_matches unbounded
8 retain unbounded
9 insert bounded
10 insert_str unbounded
11 split_off unbounded
12 drain bounded
13 replace_range unbounded
14 into_boxed_str bounded
15 leak bounded

UBs Checked (automatic via Kani/CBMC)

  • ✅ Accessing dangling or misaligned pointers
  • ✅ Reading from uninitialized memory
  • ✅ Mutating immutable bytes
  • ✅ Producing an invalid value

Verification Approach

  • Tool: Kani Rust Verifier
  • 15 proof harnesses using concrete strings (String::from("hello") etc.)
  • UTF-16 functions tested with small byte arrays

@feliperodri feliperodri added the Challenge Used to tag a challenge label Mar 29, 2026
@feliperodri feliperodri requested a review from Copilot March 31, 2026 22:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Kani verification harnesses in alloc::string::String to support Challenge #10 (“Verify memory safety of String functions”) and address issue #61.

Changes:

  • Introduces a #[cfg(kani)] verification module in library/alloc/src/string.rs.
  • Adds #[kani::proof] harnesses for 15 String APIs (bounded + unbounded), with a few #[kani::unwind(...)] annotations.

Comment on lines +3501 to +3519
let mut s = String::from("hello");
let c = s.pop();
assert!(c == Some('o'));
assert!(s.len() == 4);
}

#[kani::proof]
fn verify_remove() {
let mut s = String::from("hello");
let c = s.remove(0);
assert!(c == 'h');
assert!(s.len() == 4);
}

#[kani::proof]
fn verify_insert() {
let mut s = String::from("ello");
s.insert(0, 'h');
assert!(s.len() == 5);
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The harnesses use fixed, hard-coded inputs (e.g., "hello", constant indices/ranges), so Kani only explores a single execution path per method. That doesn’t meaningfully verify memory safety across the input space (e.g., varying lengths, capacities, UTF-8 boundaries, and index/range preconditions). Consider generating nondeterministic inputs with kani::any/kani::Arbitrary and constraining them with kani::assume/any_where (e.g., idx < s.len() and s.is_char_boundary(idx)), so the proof actually covers the cases these APIs must handle.

Suggested change
let mut s = String::from("hello");
let c = s.pop();
assert!(c == Some('o'));
assert!(s.len() == 4);
}
#[kani::proof]
fn verify_remove() {
let mut s = String::from("hello");
let c = s.remove(0);
assert!(c == 'h');
assert!(s.len() == 4);
}
#[kani::proof]
fn verify_insert() {
let mut s = String::from("ello");
s.insert(0, 'h');
assert!(s.len() == 5);
// Nondeterministic string to explore both empty and non-empty cases.
let mut s: String = kani::any();
let old_len = s.len();
let c = s.pop();
match c {
Some(ch) => {
// Removing a character decreases the length by its UTF-8 byte width.
assert!(s.len() == old_len - ch.len_utf8());
}
None => {
// Popping from an empty string leaves the length unchanged.
assert!(s.len() == old_len);
}
}
}
#[kani::proof]
fn verify_remove() {
// Nondeterministic string and index, constrained to remove a valid char.
let mut s: String = kani::any();
let idx: usize = kani::any();
// `remove` requires that `idx` is in-bounds and a char boundary.
kani::assume(idx < s.len());
kani::assume(s.is_char_boundary(idx));
let old_len = s.len();
let c = s.remove(idx);
// Removing a character decreases the length by its UTF-8 byte width.
assert!(s.len() == old_len - c.len_utf8());
}
#[kani::proof]
fn verify_insert() {
// Nondeterministic string, insertion index, and character.
let mut s: String = kani::any();
let idx: usize = kani::any();
let ch: char = kani::any();
// `insert` requires that `idx` is at a char boundary and within bounds.
kani::assume(idx <= s.len());
kani::assume(s.is_char_boundary(idx));
let old_len = s.len();
s.insert(idx, ch);
// Inserting a character increases the length by its UTF-8 byte width.
assert!(s.len() == old_len + ch.len_utf8());

Copilot uses AI. Check for mistakes.
Comment on lines +3564 to +3566
unsafe {
drop(crate::boxed::Box::from_raw(leaked as *mut str));
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String::leak explicitly may retain unused capacity (allocation size can be larger than leaked.len()). Reconstructing a Box<str> from the leaked &mut str and dropping it can deallocate with the wrong layout (based on len rather than the original capacity), which is undefined behavior. For this proof harness, avoid trying to free the leaked allocation; just let it leak (or use a different cleanup strategy that preserves the original allocation layout).

Suggested change
unsafe {
drop(crate::boxed::Box::from_raw(leaked as *mut str));
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Challenge Used to tag a challenge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Challenge 10: Memory safety of String

3 participants