PoC: switch to rav1d for AVIF decoding#2849
Conversation
|
We'll probably want to get a sense of the crate with fuzzing/Miri before switching over. From what I understand zlib-rs was created by the same organization. And my experience with zlib-rs was that the marketing got a bit ahead of the implementation: When I saw it declared as ready to try for production use, I ran it through some fuzzing and discovered that it instantly crashed because a few decode code-paths were still marked |
|
FWIW I don't see the two projects sharing any key developers. But I don't see a fuzzing harness in-tree, which is a potential red flag. |
|
Oh, https://github.com/imazen/rav1d-safe also exists where people ported rav1d's handwritten assembly to Rust's safe SIMD intrinsics plus However, the license is AGPL-3.0, which is a very aggressive copyleft variant. But the README includes the following:
Should I talk to the memorysafety.org folks (with you cc'd) about getting the safe Rust codepaths upstreamed? Or perhaps we could reach out to rav1d-safe ourselves and discuss inclusion into |
|
In any case it brought |
|
Whole rav1d-safe seems to me have been made by AI. Don't get me wrong I'm fine with AI, but there is like ~100K lines fairly complex AI-generated code that likely no one at least partially understands. If there are logical or safety errors, which is almost inevitable, they will be quite difficult to track and fix, especially since no one really knows what the code is actually supposed to do. |
|
That's not true: there are scalar Rust functions implementing the same behavior. So we always have known-good reference code and can automatically generate tests that compare the output of any optimized implementation to the scalar reference. This is precisely the property that could make an AI-driven conversion manageable; without this ground truth it would be a lost cause. Still, the fact that such a faithful conversion is potentially feasible doesn't mean that rav1d-safe did it with an appropriate level of rigor. Their claims still need to be verified. |
I meant that if something goes wrong in the decoding path without an obvious panic, and the chatbot can't guess where the mistake is, one would have to debug the entire decoding path from scratch, because no one would be able to point out even where to start looking. |
|
I have to do that in other repositories, too, both that I wrote and did not write myself. The argument is not exactly a quantitative methodology for getting rid of faults in comitted code. Using an LLM for translation of code is as close to a qualified use case for it as it gets; and intrinsic instructions rarely have panic side effects so really besides weird behavior I'm not too concerned. |
|
The upstream |
Depends on memorysafety/rav1d#1439
We don't have any in-tree AVIF tests, so I tested manually with Shnatsel/wondermagick#92 and it seems to work great!