Skip to content

[Feature Request] exact floating-point mantissa overflow round to byte #347

@jopadan

Description

@jopadan
#include <float.h>

        inline BX_CONSTEXPR_FUNC uint8_t ubround(float _a)
        {
                _a += (1u << (FLT_MANT_DIG - 1)) + (1u << (FLT_MANT_DIG - 2));
                return *(uint8_t*)&_a;
        }

        inline BX_CONSTEXPR_FUNC int8_t bround(float _a)
        {
                _a += (1u << (FLT_MANT_DIG - 1)) + (1u << (FLT_MANT_DIG - 2));
                return *(int8_t*)&_a;
        }

Causes constexpr warning for reinterpret cast.
Works for HALF_MANT_DIG = 11 C23 _Float16 types too but uses software emulation if no hardware support is present

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions