Header lexy/dsl/byte.hpp

Rules that match specific bits.

The inputs all work on byte boundaries, so it is not possible to match and consume individual bits. However, the lexy::dsl::bits  rule allows consuming 8 (specific) bits at a time. To specify bit value, there are special bit rules in namespace lexy::dsl::bit.

Bit rules lexy::dsl::bit::_0 and lexy::dsl::bit::_1

lexy/dsl/byte.hpp
namespace lexy::dsl::bit
{
    constexpr bit-rule auto 0;
    constexpr _bit-rule auto _1;
}

_0 and _1 are bit rules that match a zero/one bit.

They specify one bit each.

Bit rule lexy::dsl::bit::nibble

lexy/dsl/byte.hpp
namespace lexy::dsl::bit
{
    template <unsigned Value>
    constexpr bit-rule auto nibble;
}

nibble is a bit rule that matches a specific nibble.e. four bits.

It requires that 0 ⇐ Value ⇐ 0xF. It then matches the specified four bits of Value.

Bit rules lexy::dsl::bit::any and lexy::dsl::bit::_

lexy/dsl/byte.hpp
namespace lexy::dsl::bit
{
    template <unsigned N>
    constexpr bit-rule auto any;

    constexpr bit-rule auto _ = any<1>;
}

any is a bit rule that matches N > 0 bits of arbitrary value.

It specifies N bits.

Token rule lexy::dsl::bits

lexy/dsl/byte.hpp
namespace lexy::dsl
{
    constexpr token-rule auto bits(bit-rule auto ... bits);
}

bits is a token rule that matches a byte with the specified bit values.

Requires

In total, bits specify exactly eight bit values. The encoding is lexy::byte_encoding.

Matching

Matches and consumes one byte. It then checks that the byte has the specified bits, by applying each bit rule in turn. From left to right, bit rules specify the most significant bit (e.g. the sign bit) down to the least significant bit (e.g. the even/odd bit).

Errors

lexy::expected_char_class  ("bits"): if the input was at EOF or the byte value does not match the pattern specified by the bits. The rule then fails.

Example 1. Manually match a UTF-8 code point
struct code_point
{
    static constexpr auto rule = [] {
        // 10xxxxxx
        auto continuation = dsl::bits(dsl::bit::_1, dsl::bit::_0, dsl::bit::any<6>);

        // 0xxxxxxx
        auto ascii = dsl::bits(dsl::bit::_0, dsl::bit::any<7>);
        // 110xxxxx
        auto lead_two = dsl::bits(dsl::bit::_1, dsl::bit::_1, dsl::bit::_0, dsl::bit::any<5>);
        // 1110xxxx
        auto lead_three
            = dsl::bits(dsl::bit::_1, dsl::bit::_1, dsl::bit::_1, dsl::bit::_0, dsl::bit::any<4>);
        // 11110xxx
        auto lead_four = dsl::bits(dsl::bit::_1, dsl::bit::_1, dsl::bit::_1, dsl::bit::_1,
                                   dsl::bit::_0, dsl::bit::any<3>);

        return ascii | lead_two >> continuation | lead_three >> dsl::twice(continuation)
               | lead_four >> dsl::times<3>(continuation);
    }();
};
Tip
Combine it with e.g. lexy::dsl::bint8  to match an integer with specified bits set.

See also