Header lexy/dsl/byte.hpp
Rules that match one or more bytes.
Token rule lexy::dsl::byte
and lexy::dsl::bytes
lexy/dsl/byte.hpp
namespace lexy::dsl
{
class bytes-dsl // models token-rule
{
public:
template <typename Predicate>
constexpr token-rule auto if_() const;
template <unsigned char ... Bytes>
constexpr token-rule auto set() const;
template <unsigned char Low, unsigned char High>
constexpr token-rule auto range() const;
constexpr token-rule auto ascii() const;
};
template <std::size_t N>
constexpr bytes-dsl auto bytes;
constexpr _bytes-dsl auto byte = bytes<1>;
}
bytes
is a token rule that matches N
bytes from a specified set.
- Requires
The encoding of the input is
lexy::byte_encoding
.- Matching
Matches and consumes
N
bytes:By default, it matches arbitrary bytes in the range
[0x00, 0xFF]
(both sides inclusive). Its name isbyte
..if_()
: matches bytes where the predicate returns true. It must have aconstexpr bool operator()(unsigned char)
. Its name is the type name ofPredicate
..set()
: matches the specified bytes. Its name isbyte.set
..range()
: matches the bytes in the range[Low, High]
(both sides inclusive). Its name isbyte.range
..ascii()
: matches bytes that are also ASCII characters; i.e. in the range[0x00, 0x7F]
(both sides inclusive). Its name isbyte.ASCII
.
- Errors
lexy::expected_char_class
(with the name as above): if a mismatched byte or EOF was encountered early; at the position of the last byte it could consume. The rule then fails.- Parse tree
Single token node with the
lexy::predefined_token_kind
lexy::any_token_kind
(default) orlexy::unknown_token_kind
(with predicate/range).
Caution | Combining lexy::dsl::capture with bytes does not do any endianness conversion. |
Branch rule lexy::dsl::padding_bytes
lexy/dsl/byte.hpp
namespace lexy::dsl
{
template <std::size_t N, unsigned char Value = 0>
constexpr branch-rule auto padding_bytes;
}
padding_bytes
is a branch rule that matches N
padding bytes.
Padding bytes are bytes whose value doesn’t matter: they’re only expected to have a certain value, but it’s okay, if they don’t.
- Requires
The encoding of the input is
lexy::byte_encoding
.- (Branch) parsing
(Branch) parses
lexy::dsl::bytes
. If that succeeds, the branch is taken. It then checks if theN
consumed bytes have the givenValue
.- Errors
All errors raised by parsing
bytes
, if not branch parsing. The rule then fails.lexy::expected_literal
if a byte did not haveValue
. The rule then recovers.
Note | Branch parsing will not backtrack if the padding bytes do not have the given Value . |
Binary integer branch rules
lexy/dsl/byte.hpp
namespace lexy
{
struct mismatched_byte_count {};
}
namespace lexy::dsl
{
struct binary-integer-dsl_ // models branch-rule
{
constexpr branch-rule operator()(token-rule bytes) const;
};
constexpr binary-integer-dsl bint8;
constexpr binary-integer-dsl bint16;
constexpr binary-integer-dsl little_bint16;
constexpr binary-integer-dsl big_bint16;
constexpr binary-integer-dsl bint32;
constexpr binary-integer-dsl little_bint32;
constexpr binary-integer-dsl big_bint32;
constexpr binary-integer-dsl bint64;
constexpr binary-integer-dsl little_bint64;
constexpr binary-integer-dsl big_bint64;
}
[little/big_]bintN
is a branch rule that converts N / 8
bytes into an N
bit integer.
Each rule has the associated bit width N
and endianness:
for the prefixed ones it is little/big endian, for the unprefixed ones it is the native endianness.
The endianness doesn’t matter for bint8
.
They delegate to the associated bytes
rule, which can be specified using the function call operator.
If none has been specified, they default to lexy::dsl::bytes
.
- Requires
The encoding of the input is
lexy::byte_encoding
.- (Branch) parsing
(Branch) parses
bytes
. If that succeeds, the branch is taken. It then checks thatN / 8
bytes have been consumed.- Errors
All errors raised by parsing
bytes
. The rule then fails.lexy::mismatched_byte_count
: ifbytes
succeeded but did not consumeN / 8
bytes; its range covers everything consumed bybytes
. The rule then fails.
- Values
A
std::uint_leastN_t
which results in readingN / 8
bytes in the specified endianness. This conversion can never fail.