Header lexy/dsl/encode.hpp

Rule lexy::dsl::encode

namespace lexy::dsl
    template <typename Encoding,
      lexy::encoding_endianness Endianness = lexy::encoding_endianness::bom>
    constexpr rule auto encode(rule auto rule);

encode is a rule that temporarily changes the encoding of the input while parsing rule.

  • The input encoding has single byte code units (e.g. lexy::byte_encoding).

  • If Endianness == lexy::encoding_endianness::bom, Encoding is ASCII, UTF-8, UTF-16, or UTF-32.

  • If Endianness != lexy::encoding_endianness::bom: Parses rule using a new input whose encoding is Encoding. The new input creates code units by reinterpreting the original code units as code units of Encoding written as bytes in the specified Endianness as needed.

  • If Endianness == lexy::encoding_endianness::bom: Tries to match and consume lexy::dsl::bom<Encoding, lexy::encoding_endianness::little/big>. If that was successful, parses encode<Encoding, lexy::encoding_endianness::little/big>(rule) as above. If no BOM could be matched, parses encode<Encoding, lexy::encoding_endianness::big>(rule) as above.


All errors raised by parsing rule. The rule then fails if rule has failed. Note that the error type is templated on the reader used, which is an unspecified type here.


All values produced by rule. Note that types that are templated on the reader (e.g. lexy::lexeme) are instantiated with an unspecified reader type. Types that are only templated on the iterator do not change, as the unspecified reader has the same iterator type.

Example 1. Match umlauts in different encodings
struct production
    static constexpr auto rule = [] {
        auto utf8  = dsl::encode<lexy::utf8_encoding>(LEXY_LIT(u8"ä"));
        auto utf16 = dsl::encode<lexy::utf16_encoding>(LEXY_LIT(u"ä"));
        auto utf32 = dsl::encode<lexy::utf32_encoding>(LEXY_LIT(U"ä"));

        return utf8 + dsl::newline + utf16 + dsl::newline + utf32 + dsl::eof;

See also