Header lexy/dsl/digit.hpp

Token rules that match individual digits or sequence of digits.

Bases

lexy/dsl/digit.hpp
namespace lexy::dsl
{
    struct binary {};
    struct octal {};
    struct decimal {};

    struct hex_lower {};
    struct hex_upper {};
    struct hex {};
}

The tag types binary, octal, decimal, hex_lower, hex_upper, and hex indicate the base of a number and what digits are allowed.

The bases
Tag Radix Digits

binary

2

01

octal

8

01234567

decimal

10

0123456789

hex_lower

16

0123456789, abcdef

hex_upper

16

0123456789, ABCDEF

hex

16

0123456789, abcdef, ABCDEF

Token rule lexy::dsl::zero

lexy/dsl/digit.hpp
namespace lexy::dsl
{
    constexpr token-rule auto zero;
}

zero is a token rule that matches 0.

Matching

Matches and consumes lexy::dsl::lit_c<'0'>, i.e. 0.

Errors

lexy::expected_char_class ("digit.zero"): instead of lexy::expected_literal; at the starting reader position. The rule then fails.

Parse tree

Single token node with the lexy::predefined_token_kind lexy::digit_token_kind.

Example 1. The number zero
struct production
{
    static constexpr auto rule = dsl::zero + dsl::eof;
};

Token rule lexy::dsl::digit

lexy/dsl/digit.hpp
namespace lexy::dsl
{
    template <typename Base = decimal>
    constexpr token-rule auto digit;
}

digit is a token rule that matches one digit of the specified Base.

Requires

Base must be one of the Bases.

Matching

Matches and consumes one of the digits of Base specified in the table above. Any transcoding from ASCII to the input encoding for the comparison is done by a simple static_cast.

Errors

lexy::expected_char_class ("digit.<base>"): at the starting reader position. The rule then fails.

Parse tree

Single token node with the lexy::predefined_token_kind lexy::digit_token_kind.

Example 2. A hexadecimal digit, upper-case letters only
struct production
{
    static constexpr auto rule = dsl::digit<dsl::hex_upper> + dsl::eof;
};
Note
The only difference between lexy::dsl::digit<decimal> and lexy::dsl::ascii::digit is the name of the character class in the error.

Token rule lexy::dsl::digits

lexy/dsl/digit.hpp
namespace lexy
{
    struct forbidden_leading_zero {};
}

namespace lexy::dsl
{
    class digits-dsl // models token-rule
    {
    public:
        constexpr digit-dsl sep(token-rule auto s) const;

        constexpr digit-dsl no_leading_zero() const;
    };

    template <typename Base = decimal>
    constexpr digits-dsl auto digits;
}

digits is a token rule that matches one or more digits of the specified Base with optional separator.

Matching

Matches and consumes digit<Base> once, then tries to match and consume it again until it no longer matches. If a separator has been specified using .sep(), it tries to match it after every digit. If it succeeds, matches digit<Base> immediately afterwards.

Errors
  • lexy::forbidden_leading_zero: if .no_leading_zero() was called and it matches zero as its first digit, which is then followed by a digit or separator. The error is raised at the position of the zero. The rule then fails without consuming additional input.

  • All errors raised by matching the initial digit. The rule then fails.

  • All errors raised by matching a digit after a separator to prevent trailing separators. The rule then fails.

Parse tree

Single token node with the lexy::predefined_token_kind lexy::digit_token_kind.

Example 3. Decimal digits with digit separator but without leading zeroes
struct production
{
    static constexpr auto rule
        = dsl::digits<>.sep(dsl::digit_sep_tick).no_leading_zero() + dsl::eof;
};
Note
If a digit separator is specified, it can be between any two digits. lexy does not enforce the use of thousands separators only, or other conventions.
Note
digits does not produce the value of the number it parsed. Use lexy::dsl::integer for that.

Token rule lexy::dsl::n_digits

lexy/dsl/digit.hpp
namespace lexy::dsl
{
    class n_digits-dsl // models token-rule
    {
    public:
        constexpr n_digits-dsl sep(token-rule auto s) const;
    };

    template <std::size_t N, typename Base = decimal>
    constexpr n_digits-dsl auto n_digits;
}

n_digits is a token rule that matches exactly N digits of the specified Base with optional separator.

Matching

Matches and consumes digit<Base> N times; any additional trailing digits are ignored. If a digit separator has been specified using .sep(), it tries to match it after every digit except the last one. This does not count towards the number of digits.

Errors

All errors raised by matching digit<Base>. The rule then fails.

Parse tree

Single token node with the lexy::predefined_token_kind lexy::digit_token_kind.

Example 4. A \x escape sequence
struct production
{
    static constexpr auto rule = LEXY_LIT("\\x") + dsl::n_digits<2, dsl::hex>;
};
Note
If a digit separator is specified, it can be between any two digits. lexy does not enforce the use of thousands separators only, or other conventions.
Note
n_digits does not produce the value of the number it parsed. Use lexy::dsl::integer or lexy::dsl::code_point_id for that.

Pre-defined digit separators

lexy/dsl/digit.hpp
namespace lexy::dsl
{
    constexpr token-rule auto digit_sep_underscore = lit_c<'_'>;
    constexpr token-rule auto digit_sep_tick       = lit_c<'\''>; // note: single character
}

The token rules digit_sep_underscore and digit_sep_tick are two convenience aliases for lexy::dsl::lit_c that match common digit separators.

See also