Header `lexy/dsl/ascii.hpp`

Char class rules for matching the ASCII char classes.

ASCII char classes

lexy/dsl/ascii.hpp

namespace lexy::dsl
{
    namespace ascii
    {
        constexpr char-class-rule auto control;

        constexpr char-class-rule auto blank;
        constexpr char-class-rule auto newline;
        constexpr char-class-rule auto other_space;
        constexpr char-class-rule auto space;

        constexpr char-class-rule auto digit;

        constexpr char-class-rule auto lower;
        constexpr char-class-rule auto upper;
        constexpr char-class-rule auto alpha;
        constexpr char-class-rule auto alpha_underscore;

        constexpr char-class-rule auto alpha_digit;
        constexpr char-class-rule auto alnum = alpha_digit;

        constexpr char-class-rule auto word;
        constexpr char-class-rule auto alpha_digit_underscore = word;

        constexpr char-class-rule auto punct;

        constexpr char-class-rule auto graph;
        constexpr char-class-rule auto print;

        constexpr char-class-rule auto character;
    }
}

These char class rules match one ASCII character from a char class, as specified in the table below.

The char classes

Token Rule Char Class <cctype> function (C locale)

Token Rule	Char Class	`<cctype>` function (C locale)
`control`	`0x00-0x1F`, `\x7F`	`std::iscntrl()`
`blank`	`' '` (space) or `'\t'`	`std::isblank()`
`newline`	`'\n'` or `'\r'`	n/a
`other_space`	`'\f'` or `'\v\`	n/a
`space`	`blank`, `newline`, `other_space`	`std::isspace()`
`digit`	`0123456789`	`std::isdigit()`
`lower`	`abcdefghijklmnopqrstuvwxyz`	`std::islower()`
`upper`	`ABCDEFGHIJKLMNOPQRSTUVWXYZ`	`std::isupper()`
`alpha`	`lower`, `upper`	`std::isalpha()`
`alpha_underscore`	`lower`, `upper`, `'_'`	n/a
`alpha_digit`	`lower`, `upper`, `digit`	`std::isalnum()`
`word`	`lower`, `upper`, `digit`, `'_'`	n/a
`punct`	!"#$%&'()*+,-./:;<=>?@[\]^_`{\|}~	`std::ispunct()`
`graph`	`alpha_digit`, `punct`	`std::isgraph()`
`print`	`alpha_digit`, `punct`, `' '` (space)	`std::ispunct()`
`character`	any ASCII character	n/a

control

0x00-0x1F, \x7F

std::iscntrl()

blank

' ' (space) or '\t'

std::isblank()

newline

'\n' or '\r'

n/a

other_space

'\f' or '\v\

n/a

space

blank, newline, other_space

std::isspace()

digit

0123456789

std::isdigit()

lower

abcdefghijklmnopqrstuvwxyz

std::islower()

upper

ABCDEFGHIJKLMNOPQRSTUVWXYZ

std::isupper()

alpha

lower, upper

std::isalpha()

alpha_underscore

lower, upper, '_'

n/a

alpha_digit

lower, upper, digit

std::isalnum()

word

lower, upper, digit, '_'

n/a

punct

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

std::ispunct()

graph

alpha_digit, punct

std::isgraph()

print

alpha_digit, punct, ' ' (space)

std::ispunct()

character

any ASCII character

n/a

Example 1. A C-like identifier

struct production
{
    static constexpr auto rule = [] {
        auto head = dsl::ascii::alpha_underscore;
        auto tail = dsl::ascii::alpha_digit_underscore;
        return dsl::identifier(head, tail);
    }();
};

Example 2. Allow ASCII whitespace between tokens

struct production
{
    static constexpr auto whitespace = dsl::ascii::space;
    static constexpr auto rule                  //
        = LEXY_LIT("Hello") + LEXY_LIT("World") //
          + dsl::exclamation_mark + dsl::eof;
};

Note	The only difference between `lexy::dsl::ascii::digit` and `lexy::dsl::digit<lexy::dsl::decimal>` is the name of the character class in the error.

Caution

Differentiate between lexy::dsl::ascii::newline, which matches \r or \n, and lexy::dsl::newline, which matches \r\n or \n!

Caution

As token rules, they match whitespace immediately following the character. As such, the rule is best used in contexts where automatic whitespace skipping is disabled. They can safely be used as part of the whitespace definition.

Tip	The equivalent of `std::isxdigit()` is `lexy::dsl::digit<lexy::dsl::hex>`.

Tip	Use `lexy::dsl::unicode` for the equivalent Unicode character classes.

Char class rule `lexy::dsl::ascii::one_of`

lexy/dsl/ascii.hpp

namespace lexy::dsl::ascii
{
    template <auto Str>
    constexpr char-class-rule auto one_of;
}

#define LEXY_ASCII_ONE_OF(Str) lexy::dsl::ascii::one_of<Str>

one_of is a char class rule that matches one of the specified ASCII characters. Its name is Str.

The macro LEXY_ASCII_ONE_OF(Str) is equivalent to one_of<Str>, except that it also works on older compilers that do not support C++20’s extended NTTPs. Use this instead of one_of<Str> if you need to support them.

Example 3. Match the name of a musical note.

struct production
{
    static constexpr auto rule = LEXY_ASCII_ONE_OF("CDEFGB");
};

Note	It is impossible to match the null character using `one_of`.

ASCII char classes

Char class rule lexy::dsl::ascii::one_of

See also

Char class rule `lexy::dsl::ascii::one_of`