Header lexy/dsl/code_point.hpp
Char class rule lexy::dsl::code_point
lexy/dsl/code_point.hpp
namespace lexy::dsl
{
class code-point-dsl // models char-class-rule
{
public:
template <typename Predicate>
constexpr token-rule auto if_() const;
template <char32_t ... CPs>
constexpr token-rule auto set() const;
template <char32_t Low, char32_t High>
constexpr token-rule auto range() const;
constexpr token-rule auto ascii() const;
constexpr token-rule auto bmp() const;
constexpr token-rule auto noncharacter() const;
template <lexy::code_point::general_category_t Category>
constexpr token-rule auto general_category() const;
template <lexy::code_point::gc-group CategoryGroup>
constexpr token-rule auto general_category() const;
};
constexpr code-point-dsl auto code_point;
}
code_point
is a char class rule that matches a specified set of Unicode code points.
code_point
matches an arbitrary scalar code point Its char class name is
code-point
.code_point.if_()
matches all code points where the predicate returns
true
. The predicate must have aconstexpr bool operator()(lexy::code_point)
. Its char class name is the type name ofP
.code_point.set()
matches the specified code points. Its name is
code-point.set
.code_point.range()
matches the code points in the range
[Low, High]
(both sides inclusive). Its name iscode-point.range
.code_point.ascii()
matches all ASCII code points. Its name is
code-point.ASCII
.code_point.bmp()
matches all code points in the BMP. Its name is
code-point.BMP
.code_point.noncharacter()
matches all non-character code points. Its name is
code-point.noncharacter
.code_point.general_category()
matches all code points with the specified category or category group. Its name is the name of the category. This requires the Unicode database, except for
Cc
,Cs
, andCo
, which are fixed.
Caution | As a token rule, it matches whitespace immediately following the code point. As such, the rule is best used in contexts where automatic whitespace skipping is disabled. |
Note | See lexy::dsl::unicode for common predefined predicates. |
Note | .ascii() , .bmp() , and .noncharacter() corresponds to the corresponding member function of lexy::code_point .
The other classification functions don’t have rules:
* cp.is_valid() and cp.is_scalar() is always true ; cp.is_surrogate() is never true .
* cp.is_control() is general category Cc .
* cp.is_private_use() is general category Co . |