JSON Validation Benchmark
This benchmark measures the time it takes to validate JSON, i.e. to check whether it is well-formed. Validation was chosen as opposed to parsing, as parsing speed depends on the JSON data structure as well. Implementing an efficient JSON container is out of scope for lexy, so it would have a disadvantage over the specialized JSON libraries.
The average validation times for each input are shown in the boxplots below. Lower values are better.
The implementations
baseline
This simply adds all input characters of the JSON document without performing actual validation.
lexy
A JSON validator using the lexy grammar from the example. It uses the regular
lexy::buffer
as input, enabling SWAR and other optimizations. For maximum performance, this is the recommended input.lexy (no SWAR)
Same as above, but it uses a special input where SWAR optimization has been manually disabled.
lexy (no buffer)
Same as above, but it uses
lexy::string_input
as the input. This is a stand-in for a generic non-buffer input where no input specific optimizations are possible.pegtl
A JSON validator using the PEGTL JSON grammar.
nlohmann/json
A JSON validator using JSON for Modern C++ implemented by
nlohmann::json::accept()
.rapidjson
A JSON validator using rapidjson implemented using a SAX parser with the
rapidjson::BaseReaderHandler
.Boost.JSON
A JSON validator using Boost.JSON implemented using a custom parse handler.
The inputs
canada.json
Contains lots of 2-element arrays holding floating-point coordinate pairs. Taken from https://github.com/miloyip/nativejson-benchmark.
citm_catalog.json
Big JSON file with some variety. Taken from https://github.com/miloyip/nativejson-benchmark.
twitter.json
Some data from twitter’s API. Taken from https://github.com/miloyip/nativejson-benchmark.
The Methodology
The input data is read using lexy::read_file()
.
The resulting buffer is then passed to the various implementations using their memory inputs.
Benchmarking is done by nanobench on an 2020 Mac Mini with M1 processor.