On Mon, Jul 10, 2017 at 2:06 PM, Artyom Beilis via Boost
It looks for me you are running into premature optimization.
Hmm, no, I don't think so. Correct utf8 validation is a bottleneck for every websocket program, I have used a profiler so this comes from measurement not opinion. It looks like you are using source inputs that contain high-ascii characters. In this case, Beast switches to the "slow" algorithm which is similar to what Locale does. Try using an input file that consists only of low-ASCII characters. Your results are quite different from mine, even with std::memcpy: beast: 1,124,515,969 char/s beast: 1,336,074,093 char/s beast: 1,494,183,562 char/s beast: 1,506,365,044 char/s beast: 1,533,419,187 char/s locale: 75,457,683 char/s locale: 81,358,140 char/s locale: 80,413,657 char/s locale: 81,635,114 char/s locale: 67,234,619 char/s Ubuntu VM: beast.benchmarks.utf8_checker beast: 2894806032 char/s beast: 2874126708 char/s beast: 2890616214 char/s beast: 2017890885 char/s beast: 2785087614 char/s locale: 574731777 char/s locale: 571439694 char/s locale: 242245477 char/s locale: 511534158 char/s locale: 574121386 char/s Travis https://travis-ci.org/vinniefalco/Beast/jobs/252155928#L1334 beast: 1155900653 char/s beast: 1146058480 char/s beast: 1162309551 char/s beast: 1151093660 char/s beast: 1159334387 char/s locale: 218684840 char/s locale: 220357048 char/s locale: 208476005 char/s locale: 224853783 char/s locale: 209990002 char/s On every machine I try, locale performs more poorly on all-low-ascii inputs by at least a factor of 5. Code: https://github.com/vinniefalco/Beast/blob/da7946b6e5f8bda225ff122984e945b9e0...