On May 10, 2013, at 1:45 PM, "Vicente J. Botet Escriba"
When I add validation on the source date format I get
clang 3.2 * empty field->serial ~6.3ns. * field->serial ~13.4ns. * empty serial->field ~1ns. * serial->field ~17.9ns.
gcc-4.8.0 * empty field->serial ~7.5ns. * field->serial ~15.7ns. * empty serial->field ~1ns. * serial->field ~21.7ns.
I've been experimenting with adding validation today. I'm guessing that all of your validation is in a translation unit hidden from the testing loop. Is that correct?
I've been putting my validation in a header because I want to make it constexpr, and constexpr stuff has weak linkage. The motivation for making it constexpr is that for any part of the validation that involves compile-time information, the validation happens at compile time.
And my first experiments today involve putting some of the validation back into the unit specifiers, in contrast to the direction I was heading earlier.
Specifically:
// invariants:
// 1 <= d_
class day
{
int d_;
static
constexpr
int
__attribute__((__always_inline__))
check_invariants(int d)
{
return 1 <= d ? d : throw bad_date{};
}
public:
constexpr
explicit
__attribute__((__always_inline__))
day(int d)
: d_(check_invariants(d))
{}
constexpr
__attribute__((__always_inline__))
operator int() const
{return d_;}
};
// invariants:
// 1 <= m_ && m_ <= 12
class month
{
int m_;
static
constexpr
int
__attribute__((__always_inline__))
check_invariants(int m)
{
return 1 <= m && m <= 12 ? m : throw bad_date{};
}
public:
constexpr
explicit
__attribute__((__always_inline__))
month(int m)
: m_(check_invariants(m))
{}
constexpr
__attribute__((__always_inline__))
operator int() const
{return m_;}
};
// invariants:
// none
class year
{
int y_;
public:
constexpr
explicit
__attribute__((__always_inline__))
year(int y)
: y_(y)
{}
constexpr
__attribute__((__always_inline__))
operator int() const
{return y_;}
constexpr
bool
__attribute__((__always_inline__))
is_leap() const
{return y_ % 4 == 0 && (y_ % 100 != 0 || y_ % 400 == 0);}
};
Because of a bug in clang (http://llvm.org/bugs/show_bug.cgi?id=12848) I've had to mark everything with always_inline to get the compiler to optimize it properly. But once done, it does the optimizations nicely.
Now the ymd_date (or whatever name) constructors can be carefully crafted to not re-validate information that is already known. For example if the ymd_date constructor takes a month (not an int), then there is no need for it to re-validate in the month at that point. month is known to be valid.
I've removed what I call "range checking", which means there is no validation on year.
Here is a partial implementation of what I'm testing for ymd_date:
class ymd_date
{
year y_;
month m_;
day d_;
static
constexpr
day
__attribute__((__always_inline__))
check_invariants(year y, month m, day d)
{
return m != 2 ?
(
d <= limit[m-1] ? d : throw bad_date{}
) :
(
y.is_leap() ? (d <= 29 ? d : throw bad_date{}) :
(d <= 28 ? d : throw bad_date{})
);
}
static
constexpr
day
__attribute__((__always_inline__))
check_invariants(year y, month_day md)
{
return md.month() != 2 || md.day() <= 28 || y.is_leap() ?
md.day() : throw bad_date{};
}
public:
constexpr
__attribute__((__always_inline__))
ymd_date(year y, month m, day d)
: y_(y),
m_(m),
d_(check_invariants(y_, m_, d))
{}
The class is holding objects of type year, month and date instead of 3 ints (or whatever) so that the invariants of the individual components are not compromised when storing into, or returning from the ymd_date (i.e. they don't have to unnecessarily undergo re-validation).
The ymd_date validator taking year, month and day doesn't have to validate the month, it is known to be valid. It doesn't have to validate the year, there is nothing to validate. It only has to validate the day. And it doesn't need to check that the day >= 1, the day constructor already took care of that.
My experiments with looking at assembly generated at -O3 is that if either the month or day is a compile-time object, the validation code is reduced. For example it is common for day to be the first of the month, or perhaps the 5th, or any other fixed number <= 28. When this happens, and I construct a:
ymd_date ymd(year(y), month(m), day(1));
I can see in the generated assembly that everything disappears except ensuring that 1 <= m <= 12. Similarly when only the month is compile-time information I'm seeing the constraint checking on d is simplified, especially for the case that the month is not feb.
But even when all three unit specifiers are run time information, when I run this through a field->serial conversion:
const int Ymin = 1900;
const int Ymax = 2100;
volatile int k;
int count = 0;
auto t0 = std::chrono::high_resolution_clock::now();
for (int y = Ymin; y <= Ymax; ++y)
{
for (int m = 1; m <= 12; ++m)
{
int last = days_in_month(y, m);
for (int d = 1; d <= last; ++d)
{
ymd_date ymd{year(y), month(m), day(d)};
k = days_from(ymd.year(), ymd.month(), ymd.day());
++count;
}
}
}
auto t1 = std::chrono::high_resolution_clock::now();
typedef std::chrono::duration