[boost.numeric] Poor Performance of numeric_cast

Tang Jiang Jun

15 Oct 2012 15 Oct '12

8:16 a.m.

Hi, I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code. Here is my testing code and result. *#include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream> using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono; int main() { const static int32_t COUNT = 1000000; high_resolution_clock::time_point start; start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i32 = 123; int16_t i16 = i32; } cout << format("Native Integer Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i32 = 100; int16_t i16 = numeric_cast< int16_t >( i32 ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { float f = 100.0f; int32_t i = static_cast< int32_t >( f ); } cout << format("Native Floating-Integer Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { float f = 123.0f; int32_t i = numeric_cast< int32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Floating-Integer Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i = 132; float f = static_cast< float >( i ); } cout << format("Native Integer-Floating Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i = 128; float f = numeric_cast< float >( i ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer-Floating Cast: %1%\n") % ( ( high_resolution_clock::now() - start ) / COUNT ); return 0; };* Result: *Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds*

Attachments:

attachment.html (text/html — 3.7 KB)

Show replies by date

Oswin Krause

15 Oct 15 Oct

8:30 a.m.

Hi, Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing. Greetings, Oswin On 2012-10-15 10:16, Tang Jiang Jun wrote:

...

Hi,

I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() {     const static int32_t COUNT = 1000000;     high_resolution_clock::time_point start;

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         int32_t i32 = 123;         int16_t i16 = i32;     }     cout << format("Native Integer Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             int32_t i32 = 100;             int16_t i16 = numeric_cast< int16_t >( i32 );         }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Integer Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         float f = 100.0f;         int32_t i = static_cast< int32_t >( f );     }     cout << format("Native Floating-Integer Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             float f = 123.0f;             int32_t i = numeric_cast< int32_t >( f );         }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Floating-Integer Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         int32_t i = 132;         float f = static_cast< float >( i );     }     cout << format("Native Integer-Floating Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             int32_t i = 128;             float f = numeric_cast< float >( i );         }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Integer-Floating Cast: %1%n") % ( ( high_resolution_clock::now() - start ) / COUNT );

    return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

Tang Jiang Jun

9:29 a.m.

Hi Oswin, Sorry, I forgot to mention that I compiled it as debug configuration in order to prevent unintended optimization. Anyway, many thanks for reminding! Tang On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause < Oswin.Krause@ruhr-uni-bochum.de> wrote:

...

Hi,

Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

...
Hi,

I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/**cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() { const static int32_t COUNT = 1000000; high_resolution_clock::time_**point start;

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i32 = 123; int16_t i16 = i32; } cout << format("Native Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i32 = 100; int16_t i16 = numeric_cast< int16_t >( i32 ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { float f = 100.0f; int32_t i = static_cast< int32_t >( f ); } cout << format("Native Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { float f = 123.0f; int32_t i = numeric_cast< int32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i = 132; float f = static_cast< float >( i ); } cout << format("Native Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i = 128; float f = numeric_cast< float >( i ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>

Oswin Krause

10:43 a.m.

Hi, Never benchmark in debug mode. Moreover, never ever benchmark boost code in debug mode. On 2012-10-15 11:29, Tang Jiang Jun wrote:

...

Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug configuration in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [3]> wrote:

...
Hi,

Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

...
Hi,

I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() {     const static int32_t COUNT = 1000000;     high_resolution_clock::time_point start;

     start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         int32_t i32 = 123;         int16_t i16 = i32;     }     cout << format("Native Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             int32_t i32 = 100;             int16_t i16 = numeric_cast< int16_t >( i32 );          }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         float f = 100.0f;         int32_t i = static_cast< int32_t >( f );     }     cout << format("Native Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             float f = 123.0f;             int32_t i = numeric_cast< int32_t >( f );          }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         int32_t i = 132;         float f = static_cast< float >( i );     }     cout << format("Native Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    start = high_resolution_clock::now();     for( int32_t n = 0; n < COUNT; ++n )     {         try         {             int32_t i = 128;             float f = numeric_cast< float >( i );          }         catch( const bad_numeric_cast& e )         {             cout << e.what() << endl;         }     }     cout << format("Boost Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

    return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [1] http://lists.boost.org/mailman/listinfo.cgi/boost-users [2]

Links: ------ [1] mailto:Boost-users@lists.boost.org [2] http://lists.boost.org/mailman/listinfo.cgi/boost-users [3] mailto:Oswin.Krause@ruhr-uni-bochum.de

Tang Jiang Jun

16 Oct 16 Oct

3:50 a.m.

Hi, I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However there definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily, this overhead will impact the performance significantly . The following is the benchmark code after modification and the result run in my computer. CODE #include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream> using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono; typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& ); nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now(); uint32_t sum = 0; _profileFunc( _count, sum ); nanoseconds ns = ( high_resolution_clock::now() - start ) / _count; cout << sum << endl; return ns; } void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } } void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } } void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f; for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); } _sum = static_cast< uint32_t >( fsum ); } void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f; for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } _sum = numeric_cast< uint32_t >( fsum ); } void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } } void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } } int main() { const static int32_t COUNT = 10000; nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT ); cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl; return 0; }; RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds Regards, Tang On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause < Oswin.Krause@ruhr-uni-bochum.de> wrote:

...

Hi,

Never benchmark in debug mode. Moreover, never ever benchmark boost code in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote:

...
Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug configuration in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.**de <Oswin.Krause@ruhr-uni-bochum.de>[3]> wrote:

Hi,

...
Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

Hi,

...
I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/**cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() { const static int32_t COUNT = 1000000; high_resolution_clock::time_**point start;

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i32 = 123; int16_t i16 = i32; } cout << format("Native Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i32 = 100; int16_t i16 = numeric_cast< int16_t >( i32 ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { float f = 100.0f; int32_t i = static_cast< int32_t >( f ); } cout << format("Native Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { float f = 123.0f; int32_t i = numeric_cast< int32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i = 132; float f = static_cast< float >( i ); } cout << format("Native Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i = 128; float f = numeric_cast< float >( i ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [1] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[2]

Links: ------ [1] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [2] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>

Brian Budge

4:08 a.m.

Unsure, but maybe you should put the try/catch outside of the inner loop? On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun <tangjiangjun@gmail.com> wrote:

...

Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However there definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily, this overhead will impact the performance significantly .

The following is the benchmark code after modification and the result run in my computer.

CODE #include <boost/numeric/conversion/cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de> wrote:

...
Hi,

Never benchmark in debug mode. Moreover, never ever benchmark boost code in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote:

...
Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug configuration in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [3]> wrote:

...
Hi,

Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

...
Hi,

I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() { const static int32_t COUNT = 1000000; high_resolution_clock::time_point start;

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i32 = 123; int16_t i16 = i32; } cout << format("Native Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i32 = 100; int16_t i16 = numeric_cast< int16_t >( i32 ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { float f = 100.0f; int32_t i = static_cast< int32_t >( f ); } cout << format("Native Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { float f = 123.0f; int32_t i = numeric_cast< int32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i = 132; float f = static_cast< float >( i ); } cout << format("Native Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i = 128; float f = numeric_cast< float >( i ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [1] http://lists.boost.org/mailman/listinfo.cgi/boost-users [2]

Links: ------ [1] mailto:Boost-users@lists.boost.org [2] http://lists.boost.org/mailman/listinfo.cgi/boost-users [3] mailto:Oswin.Krause@ruhr-uni-bochum.de

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Tang Jiang Jun

4:45 a.m.

I've already tried to remove all the try-catch blocks, but the overhead is still there. On the other side, when casting a number, it is general to use a dedicated try-catch block to protect it. On Tue, Oct 16, 2012 at 12:08 PM, Brian Budge <brian.budge@gmail.com> wrote:

...

Unsure, but maybe you should put the try/catch outside of the inner loop?

...
Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However there definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if

On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun <tangjiangjun@gmail.com> wrote: the

...
cast is the core step in an algorithms and is executed heavily, this overhead will impact the performance significantly .

The following is the benchmark code after modification and the result run in my computer.

CODE #include <boost/numeric/conversion/cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de> wrote:

...
Hi,

Never benchmark in debug mode. Moreover, never ever benchmark boost code in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote:

...
Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug configuration in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [3]> wrote:

...
Hi,

Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

...
Hi,

I have run a performance testing for numeric_cast recently, and found that the result was really unexpected bad, although the document mentioned that it will be no overhead if overflows don't happen. Could somebody please help me to verify this testing? If this is true, I doubt whether I should use numeric_cast in the production code.

Here is my testing code and result.

#include <boost/numeric/conversion/cast.hpp> #include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

int main() { const static int32_t COUNT = 1000000; high_resolution_clock::time_point start;

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i32 = 123; int16_t i16 = i32; } cout << format("Native Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i32 = 100; int16_t i16 = numeric_cast< int16_t >( i32 ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { float f = 100.0f; int32_t i = static_cast< int32_t >( f ); } cout << format("Native Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { float f = 123.0f; int32_t i = numeric_cast< int32_t >( f ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Floating-Integer Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { int32_t i = 132; float f = static_cast< float >( i ); } cout << format("Native Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

start = high_resolution_clock::now(); for( int32_t n = 0; n < COUNT; ++n ) { try { int32_t i = 128; float f = numeric_cast< float >( i ); } catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } cout << format("Boost Integer-Floating Cast: %1%n") % ( (

high_resolution_clock::now() - start ) / COUNT );

return 0; };

Result: Native Integer Cast: 3 nanoseconds Boost Integer Cast: 311 nanoseconds Native Floating-Integer Cast: 4 nanoseconds Boost Floating-Integer Cast: 430 nanoseconds Native Integer-Floating Cast: 2 nanoseconds Boost Integer-Floating Cast: 106 nanoseconds

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [1] http://lists.boost.org/mailman/listinfo.cgi/boost-users [2]

Links: ------ [1] mailto:Boost-users@lists.boost.org [2] http://lists.boost.org/mailman/listinfo.cgi/boost-users [3] mailto:Oswin.Krause@ruhr-uni-bochum.de

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Oswin Krause

5:24 a.m.

Hi, the results turnd out to have a high variance due to the low time usage. SInce just choosing higher count numbers already lead to an overflow, i hacked in the following loop: nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now(); double summ = 0; for(std::size_t i = 0; i != 100000; ++i){ uint32_t sum = i; _profileFunc( _count, sum ); summ +=sum; } nanoseconds ns = ( high_resolution_clock::now() - start ) / _count; cout << summ << endl; return ns; } results: Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance. On 2012-10-16 06:45, Tang Jiang Jun wrote:

...

I've already tried to remove all the try-catch blocks, but the overhead is still there. On the other side, when casting a number, it is general to use a dedicated try-catch block to protect it.

On Tue, Oct 16, 2012 at 12:08 PM, Brian Budge <brian.budge@gmail.com [15]> wrote:

...
Unsure, but maybe you should put the try/catch outside of the inner loop?

...
Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However

...
definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily,

On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun <tangjiangjun@gmail.com [1]> wrote: there this

...
overhead will impact the performance significantly .

The following is the benchmark code after modification and the result run in my computer.

CODE #include <boost/numeric/conversion/cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [2]> wrote:

...
Hi,

Never benchmark in debug mode. Moreover, never ever benchmark

boost code

...
in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote:

...
Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug

configuration

...
in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [3] [3]> wrote:

...
Hi,

Your complete loop got optimized away in the native test cases. Because of the try/catch block the compiler couldn't do this in the other cases. So you are benchmarking nothing vs somthing.

Greetings, Oswin

On 2012-10-15 10:16, Tang Jiang Jun wrote:

> Hi, > > I have run a performance testing for numeric_cast recently, and > found > that the result was really unexpected bad, although the document > mentioned that it will be no overhead if overflows don't happen. > Could somebody please help me to verify this testing? If this is > true, > I doubt whether I should use numeric_cast in the production code. > > Here is my testing code and result. > > #include <boost/numeric/conversion/cast.hpp> > #include <boost/format.hpp> > #include <boost/cstdint.hpp> > #include <boost/chrono.hpp> > #include <iostream> > > using namespace std; > using namespace boost; > using namespace boost::numeric; > using namespace boost::chrono; > > int main() > { > const static int32_t COUNT = 1000000; > high_resolution_clock::time_point start; > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > int32_t i32 = 123; > int16_t i16 = i32; > } > cout << format("Native Integer Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > try > { > int32_t i32 = 100; > int16_t i16 = numeric_cast< int16_t >( i32 > ); > } > catch( const bad_numeric_cast& e ) > { > cout << e.what() << endl; > } > } > cout << format("Boost Integer Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > float f = 100.0f; > int32_t i = static_cast< int32_t >( f ); > } > cout << format("Native Floating-Integer Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > try > { > float f = 123.0f; > int32_t i = numeric_cast< int32_t >( f ); > } > catch( const bad_numeric_cast& e ) > { > cout << e.what() << endl; > } > } > cout << format("Boost Floating-Integer Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > int32_t i = 132; > float f = static_cast< float >( i ); > } > cout << format("Native Integer-Floating Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > start = high_resolution_clock::now(); > for( int32_t n = 0; n < COUNT; ++n ) > { > try > { > int32_t i = 128; > float f = numeric_cast< float >( i ); > } > catch( const bad_numeric_cast& e ) > { > cout << e.what() << endl; > } > } > cout << format("Boost Integer-Floating Cast: %1%n") % ( ( > > high_resolution_clock::now() - start ) / COUNT ); > > return 0; > }; > > Result: > Native Integer Cast: 3 nanoseconds > Boost Integer Cast: 311 nanoseconds > Native Floating-Integer Cast: 4 nanoseconds > Boost Floating-Integer Cast: 430 nanoseconds > Native Integer-Floating Cast: 2 nanoseconds > Boost Integer-Floating Cast: 106 nanoseconds

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [4] [1] http://lists.boost.org/mailman/listinfo.cgi/boost-users [5] [2]

Links: ------ [1] mailto:Boost-users@lists.boost.org [6] [2] http://lists.boost.org/mailman/listinfo.cgi/boost-users [7] [3] mailto:Oswin.Krause@ruhr-uni-bochum.de [8]

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [9] http://lists.boost.org/mailman/listinfo.cgi/boost-users [10]

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [11] http://lists.boost.org/mailman/listinfo.cgi/boost-users [12]

Boost-users mailing list Boost-users@lists.boost.org [13] http://lists.boost.org/mailman/listinfo.cgi/boost-users [14]

Links: ------ [1] mailto:tangjiangjun@gmail.com [2] mailto:Oswin.Krause@ruhr-uni-bochum.de [3] mailto:Oswin.Krause@ruhr-uni-bochum.de [4] mailto:Boost-users@lists.boost.org [5] http://lists.boost.org/mailman/listinfo.cgi/boost-users [6] mailto:Boost-users@lists.boost.org [7] http://lists.boost.org/mailman/listinfo.cgi/boost-users [8] mailto:Oswin.Krause@ruhr-uni-bochum.de [9] mailto:Boost-users@lists.boost.org [10] http://lists.boost.org/mailman/listinfo.cgi/boost-users [11] mailto:Boost-users@lists.boost.org [12] http://lists.boost.org/mailman/listinfo.cgi/boost-users [13] mailto:Boost-users@lists.boost.org [14] http://lists.boost.org/mailman/listinfo.cgi/boost-users [15] mailto:brian.budge@gmail.com

Tang Jiang Jun

7:47 a.m.

Hi, I adopted your suggestion to run the inner cast for 10000 * 10000 times, and the result is same on my computer. I guess maybe the difference is caused by the architecture of our CPU, and my CPU is intel i3. Here is the result. Native Integer Cast: 2 nanoseconds Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 15 nanoseconds On Tue, Oct 16, 2012 at 1:24 PM, Oswin Krause < Oswin.Krause@ruhr-uni-bochum.de> wrote:

...

Hi,

the results turnd out to have a high variance due to the low time usage. SInce just choosing higher count numbers already lead to an overflow, i hacked in the following loop:

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_**point start = high_resolution_clock::now(); double summ = 0; for(std::size_t i = 0; i != 100000; ++i){ uint32_t sum = i; _profileFunc( _count, sum ); summ +=sum;

}

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << summ << endl;

return ns; }

results:

Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds

so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance.

On 2012-10-16 06:45, Tang Jiang Jun wrote:

...
I've already tried to remove all the try-catch blocks, but the overhead is still there. On the other side, when casting a number, it is general to use a dedicated try-catch block to protect it.

On Tue, Oct 16, 2012 at 12:08 PM, Brian Budge <brian.budge@gmail.com [15]> wrote:

Unsure, but maybe you should put the try/catch outside of the inner

...
loop?

...
Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However

...
definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily,

On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun <tangjiangjun@gmail.com [1]> wrote: there this

...
overhead will impact the performance significantly .

The following is the benchmark code after modification and the result run in my computer.

CODE #include <boost/numeric/conversion/**cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_**point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.**de <Oswin.Krause@ruhr-uni-bochum.de>[2]> wrote:

...
Hi,

Never benchmark in debug mode. Moreover, never ever benchmark

boost code

...
in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote:

...
Hi Oswin,

Sorry, I forgot to mention that I compiled it as debug

configuration

...
in order to prevent unintended optimization. Anyway, many thanks for reminding!

Tang

On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.**de <Oswin.Krause@ruhr-uni-bochum.de>[3] [3]> wrote:

> Hi, > > Your complete loop got optimized away in the native test cases. > Because of the try/catch block the compiler couldn't do this in the > other cases. So you are benchmarking nothing vs somthing. > > Greetings, > Oswin > > On 2012-10-15 10:16, Tang Jiang Jun wrote: > >> Hi, >> >> I have run a performance testing for numeric_cast recently, and >> found >> that the result was really unexpected bad, although the document >> mentioned that it will be no overhead if overflows don't happen. >> Could somebody please help me to verify this testing? If this is >> true, >> I doubt whether I should use numeric_cast in the production code. >> >> Here is my testing code and result. >> >> #include <boost/numeric/conversion/**cast.hpp> >> #include <boost/format.hpp> >> #include <boost/cstdint.hpp> >> #include <boost/chrono.hpp> >> #include <iostream> >> >> using namespace std; >> using namespace boost; >> using namespace boost::numeric; >> using namespace boost::chrono; >> >> int main() >> { >> const static int32_t COUNT = 1000000; >> high_resolution_clock::time_**point start; >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> int32_t i32 = 123; >> int16_t i16 = i32; >> } >> cout << format("Native Integer Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> try >> { >> int32_t i32 = 100; >> int16_t i16 = numeric_cast< int16_t >( i32 >> ); >> } >> catch( const bad_numeric_cast& e ) >> { >> cout << e.what() << endl; >> } >> } >> cout << format("Boost Integer Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> float f = 100.0f; >> int32_t i = static_cast< int32_t >( f ); >> } >> cout << format("Native Floating-Integer Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> try >> { >> float f = 123.0f; >> int32_t i = numeric_cast< int32_t >( f ); >> } >> catch( const bad_numeric_cast& e ) >> { >> cout << e.what() << endl; >> } >> } >> cout << format("Boost Floating-Integer Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> int32_t i = 132; >> float f = static_cast< float >( i ); >> } >> cout << format("Native Integer-Floating Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> start = high_resolution_clock::now(); >> for( int32_t n = 0; n < COUNT; ++n ) >> { >> try >> { >> int32_t i = 128; >> float f = numeric_cast< float >( i ); >> } >> catch( const bad_numeric_cast& e ) >> { >> cout << e.what() << endl; >> } >> } >> cout << format("Boost Integer-Floating Cast: %1%n") % ( ( >> >> high_resolution_clock::now() - start ) / COUNT ); >> >> return 0; >> }; >> >> Result: >> Native Integer Cast: 3 nanoseconds >> Boost Integer Cast: 311 nanoseconds >> Native Floating-Integer Cast: 4 nanoseconds >> Boost Floating-Integer Cast: 430 nanoseconds >> Native Integer-Floating Cast: 2 nanoseconds >> Boost Integer-Floating Cast: 106 nanoseconds > > > ______________________________**_________________ > Boost-users mailing list > Boost-users@lists.boost.org [4] [1] > http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[5] [2]

Links: ------ [1] mailto:Boost-users@lists.**boost.org<Boost-users@lists.boost.org>[6] [2] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[7] [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>[8]

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [9] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[10]

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [11] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[12] ______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [13] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[14]

Links: ------ [1] mailto:tangjiangjun@gmail.com [2] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [4] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [5] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [6] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [7] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [8] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [9] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [10] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [11] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [12] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [13] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [14] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [15] mailto:brian.budge@gmail.com

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>

Oswin Krause

7:55 a.m.

Hi, is this the total run time? can you show the code? Your run times are the same as before. you should expect a runtime factor of 1000+ in the end result - or the compiler was too smart. On 2012-10-16 09:47, Tang Jiang Jun wrote:

...

Hi,

I adopted your suggestion to run the inner cast for 10000 * 10000 times, and the result is same on my computer. I guess maybe the difference is caused by the architecture of our CPU, and my CPU is intel i3.

Here is the result.

Native Integer Cast: 2 nanoseconds Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 15 nanoseconds

On Tue, Oct 16, 2012 at 1:24 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.de [33]> wrote:

...
Hi,

the results turnd out to have a high variance due to the low time usage. SInce just choosing higher count numbers already lead to an overflow, i hacked in the following loop:

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now(); double summ = 0; for(std::size_t i = 0; i != 100000; ++i){ uint32_t sum = i; _profileFunc( _count, sum ); summ +=sum;

}

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << summ << endl;

return ns; }

results:

Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds

so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance.

On 2012-10-16 06:45, Tang Jiang Jun wrote:

...
I've already tried to remove all the try-catch blocks, but the overhead is still there. On the other side, when casting a number, it is general to use a dedicated try-catch block to protect it.

On Tue, Oct 16, 2012 at 12:08 PM, Brian Budge <brian.budge@gmail.com [15] [15]> wrote:

...
Unsure, but maybe you should put the try/catch outside of the inner loop?

On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun

...
Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However

...
definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily,

...
overhead will impact the performance significantly .

The following is the benchmark code after modification and

<tangjiangjun@gmail.com [1] [1]> wrote: there this the result run in

...
my computer.

CODE #include <boost/numeric/conversion/cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause

...
<Oswin.Krause@ruhr-uni-bochum.de [2] [2]> wrote:

...
Hi,

Never benchmark in debug mode. Moreover, never ever

benchmark boost code

...
in debug mode.

On 2012-10-15 11:29, Tang Jiang Jun wrote: > > Hi Oswin, > > Sorry, I forgot to mention that I compiled it as debug configuration > in order to prevent unintended optimization. > Anyway, many thanks for reminding! > > Tang > > > On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause

...
...
> <Oswin.Krause@ruhr-uni-bochum.de [3] [3] [3]> wrote: > >> Hi, >> >> Your complete loop got optimized away in the native test cases. >> Because of the try/catch block the compiler couldn't do this in the >> other cases. So you are benchmarking nothing vs somthing. >> >> Greetings, >> Oswin >> >> On 2012-10-15 10:16, Tang Jiang Jun wrote: >> >>> Hi, >>> >>> I have run a performance testing for numeric_cast recently, and >>> found >>> that the result was really unexpected bad, although the document >>> mentioned that it will be no overhead if overflows don't happen. >>> Could somebody please help me to verify this testing? If this is >>> true, >>> I doubt whether I should use numeric_cast in the production code. >>> >>> Here is my testing code and result. >>> >>> #include <boost/numeric/conversion/cast.hpp> >>> #include <boost/format.hpp> >>> #include <boost/cstdint.hpp> >>> #include <boost/chrono.hpp> >>> #include <iostream> >>> >>> using namespace std; >>> using namespace boost; >>> using namespace boost::numeric; >>> using namespace boost::chrono; >>> >>> int main() >>> { >>> const static int32_t COUNT = 1000000; >>> high_resolution_clock::time_point start; >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> int32_t i32 = 123; >>> int16_t i16 = i32; >>> } >>> cout << format("Native Integer Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> try >>> { >>> int32_t i32 = 100; >>> int16_t i16 = numeric_cast< int16_t >( i32 >>> ); >>> } >>> catch( const bad_numeric_cast& e ) >>> { >>> cout << e.what() << endl; >>> } >>> } >>> cout << format("Boost Integer Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> float f = 100.0f; >>> int32_t i = static_cast< int32_t >( f ); >>> } >>> cout << format("Native Floating-Integer Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> try >>> { >>> float f = 123.0f; >>> int32_t i = numeric_cast< int32_t >( f ); >>> } >>> catch( const bad_numeric_cast& e ) >>> { >>> cout << e.what() << endl; >>> } >>> } >>> cout << format("Boost Floating-Integer Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> int32_t i = 132; >>> float f = static_cast< float >( i ); >>> } >>> cout << format("Native Integer-Floating Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> start = high_resolution_clock::now(); >>> for( int32_t n = 0; n < COUNT; ++n ) >>> { >>> try >>> { >>> int32_t i = 128; >>> float f = numeric_cast< float >( i ); >>> } >>> catch( const bad_numeric_cast& e ) >>> { >>> cout << e.what() << endl; >>> } >>> } >>> cout << format("Boost Integer-Floating Cast: %1%n") % ( ( >>> >>> high_resolution_clock::now() - start ) / COUNT ); >>> >>> return 0; >>> }; >>> >>> Result: >>> Native Integer Cast: 3 nanoseconds >>> Boost Integer Cast: 311 nanoseconds >>> Native Floating-Integer Cast: 4 nanoseconds >>> Boost Floating-Integer Cast: 430 nanoseconds >>> Native Integer-Floating Cast: 2 nanoseconds >>> Boost Integer-Floating Cast: 106 nanoseconds >> >> >> _______________________________________________ >> Boost-users mailing list >> Boost-users@lists.boost.org [4] [4] [1] >> http://lists.boost.org/mailman/listinfo.cgi/boost-users [5] [5] [2] > > > > > Links: > ------ > [1] mailto:Boost-users@lists.boost.org [6] [6] > [2] http://lists.boost.org/mailman/listinfo.cgi/boost-users [7] [7] > [3] mailto:Oswin.Krause@ruhr-uni-bochum.de [8] [8]

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [9] [9] http://lists.boost.org/mailman/listinfo.cgi/boost-users [10] [10]

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [11] [11] http://lists.boost.org/mailman/listinfo.cgi/boost-users [12] [12]

Boost-users mailing list Boost-users@lists.boost.org [13] [13] http://lists.boost.org/mailman/listinfo.cgi/boost-users [14] [14]

Links: ------ [1] mailto:tangjiangjun@gmail.com [16] [2] mailto:Oswin.Krause@ruhr-uni-bochum.de [17] [3] mailto:Oswin.Krause@ruhr-uni-bochum.de [18] [4] mailto:Boost-users@lists.boost.org [19] [5] http://lists.boost.org/mailman/listinfo.cgi/boost-users [20] [6] mailto:Boost-users@lists.boost.org [21] [7] http://lists.boost.org/mailman/listinfo.cgi/boost-users [22] [8] mailto:Oswin.Krause@ruhr-uni-bochum.de [23] [9] mailto:Boost-users@lists.boost.org [24] [10] http://lists.boost.org/mailman/listinfo.cgi/boost-users [25] [11] mailto:Boost-users@lists.boost.org [26] [12] http://lists.boost.org/mailman/listinfo.cgi/boost-users [27] [13] mailto:Boost-users@lists.boost.org [28] [14] http://lists.boost.org/mailman/listinfo.cgi/boost-users [29] [15] mailto:brian.budge@gmail.com [30]

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org [31] http://lists.boost.org/mailman/listinfo.cgi/boost-users [32]

Links: ------ [1] mailto:tangjiangjun@gmail.com [2] mailto:Oswin.Krause@ruhr-uni-bochum.de [3] mailto:Oswin.Krause@ruhr-uni-bochum.de [4] mailto:Boost-users@lists.boost.org [5] http://lists.boost.org/mailman/listinfo.cgi/boost-users [6] mailto:Boost-users@lists.boost.org [7] http://lists.boost.org/mailman/listinfo.cgi/boost-users [8] mailto:Oswin.Krause@ruhr-uni-bochum.de [9] mailto:Boost-users@lists.boost.org [10] http://lists.boost.org/mailman/listinfo.cgi/boost-users [11] mailto:Boost-users@lists.boost.org [12] http://lists.boost.org/mailman/listinfo.cgi/boost-users [13] mailto:Boost-users@lists.boost.org [14] http://lists.boost.org/mailman/listinfo.cgi/boost-users [15] mailto:brian.budge@gmail.com [16] mailto:tangjiangjun@gmail.com [17] mailto:Oswin.Krause@ruhr-uni-bochum.de [18] mailto:Oswin.Krause@ruhr-uni-bochum.de [19] mailto:Boost-users@lists.boost.org [20] http://lists.boost.org/mailman/listinfo.cgi/boost-users [21] mailto:Boost-users@lists.boost.org [22] http://lists.boost.org/mailman/listinfo.cgi/boost-users [23] mailto:Oswin.Krause@ruhr-uni-bochum.de [24] mailto:Boost-users@lists.boost.org [25] http://lists.boost.org/mailman/listinfo.cgi/boost-users [26] mailto:Boost-users@lists.boost.org [27] http://lists.boost.org/mailman/listinfo.cgi/boost-users [28] mailto:Boost-users@lists.boost.org [29] http://lists.boost.org/mailman/listinfo.cgi/boost-users [30] mailto:brian.budge@gmail.com [31] mailto:Boost-users@lists.boost.org [32] http://lists.boost.org/mailman/listinfo.cgi/boost-users [33] mailto:Oswin.Krause@ruhr-uni-bochum.de

Tang Jiang Jun

8:13 a.m.

No, it's the unit time for each casting. The code is as follows. nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_point start = high_resolution_clock::now(); uint32_t c = 0; for( uint32_t n = 0; n < _count; ++n ) { uint32_t sum = n; _profileFunc( _count, sum ); if( sum > n ) { c++; } } nanoseconds ns = ( ( high_resolution_clock::now() - start ) / _count ) / _count; cout << c << endl; return ns; } I added a trivial counter to prevent the compiler to optimize out the profiling code. On Tue, Oct 16, 2012 at 3:55 PM, Oswin Krause < Oswin.Krause@ruhr-uni-bochum.de> wrote:

...

Hi,

is this the total run time? can you show the code? Your run times are the same as before. you should expect a runtime factor of 1000+ in the end result - or the compiler was too smart.

On 2012-10-16 09:47, Tang Jiang Jun wrote:

...
Hi,

I adopted your suggestion to run the inner cast for 10000 * 10000 times, and the result is same on my computer. I guess maybe the difference is caused by the architecture of our CPU, and my CPU is intel i3.

Here is the result.

Native Integer Cast: 2 nanoseconds Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 15 nanoseconds

On Tue, Oct 16, 2012 at 1:24 PM, Oswin Krause <Oswin.Krause@ruhr-uni-bochum.**de <Oswin.Krause@ruhr-uni-bochum.de>[33]> wrote:

Hi,

...
the results turnd out to have a high variance due to the low time usage. SInce just choosing higher count numbers already lead to an overflow, i hacked in the following loop:

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_**point start = high_resolution_clock::now(); double summ = 0; for(std::size_t i = 0; i != 100000; ++i){ uint32_t sum = i; _profileFunc( _count, sum ); summ +=sum;

}

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << summ << endl;

return ns; }

results:

Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds

so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance.

On 2012-10-16 06:45, Tang Jiang Jun wrote:

I've already tried to remove all the try-catch blocks, but the

...
overhead is still there. On the other side, when casting a number, it is general to use a dedicated try-catch block to protect it.

On Tue, Oct 16, 2012 at 12:08 PM, Brian Budge <brian.budge@gmail.com [15] [15]> wrote:

Unsure, but maybe you should put the try/catch outside of the

...
inner loop?

On Mon, Oct 15, 2012 at 8:50 PM, Tang Jiang Jun

...
Hi,

I modify my code to make it can run in release mode without unintended optimization, and now the performance is acceptable. However

...
definitely has some runtime overhead even no overflow happens, and the overhead takes extra time as much as the plain cast itself takes. I think this maybe should be mentioned in the numeric_cast document, because if the cast is the core step in an algorithms and is executed heavily,

...
overhead will impact the performance significantly .

The following is the benchmark code after modification and

<tangjiangjun@gmail.com [1] [1]> wrote: there this the result run in

...
my computer.

CODE #include <boost/numeric/conversion/**cast.hpp>

#include <boost/format.hpp> #include <boost/cstdint.hpp> #include <boost/chrono.hpp> #include <iostream>

using namespace std; using namespace boost; using namespace boost::numeric; using namespace boost::chrono;

typedef void (*PROFILE_FUNC)( uint32_t, uint32_t& );

nanoseconds profile( PROFILE_FUNC _profileFunc, uint32_t _count ) { high_resolution_clock::time_**point start = high_resolution_clock::now();

uint32_t sum = 0; _profileFunc( _count, sum );

nanoseconds ns = ( high_resolution_clock::now() - start ) / _count;

cout << sum << endl;

return ns; }

void native_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { _sum += static_cast< uint32_t >( n ); } }

void boost_integer_cast( uint32_t _count, uint32_t& _sum ) { for( uint64_t n = 0; n < _count; ++n ) { try { _sum += numeric_cast< uint32_t >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

void native_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { fsum += static_cast< float >( n ); }

_sum = static_cast< uint32_t >( fsum ); }

void boost_itof_cast( uint32_t _count, uint32_t& _sum ) { float fsum = 0.0f;

for( uint32_t n = 0; n < _count; ++n ) { try { fsum += numeric_cast< float >( n );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } }

_sum = numeric_cast< uint32_t >( fsum ); }

void native_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { _sum += static_cast< uint32_t >( f ); } }

void boost_ftoi_cast( uint32_t _count, uint32_t& _sum ) { for( float f = 0.0f; f < _count; f += 1.0f ) { try { _sum += numeric_cast< uint32_t >( f );

} catch( const bad_numeric_cast& e ) { cout << e.what() << endl; } } }

int main() { const static int32_t COUNT = 10000;

nanoseconds nsNativeIntegerCast = profile( native_integer_cast, COUNT ); nanoseconds nsBoostIntegerCast = profile( boost_integer_cast, COUNT ); nanoseconds nsNativeItofCast = profile( native_itof_cast, COUNT ); nanoseconds nsBoostItofCast = profile( boost_itof_cast, COUNT ); nanoseconds nsNativeFtoiCast = profile( native_ftoi_cast, COUNT ); nanoseconds nsBoostFtoiCast = profile( boost_ftoi_cast, COUNT );

cout << "Native Integer Cast: " << nsNativeIntegerCast << endl; cout << "Boost Integer Cast: " << nsBoostIntegerCast << endl; cout << "Native Integer-Floating Cast: " << nsNativeItofCast << endl; cout << "Boost Integer-Floating Cast: " << nsBoostItofCast << endl; cout << "Native Floating-Integer Cast: " << nsNativeFtoiCast << endl; cout << "Boost Floating-Integer Cast: " << nsBoostFtoiCast << endl;

return 0; };

RESULT: Native Integer Cast: 1 nanosecond Boost Integer Cast: 4 nanoseconds Native Integer-Floating Cast: 3 nanoseconds Boost Integer-Floating Cast: 3 nanoseconds Native Floating-Integer Cast: 5 nanoseconds Boost Floating-Integer Cast: 14 nanoseconds

Regards, Tang

On Mon, Oct 15, 2012 at 6:43 PM, Oswin Krause

...
<Oswin.Krause@ruhr-uni-bochum.**de <Oswin.Krause@ruhr-uni-bochum.de>[2] [2]> wrote: > > Hi, > > Never benchmark in debug mode. Moreover, never ever benchmark boost code > in debug mode. > > > On 2012-10-15 11:29, Tang Jiang Jun wrote: >> >> Hi Oswin, >> >> Sorry, I forgot to mention that I compiled it as debug configuration >> in order to prevent unintended optimization. >> Anyway, many thanks for reminding! >> >> Tang >> >> >> On Mon, Oct 15, 2012 at 4:30 PM, Oswin Krause

...
>> <Oswin.Krause@ruhr-uni-bochum.**de<Oswin.Krause@ruhr-uni-bochum.de>[3] [3] [3]> wrote: >> >>> Hi, >>> >>> Your complete loop got optimized away in the native test cases. >>> Because of the try/catch block the compiler couldn't do this in the >>> other cases. So you are benchmarking nothing vs somthing. >>> >>> Greetings, >>> Oswin >>> >>> On 2012-10-15 10:16, Tang Jiang Jun wrote: >>> >>>> Hi, >>>> >>>> I have run a performance testing for numeric_cast recently, and >>>> found >>>> that the result was really unexpected bad, although the document >>>> mentioned that it will be no overhead if overflows don't happen. >>>> Could somebody please help me to verify this testing? If this is >>>> true, >>>> I doubt whether I should use numeric_cast in the production code. >>>> >>>> Here is my testing code and result. >>>> >>>> #include <boost/numeric/conversion/**cast.hpp> >>>> #include <boost/format.hpp> >>>> #include <boost/cstdint.hpp> >>>> #include <boost/chrono.hpp> >>>> #include <iostream> >>>> >>>> using namespace std; >>>> using namespace boost; >>>> using namespace boost::numeric; >>>> using namespace boost::chrono; >>>> >>>> int main() >>>> { >>>> const static int32_t COUNT = 1000000; >>>> high_resolution_clock::time_**point start; >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> int32_t i32 = 123; >>>> int16_t i16 = i32; >>>> } >>>> cout << format("Native Integer Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> try >>>> { >>>> int32_t i32 = 100; >>>> int16_t i16 = numeric_cast< int16_t >( i32 >>>> ); >>>> } >>>> catch( const bad_numeric_cast& e ) >>>> { >>>> cout << e.what() << endl; >>>> } >>>> } >>>> cout << format("Boost Integer Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> float f = 100.0f; >>>> int32_t i = static_cast< int32_t >( f ); >>>> } >>>> cout << format("Native Floating-Integer Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> try >>>> { >>>> float f = 123.0f; >>>> int32_t i = numeric_cast< int32_t >( f ); >>>> } >>>> catch( const bad_numeric_cast& e ) >>>> { >>>> cout << e.what() << endl; >>>> } >>>> } >>>> cout << format("Boost Floating-Integer Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> int32_t i = 132; >>>> float f = static_cast< float >( i ); >>>> } >>>> cout << format("Native Integer-Floating Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> start = high_resolution_clock::now(); >>>> for( int32_t n = 0; n < COUNT; ++n ) >>>> { >>>> try >>>> { >>>> int32_t i = 128; >>>> float f = numeric_cast< float >( i ); >>>> } >>>> catch( const bad_numeric_cast& e ) >>>> { >>>> cout << e.what() << endl; >>>> } >>>> } >>>> cout << format("Boost Integer-Floating Cast: %1%n") % ( ( >>>> >>>> high_resolution_clock::now() - start ) / COUNT ); >>>> >>>> return 0; >>>> }; >>>> >>>> Result: >>>> Native Integer Cast: 3 nanoseconds >>>> Boost Integer Cast: 311 nanoseconds >>>> Native Floating-Integer Cast: 4 nanoseconds >>>> Boost Floating-Integer Cast: 430 nanoseconds >>>> Native Integer-Floating Cast: 2 nanoseconds >>>> Boost Integer-Floating Cast: 106 nanoseconds >>> >>> >>> ______________________________**_________________ >>> Boost-users mailing list >>> Boost-users@lists.boost.org [4] [4] [1] >>> http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [5] [5] [2] >> >> >> >> >> Links: >> ------ >> [1] mailto:Boost-users@lists.**boost.org<Boost-users@lists.boost.org>[6] [6] >> [2] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [7] [7] >> [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>[8] [8] > > > ______________________________**_________________ > Boost-users mailing list > Boost-users@lists.boost.org [9] [9] > http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[10]

[10]

...
______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [11] [11] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[12]

[12] ______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [13] [13] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[14] [14]

Links: ------ [1] mailto:tangjiangjun@gmail.com [16] [2] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>[17] [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>[18] [4] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org>[19] [5] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[20] [6] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org>[21] [7] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[22] [8] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>[23] [9] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org>[24] [10] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[25] [11] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org>[26] [12] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[27] [13] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org>[28] [14] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[29] [15] mailto:brian.budge@gmail.com [30]

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org [31] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>[32]

Links: ------ [1] mailto:tangjiangjun@gmail.com [2] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [3] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [4] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [5] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [6] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [7] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [8] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [9] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [10] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [11] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [12] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [13] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [14] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [15] mailto:brian.budge@gmail.com [16] mailto:tangjiangjun@gmail.com [17] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [18] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [19] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [20] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [21] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [22] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [23] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de> [24] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [25] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [26] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [27] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [28] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [29] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [30] mailto:brian.budge@gmail.com [31] mailto:Boost-users@lists.**boost.org <Boost-users@lists.boost.org> [32] http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users> [33] mailto:Oswin.Krause@ruhr-uni-**bochum.de<Oswin.Krause@ruhr-uni-bochum.de>

______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>

John Maddock

8:12 a.m.

...

results:

Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds

so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance.

That's sort of what I would expect - think about it - if the cast is to the same or a wider type, then there is no check, and numeric_cast and static_cast do the same thing. However for a narrowing cast (float to integer), then at the very least there has to be an extra if statement to test whether the value being cast is in range - that would normally roughly double the runtime cost. But there *may* be another hidden cost: depending on the loop the compiler may decide not to inline the numeric_cast in order to give a tighter loop, and the cost of the function call would add a big chunk of time. Plus the extra code associated with the error handling if the value is out of range adds a certain amount of code bloat to the loop, reducing code locality. But the thing is you can't avoid this if you want the runtime check. There's no such thing as a free lunch sadly. John.

Tang Jiang Jun

8:35 a.m.

Very incisive! I checked the assembly code and verified your hypothesis. The integer-to-floating cast of numeric_cast was really inlined by the compiler, meanwhile the other two were using function call to invoke numeric_cast. So maybe the better solution is to force inline for numeric_cast in all cases. On Tue, Oct 16, 2012 at 4:12 PM, John Maddock <boost.regex@virgin.net>wrote:

...

results:

...
Native Integer Cast: 26729 nanoseconds Boost Integer Cast: 26449 nanoseconds Native Integer-Floating Cast: 105479 nanoseconds Boost Integer-Floating Cast: 105455 nanoseconds Native Floating-Integer Cast: 168933 nanoseconds Boost Floating-Integer Cast: 453505 nanoseconds

so no overhead in Integer-Integer or Integer-Floating. But Floating-Integer has bad performance.

That's sort of what I would expect - think about it - if the cast is to the same or a wider type, then there is no check, and numeric_cast and static_cast do the same thing. However for a narrowing cast (float to integer), then at the very least there has to be an extra if statement to test whether the value being cast is in range - that would normally roughly double the runtime cost. But there *may* be another hidden cost: depending on the loop the compiler may decide not to inline the numeric_cast in order to give a tighter loop, and the cost of the function call would add a big chunk of time. Plus the extra code associated with the error handling if the value is out of range adds a certain amount of code bloat to the loop, reducing code locality. But the thing is you can't avoid this if you want the runtime check. There's no such thing as a free lunch sadly.

John. ______________________________**_________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/**mailman/listinfo.cgi/boost-**users<http://lists.boost.org/mailman/listinfo.cgi/boost-users>

4645

Age (days ago)

4646

Last active (days ago)

List overview

Download

0 comments

1 participants

participants (1)

Brian Budge
John Maddock
Oswin Krause
Tang Jiang Jun