On 18.08.2015 22:16, Joel FALCOU wrote:
On 18/08/2015 11:12, Andrey Semashev wrote:
For some data it is enough to return some meaningful pessimistic default in case if the actual value cannot be obtained. E.g. for ISA extensions we could return 'not supported', for cache size return 0, for OS version string return an empty string (or a fixed string based on the data available at compile time) and so on.
For other data this doesn't quite work though. We can't return 0 as the system RAM size, for instance - except we can but then the user's application would have to check for this special value. I'm not sure what is best in this case.
One other stuff to consider, as we had reort of this issue by our users, is that such facility should have cached and non-cached retrieval function.
Using CPUID to grab SIMD facility for example is slow enough to have a noticable impact on computing in some cases, hence the need for caching. I think some of those config are static anyway (you won't remove a CPU feature midfight) and must be cached in static value at start-up. Others, like for example amount of available free RAM must not.
I believe most of the API should be non-caching and when caching is reasonable we should probably think of a stateful approach. Let the user cache the state, if needed, and also deal with inherent thread safety issues. The CPU features are especially difficult because there are two usage patterns for this information that I have faced: 1. Collect all necessary CPU info at once and then use it to configure user's application (e.g. setup function tables and constants). This is typically done relatively rarely, like at application startup or some internal context initialization. 2. Query for one or few features, perhaps for using in a local condition to jump to a specialized code branch. The code that makes this query can be called often, so it must be fast. Satisfying both these patterns in an effective way is not easy, but I think it should be possible if we represent CPU features collection as an object that the user can create and cache, if needed. The features can be obtained lazily and cached within this object. Something along these lines (pseudo-code): namespace boost::sys_info::cpu { enum class feature_tag { // Arch-specific values sse, sse2... _count }; template< typename... Features > struct feature_list; // This struct can be specialized for different feature tags template< feature_tag Tag > struct feature { // In specializations, we can describe pre-requisites // for each feature, e.g. there must be OSXSAVE and // the OS must be saveing/restoring YMM registers // in order to be able to use AVX. typedef feature_list< ... > prerequisites; }; constexpr feature< feature_tag::sse > sse = {}; constexpr feature< feature_tag::sse2 > sse2 = {}; ... class features { // The flags indicate which features have been queried std::bitset< feature_tag::_count > m_cached; // The flags indicate which features are supported std::bitset< feature_tag::_count > m_values; public: // By default creates an empty object. // The only thing it may need to do // is to obtain the max cpuid function number. // If do_init == true, calls init() automatically. explicit features(bool do_init = false); // Obtains all features at once void init(); // If not cached already, tests for the feature and its // pre-requisites and returns the flag template< feature_tag Tag > bool operator[] (feature< Tag >); }; } // namespace boost::sys_info::cpu // Usage example void foo() { namespace cpu = boost::sys_info::cpu; cpu::features f; if (f[cpu::sse]) // SSE-optimized code foo_sse(); else // generic code foo_generic(); } I know there are complications and possible ways of optimization. In particular, we actually discover multiple features with one cpuid call, so we might want to fill multiple flags per feature query. And for /proc/cpuinfo backed solution we might want to always parse the whole file at once. But I like this interface and the design looks extensible.