GSoC 2015 Boost Compute Proposal
Hi, This is my proposal. Please suggest edits. Thank you!! :) *Personal Details:* Name: Aditya Atluri University: The George Washington University Major: Computer Engineering Degree: MS Email: adityaavinash1@gmail.com Homepage: adityaatluri.github.io Availability: 1. 7 days a week 2. 27 April to 28 August 3. Nothing. *Background Information:* *Educational Background:* I did my undergrad from Indian Institute of Technology, Dhanbad, India. Currently, I am pursuing my Masters at The George Washington University. The courses I have taken are, High Performance Computing, Advanced Computer Micro Architecture, Computer Graphics, Advanced Topics in Computer Graphics, VLSI Design, Low Power VLSI Design, Compiler, Machine Learning. I have did lots of projects in these courses which can be seen on my website and in my bio[1]. *Programming Background and Interests:* I am researcher, I program GPUs with IRs, writing work arounds and using a specific hardware for other applications (example, implementing ray tracing on rasterizer on GPUs, using tessellators for generating procedural textures). I wrote high productive parallel programming abstractions for GPUs[2]. I presented my work at GTC 2014. For this work, NVIDIA supported me with a couple of Tesla K40s. As CUDA, OpenGL, OpenCL, Metal, Thrust are C/C++ APIs I use C++ extensively. I also create databases on GPUs which is now in deep interest from NVIDIA, Google and MapD. I worked with Mesa, Open Source graphics drivers for Linux to build ARB_shader_atomic_counters for R600 backend. I also worked at IBM software labs, where I worked majorly on Java. *Why I opted Boost?* I use boost on daily basis as a part of PyCUDA. I write work arounds which involve some of boost libraries. The next thing I am interested in Boost is its Compute library. I have used other GPU STL style libraries but, none are as good as Compute (simple and elegant). I couldn't contribute to Compute via GSOC last year due to my CPT. I don't want to lose that chance again. This time, I want to take Boost.Compute to a whole new level. I like to program anything related to GPUs, compilers, drivers, APIs. *Interest in Boost.Compute:* As I have mentioned earlier, I am all-in for anything related to GPUs. Either graphics or compute libraries. I believe in Boost.Compute that it can be a widely used API, hence I want to make it more powerful and more useful. *Previous Work:* I work on graphics drivers on regular basis. I write loads of OpenGL compute and shader code (part of my TA job and Mesa testing on local systems). I am good at using APIs, making optimizations, less memory footage. For example, increasing the number of draw calls per second without stalling CPU, writing shaders with more ALU operations than Load/Store operations. Aligning data properly to memory width size, minimizing padding done by compilers by struct unrolling (to get exact multiple of word size). These are a few of the optimizations I do every day. I am currently working on running Metal Shading Language on Intel processors (using AVX).[3] *Plans beyond SoC*: I want to extend Compute to ARM and Qualcomm devices that support both OpenCL and ARM Neon. This brings all "compute-intense" APIs under one roof. C++ 98/03 - 03 C++ 98/14 - 03 C++ STL - 4 Boost C++ Libraries - (Knowledge: 2, Usage: 2) Git: 3 I program using Visual Studio and Xcode. Using Terminal on Linux. I am good with Doxygen. *Project Proposal:* Introduction: Boost Compute library is so far the best portable STL-like C++ library for accelerated computing leveraging the underlying parallel hardware. Boost Compute works well on OpenCL compatible desktop/notebook devices. In this project, we implement Boost Compute on mobile devices. Current mobile devices doesn’t provide good STL style abstractions that are accelerated by SIMD and OpenCL. With the introduction of Metal graphics API for iOS devices, development of compute application for non-iOS and iOS devices requires knowledge multiple skill sets. With this project, we bring all compute abstractions of different mobile devices under one roof that is, Boost.Compute. Project Goals: Make C++ extensions from Objective-C. Integrate device containers to Boost.Compute front end. Make Metal API data types visible to user through Boost.Compute. Implementing BOOST_COMPUTE_FUNCTION() to run custom kernels. Most important, implementing algorithms for Boost.Compute on iOS devices using Metal Shader Language. *Implementation:* Building Boost.Compute APIs for iOS devices. - Metal graphics APIs are used as backend to create Boost.Compute APIs in conjunction with Metal Shading Language Make C++ extensions from Objective-C. - Metal is Objective-C graphics API whereas Boost.Compute is C++ compute API. Hence a bridge between these two APIs should be made to use Compute on iOS devices. Hence, we build C++ extensions from Objective-C APIs. As C++ is also used in building iOS apps, writing Boost.Compute for iOS devices makes it easier to leverage compute performance from an iOS app. Build resource management Metal API backend for Boost.Compute - Here we develop infrastructure which make "algorithms" run on the hardware. Like, creating automated CommandQueues, CommandBuffers, Resources (buffers or textures) - As Metal APIs does not have ‘context’ related API, mapping can be done to Metal command buffer API. Some OpenCL APIs which are natively available (using in Boost.Compute) are not present in Metal API, appropriate mapping from Boost.Compute to Metal API commands will be found. Integrate device containers from Metal API to Boost.Compute front end. - Boost.Compute provide vector and array containers. These should be declared in Metal and integrated to Boost. Device containers include, device vectors, initializing vectors, moving vectors, similarly for arrays, etc., and their STL type functionality (.size(), .length()) - Implementing iterators for the containers should also be done. Make Metal API data types visible to user through Boost.Compute. - Mapping Metal data types to current Boost.Compute data types, while making them consistent with current Boost.Compute APIs. Certain new data types to Metal are being added which improve the performance and decrease memory footprint. Making these visible to the programmer and building building algorithms with them is done in this section. Implement algorithms for Boost.Compute in Metal Shading Language. - This makes the biggest challenge as Metal compute shaders support two modes of execution (fast and precise modes). Algorithms written in each mode should be iterated over to judge the best performance without losing precision for speed. Testing the existing Boost.Compute code on iOS devices and debugging for errors. - With Metal emulators not being available for Xcode, hardware testing will be done[4]. *Timeline:* The timeline is straight forward, for details check Implementation section. 27 April – 25 May: Community bonding, testing the Xcode and iOS infrastructure to find any possible build problems. Build a test run environment (iOS app) to check any other hardware/software issues in deploying Metal API code to iOS device. 25 May – 10 June: Wrap C++ Metal extensions with C++, submit patches. 10 June – 25 June: Implement data types and containers for Boost from C++ Metal APIs. Make appropriate changes to the code after getting feedback from community. Submit patches. 25 June – 3 July: Review community feedback and change patches accordingly. 3 July – 3 August: Write BUILD_COMPUTE_FUNCTION() and Boost.Compute algorithms on Metal Shading Language and integrate it to Boost.Compute Metal APIs. Submit patches. 3 August – 17 August: Write tests and debug by running them on iOS test app 17 August – 28 August: Cleaning up code and review with community. Available Hardware: iPhone 4s, 6Plus, Mac Mini. Mentor: Kyle Lutz (kyle.r.lutz@gmail.com) *Competency:* The project I started which creates the actual Metal API, is iaMetal (Metal for Intel processors)[3]. It does not uses Metal SL, it converts Metal SL to AVX extensions. [1]. https://github.com/adityaatluri/adityaatluri.github.co/blob/master/CV.md [2]. https://github.com/urutu/Urutu [3]. https://github.com/iaMetal/iaMetal [4]. https://devforums.apple.com/message/971605#971605 -- Regards, Aditya Avinash Atluri, Graduate Student, Electrical and Computer Engineering, The George Washington University, Washington, DC.
participants (1)
-
Aditya Avinash Atluri