• CUDA with LLVM and gpucc, Google's CUDA Compiler

    The SC Conference in Austin made me read up on compiler developments concerning CUDA. Two related things gained traction in the last couple of weeks. One is CUDA code compilation using LLVM, but having still the NVIDIA CUDA driver and runtime as a backend; the other is a full Open-Source nvcc replacement.

    CUDA with LLVM

    Since a few weeks, you can use LLVM / Clang to compile CUDA code. How it’s done is written in a document in the LLVM code repository (fix link, introduced with this commit). I haven’t tried it yet, but it looks quite straight-forward. There are still more optimizations in LLVM going on to better include CUDA.

    gpucc

    Apparently the same people from Google sewing CUDA into LLVM are also developing gpucc, an Open-Source CUDA compiler.

    Surely, the compiler is LLVM-based and from the last LLVM developers’ meeting comes also the only in-depth info on gpucc: A talk by Jingyue Wu (video, slides). I like the optimizations done by the compiler, which are also already included into the public LLVM part from above (the whitepapers for reference: »Straight-line Scalar Optimizations« and »Memory Space Inference for NVPTX Backend«, both by Wu)!

    It looks quite interesting. Their time line foresees a publication next year (»Q1 2016«).

    (Sidenote: AMD is working on a tool converting CUDA to a C++ programming model, which can then be translated to CUDA or AMD’s HCC compiler; it’s like CUDA support for AMD through a back door.)

  • No Firewall Warnings for OS X Apps with Self-Signed Certificates

    Ever got this annoying popup-window from OS X’ firewall asking you to allow incoming connections to some certain application?

    OS X Firewall Warning

    I’m currently fiddling around with MPI where constantly messages are being sent and OS X surely always prompts me to »allow« it.

    There’s a solution: Using your Keychain Access.app, create a self-signed certificate for the certain app, trust it »always«, and then sign the application with your freshly made certificate.

    Read how it’s done in this Stackexchange post.

    As an alternative from the same thread, you can use ad-hoc signing, e.g.

    sudo codesign --force --deep --sign - /path/to/application.app
  • NVIDIA Nsight Eclipse Edition as an OS X App

    Edit 2016-02-19: By chance, I found out, what I wrote below is not true. NVIDIA supplies a bundled App for OS X for Nsight (and also for the Visual Profiler, nvvp). They are located at /Developer/NVIDIA/CUDA-7.5/libnsight/nsight.app and libnvvp/nvvp.app. Just create aliases from there to your /Applications/ folder and you’re done. Easy!
    I leave the rest below for completeness.


    NVIDIA bundles a custom Eclipse IDE version in their CUDA Toolkit, the Nsight Eclipse Edition. A handy tool for local and remote GPU development.

    Usually, the program is started via command line invocation (»nsight«, resolved to /Developer/NVIDIA/CUDA-7.5/bin/nsight or the likes via your $PATH).

    To start it as a more proper OS X App, AppleScript can be used. Open Script Editor.app (in /Applications/Utilities/) and paste the following

    run application "/Developer/NVIDIA/CUDA-7.5/bin/nsight"

    modifying the path to the executable accordingly.

    Save the file as a program in /Applications/ (~/Applications/) and, voila, you can start it with Spotlight or Alfred.

    To change the icon of the app, select it in Finder, hit ⌘+i, select the icon on the upper left side and paste (⌘+v) an image from clipboard – e.g. a cutout from the logo on the official NVIDIA webpage for Nsight Eclipse edition.

  • GPU/CPU Comparison Schemes

    For my thesis I made drawings comparing conceptual differences of GPUs and CPUs. I didn’t like the ones which were floating through the interwebs, since most of them had bad quality or horrible colors.

    The chosen font is Myriad Pro, the color scheme is blue / purple / green.

    CPU Die Structure (Simplified)

    GPU Die Structure (Simplified)

    GPU die

    GPU Die Structure with Multiprocessors

    GPU structure

    Others

    A few schemes I’m not particularly proud of (and I did not use them any where). But, for completeness:

    Multi GPU Scheme

    Multi GPU

    Grid, Block, Thread

    Does not really work, since the virtual entities of Grid, Block, and Thread do not map 1:1 to physical entities on the GPU… It was a try…

    Gridblockthread

  • Hello

    Hi there!

    My name is Andreas and on this blog I’d like to report from my work accelerating various scientific applications on and with GPUs. I’m attached to the NVIDIA Application Lab of the Supercomputing Centre Jülich of Forschungszentrum Jülich.

    This is my first job after getting my PhD in particle physics working on algorithm development for a new hadron physics experiment called PANDA. In my thesis, I already studied the impact of running accelerated code on GPUs – it’s quite fascinating.

    On this blog, expect random thoughts on problems I solve and challenges I face. Also, it’s a directory of useful stuff I find and produce. There might be also some content on the actual application I’m working on. But let’s see.

    So far, thanks for tuning in.