The SC Conference in Austin made me read up on compiler developments concerning CUDA. Two related things gained traction in the last couple of weeks. One is CUDA code compilation using LLVM, but having still the NVIDIA CUDA driver and runtime as a backend; the other is a full Open-Source
CUDA with LLVM
Since a few weeks, you can use LLVM / Clang to compile CUDA code. How it’s done is written in a document in the LLVM code repository (fix link, introduced with this commit). I haven’t tried it yet, but it looks quite straight-forward. There are still more optimizations in LLVM going on to better include CUDA.
Apparently the same people from Google sewing CUDA into LLVM are also developing
gpucc, an Open-Source CUDA compiler.
Surely, the compiler is LLVM-based and from the last LLVM developers’ meeting comes also the only in-depth info on
gpucc: A talk by Jingyue Wu (video, slides). I like the optimizations done by the compiler, which are also already included into the public LLVM part from above (the whitepapers for reference: »Straight-line Scalar Optimizations« and »Memory Space Inference for NVPTX Backend«, both by Wu)!
It looks quite interesting. Their time line foresees a publication next year (»Q1 2016«).
(Sidenote: AMD is working on a tool converting CUDA to a C++ programming model, which can then be translated to CUDA or AMD’s HCC compiler; it’s like CUDA support for AMD through a back door.)
Ever got this annoying popup-window from OS X’ firewall asking you to allow incoming connections to some certain application?
I’m currently fiddling around with MPI where constantly messages are being sent and OS X surely always prompts me to »allow« it.
There’s a solution: Using your
Keychain Access.app, create a self-signed certificate for the certain app, trust it »always«, and then sign the application with your freshly made certificate.
Read how it’s done in this Stackexchange post.
As an alternative from the same thread, you can use
ad-hoc signing, e.g.
sudo codesign --force --deep --sign - /path/to/application.app
Edit 2016-02-19: By chance, I found out, what I wrote below is not true. NVIDIA supplies a bundled App for OS X for Nsight (and also for the Visual Profiler,
nvvp). They are located at
libnvvp/nvvp.app. Just create aliases from there to your
/Applications/folder and you’re done. Easy!
I leave the rest below for completeness.
Usually, the program is started via command line invocation (»
nsight«, resolved to
/Developer/NVIDIA/CUDA-7.5/bin/nsightor the likes via your
To start it as a more proper OS X App, AppleScript can be used. Open
/Applications/Utilities/) and paste the following
run application "/Developer/NVIDIA/CUDA-7.5/bin/nsight"
modifying the path to the executable accordingly.
Save the file as a program in
~/Applications/) and, voila, you can start it with Spotlight or Alfred.
To change the icon of the app, select it in Finder, hit
⌘+i, select the icon on the upper left side and paste (
⌘+v) an image from clipboard – e.g. a cutout from the logo on the official NVIDIA webpage for Nsight Eclipse edition.
For my thesis I made drawings comparing conceptual differences of GPUs and CPUs. I didn’t like the ones which were floating through the interwebs, since most of them had bad quality or horrible colors.
The chosen font is Myriad Pro, the color scheme is blue / purple / green.
CPU Die Structure (Simplified)
GPU Die Structure (Simplified)
GPU Die Structure with Multiprocessors
A few schemes I’m not particularly proud of (and I did not use them any where). But, for completeness:
Multi GPU Scheme
Grid, Block, Thread
Does not really work, since the virtual entities of Grid, Block, and Thread do not map 1:1 to physical entities on the GPU… It was a try…
My name is Andreas and on this blog I’d like to report from my work accelerating various scientific applications on and with GPUs. I’m attached to the NVIDIA Application Lab of the Supercomputing Centre Jülich of Forschungszentrum Jülich.
This is my first job after getting my PhD in particle physics working on algorithm development for a new hadron physics experiment called PANDA. In my thesis, I already studied the impact of running accelerated code on GPUs – it’s quite fascinating.
On this blog, expect random thoughts on problems I solve and challenges I face. Also, it’s a directory of useful stuff I find and produce. There might be also some content on the actual application I’m working on. But let’s see.
So far, thanks for tuning in.
Page: 3 of 3 • « Previous | Next »