It’s still not perfect, far from it in fact, but it’s progress none the less. I’ve been reading a lot lately about Metropolis Light Transport, Manifold Exploration, Multiple Importance Sampling (they do love their M names) and it’s high time I started implementing some of them myself.
So it’s with great sadness that I am retiring my PRT project which began over a year ago, all the way back at the start of my dissertation. PRT is written in Java, for simplicity, and was designed in such a way that as I read new papers about more and more complex rendering techniques I could easily drop in a new class, add a call to the render loop, or even replace the main renderer all together with an alternative algorithm which still called upon the original framework.
I added many features over time from Ray Tracing, Photon Mapping, Phong and Blinn-Phong shading, DOF, Refraction, Glossy Surfaces, Texture Mapping, Spacial Trees, Meshes, Ambient-Occlusion, Area Lighting, Anti-Aliasing, Jitter Sampling, Adaptive Super-Sampling, Parallelization via both multi-threading and using the gpu with OpenCL, Path Tracing, all the way up top Bi-Directional Path Tracing.
But the time has taken it’s toll and too much has been added on top of what began as a very simple ray tracer. It’s time to start anew.
My plans for the new renderer is to build it entirely in C++ with the ability to easily add plugins over time like the original. Working in C++ gives a nice benefit that as time goes by I can choose to dedicate some parts of the code to the GPU via CUDA or OpenCL without too much overhead or hassle. For now though the plan is to rebuild the optimized maths library and get a generic framework for a render in place. Functioning renderers will then be built on top of the framework each implementing different feature sets and algorithms.
I joke. So far it’s just been a lot of reading papers on graphics, most of which I do not understand. :(
Anyway, as a fun little side project I’ve been working on a 3D Ray Caster using my old favourites, OpenCL, OpenGL, and C++. It’s quite similar in concept to the renderer for my dissertation project last year but with a simplified rendering method and faster performance.
Currently the program has seven hardcoded Axis Aligned Bounding Boxes (AABB’s) which it renders as the camera orbits around them. I’m working on a method to organise AABB’s into a flat packed Oct-Tree which can be passed to the GPU. Once this is working it should be trivial to construct an AABB Oct-Tree of the CT Scanned skull I used before, or to construct a simple game.
Another thing I may look into is modifying the Aliens game to have textured floors and ceilings using floor-casting and possibly to have maps with multiple vertical levels.
A note about the growing trend towards GPGPU for the masses. This is my response to a Reddit post I saw about a new GPU language, I felt I should copy it here.
As someone who does a lot of development using the GPU a new language is the last thing I want. Programming for the GPU is complicated, it just is, and it should be because what you are running your program on is a very complicated piece of hardware. You have to treat it right and you have to structure your programs and algorithms in a specific way which is not common to other architectures.
All these attempts lately to make GPU programming easy and doable for everyone (the LISP and Haskell libraries come to mind) completely miss the mark. They work under the premise that if you make it easier, everyone can make everything GPU accelerated, and that that will be better. It won’t.
Half the problem with current libraries that make CPU concurrency easier is that people start parallelizing too early. They don’t fully and completely optimize something they already have, they don’t go in and profile it, they don’t notice long system calls and work in a bit of assembly which will reduce the latency. No. Instead they will just chuck some threads in there, because threads make things faster… It’s just not true.
This problem is even more volatile on the GPU. It’s a delicate balance and not every job is suited to the GPU just as not every job is suited to multi threading on the CPU. Giving people (the uninformed people at least) the power of the GPU for every conceivable task is just daft.
When you structure programs for the GPU you need to have full and complete control over everything it does and when it does it. Languages like OpenCL and CUDA may be complicated but your kernels do what they say they do. It’s exactly why writing good C is complicated because you are right on the hardware level with very low abstraction. OpenCL and CUDA don’t try to optimize what you wrote (past a few compile time optimizations which are to be expected) they translate your commands onto the hardware nearly directly. The downside of that is it means you need to fully and completely understand your algorithm and how the hardware will react to each stage of it, the benefit is incredible performance and massively parallel execution.
TL:DR GPU programming is hard for a reason and giving everyone an easy way to do it completely misses the point. It’s like trying to make everyones car into a supercar by handing out nitrous injectors.
I’ve been meaning to do it for a while but I have finally gotten around to making a Template for OpenCL & OpenGL Projects in Visual Studio 2012.
The host application is written in C++ using GLew and freeGLUT to interface with OpenGL, and the Amd APP SDK to interface with OpenCL as I am running a ATI 5850 GPU. Building upon the template is fast and easy and really comes down to just a few core functions which differ for each program. Below is a simplified version of the functions each program must implement.
Using this frame as a base to build off of I was able to quickly and easily get a demo running. The program in the video above and the images below generates a 1000 * 1000 (1 million) vertex triangular mesh and displays an exponential sine function over it. At each frame OpenCL computes the 3D location of all 1,000,000 points in the mesh which OpenGL then renders. Due to the massively paralleled nature of the OpenCL update code frame rates exceeding 100 frames per second are achieved! (In the video Fraps capped the frame rate at 30fps for smooth capture)
Below is a simplified version of the SineMesh Demo, most of the work is performed in the initialization phase to create the 1000 * 1000 mesh and link the triangles together. The program could be made a lot more efficient if the triangle linking phase was done such that it creates a Triangle Strip as opposed to the current Triangle List. A strip has the benefit of chaining touching triangles together to save the number of vertices which need to be stored in the linking array.
Rejoice! For after one hell of a long year finally the dissertation is done dusted and thankfully handed in!
This was the final result, a real-time path tracer written in C using OpenCL to compute frames and OpenGl to render them. Here is a simple cornell box style scene that was left to converge for a couple of minutes.
My project fair demo. My graders seemed to like it (I bagged 92% for the Viva!) and so did the PhD students and my coursemates… Not so much praise from the school kids, the phrase “Realistic? Looks nothing like Call of Duty” was used…. First time I’ve ever wanted to smack a child jokes but what can you do. The tech industry people didn’t seem to care for it either which was rather miserable because I had to stand there for 5 hours in a boiling hot room while no one wanted to know about my work.
But I can’t really complain about not getting any job offers because… I got offered a summer research position unconditional fully funded PhD Studentship at the uni! So this summer I’ll be staying in Swansea to effectively continue this project with the goal of getting Path Tracing working much much faster and on mobile devices (tablets most likely).
Here’s an album showing the progress from start to finish throughout the year.