More Bi-Directional Path Tracing

It’s still not perfect, far from it in fact, but it’s progress none the less. I’ve been reading a lot lately about Metropolis Light Transport, Manifold Exploration, Multiple Importance Sampling (they do love their M names) and it’s high time I started implementing some of them myself.

So it’s with great sadness that I am retiring my PRT project which began over a year ago, all the way back at the start of my dissertation. PRT is written in Java, for simplicity, and was designed in such a way that as I read new papers about more and more complex rendering techniques I could easily drop in a new class, add a call to the render loop, or even replace the main renderer all together with an alternative algorithm which still called upon the original framework.

I added many features over time from Ray Tracing, Photon Mapping, Phong and Blinn-Phong shading, DOF, Refraction, Glossy Surfaces, Texture Mapping, Spacial Trees, Meshes, Ambient-Occlusion, Area Lighting, Anti-Aliasing, Jitter Sampling, Adaptive Super-Sampling, Parallelization via both multi-threading and using the gpu with OpenCL, Path Tracing, all the way up top Bi-Directional Path Tracing.

But the time has taken it’s toll and too much has been added on top of what began as a very simple ray tracer. It’s time to start anew.

My plans for the new renderer is to build it entirely in C++ with the ability to easily add plugins over time like the original. Working in C++ gives a nice benefit that as time goes by I can choose to dedicate some parts of the code to the GPU via CUDA or OpenCL without too much overhead or hassle. For now though the plan is to rebuild the optimized maths library and get a generic framework for a render in place. Functioning renderers will then be built on top of the framework each implementing different feature sets and algorithms.

Tags Bi-Directional Path Tracing, Concurrency, CUDA, Depth of Field, Global Illumination, GPGPU, Monte Carlo Integration, OpenCL, Path Tracing, Photon Mapping, Ray Tracing

3D Ray Casting

October 12, 2013 - 6:37 pm by Joss Whittle C/C++ GPGPU Graphics

Yup, the PhD is going that well…

I joke. So far it’s just been a lot of reading papers on graphics, most of which I do not understand. :(

Anyway, as a fun little side project I’ve been working on a 3D Ray Caster using my old favourites, OpenCL, OpenGL, and C++. It’s quite similar in concept to the renderer for my dissertation project last year but with a simplified rendering method and faster performance.

The goal for this project is to revisit Voxel Rendering which I played around with over the summer, and possibly to revisit game development with a new version of my Aliens First-Person Pacman game.

Currently the program has seven hardcoded Axis Aligned Bounding Boxes (AABB’s) which it renders as the camera orbits around them. I’m working on a method to organise AABB’s into a flat packed Oct-Tree which can be passed to the GPU. Once this is working it should be trivial to construct an AABB Oct-Tree of the CT Scanned skull I used before, or to construct a simple game.

Another thing I may look into is modifying the Aliens game to have textured floors and ceilings using floor-casting and possibly to have maps with multiple vertical levels.

Tags OpenCL, OpenGL, Ray Casting

Sine Mesh using OpenCL + OpenGL Interop

July 6, 2013 - 11:17 am by Joss Whittle C/C++ GPGPU Graphics

I’ve been meaning to do it for a while but I have finally gotten around to making a Template for OpenCL & OpenGL Projects in Visual Studio 2012.

The host application is written in C++ using GLew and freeGLUT to interface with OpenGL, and the Amd APP SDK to interface with OpenCL as I am running a ATI 5850 GPU. Building upon the template is fast and easy and really comes down to just a few core functions which differ for each program. Below is a simplified version of the functions each program must implement.

/** * Setup additional OpenGL resources * Initialise Arrays, Buffers, VBO's, Textures, ect * Set default values, and the Global/Local OpenCL sizes */ void initResources(void); /** * Cleanup CL & GL Resources * Free up any allocated memory or buffers */ void cleanupResources(void); /** * Stage the OpenCL Kernel * Run directly before the kernel is executed */ void stageExecuteCL(void); /** * Collate the OpenCL Kernel * Run directly after the kernel is executed */ void collateExecuteCL(void); /** * Draw additional OpenGL data to the frame * Use this to draw data to the screen after the kernel has run */ void drawAdditional(void); /** * Draw strings with OpenGL (Runtime Stats) * The last step in rendering, draw strings to screen */ void drawString(void);

Using this frame as a base to build off of I was able to quickly and easily get a demo running. The program in the video above and the images below generates a 1000 * 1000 (1 million) vertex triangular mesh and displays an exponential sine function over it. At each frame OpenCL computes the 3D location of all 1,000,000 points in the mesh which OpenGL then renders. Due to the massively paralleled nature of the OpenCL update code frame rates exceeding 100 frames per second are achieved! (In the video Fraps capped the frame rate at 30fps for smooth capture)

Below is a simplified version of the SineMesh Demo, most of the work is performed in the initialization phase to create the 1000 * 1000 mesh and link the triangles together. The program could be made a lot more efficient if the triangle linking phase was done such that it creates a Triangle Strip as opposed to the current Triangle List. A strip has the benefit of chaining touching triangles together to save the number of vertices which need to be stored in the linking array.

The OpenCL Kernel

// For code readability, a colour is still just a float4 vector typedef float4 colour; /** * Manipulate a OpenGL Mesh stored in a VBO to simulate a mathematical function * * @param pos The VBO of vertex positions * @param col The VBO of vertex colours * @param attr An array of attributes about the rendering */ __kernel void sineMesh( global float *pos, global write_only float *col, global read_only float *attr ) { // All declarations first int index, index3, index4; float frame, grid, x, y, r, a,b, result; // What node are we in the calculation? index = get_global_id(0); index3 = index * 3; index4 = index * 4; // Load the animation values into local memory frame = attr[0]; grid = attr[1]; a = attr[2]; b = attr[3]; // Where are we on the grid? x = pos[index3]; y = pos[index3 + 1]; // Compute Function relative to our location from (0,0) on the grid // Makes the function radial around the origin r = native_sqrt((x*x) + (y*y)); result = native_exp(-r * a) * native_sin((b * r) - (frame)); // Update the Z height with the result pos[index3 + 2] = result * 2.0f; // Result Height // Fade from Green through Blue to Red based on the current height if (result < 0.0f) { result = clamp(-result, 0.0, 1.0f); col[index4] = 0.0f; col[index4 + 1] = result; } else { result = clamp(result, 0.0, 1.0f); col[index4] = result; col[index4 + 1] = 0.0f; } col[index4 + 2] = 1.0f - result; }

The (simplified) Host Program

// How big a mesh do we want N x N? const int GRID_SIZE = 1000; // Animation Timing long m_time = 0, p_time = 0; // OpenCL/GL Render Texture Buffers GLuint m_vboTri, m_vboMesh, m_vboColour; vector<Memory> m_sharedCL; // Local Arrays for Initializing float *m_mesh, *m_colour; int *m_tri; // Interactive values for Animation and Function Speed float m_rot = 0.0f, m_step = 0.001f, m_rotSpeed = 0.01f; // Local Array for sending attributes to kernel along with data CLArray<float> m_attr; // How should OpenGL render the mesh? int m_drawMode = GL_TRIANGLES; /** * Setup additional OpenGL resources * Initialise Arrays, Buffers, VBO's, Textures, ect * Set default values, and the Global/Local OpenCL sizes */ void initResources(void) { // Set Kernel Params m_global = NDRange(GRID_SIZE * GRID_SIZE); m_local = NullRange; // Number of Vertices int GRID2 = GRID_SIZE * GRID_SIZE; // Number of Triangles in an N * N Grid int TRI_COUNT = ((GRID_SIZE - 1) * (GRID_SIZE - 1) * 2); // Build Local Arrays m_mesh = new float[GRID2 * 3]; m_colour = new float[GRID2 * 4]; m_tri = new int[TRI_COUNT * 3]; // Fill Vertex and Colour Array with defaults for (int y = 0; y < GRID_SIZE; y++) { for (int x = 0; x < GRID_SIZE; x++) { // Array Indexes for Insertion int index3 = 3 * x + 3 * y * GRID_SIZE; int index4 = 4 * x + 4 * y * GRID_SIZE; m_mesh[index3] = (((float)x / (float)GRID_SIZE) - 0.5f) * 4.0f; // Mesh X m_mesh[index3 + 1] = (((float)y / (float)GRID_SIZE) - 0.5f) * 4.0f; // Mesh Y m_mesh[index3 + 2] = 0.0f; // Mesh Z m_colour[index4] = 0.0f; // Colour R m_colour[index4 + 1] = 0.0f; // Colour G m_colour[index4 + 2] = 1.0f; // Colour B <- Constant Start Colour m_colour[index4 + 3] = 1.0f; // Colour A } } // Connect up the vertices into the triangle array int tri = 0; for (int y = 0; y < GRID_SIZE - 1; y++) { for (int x = 0; x < GRID_SIZE - 1; x++) { // Array Indexes for Insertion int index3 = (x) + (y * GRID_SIZE); int index3E = (x+1) + (y * GRID_SIZE); int index3S = (x) + ((y+1) * GRID_SIZE); int index3SE = (x+1) + ((y+1) * GRID_SIZE); // For each Vertex 0 -> N-1 Add in the two triangles along it's SE diagonal m_tri[tri] = index3; m_tri[tri+1] = index3E; m_tri[tri+2] = index3SE; tri+=3; m_tri[tri] = index3SE; m_tri[tri+1] = index3S; m_tri[tri+2] = index3; tri+=3; } } // Create GL Buffers // Triangle Buffer glGenBuffers(1, &m_vboTri); glBindBuffer(GL_ARRAY_BUFFER, m_vboTri); glBufferData(GL_ARRAY_BUFFER, TRI_COUNT * 3 * sizeof(int), m_tri, GL_STATIC_DRAW); // Vertex Buffer (Hook with OpenCL) glGenBuffers(1, &m_vboMesh); glBindBuffer(GL_ARRAY_BUFFER, m_vboMesh); glBufferData(GL_ARRAY_BUFFER, GRID2 * 3 * sizeof(float), m_mesh, GL_STREAM_DRAW); m_sharedCL.push_back(cl::BufferGL(m_context, CL_MEM_READ_WRITE, m_vboMesh, &err)); m_kernel.setArg(0, m_sharedCL[0]); // Colour Buffer (Hook with OpenCL) glGenBuffers(1, &m_vboColour); glBindBuffer(GL_ARRAY_BUFFER, m_vboColour); glBufferData(GL_ARRAY_BUFFER, GRID2 * 4 * sizeof(float), m_colour, GL_STREAM_DRAW); m_sharedCL.push_back(cl::BufferGL(m_context, CL_MEM_WRITE_ONLY, m_vboColour, &err)); m_kernel.setArg(1, m_sharedCL[1]); // OpenGL Render Settings glColorMaterial(GL_FRONT_AND_BACK, GL_AMBIENT_AND_DIFFUSE); glEnable(GL_COLOR_MATERIAL); glEnable(GL_BLEND); glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); // Set up attribute array m_attr.init(4, m_context, CL_MEM_READ_ONLY); m_attr.setAsArg(m_kernel, 2); m_attr[0] = 0.0f; // Frame m_attr[1] = (float)GRID_SIZE; // Grid Size m_attr[2] = 0.75f; // A (Interactive Variable, modifies exponential function) m_attr[3] = 20.0f; // B (Interactive Variable, modifies sine function) } /** * Cleanup CL & GL Resources * Free up any allocated memory or buffers */ void cleanupResources(void) { delete[] m_mesh; delete[] m_colour; delete[] m_tri; m_attr.release(); } /** * Stage the OpenCL Kernel * Run directly before the kernel is executed */ void stageExecuteCL(void) { // Work out timing for animation p_time = m_time; m_time = glutGet(GLUT_ELAPSED_TIME); float time = (float)(m_time - p_time); // Aquire OpenGL VBO's m_queue.enqueueAcquireGLObjects(&m_sharedCL); // Update animation and camera rotation m_rot += time * m_rotSpeed; m_attr[0] += time * m_step; m_attr.writeBuffer(m_queue); } /** * Collate the OpenCL Kernel * Run directly after the kernel is executed */ void collateExecuteCL(void) { m_queue.enqueueReleaseGLObjects(&m_sharedCL); } /** * Draw additional OpenGL data to the frame * Use this to draw data to the screen after the kernel has run */ void drawAdditional(void) { glShadeModel(GL_SMOOTH); // Place the camera glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glRotatef(-40, 1.0f, 0, 0); glRotatef(m_rot, 0, 0, 1.0f); // Load the colour, mesh, and triangle buffers glBindBuffer(GL_ARRAY_BUFFER, m_vboColour); glColorPointer(4, GL_FLOAT, 0, 0); glBindBuffer(GL_ARRAY_BUFFER, m_vboMesh); glVertexPointer(3, GL_FLOAT, 0, 0); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, m_vboTri); glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_COLOR_ARRAY); // Render the Mesh glDrawElements(m_drawMode, ((GRID_SIZE - 1) * (GRID_SIZE - 1) * 2) * 3, GL_UNSIGNED_INT, 0); glDisableClientState(GL_VERTEX_ARRAY); glDisableClientState(GL_COLOR_ARRAY); } /** * Draw strings with OpenGL (Runtime Stats) * The last step in rendering, draw strings to screen */ void drawString(void) { glColor3f(1.f, 1.f, 1.f); glPrintString(5, HEIGHT-35, std::string("Joss Whittle")); // ... ect // Print stats and such }

Tags OpenCL, OpenGL

Dissertation is done!

May 8, 2013 - 5:55 pm by Joss Whittle C/C++ Dissertation GPGPU Graphics Java University

Rejoice! For after one hell of a long year finally the dissertation is done dusted and thankfully handed in!

This was the final result, a real-time path tracer written in C using OpenCL to compute frames and OpenGl to render them. Here is a simple cornell box style scene that was left to converge for a couple of minutes.

My project fair demo. My graders seemed to like it (I bagged 92% for the Viva!) and so did the PhD students and my coursemates… Not so much praise from the school kids, the phrase “Realistic? Looks nothing like Call of Duty” was used…. First time I’ve ever wanted to smack a child jokes but what can you do. The tech industry people didn’t seem to care for it either which was rather miserable because I had to stand there for 5 hours in a boiling hot room while no one wanted to know about my work.

But I can’t really complain about not getting any job offers because… I got offered a ~~summer research position~~ unconditional fully funded PhD Studentship at the uni! So this summer I’ll be staying in Swansea to effectively continue this project with the goal of getting Path Tracing working much much faster and on mobile devices (tablets most likely).

Here’s an album showing the progress from start to finish throughout the year.