Editable Grid.. yay!

Editable grid 16x16x16

Finally, my grid is editable now.

We can edit or remove current blocks – adding new ones will be introduced after optimisation stage.
Current version supports 4 standard blocks (Blue, Green, Yellow, Brown) and 1 point light (Blue with circle in the middle).

In this iteration of “boxel engine” regenerateBuffers method updates nearest block that collides with ray fired from camera POV within worldGrid 3 dimensional table, after that, buffers that contain shadow and light powers are recalculated, at the end two openGL methods are called (for each buffer):

1
2
 glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer[sectorID]);
 glBufferSubData(GL_ARRAY_BUFFER, 0, frame_vertices[sectorID].size()*sizeof(glm::vec3),  &frame_vertices[sectorID][0][0]);

first, buffer is bind, and then values are updated, starting from index 0 and then updating whole buffer.
Optimisation, that I will be introducing soon, is updating only changed values that will reduce data size.
That has to be send to graphics card.

Note:

Grid 64 recalculation takes a while :S

Metrics will be added soon.

100 will be about: Shadows and some point Light(s)

Shadows added - calculated on GPU but still works fine!

Shadow value for each cube is stored in separate grid, values are sent to shader and then output is altered. It all depends on value.
The value is calculated based on number of obstacles on the way between light and current cube.
Below you can find some metrics:

Metrics for shadow added.

Earlier I have introduced point light to engine, there is example of one point light – shown below.

Point lights introduced.. performance drops.. but that's good.

Below 510 point lights on 64x64x64 cubes grid.

Point lights on 64 grid - 36 FPS.. pretty Ok.

And its metrics..

512 Point Lights took 2 Friends Episodes to load in

Metrics for grid of 32x32x32 cubes, below:

Wait.. What?!?.. straight Line?

Why the results for point lights are almost the same?
All the calculations are precomputed on CPU so we are dealing with fixed buffer of light values – which is fair enough, however I need to optimise loading process so it will allow to add lights in real time.

Next step, editable terrain..

Finally.. textures with metrics

Textured 262144 cubes

This evening code hunger got me badly.. tomorrow, big launch of SW:TOR and lets be honest about it.. I will do no more honours project related work over this Christmas.

Anyway, in order to evaluate the software I have collected FPS and Memory Usage of the application.

Metrics for above project

Base metrics are like this:

Base Metrics

Funny enough, in texture integration progress I have made some optimisation, frame rate went up by slight cost of memory.. I am happy, I can deal with that.

Using OffloadCL to compile C++ AMP code for OpenCL

C++ AMP (Accelerated Massive Parallelism) is a GPGPU API (STL-like library) implemented by Microsoft in c++11.

Lately, I had a pleasure to use an alternative tool-kit that is not limited to DirectX.
Technology that can be used on any device – that can run OpenCL – which not only reaches the performance offered by the C++ AMP but even goes beyond it – in some cases.

More information about the OffloadCL tool-kit – note that the OffloadCL tool-kit does not implement C++ AMP, it’s the flexible design allows the possibility to make it work C++ AMP code; see example below.

The showcase I want to present in this post is a Binomial Option Pricing Model (BOPM) – my objective was to “port” the code from this blog post so it will be using OffloadCL tool-kit.

More information about BOPM.

After few hours of setting up OffloadCL, few “why is that not working?” later I was ready to start my first application using OffloadCL tool-kit.. Well almost..

I had no idea what are the methods and approaches used in the tool-kit – there is no official documentation – yet – however there was a couple of ready examples and header file, for the rescue!

Few minutes later I was good to go.

When I started reading the C++ AMP source code, was pretty usual – small functions that do the job, few #define etc etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//----------------------------------------------------------------------------
// Sequential(CPU) binomial option calculation
//----------------------------------------------------------------------------
void binomial_options_cpu()
{
  const unsigned data_size = MAX_OPTIONS;
 
 
    // this is like GPU kernel - where we have the meat
    for (unsigned i = 0; i < data_size; i++)
    {
        float call[NUM_STEPS + 1];
 
        // Compute values at expiration date:
        // call option value at period end is V(T) = S(T) - X
        // if S(T) is greater than X, or zero otherwise.
        // The computation is similar for put options.
        for(int j = 0; j <= NUM_STEPS; j++)
           call[j] = expiry_call_value(V_S[i], V_X[i], V_VDT[i], j);
 
        // Walk backwards up binomial tree
        for(int j = NUM_STEPS; j > 0; j--)
            for(int k = 0; k < = j - 1; k++)
                call[k] = V_PU_BY_DF[i] * call[k + 1] + V_PD_BY_DF[i] * call[k];
 
        CALL_VALUE_CPU[i] = call[0];
    }
}

Code from BinomialOptions.cpp

Code above is CPU version of binomial calculation, fairly simple.

However GPU version of this function is divided in two.

The method that prepares all buffers, in parallel_for_each function kernel method does all the calculations, for source code please download from here.

This is where things got complicated.. I had no big experience with parallel programming so I thought that I will need to spend a lot of time on research and learning, but apparently OffloadCL tool-kit made my work very easy..

What I had to do was to integrate the Offload compiler to the solution for the latest VS11 – there will be proper integration in the future.

Next step was to write some code in that file, small modification of the c++ code was enough to make the example work.

Get the source files for this post.

The OffloadCL tool-kit is still being improved, however I like the way how it works at the moment, the code is C like, simple and efficient.
I hope it will stay this way.

It was very easy and surprisingly straightforward to port C++ AMP code to OpenCL devices using OffloadCL compiler, looking forward to developing more using it!

Graphics Programming wee update

Pause screen + Animation

some progress of Graphics Programming coursework – pause screen and rotation. At the moment buffer of vertices is regenerated each frame – need to switch to DYNAMIC_DRAW.