This is the first part of a series of articles I’m planning to write about my current side project which I’m calling “Kami Editor“ (Kami = god, deity in Japanese , *I’m bad with names, don’t judge me). This is more of an intro article so I won’t get too technical, I’ll just try to write about my goals and motivation, current progress and my future plans for this project. I will go through the design of my engine and editor and throw some random thoughts here and there. Before going any further, I need to talk a bit about how the idea for this project came to be: I did my first procedural content generation project about two years ago during my postgraduate degree, while working on my thesis, where I decided that apart from engines and fancy graphics, I liked procedural generation too. The project I did was a small procedurally generated voxel terrain, polygonised through the marching cubes algorithm (after GPU Gems 3) and then rendered with OpenGL. My goal was to try and model procedurally some naturally occurring structures like overhangs, rocks and arcs using metaballs (blobs) and 3D noise. This way I could roughly define the shape of a structure using an implicit surface and add a natural/organic look using several layers of noise. Once I finished with my thesis I wanted to spend some time and make the project into a “proper” editor, mainly for convenience reasons. For the UI I was using AntTweakBar at the time, which is a nice little GUI library for OpenGL and DirectX, but it’s not what you want to be using when making an editor. I wanted to have transformation tools to easily move the metaballs around, widgets to have easier control over the noise and basically all the conveniences you can find in modern editors. I wanted something like Unity3D but for procedural stuff. Then I started thinking about how nice it would be to have a system where I could experiment with procedural plants or architecture, where I could start with few building blocks, combine and tweak them to create a variety of worlds, hence the name Kami (it makes sense now doesn’t it?). Before I go over all the different parts of the project here’s a screenshot of my editor rendering a volume of 3D Perlin noise (generated and polygonised entirely on the GPU with OpenCL) and another one rendering meshes imported from files through Assimp.
First of all I needed an engine to do my rendering, manage my resources, my scene objects, etc. As a programmer I like to torture myself a little bit so I decided to write my own little engine, it is after all a nice way to challenge yourself and expand your knowledge. I could use Ogre3D which is one of my favourite rendering engines since I learned so much just by going through its code, but again I wanted the challenge and flexibility of making my own thing. For the rendering part I decided to go with an OpenGL implementation first, since this is what I’ve used most recently, but I’m designing to allow for a Direct3D 11/12 implementation in the future. I’ve worked with both OpenGL and Direct3D 10 in the past, in small projects, but this is the first time I’m trying support both. The general idea behind the engine is to have it split in specific modules, each designed to do a very specific job, communicating only with what it has to, to do its thing. Currently I’ve got a massive project which only gets bigger and a terrible resource management system. However I’m in the process of re-factoring parts of it to make it fit to a proper design. I’ll try to go over the ideas I’ve currently got in my mind with a little bit more detail.
Here I will have everything related with the operating system, content and file system, high precision timing, string utilities, and other systems whose role is to provide support and a layer of abstraction between the platform and the engine. To create a window to do our rendering to, load from / save to a file, load a mesh or an image file we will call functions from here for example. This is done so I can easily port my engine to another operating system if I decide to do so.
This is the engine’s namespace currently, but anything related to the engine itself will need to be in here. The engine basically will be running in a loop until we stop it and will be managing the various subsystems. The entry point for the engine is a base KamiApp class which we need to override to implement our desired functionality. Initializing the engine is as simple as creating a new instance of our KamiApp calling Initialise() and Run() followed by Shutdown(). One of the things I’ve been over-thinking for the past months is the resource management, since I consider it a quite important part and wanted to get it right from the beginning. The way I’m thinking about my resources currently are by categorizing them in “higher” or engine resources and “lower” or render/GPGPU resources. My goal is to be able to do most things by dealing with the higher level resources in the editor and for them to be agnostic of the underlying render and GPGPU implementations. I’m considering the mesh, which is an object’s overall geometry, the submesh which is a part of that geometry and the material which describes how this geometry will be rendered as “higher” level resources. I would like to deal mostly with these resources in the editor. So if, for example, I want to render a cube, I will only need a scene node with a mesh attached and at least one material instance assigned.
If you take a look at the class diagram above you’ll see that the material is using an effect and texture resources as well. The effect is not an engine resource but a render system one. This is because an effect is basically a collection of vertex, pixel and geometry shaders all bound together by techniques and passes. Direct3D gives us this functionality through HLSL. Same goes for NVidia’s CgFX, however this is no longer actively developed. GLSL does not provide something similar as far as I know. I have yet to decide if I’ll go with a third-party solution or if I’ll just make my own shader-effect system. Anyway, the reason I’m considering the effect to be a render resource is because it should provide a way to set the shader uniform variables, for example, and in order to do this, either be setting the variables one by one or by using constant or uniform buffers, you need to know about the underlying API.
The texture resource, on the other hand, I consider it as an engine resource simply because it’s not the actual texture but more like a description of where to find the texture. Let’s look at this in more detail: it is possible to have a big texture buffer in the GPU memory (VRAM) and inside it store many smaller sized textures. This is known as texture packing or texture atlas and we do this to minimize state changes in the renderer in order to speed things up. So my texture resource will hold a handle to the lower level, bigger texture, resource that keeps the actual data and a description of how to get the part we want (e.g.: start x, y, end x,y). The same idea applies to the submesh: It just holds handles to the render resources where our vertex and index data reside, along with a description of where exactly in the bigger buffer that data are (e.g.: start index, number of elements, etc). Additionally the submesh contains a VertexLayout which basically tells us how to pass the vertex data to the input stage of graphics pipeline (interleaved or separate buffers, various vertex attributes such as Position, Texture Coords or Normals).
Looking at the mesh you will see that it uses a list of submeshes and materials. The mesh contains the overall geometry of an object along with various materials assigned to the various submeshes, skeletons for animations, etc. For now I’m going to stick with this simple representation where the mesh will be composed only of submeshes and materials and where a submesh can only have one material assigned. When we want to render a mesh we are not sending submeshes or materials to the renderer, instead we issue render commands informing the render subsystem that we want, for example, to render from the bigger buffer “A” the geometry that starts at point “X” and contains “N” elements. This is done because I want to try and do some speed-ups in the render subsystem by sorting the bind and draw calls for instance. So if I want to render many parts of buffer “A” which use the same shaders I would like to bind the buffer just once. If I want to use multiple parts of Texture “A” I would like to bind it only once as well.
Finally, I’m also trying to take into account the procedural generation part of the project, which is the reason I’m doing all this (in case you forgot). Ideally, I would like the procedural generation system not to deal with the Render and GPGPU subsystems but I would like it to return higher level resources or even scene nodes. If for example I wanted to create a procedural mesh I would like to attach a script to a scene node which would create an empty mesh and then using the PCG module fill it up with procedural submeshes. I would also like the ability to attach a 3D volume to a scene node and then render it through a volume renderer (direct or indirect) without having to create or add any meshes. These are just ideas for now since I haven’t reached the point where I can start working on the procedural generation stuff yet, hopefully though I will be writing a lot about that soon.
Let’s look at the render system now. That’s pretty much self-explanatory: everything related to the render API and, well, rendering in general, will have to go in here. I’ve wrote earlier a bit about the higher level resources and the lower level ones. I want my render subsystem to provide a single interface where I can get all the hardware resources without having to care about the underlying API. This interface will be using factory classes that construct the various resources and resource pools and managers to manage these resources. Let’s go quickly over these resources. In most render APIs you’ve got buffers that allow you to store information in the VRAM. You’ve got vertex and index buffers (VBOs in OpenGL, ID3D11Buffer in Direct3D) which allow you to store your geometry. I’m wrapping this functionality around my HardwareBuffers providing the necessary calls to create a buffer, bind it, read from and write data to it. The same goes for the texture buffers which I’m calling HardwareTexture wrapping around OpenGL’s texture buffers and around ID3D11Texture when I’ll be doing the Direct3D implementation.
You’ve got many more resources in render APIs which you have to wrap around in order to provide the same functionality through interfaces, regardless of the API used. For now I will be sticking with simple objects encapsulating as much of the related functionality as possible. Apart from the HardwareBuffers and HardwareTextures I have a RenderTarget which provides render-to-target support (frame buffer in OpenGL’s, ID3D11RenderTargetView in Direct3D) and an InputLayout which describes how we pass our vertex data to the graphics pipeline (interleaved, non-interleaved vertices and vertex attributes, etc). Finally I’ve got, of course, Shaders and Effects, which currently are a single vertex shader, an optional geometry one and a fragment shader bound and compiled into a single GLProgram. You’ve also got uniform or constant buffers which allow you to pass blocks of variables to the shader programs, but I’m not providing any functionality for this yet.
This subsystem will also need to communicate with the GPGPU subsystem (OpenCL and CUDA implementations) to support interoperability between these two. Both OpenCL and CUDA provide interoperability with OpenGL and Direct3D, allowing you to share data between these two. This is basically what I was doing in my thesis: I was generating the noise and terrain patch in the GPU through OpenCL kernels and then polygonising it using an OpenCL version of the marching cubes algorithm writing the geometry directly to the OpenGL VBOs. It’s quite simple to do that, you create a buffer shared between GL and CL, lock it, execute your OpenCL kernels that write data to it and then unlock it and render it.
Also known as General Purpose GPU programming. Currently I’m using OpenCL to generate 3D Perlin noise volumes, the terrain patches and polygonise the terrain volume through the marching cubes algorithm, writing vertex and index data to OpenGL-CL shared buffers. I definitely need to provide a CUDA implementation since NVidia doesn’t play nice with OpenCL, making my OpenCL programs quite useless on NVidia GPUs at the moment. The good thing is that both OpenCL and CUDA provide interoperability with the rendering API allowing you to lock vertex, index and texture buffers, write to them and unlock them allowing them to get rendered by the API. Additionally Direct3D 11 gives you compute shaders that can be used for GPGPU programming, so I will be looking into that as well.
This subsystem is focused on the objects that exist in our virtual world. Things like game objects, lights, cameras, frustums and scene managers should go in here. One important job for this subsystem is to decide which parts of the scene it will send to the renderer for rendering, so it will reduce its load as much as possible. If for example we have 100 objects in our scene but we only see 10 of them with our current camera, there’s no need to render the other 90, is it? Another important part of the scene manager is to sort the objects before sending them to the renderer. If we have transparent objects in our scene we will need to sort them differently than the opaque ones (back-to-front) and then pass them to the renderer. This will probably require the use of multiple render queues or render groups.
There are many types of scene managers, some suited for indoor scenes and others for outdoor scenes like terrains. At the moment I’m using extremely simple scene management: just a root node and all the others children of it, since there is so much work that needs to be done in getting the engine and renderer working properly. Each scene object (node) has a transformation matrix representing its position, rotation and scaling related to the rest of word and implements a simple component based model where I can attach things like meshes, lights, cameras. Then in the update function of the scene manager I just send all lights I’m using and issue render commands for each mesh that is attached to a node. When I get the change I would like to implement a proper scene management system that would allow me to support various types of managers and switch them dynamically when needed. I’ll probably implement a simple – generic manager and another one better suited for the complex terrain probably using Sparse Voxel Octrees.
As procedural generation we consider the use of various techniques and algorithms that can be used to generate anything from simple geometry or vegetation to whole cities and even stories for our games, programmatically rather than manually. We can create infinite realistic terrains, for instance, just by adding a few frequencies of pseudorandom noise together and we only need to store the algorithm we used and the input variables that define that particular terrain. That means that we don’t need to store any static geometry whatsoever, since all the geometry is generated on the fly. The greatest thing here is that if we don’t change the input the variables to our algorithm, the terrain will always look the same, since we’re using algorithms that produce what is called pseudo-random noise! However if we change some inputs even slightly we might be getting a completely different terrain, and that’s why we consider these systems as chaotic: a tiny change can cause huge changes. I think that procedurally generation requires a lot of tweaking to get it right but you can also get some unexpected and quite interesting results. This is the main reason I’m doing this editor, so I can experiment with various techniques and create a variety of worlds.
I will keep this system as a separate module for now but it will, of course, make heavy use of the engine and it will be used a lot by the editor. Things like noise, terrain generation algorithms, procedural vegetation scattering and algorithms that create procedural plants and even the generation of various procedural meshes (e.g.: platonic solids) will go in here. This module will be returning meshes, textures or scene objects that can be easily used by the editor. Currently I’ve got most of the work I did for my master thesis working on this new editor. I’ve got the Perlin noise, terrain generation and marching cubes CL programs running smoothly on the GPGPU subsystem, outputting their geometry to a mesh that I attach to a scene node and render it. The mesh contains a submesh and a simple material which in turn contain a hardware buffer, shared between OpenCL and OpenGL, and a simple diffuse lighting effect used to render it. The Perlin noise CL program gives us float values at requested (x, y, z) point. In the terrain generation CL program we use various Perlin noise values, sampled in various frequencies at the desired point and added together to give us a 3D texture which represents our complex terrain. Things like ground and some removal of floating geometry are applied at this stage. Each point of the resulting 3D terrain patch (a simple 3D texture) is a float value which we call density value. To render this volume I’m using the marching cubes algorithm, which is basically an indirect volume renderer, if you want to call it like that. What it roughly does is to split the overall volume into smaller cubes and then combining the density values at each point of the cube with some predefined tables, it generates triangles we can actually render. The great thing is that marching cubes can be parallelized entirely on the GPU using OpenCL making it fast enough to use in real-time. This is a fascinating subject which I’ll talk about in more detail in future posts.
Finally, the editor! Before I actually started working on this project I wanted to use WPF (Windows Presentation Foundation) instead of QT or wxWidgets to make the editor part, because I wanted to use something new and honestly WPF looked quite nice. After using it for a while I have to say that I’m glad I did. In WPF you define most of your layout using xaml, a xml like language, and C#, for example, to implement the logic of the UI. It is quite simple and quick to create a nice UI in WPF with C# and connect it with your C++ part of the application by writing a simple C++/CLI layer between. C++/CLI is a language Microsoft created to allow the mixed use of managed and unmanaged code in the same binary, meaning that it allows you to easily expose your C++ functionality to .NET languages.
For example my engine is written in C++ but I want to see my scene rendered in the editor which is written in C#. I would like also to be able to create new scene nodes or change values and textures to materials. Additionally I might want to enable or disable anti-aliasing or change the renderer from OpenGL to Direct3D. The way I’m doing it is this: I’ve got a C++/CLI project which contains a custom Win32 User control where I do the rendering of the scene and various wrapper classes which expose the functionality of engine types and interfaces to the editor. I’m doing this using the pImpl idiom (pointer to implementation). So for example I have a C++/CLI class for the material instance which holds a C++ pointer to the material instance implementation. I can call functions of the C++/CLI interface for the material instance from the editor (in C#) in order to change some values and the interface will know what type conversions will have to make (if any) and which functions to call from the C++ material instance to do the necessary changes. However if I need to make changes to the material itself I’m using a slightly different approach. A material instance is basically a copy of a material that is assigned to a submesh. You can change values and assign new textures to it without having to change anything in the “base” material, which is a unique resource. In order to change the actual material I’m storing the reference handle to it (a long int) instead of a pointer to it, since I’m using reference handles for may resource management. Now, when I need to make a change to the material I get the actual material from the resource manager using the stored material handle, make the changes to it and then release it.
A tricky part of the editor so far was the render user control I’m using to render my scene to the editor. C++/CLI allows you to create a Win32 custom user control and through a WPF WindowsFormsHost element add it to your UI. The user control has its own HWND which I’m passing to the engine and using it to create a RenderWindow and an OpenGL Win32 context to do my rendering to. I had some difficulty trying to get things to render with the custom control and I’ve found out that the best way to do it is to have a separate thread for the rendering. This way, the control’s main “UI” thread can do whatever it wants to do without interrupting the render thread. So the way I’m doing it at the moment is: once the controller is initialized I spawn a new thread, I create an OpenGL render context and then start loading the resources. Ideally I would like to have one thread for the resource loading and another one entirely dedicated to rendering, but this is something that will have to wait for now.
Anyhow, this marks the end of this article. This is the work I’ve been doing on and off for the past four months to get this lovely noise volume rendering on the editor (it’s more work than it seems, really). There are a few things that I’m currently changing in the engine, since I now have a better design in my mind, so for the next months I’ll be focusing into implementing a proper resource managing system and provide a better interface for the render subsystem. I’m also working on a detailed article on the Win32 user control I’m using for the rendering, since I haven’t found many resources into doing this: getting a C++ renderer render to a custom user control implemented in C++/CLI which can be placed in a WPF application.
 Game Engine Architecture – Jason Gregory