The D3D Pipeline

Before we take a look into shaders, we need to understand how the input data is manipulated to create the illusion of 3D graphics. The application (the 3D engine) sends vertex data to the rendering API. Most Windows based 3D engines use Microsoft's DirectX Graphics (D3D) API and so does A7. D3D performs a series of operations on its input data that result in the final rendered image. The figure below shows these operations in the order they are performed. As you can see, executing the vertex and pixel shader program is just another part of the rendering pipeline.

We will now discuss each step of the rendering pipeline separately.

Vertices

The application sends vertex data to the fixed function pipeline or the vertex shader. Typically, the vertex data consist of the vertex position, normal, and several sets of UV coordinates.

The vertex position is a vector in object space, meaning that the x, y and z element of the vector are relative to the objects' centre and orientation (In contrast to world-space where the position is relative to the origin of the world and its axes).

The vertex normal is a vector describing the direction of the vertex. It is often used for calculating the lighting of the vertex.

The texture coordinates (also known as UV coordinates) give the position of the vertex on the texture. It is used in the pixel shader to lookup the pixel color from the texture. Typical 3D engines support several sets of UV coordinates for additional textures, such as lightmaps; the A7 engine supports 3 sets of UV coordinates. This is the content of a typical A7 vertex:

typedef struct {
	float x,y,z;	// position in DirectX coordinates
	float nx,ny,nz;	// normal
	float u1,v1;    // first UV coordinate set, used for shadow maps and nontiled textures
	float u2,v2;    // second coordinate set, used for tiled textures
	float x3,y3,z3,w3; // third coordinate set, used for tangent vector and handedness
} D3DVERTEX;

Other data can also be passed to the vertex shader like vertex colors, tangents, tessellation factors and more. They will be explained when needed.

Transformation & Lighting

The next step is transformation & lighting (T&L). This is an important step that is performed by the fixed function pipeline or, if you need more freedom, programmed into the vertex shader.

The input vertex position is in object space, so the position is represented by a vector whose elements are on the objects' local axes. Because the position of the vertex is relative to the objects' origin, it doesn't say anything about the position of the vertex on the screen. We must perform a series of transformations in order to get the screen coordinates of the vertex.

First the vertex position is transformed to world space. In world space, the vertex position is relative to the world origin. It is then transformed to camera space. In camera space, the x coordinate goes from left to right on the screen, y goes up and z goes into the screen. We are getting close now, but we don't have screen coordinates just yet. The view is still orthogonal: objects in the distance have the same size as objects that are close, there is no perspective.

The area of the world that is visible to the camera has the shape of a pyramid with the top cut off; this is called the view frustum. To create the illusion of perspective, the view frustum is `squeezed' into a cube. This last transformation is called the perspective transformation and the resulting coordinates are in the so called clip space.

Because the far away vertices are moved more in order to fit in the cube, the far away objects will appear smaller in clip space. This may be somewhat hard to grasp, but trying to imagine the viewing frustum in your head with vertices in it may help.

If you don't fully understand what is meant by transformations just yet, don't worry. If you use the fixed function pipeline you don't have to do anything for it. If you write a vertex shader, all these transformations can be done in a single line of code which can easily be memorized.

Next is lighting. There are a number of lighting algorithms fixed into the graphics card. These algorithms assign a lightness value to every vertex, dependent on the positions of the lights in the scene. The lightness value of a pixel is then calculated by linear interpolation of the lightness values of the vertices forming the triangle.

Vertex Shader

When writing advanced graphics effects you will often find that the T&L provided by the fixed function pipeline is far too limiting. If this is the case, you can write a vertex shader program to replace the T&L stage for a certain object in your scene.

In the vertex shader, you are free to do whatever you like. You can move the vertex position or texture coordinates, perform calculations and pass the results to the following stages of the rendering pipeline.

Culling, Rasterization, Depth Test

Backface culling is the process of removing all triangles that face away from the viewer. The backside of triangles can't be seen anyway so this step helps reduce rendering time. On average, half of the triangles are facing away from the viewer so this step has a significant effect on performance.

At triangle setup the life of vertices ends and the life of pixels begins. Rasterization is the process of determining which screen pixels belong to a given triangle.

The depth test is used to determine the visibility of a pixel. This is done by comparing the depth of the pixel to the stored depth value at that pixels' position. The depth information is stored in a dedicated buffer, named the z buffer. If the new pixel is closer to the camera than the stored one, it is drawn and the depth value is updated. If the new pixel is behind the old pixel, it is not drawn and the old pixels' depth value remains stored.

If no pixel shader is used, the default multitexturing stage from the fixed function pipeline is used. This simply draws the pixel with the color of the texture(s) at the given texture coordinates and lights it.

Pixel Shader

For advanced effects such as per pixel lighting you'll need a pixel shader. The pixel shader is a function that takes a number of parameters like texture coordinates and light values and returns a red, green, blue, alpha (RGBA) color vector. The pixel shader function may contain any kind of logic. The only requirement is that it returns a color vector.

Finally, the output is written into the render target. The render target is just a square grid of pixels - a bitmap or the screen buffer. Usually, the render target is just sent to the monitor so it can be displayed. However, it is also possible to reuse the rendered image, for example for doing a post-processing effect or showing it on a surface in the level, like a mirror or water surface.

Workshops 1..4: Lighting through Shaders

In the following 4 workshops we'll build a simple lighting algorithm from the ground up. We will start with the most basic lighting, ambient lighting. Then we will add a diffuse lighting term and a specular lighting term. Finally we will add normal mapping for creating the illusion of surface detail. For every term in the shading algorithm, we will first take a look at the theory behind it, followed by the code and finally a step by step explanation of the code. Time for practice!

Workshop 1: Ambient Lighting