GPU Computing

Data Textures

In the past few weeks, we have noticed the benefits (speed!) that shader programming affords us, and we have also had to work around some of the limitations. Notably, the main issue is that it is nearly impossible to read data out of the shader. We use vertex attributes to pass point data into the shader, and these are read-only. They cannot be altered and any calculations we make cannot be saved back for later use.

Our workaround has been to perform our vertex attribute modifications on the CPU. This entails looping through all vertices, changing the values that need to be changed, marking the vertex attribute as needing an update, and then uploading the new data to the GPU.

1
2
3
4
5
6
7
  for (let i = 0; i < sphereGeo.attributes.extent.array.length; ++i) {
    sphereGeo.attributes.extent.array[i] = Math.max(
      0.0,
      sphereGeo.attributes.extent.array[i] - 0.25 * dt
    );
  }
  sphereGeo.attributes.extent.needsUpdate = true;

This introduces two potential bottlenecks:

  • Looping over the attributes happens sequentially (not in parallel).
  • Uploading new attribute values from the CPU to the GPU takes time.

We may not feel this impact in simpler programs, but we will definitely notice it when working with more complex calculations and higher resolution meshes.

How can we get around this? The only data structure that a shader can write to is a texture. This is what we’ve been doing all along when rendering graphics, either writing pixels to the canvas or to an offscreen texture (like we’ve done with the background subtraction examples).

What if instead of writing color information to a texture, we could write other arbitrary data into it? A color is just 4 numbers, so we could in theory store any 4 numbers into a pixel. This concept is the basis for GPGPU programming, or “general purpose GPU programming”.

three.js includes a DataTexture class which can be used for this purpose. A DataTexture is a texture that can be loaded with custom data, it does not need to represent an image or a video frame or anything graphical.

Point Cloud Data

Let’s create a point cloud by generating an array of random points in the range [-2.0, 2.0]. Each point will have 3 values for the XYZ coordinates. We can create as many points as we’d like, but since we know we want to store this data into a texture, we will use the texture width * height as our number of points, one point per pixel.

1
2
3
4
5
6
7
8
9
const size = 500;

// Populate an array of positions.
const posData = new Float32Array(size * size * 3);
for (let i = 0; i < size * size; i++) {
  posData[i * 3 + 0] = Math.random() * 4.0 - 2.0;
  posData[i * 3 + 1] = Math.random() * 4.0 - 2.0;
  posData[i * 3 + 2] = Math.random() * 4.0 - 2.0;
}

We can then upload the array to a DataTexture. Note that we need to specify the texture dimensions, format, and type.

  • The dimensions should be large enough to fit all the points.
  • The format should match the number of components per pixel. In our case, we have 3 values per point, so we can use the RGB 3 channel format.
  • The type should match the data type. By default, textures use unsigned char which are integers in the range [0, 255]. This will not give us the range or precision we need, so we will use floating-point textures.
  • The DataTexture.needsUpdate flag must be set to true whenever there is new data to upload to the GPU, even the first time around.
1
2
3
4
5
6
7
8
9
// Upload the positions to a DataTexture.
const texPositions = new THREE.DataTexture(
  posData,
  size,
  size,
  THREE.RGBFormat,
  THREE.FloatType
);
texPositions.needsUpdate = true;

Let’s modify these values a little for better GPU performance.

  • Power of 2 values (128, 256, 512, 1024, …) play nice with GPUs, so you will see them often when referencing graphics code. Let’s change our size to take advantage of this.
  • The same goes for the pixel format. The GPU will be happier with a 4-channel texture than with a 3-channel texture. In fact, there is a good chance that it still creates an RGBA texture under the hood even though you are requesting an RGB texture. We can switch and leave the alpha channel empty (or use it for something else down the line).
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
const size = 512;

// Populate an array of positions.
const posData = new Float32Array(size * size * 4);
for (let i = 0; i < size * size; i++) {
  posData[i * 4 + 0] = Math.random() * 4.0 - 2.0;
  posData[i * 4 + 1] = Math.random() * 4.0 - 2.0;
  posData[i * 4 + 2] = Math.random() * 4.0 - 2.0;
}

// Upload the positions to a DataTexture.
const texPositions = new THREE.DataTexture(
  posData,
  size,
  size,
  THREE.RGBAFormat,
  THREE.FloatType
);
texPositions.needsUpdate = true;

The DataTexture can now be passed into the shader as a uniform just like any regular texture.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// Create points material.
import renderVert from "./shaders/render.vert";
import renderFrag from "./shaders/render.frag";
const pointsMat = new THREE.RawShaderMaterial({
  vertexShader: renderVert,
  fragmentShader: renderFrag,
  uniforms: {
    uTexPositions: { value: texPositions }
  }
});

Point Geometry

Next, we need to generate vertices for our mesh. We will be positioning each point using the DataTexture so we don’t need a position attribute. We do need a way to link each vertex to a pixel in the texture, so we will use a uv attribute to do so.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Create points geometry.
const uvs = [];
for (let y = 0; y < size; y++) {
  for (let x = 0; x < size; x++) {
    uvs.push(x / size, y / size);
  }
}

const pointsGeo = new THREE.BufferGeometry();
pointsGeo.setAttribute("uv", new THREE.Float32BufferAttribute(uvs, 2));
pointsGeo.setDrawRange(0, size * size);

three.js uses the position attribute to determine how many vertices are in the buffer. Since we are not using that attribute, we have to call setDrawRange() to indicate how many vertices we want to draw.

We can use the gl.POINTS topology by using a THREE.Points mesh type.

1
2
3
// Create and add mesh to scene.
const points = new THREE.Points(pointsGeo, pointsMat);
scene.add(points);

In the vertex shader, we sample our texture at the passed uv coordinate to get the model position. We can then apply the MVP as we would with a position passed in as an attribute.

Also, since we are rendering points, we need to set the value of the built-in gl_PointSize variable. As the name suggests, this is the size of each point, in pixels.

Offscreen Buffers

If we want to modify our input geometry, we need to create a pipeline that will read from our DataTexture and write updated data to a second texture. This all happens offscreen as it does not need to be drawn in our main canvas.

We’ve already explored offscreen buffers in p5.js with the p5.Graphics object, and in three.js with the EffectComposer and RenderPass. We are going to use a more generic version of the RenderPass concept called a WebGLRenderTarget.

One feature of the WebGLRenderTarget is that it can use custom formats and types. Let’s create an instance with the same parameters and dimensions as our DataTexture.

1
2
3
4
5
// Create offscreen render buffer.
const bufferPosition = new THREE.WebGLRenderTarget(size, size, {
  format: THREE.RGBAFormat,
  type: THREE.FloatType
});

This buffer is often called a Framebuffer Object or FBO in other frameworks. The buffer is a “context” that we are rendering in, and the result is stored in a texture. In three.js, this texture can be accessed with WebGLRenderTarget.texture.

Instead of using the points texture directly, we now want to use the output of our WebGLRenderTarget as the input to our main point cloud shader. So, we will replace the passed uniform texture accordingly.

1
2
3
4
5
6
7
const pointsMat = new THREE.RawShaderMaterial({
  vertexShader: renderVert,
  fragmentShader: renderFrag,
  uniforms: {
    uTexPositions: { value: bufferPosition.texture }
  }
});

All elements to be rendered in three.js must be added to a Scene, and this is also the case for our offscreen geometry. We will need to create a new Scene which will consist of a rect (or Plane). We will also create a new material using a new shader which will update our position texture.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Create offscreen draw rectangle.
const planeGeo = new THREE.PlaneGeometry(1, 1, 1, 1);

// Create offscreen scene for updating positions.
const updatePosScene = new THREE.Scene();

// Create material and mesh for updating positions.
import posVert from "./shaders/update.vert";
import posFrag from "./shaders/update.frag";
const updatePosMat = new THREE.RawShaderMaterial({
  vertexShader: posVert,
  fragmentShader: posFrag,
  uniforms: {
    uTexPositions: { value: texPositions },
    uTime: { value: 0.0 }
  }
});
const updatePosMesh = new THREE.Mesh(planeGeo, updatePosMat);
updatePosScene.add(updatePosMesh);

This second offscreen Scene must also be rendered every frame. We will add the corresponding code to the tick() function. To render offscreen to a texture, we call WebGLRenderer.setRenderTarget() with the WebGLRenderTarget parameter. When we want to go back to rendering to the main canvas (the actual screen), we call WebGLRenderer.setRenderTarget() again with parameter null.

1
2
3
4
5
6
7
  // Update positions offscreen.
  renderer.setRenderTarget(bufferPosition);
  renderer.render(updatePosScene, camera);

  // Render points onscreen.
  renderer.setRenderTarget(null);
  renderer.render(scene, camera);

Ping Pong Buffering

The previous sketch still uses the values in the texPositions texture as the starting values for each frame. We are not updating the initial texture, so there is still no real advantage to using this technique over vertex attributes.

What we want to do is to use the calculated positions from the previous frames as the starting positions of the current frame. This will result in a continuous motion from frame to frame. In order to do this, we will use two WebGLRenderTarget objects. One will be the source data we read from, and the other will be the destination data we write into. Every frame, we will swap the source and destination, so that the previous destination data becomes the current source data.

This is called ping pong buffering because the render targets get swapped back and forth over and over again. This concept was handled for us automatically in the EffectComposer, but we will need to implement it manually here.

We first create an array of two WebGLRenderTarget. We also create two variables to hold the source and destination indices in this array. These will have value either 0 or 1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Create offscreen render buffers for positions.
const bufferPositions = [];
for (let i = 0; i < 2; i++) {
  bufferPositions.push(
    new THREE.WebGLRenderTarget(size, size, {
      format: THREE.RGBAFormat,
      type: THREE.FloatType
    })
  );
}

let srcIdx = 0;
let dstIdx = 1;

We will add a new helper function that takes a source texture and destination buffer as parameters. This will set the correct render target, assign the source texture as the shader uniform, and render the offscreen buffer.

1
2
3
4
5
const updatePositions = (srcPosTex, dstPosBuffer) => {
  renderer.setRenderTarget(dstPosBuffer);
  updatePosMat.uniforms.uTexPositions.value = srcPosTex;
  renderer.render(updatePosScene, camera);
};

Finally, the rendering will be triggered in the tick() function:

  • For the first iteration, updatePositions() is called using our DataTexture as source data.
  • For all the following iterations, updatePositions() is called with the source WebGLRenderTarget texture as source data.
  • The destination WebGLRenderTarget texture is assigned to the point rendering shader so that the latest data is used to draw the point cloud.
  • The source and destination indices srcIdx and dstIdx are swapped.

The same concept can be repeated for every vertex-level calculation we want to make. We just need to use the previous rendered WebGLRenderTarget texture as our input for the next pass.

Exercise

Build a rudimentary particle system where each particle is assigned a position and a velocity.

The points must stay within our bounding box, in the range [-2.0, 2.0] in all dimensions. If a particle hits one of the bounds, it should bounce back (by reversing the corresponding velocity).

We will need two offscreen passes to achieve this effect.

  • The first will test if the point goes out of bounds and if so, will change the velocity.
  • The second will add the velocity to the position.

We will therefore need two separate DataTexture objects to set the initial values, two separate shaders to modify the velocity and the position, and two separate ping pong render buffers to keep track of the values as the program loops.

Point Rendering

The sketch is working properly, but it’s a bit of a noisy mess and it’s hard to tell what is happening.

Let’s tweak the rendering to make it look a little better.

Delta Time

First, let’s take more control of the animation speed. Our current sketch will move the particles using a constant value every frame. This may seem correct at first, but it means that the animation will be very different between faster machines (that might run at 60 fps) and slower machines (that might run at 12 fps). Even on the same system, the animation might slow down and speed up erratically as the machine gets busier and hotter

We can use the time delta (the number of seconds that passed between the last and current frame) as a scalar to our calculations to make sure that the animation always moves at the same speed.

The time delta can be read from the Clock and can be passed into the shaders as a uniform.

1
2
3
4
5
6
7
8
// Animation loop.
const clock = new THREE.Clock();
const tick = () => {
  const dt = clock.getDelta();

  updatePosMat.uniforms.uDeltaTime.value = dt;
  updateVelMat.uniforms.uDeltaTime.value = dt;
  ...

Instead of adding the velocity directly to the position in the shaders, we add the velocity scaled by the time delta.

1
pos += vel * uDeltaTime;

Point Attenuation

We have already seen that the point size can be set using the gl_PointSize built-in variable. But gl_PointSize works in clip space (2D), not world space (3D), meaning that all points are rendered at the same size regardless of their position in space.

A more realistic approach would be to attenuate the points so that the ones closer to the camera are bigger than the ones further away from the camera. This is something we will need to calculate manually, but can be pretty simple if we remember the different parts of our MVP transformation.

Multiplying a point by the model matrix then by the view matrix will put it in view space, or camera space. The z-value of a point in view space will therefore represent the point depth from the camera.

We can multiply the point depth by an attenuation factor, and divide the clip space point size by this number to get a point size that takes the perspective into consideration.

1
2
3
4
vec4 viewPos = viewMatrix * modelMatrix * modelPos;

gl_PointSize = uPointSize / (uPointAttenuation * abs(viewPos.z));
gl_Position = projectionMatrix * viewPos;

Point Sprites

WebGL will automatically include a gl_PointCoord variable in the fragment shader of a gl.POINTS mesh, which can be used to apply a texture to the point sprite. This is similar to a texture coordinate, except that we do not need to generate it or pass it in as a varying, it is just ready to go!

If we pass a texture uniform to the shader, we can sample it using gl_PointCoord to draw nicer looking points.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
precision mediump float;

uniform sampler2D uTexSprite;

varying vec4 vColor;

void main()
{
    vec4 texCol = texture2D(uTexSprite, gl_PointCoord);
    gl_FragColor = texCol;
}
  • If the sprite texture has an alpha channel, we should set the Material.transparent property to true.
  • If the sprite texture uses black for transparency, we should set the Material.blending property to a suitable value like THREE.AdditiveBlending.

In both cases, there is a good chance that we will see some weird issues when rendering overlapping point sprites. This is because the renderer is trying to hide (or cull) points that are behind others, but it shouldn’t because they are semi-transparent and we can see through them.

If this happens, we can tell the material to ignore the depth when rendering the points by setting Material.depthWrite to false.

1
2
3
4
5
6
const pointsMat = new THREE.RawShaderMaterial({
  vertexShader: renderVert,
  fragmentShader: renderFrag,
  blending: THREE.AdditiveBlending,
  depthWrite: false,
  ...

Instanced Meshes

We won’t always want to render our particles as simple 2D point sprites. We may want to render a full 3D mesh at each point position. This is a bit more complex, because we would need two sets of vertex attributes, one that corresponds to the particle and another that corresponds to the 3D mesh.

Also, our mesh may have lots of details and be made up of many triangles. We do not want to duplicate all this data for each particle, that would eat up too much memory and slow down our app.

The trick here is to use geometry instancing. In graphics programming, instancing means to render multiple copies of the same mesh with a single call. Each instance will share many common attributes, but will also have its own unique attributes that will differentiate each copy. For example, a set of instanced meshes can all share the same position attributes, but can each have their own model matrix. This would allow each one to be rendered at different position, orientation, and scale in space.

We can update our point cloud to use 3D spheres for each particle instead of a 2D point sprite.

When creating the geometry, we can assign special attributes of type InstancedBufferAttribute. This will indicate that these attributes are unique for each instance of the mesh. Regular BufferAttribute objects will be shared across all instances.

For our point cloud, we will instance our uv coordinates as each instance needs to read a different pixel value from the velocity and position textures. We will rename the attribute to instanceUv to avoid a conflict with the uv attribute already set in the SphereGeometry.

1
2
3
4
5
const pointsGeo = new THREE.SphereGeometry(0.05);
pointsGeo.setAttribute(
  "instanceUv",
  new THREE.InstancedBufferAttribute(new Float32Array(uvs), 2)
);

Instead of creating a regular Mesh from our geometry and material, we will create an InstancedMesh. The InstancedMesh constructor takes the number of instances to create as its third parameter.

1
2
3
// Create and add mesh to scene.
const points = new THREE.InstancedMesh(pointsGeo, pointsMat, size * size);
scene.add(points);

Exercise

Modify the velocity update shader by adding a force to it. For example, particles that get too close to the center can be pushed back out.