Apr 13 2008
Multithreaded Rendering
One of the big things for games programming these days is dealing with multiprocessing. I have been working on a multithreaded renderer in my spare time to do interactive brush work. The previous movies for peduncle are all non-interactive. They actually render in faster than realtime, but they just run from a command line. First, I had to create the basic infrastructure, which is not much different than a game. This infrastructure means all of the usual subsystems like fileIO, memory, debug scaffolding, and rendering. I already had fileIO running in another thread. This evening, I got my multithreaded renderer working.
The basic architecture for the renderer is to build a render queue (a command list), then execute all of those commands. In fact, this is almost identical to a PS2 rendering DMA list. Xgkick the data and let the GS render.
My renderer has four modes:
- Immediate
- Single threaded, single buffer
- Single threaded, double buffer
- Multithreaded, double buffer
Immediate is useful for debugging, although it is extremely slow. Whenever a command is inserted into the render queue, it is immediately executed. Great for stepping through code, since the rendering happens when it is coded from the simulation thread. Very bad performance.
Single threaded, single buffer puts all of the commands into a buffer. At the end of the frame, it goes through those commands and renders them. There is only one buffer, which is cleared at the end of the frame. This is good for making sure that your commands are executing properly, without having to worry about threading issues.
Single threaded, double buffer is almost the same as the “single threaded, single buffer” except when the rendering happens at the end of the frame, it renders the data from the previous simulation frame. This is because there are two buffers. Again, easier to debug since it is a single thread, but more difficult to understand the rendering queue because it is the previous frames data.
Multithreaded, double buffer is the normal rendering mode. At the start of the frame, the simulation thread begins to fill a render buffer with commands. At the same time, the render thread is processing the render buffer from the previous simulation frame. At the end of the frame, both threads synchronize and the buffers switch. This is the fastest mode, but very hard to debug.
People have asked why I don’t have explicit code running in the render thread, such as object->draw(). In reality, one of the commands is a callback, which could do something like that. However, that tends to make thread synchronization and mutex hell. By explicitly dealing with commands, there is very little synchronization between the simulation and render threads. It also makes it much better for systems like the PS3, where the SPUs are doing a lot of the work and you have to DMA data to them anyways.
Hopefully, I will have some cool movies soon now that the basics are in place.
Leave a Reply
You must be logged in to post a comment.