Writing a fast software renderer and audio engine in js (Pt.1 rendering)
My game is called Torchlight's Shadow, and it's an ASCII roguelike party-based dungeon-crawler with dynamic lighting and sound.
After spending about a month rewriting my audio engine and renderer pretty much from scratch I wanted to document my process.
First rewriting the renderer: When I originally started working on the game, I thought it would be fun to actually use text to render the display. That meant filling an HTML element with a big block of characters representing the screen, and including an outrageous number of <span>s to specify the foreground and background color of each tile. This was incredibly slow, but it didn't matter much while I was still just building out the proof of concept, and I also just thought it was fun to render my text display as actual text that you could copy and paste into a microsoft word document if you wanted to. Everything ran on the main thread though so budgeting time per frame to fill the inner html of the display every frame became unworkable, and I switched to using an HTML canvas element and drawing background colors as rectangles and the foreground as colored text.
After spending a frustratingly long time trying to figure out how to properly scale elements on the canvas and the canvas itself to the screen, I got it working and got a huge speedup. Rendering still ate up a lot of time on the main thread though, and I discovered that there was such a thing as a "Offscreen canvas" that could be drawn on from a separate worker thread meaning the actual rendering task could be done on a separate thread from the game logic meaning that the main thread could just pass off the information needed to render the display. Previously I had been under the impression that a worker thread could only be so useful because I thought that the main thread was needed to interact with the DOM. Once I discovered the offscreen canvas, translating my rendering logic to a worker thread was the obvious next move.
I had never done much with workers before other than briefly using them for a wave-function-collapse algorithm that was too slow to run on the main thread in a reasonable amount of time, but I quickly learned that I was going to need to repackage the information needed to draw the screen into simpler data types. I couldn't quickly pass a nested 2d or 3d array that encoded xy values and rgb values by array indices anymore unless I wanted to deal with the overhead of structured cloning (IE copying data to the worker rather than just sending an address in memory and marking it as belonging to the worker). This meant I wanted to pack rgb values into typed unsigned integer arrays and transfer ownership of the buffers. I sorted the background render instructions and foreground render instructions both by color so that I could change fillstyle only once for each unique color on the screen, and wrote the rbg value once into an array of colors (as three sequential values in a Uint8 array) and then wrote the number of pixels that needed to be that color to another array, and then wrote to a final array the x and y coordinates of those tiles (also stored in sequential elements). So the information would arrive at the worker in the form of: an array of colors, an array of lengths, and an array of coordinates for both the foreground and background. This means the worker just had to start painting the color at elements 0,1, and 2 of the color array, to the coordinates at elements 0, and 1 of the positions array, and then continue writing the same color until it had rendered a number equal to the length in element 0 of the lengths array, then it would move to the next color and length, and do the same.
This meant my renderer would be fast enough that latency was never going to be an issue, the screen could render at 60 fps no problem, and the rendering would never get in the way of the main thread. Only after I was mostly done with the audio engine did I return to remove one final performance bottleneck which was the use of the very slow fillstyle and filltext operations on the HTML canvas. After doing some digging I found that a much faster way to draw to a canvas element was to actually write to the image data directly, and that image data could be written to as a clamped Uint8array that encoded rgba values. This was very nearly how I was already encoding my data other than the order being based on location (which mine was before sorting by color) and the inclusion of alpha values which I didn't bother with in my original implementation. I quickly discovered that if all I wanted was squares, I could simple draw one pixel to a canvas per square and then once I had drawn them all, just scale it to the size of the screen. Writing integers into a buffer and then scaling it to the size of the screen once turned out to be orders of magnitude faster than what I was doing before, and made background color filling nearly free.
To get the same kind of speed-up to foreground text color, I ended up writing my own text rasterizer. Originally I thought I could just use caching to solve my problem, but the issue was that I would only get a speed-up if I could draw each character with the slow filltext call once and then cache them to be used in the future, but with my dynamic lighting, it meant I was going to need to cache an outrageous number of tiles (naively it would be on the order of (256^3)*N where N is the number of glyphs in my font). In actuality the number would probably be orders of magnitude smaller than that because most colors and most characters never get displayed, but the idea of coming up with a data structure that could quickly store and fetch millions or billions of values and be able to access them every frame without becoming a performance bottleneck was not appealing. Luckily I didn't need to cache all the information, if I just rasterized the text myself by drawing the character once with the slow filltext call, and then scanned image and just stored whether there was text there or not, I could store that buffer in the cache, and then when drawing text, I didn't need to render an image over the tile where the character would go, I could just render pixels of the desired color, but make sure they only get rendered to the pixels that the cached raster of the glyph dictated.
After that final adjustment, my renderer could basically draw to the screen as fast as I could send data to it. At the normal game resolution, it can render a frame in about 0.5ms. about 16ms is 60 fps, and 8 is 120 which is the refresh rate of my monitor. Sixteen times faster than it needs to be is great headroom, and it also means I won't have to worry about performance on other machines within reason or about changing resolution or adding more work to do per frame.
I didn't start doing this project in Javascript because I thought it was suited to making games or to making this game, but just because it was what I remembered most from my days of hobbyist game dev when I was younger. What I've learned as I work with it is not that it is inherently slow, but rather than it just gives you lots and lots of options for extremely slow ways to do things, but it usually has one or two options to do things quickly. I found when optimizing my lighting engine, that I saw a huge speed-up when I re-placed my forEach loops with for loops for example, and basically stopped using forEach entirely. JS doesn't give you some kind of warning when you use tools that will slow your code down by 10x or 100x, you just have to profile enough to catch the places where you shot yourself in the foot. It also gives you lots of options for ways to not have to think about what is going on in your program. This became extremely obvious when reworking on my audio engine to be multi-threaded because I thought it would only take a weekend or maybe a week at most like moving my renderer to its own thread. It turned out to be a weeks-long project, and I learned I really didn't know how much was being done by the WebAudio API I had been using before.
Subtle foreshadowing for Pt.2 where I discuss the new audio engine.
Torchlight's Shadow
Party-based ASCII-art procedurally-generated dungeon crawler with full dynamic lighting and audio
| Status | In development |
| Author | UnspeakableEmptiness |
| Genre | Role Playing |
| Tags | ascii, Atmospheric, Dungeon Crawler, Roguelike, Tactical RPG |
Leave a comment
Log in with itch.io to leave a comment.