Curses! – A Post That Ends Up Being About Capturing Ncurses Output And Converting It To HTML

So I had an idea the other day! I started to implement it, but it turned out it was already done way better elsewhere, so I dropped it. But let’s talk about it anyway, because it’s not the worst idea ever and I had a fun experience as a result.

For some reason, I was under the impression that a really good web-based Interactive Fiction interpreter wasn’t available. I have no idea why I held this idea, because I had been using Parchment to play Z-Machine games online for the past couple of days. Maybe I figured there wasn’t a good online Glulx interpreter, but I was wrong about that too. In any case, the idea was simple: Provide a reasonable method of playing Interactive Fiction online with near-perfect emulation of how it would feel when played on a native interpreter.

The method I chose was chosen because I am lazy. It was literally to be a bunch of different things I found online all cobbled together. The fundamental plan was to run a Linux Z-Machine interpreter for each user, take ‘screenshots’ of the terminal, convert this screenshot to HTML and serve it to the user via a web interface. Input would be sent down to the interpreter from the browser, ‘screenshots’ get sent back up. This would have meant the user experienced ‘lag’ when typing, but I was just pleased that I thought I struck upon a novel method of delivering an authentic native Z-Machine experience in the browser.

Rather than talk about why that was all a terrible, terrible idea, let’s focus on the journey and not the aborted destination. While the idea didn’t pan out, I did learn a thing or two about the terminal and ANSI codes. In fact, I even found and fixed a bug in tmux! The fix hasn’t been accepted as a patch yet, and it might not be because it’s not the cleanest solution to the problem… But I did discover a bug, so yay!

So how did I end up doing that? Well, it turns out that the very first step of the journey was a doozy. Capturing output from the terminal is not usually very difficult. I was using a Z-Machine interpreter called Frotz to do the heavy-lifting of running the IF games. Naturally, my first attempt was redirecting the output of Frotz to a file and finding a tool to convert the ANSI to something more web-friendly. That’s when I came across ansi2html.sh, which didn’t work at the time for a very particular reason. Frotz uses something called ncurses to draw its screen. Rather than scroll the terminal when new text has to be printed, it prints over the existing space by using special terminal control codes called cursor movement instructions. These instructions are used by ‘printing’ them to the output. When one of these characters is ‘printed’ by the terminal, the cursor is moved according to the instruction rather than a new character being printed. After being moved, new text is drawn over the text that was previously present in that spot.

This means that Frotz doesn’t output lines delimited by newline characters. Instead, it prints out one very long line that uses these cursor instructions to move the cursor around and redraw over already printed characters. This confuses ansi2html, which tries very hard but eventually can’t keep up with the complicated sequence of cursor movements. This explanation doesn’t do the process justice, so here’s a few pictures. First up, here’s a picture of what Frotz looks like when it’s running (or if the output file is printed using ‘cat’):
Frotz

That looks pretty standard, right? Here’s what the underlying ANSI looks like:
ANSI Horror

Wow. Interestingly, you can ‘cat’ this file and the output looks exactly the same as the first screenshot (Technically speaking. You would have to strip out a screen-clear instruction first). So ‘cat’ can handle it, but ansi2html.sh produces this:
Nice try, ansi2html

It’s a good effort, but it’s clear that ansi2html can’t do this. Other tools purported to be able to convert ncurses ANSI output to something more usable, but I tried them to no avail. I needed some sort of pre-processing step: To convert this mess of ANSI characters into a more easily manageable mess of ANSI characters. Enter tmux.

Tmux is a terminal multiplexer, just like ‘screen’. It has one advantage over ‘screen’ that I was interested in: It can capture coloured ‘screenshots’ of the terminal and write them to a hardcopy. A hardcopy is basically just a screen capture, like the output file from earlier but stripped of cursor control characters and the like. Tmux generates this by examining each individual character in its viewing pane and deducing which ANSI flags would have to be set to produce that character. In effect, tmux produces a perfect facsimile of terminal output, minus any crazy terminal control characters. Generating the hardcopy isn’t hard. I opened a terminal and booted up tmux. Once within tmux (which again, functions very similarly to ‘screen’), I started Frotz and used another terminal to send the commands to the tmux process, causing it to capture its current viewing pane to a buffer and then saving that buffer to a file called tmux.hardcopy:
tmux capture-pane -eJ ; tmux save-buffer tmux.hardcopy

So did it work? We should be able to ‘cat’ the hardcopy and get back the same image as the first one.
So close

Okay, but where did the colour go?! I can see it in the header, but it suddenly disappears. Well, ignoring that, how does ansi2html perform with the hardcopy?
Nearly there

That’s so much better! But the colour is still missing. What gives? Well, it turns out that tmux has a bug which incorrectly sets certain flags. If certain attributes change between characters (in this case, the ‘reverse colours’ attribute), a special character is printed which resets all attributes. The remaining attributes are then re-added via their own special character instructions. However, the colour instructions are not refreshed like the other attributes, so suddenly we’re missing all the colour from the output from that point on because tmux believes it only has to print the colour changing code once at the start of the hardcopy, rather than after every attribute reset. So I wrote a patch for tmux! It’s one of my first experiences with contributing to open source.

As I was writing this, one of the tmux maintainers got back to me regarding my patch: It’s good, but it’s not quite there. It has its own problems, but we’re working through it to get a proper fix together. Hopefully no one will ever actually need the patch above!

Anyway, shortly after finding/fixing the tmux bug, I realised the project as it was envisaged was fairly pointless. However, I did get to find a real-life bug in something open source! And I got a blog post out of the whole experience too. And of course, this little post wouldn’t be complete without a final screenshot of ansi2html finally taking on an ncurses program and winning:
Victory!

Strictly speaking, I had to modify ansi2html to get the reverse colours working correctly. You can find my modified version at this gist here. But yeah! There you go: How to capture output from programs that use the ncurses library and convert them to html! Looks like something good came out of the journey after all. Maybe I’ll talk about how I tracked down the tmux bug next time?