Remote Bulk Editing Using Regexp with Emacs

Dec 29th, 2021

A couple of days ago, I did maybe the weirdest and also most amazing thing on a remote machine thus far (with Emacs) and wanted to share the story.

So there’s this EmulationStation software that organizes roms with XML lists for metadata. I needed to change the <name> field of about 100 entries there to drop a numeric prefix. That’d be easy on my local computer with tools I know well, but I was accessing the device remotely via SSH and wanted to see what I could do.

I’m not very good at plain shell scripting which would’ve been an option, so I fired up Emacs to make a connection to the device.

For all intents and purposes, what you see from within Emacs locally is indistinguishable from what you see when logged into a remote machine. This helps with my effectivity, because then I can use the tools I know well to do things on remote machines, too. This is different from SSH’ing onto the machine and opening an editor there, because opening emacs/vi/nano on the remote machine would be loading that machine’s configuration. If you open your local Emacs and create a connection to the remote machine from there, your local configuration will be used. It’s just like live editing web pages via FTP as we did it in the 90’s.

The only exception are locally installed packages that interface with locally installed command-line tools (e.g. ag or ripgrep). The functions will be available, but executing these programs on the remote machine won’t work. The binary won’t be found.

So the game metadata files I needed to adjust are grouped into directories, one per system. That meant I needed to find all gamedata.xml files in all directories first. Then find the numerical prefixes within each (and replace them). I’m able to perform both tasks individually from the shell, but the combination would’ve already required some research, especially since my local machine has non-POSIX tools like fd installed that I got used to, while the remote device is a bare-bones Linux. I’m also used to grep replacements, which made the search-and-replace operation a bit cumbersome. But that wouldn’t be a problem with Emacs as you’ll see in a second.

A simple regex for the numerical prefixes is <name>\d\d\d\s, and the replacement string would be <name>. Emacs regular expressions are weird because you have to escape parens and brackets that are part of the PCRE syntax. There’s a snippet to help with that so you can write the syntax you know and have it translated to Emacs-escaped regex. I didn’t know of that, sadly, and experimented around until I found that <name>[0-9]\{0,3\} works (note the trailing space). TIL: The curly braces require escaping, and neither \d nor \s seem to work.

Now here are the steps I took:

Install wgrep package if you don’t have it already. It makes grep buffers writeable. That means you can perform changes to search result buffers in-place and save them in bulk.
Run M-x rgrep for recursive search:
- Search for <name>
- Specify the file wildcard as *.xml
- Specify the root folder from which to start recursive search
In the results buffer that appears, hit w or M-x wgrep-change-to-wgrep-mode to make the grep results buffer writeable.
Run M-x replace-regex and replace <name>[0-9]\{0,3\} (again, note the trailing space) with <name>.
To save all changes to all matched files, hit C-c C-c.

And that’s it!

The magic piece here is that Emacs will open, modify, and save all files according to the “diff” you produce in wgrep or “writable grep” mode. I didn’t have to figure out how to search recursively for the pattern in all XML files and afterwards perform the replacement automatically.

On my Mac, I would’ve been able to achieve this 10+ years ago with TextMate (or BBEdit), too. Local changes never were much of a problem.

But with Emacs, I was able to do these changes remotely. The hardest part was figuring out how to escape what in the regex. (I used highlight-regexp to more or less interactively highlight matches.)