Bonus Drop #117 (2026-06-07): What The Shell?!

A tour of q3cpma’s “World Playground”


scattered seashells on sandy beach at low tide
Photo by Stéphane Pruzsina on Pexels.com

There’s a repository most folks likely haven’t stumbled across yet: 85 shell scripts at q3cpma’s SourceHut that are some of the most carefully engineered bits of shell code I’ve come across in 3+ decades of hacking. Pure /bin/sh throughout (i.e., no bashisms) running on Linux, all the BSDs, macOS, Illumos, Solaris 10/11, and even (ugh) AIX.

The README ships a portability matrix tracking realpathxargs -0, and sed -E across a dozen platforms. Most scripts use local variables (accepted into POSIX but — TIL — not yet ratified). The README’s closing line is: “For your sanity, please use something like Tcl or a Scheme.”, which should give y’all a clue as to the type of mind we’re dealing with (Tcl…really?).

Here’s what’s actually in there, with a caveat that there is no way I could give a detailed explanation of every script and function, so we’re cherry picking the coolest, craziest, and most useful ones. I’ve tried to ensure there’s a justification for each pick. Drop a note in the comments if any others in the repo catch your eye!

You’ll need/want to clone the repo so you can pull up the sources as you go along.

We’ll start with fqueue


fqueue implements a concurrent file-based queue in 46 lines using flock(1) and ed – yes, the line editor, still shipping on every Unix-alike system you’ll ever touch. The pop operation is three bytes of ed script:

1p d w q

Printed to ed’s stdin in a single printf: print line 1, delete it, write the file, quit. One atomic ed session under flock. The full invocation:

flock -- "$path" sh -c '[ -s "$1" ] && printf "%s\n" 1p d w q | ed -s -- "$1"'

So what? This matters for two reasons. First, ed is guaranteed to be on every POSIX system — it’s in the standard — which means this queue works everywhere without installing anything. Second, the printf "1p d w q" | ed pattern generalizes: any time you need to make a surgical edit to a file from a script, ed in script mode is lighter than sed -i (which has different flags on BSD vs GNU) and more portable than trying to do it in pure shell. A concurrent, lock-protected queue backed by nothing but POSIX-mandated tools is a demonstration of what’s possible when you know what’s actually in the base system.

This one pattern — flock + ed script — is worth cloning the repo for on its own. You will need this someday.


attrfilter takes pairs of awk patterns and terminal attributes — bold, red, green, whatever tput supports — and colorizes matching lines on stdin by building an awk script at runtime from those pattern/action pairs. The awk script it constructs is built incrementally with each --pattern attribute pair, then executed as a single awk invocation. Each pattern gets a next to avoid double-matching. Unmatched lines pass through unchanged. It handles SGR reset correctly — something a lot of ad-hoc colorizers get wrong — by wrapping each match in tput sgr0 on both sides.

colordiff is the punchline:

diff -u "$@" | attrfilter \
'^[-]{3}|^[+]{3}' 'bold' \
'^\+' 'green' \
'^\-' 'red'

A diff colorizer that doesn’t know anything about diffs. The colorizer handles terminal attributes via tput; the caller knows diff syntax. That’s the Unix pipe model doing exactly what it’s supposed to do.

So what? This is a pattern you can reuse every time you need to highlight specific lines in terminal output. attrfilter itself is the reusable core — a generic awk-based pattern colorizer — and colordiff is just one application. You could build colordmesgcolorjournalctl, or colorgcc the same way. The runtime awk construction technique is also worth studying: it’s how you build dynamic filters in shell without resorting to eval or temp files. Each --pattern attr pair appends to the awk script, and everything runs in a single process. That’s the right way to do it.


jsondiff diffs two JSON files while ignoring key ordering, and it does it without temp files:

mkfifo -m 600 "$fifo1" "$fifo2"
jq --sort-keys . "$1" >"$fifo1" &
jq --sort-keys . "$2" >"$fifo2" &
colordiff "$@" "$fifo1" "$fifo2"

Two jq --sort-keys processes run in parallel, each writing to a named FIFO, with colordiff reading from both simultaneously. No temp files means no cleanup trap, no disk writes, no risk of leaving garbage behind if the script is killed. The FIFOs are created, wired up, and discarded when the script exits. This is extremely clever!

So what? This is the canonical named FIFO pattern — mkfifo, background producer processes, consumer reads both — and it solves a real problem. Normal diff on JSON is useless because key ordering differs between every tool that produces JSON. jq --sort-keys normalizes both files, and the FIFOs let the two normalization runs happen in parallel. The technique generalizes to any situation where you need to run N parallel transformations and diff the results. And it uses absolutely nothing beyond POSIX FIFOs and jq — no Python, no Node, no js-beautify.


The sponge implementation is 12 lines. If you’ve used moreutils, you know what it does: soaks up stdin before writing to a file so you can safely run command file.txt | sponge file.txt without truncating your input mid-read.

sponge=$1.sponge~
trap 'rm -- "$sponge"; exit 1' HUP INT QUIT TERM
if cat >"$sponge"
then mv -f -- "$sponge" "$1"
else rm -- "$sponge"; exit 1
fi

Cleanup trap, atomic rename on success, explicit failure handling. Nothing left to chance.

So what? First, there is a non-zero chance your system doesn’t have moreutils installed but you’re reading this because you write shell scripts, which means you need sponge eventually. Now you have the implementation memorized and can type it out in 30 seconds. Second — and more importantly — this is a template for any “write-to-file-after-processing” operation in shell. The pattern is: write to a .tmp file in the same directory (ensures same filesystem for atomic mv), set a cleanup trap, then mv on success. The trap covers signals that would leave the temp file behind. The if/else covers the case where cat itself fails (disk full, broken pipe). Most people’s ad-hoc versions miss at least one of these. This is the correct version.


tabulate draws tables with Unicode box-drawing characters — ┌┬┐├┼┤└┴┘│─ — in pure awk. It handles custom field separators, record separators, and alignment. It calls textwidth via getline for proper CJK character width measurement (essential for aligned tables with mixed-width characters), falls back to awk’s built-in length() when textwidth isn’t available, and has an ASCII fallback mode (+-+-+) for dumb terminals or when you need plain output.

The formatting engine is an awk script passed as a variable to awk -v. Box-drawing characters are constructed with awareness of which edge they’re on — corner, tee, cross — building each row’s border character by character. Column widths are computed in a first pass over all data, then the table is rendered in a second pass, with proper padding and alignment applied per column.

So what? The standard column -t leaves you with boring whitespace-aligned output and no Unicode awareness or border characters. This is a drop-in column -t replacement when you want something that looks good in a terminal or a README. More importantly, it demonstrates a technique for writing substantial awk formatters: the awk script lives in a heredoc with shell variables interpolated at script-generation time (controlling ASCII vs Unicode mode, separator characters, etc.), keeping the awk code readable while letting the shell handle runtime configuration. That hybrid shell-plus-heredoc-awk pattern runs through a lot of the scripts in this repo, and it’s worth adopting.


One function from util.sh replaces itself after the first call:

rand() {
_rand_state=$(od -An -N4 -t u4 /dev/urandom | tr -d '[:blank:]')
rand() {
printf '%u\n' $((_rand_state = (_rand_state * 1103515245 + 12345) & 2147483647))
}
rand
}

First invocation seeds from /dev/urandom, overwrites its own function definition with a fast LCG (the classic glibc rand() constants: multiplier 1103515245, increment 12345, mask 2147483647), then calls the new version. Every subsequent call hits the LCG path directly. Runtime self-modification in shell — which sounds like it shouldn’t work until you read it twice and realize it’s completely obvious.

So what? This solves two problems at once. Problem one: od on /dev/urandom is a real entropy source but calling it every time you need a random number is expensive and uses up entropy. Problem two: you don’t want a predictable seed. The solution is to pay the entropy cost exactly once, then switch to a fast PRNG. The self-replacing function is the elegant delivery mechanism: the first call is the constructor, and the constructor replaces itself with the production implementation. This pattern — a function that “installs” itself — generalizes to any one-time initialization: loading config, checking for tool availability, computing derived values. Instead of a separate init step that callers must remember, the function ensures it’s ready on first use and never slows down subsequent calls.


pararun wraps xargs -P with the things xargs leaves out. You specify how many jobs with -j (defaulting to nproc), and it runs them in parallel via xargs -P while adding:

  • progress counter showing completed/total jobs
  • An ETA calculated by awk from actual elapsed times, not a flat estimate
  • Job numbering so you can correlate output to specific tasks
  • Nested invocation detection — running pararun inside pararun detects it and disables parallelism in the inner run so you don’t silently oversubscribe your CPU by N^2

The ETA is the spiffy part. Each job process writes its completion time to a FIFO. A background while read loop accumulates these timestamps and feeds them into an awk formula that accounts for parallel slot availability: (elapsed / finished_jobs) * ceil((total_jobs - finished_jobs) / slots). The display updates in place without spamming your terminal (it uses \r to overwrite the current line). Jobs are spawned through xargs -P using a numbered per-job wrapper that tags output and forwards exit codes.

So what? Running things in parallel with xargs -P is the right portable approach — it doesn’t require GNU Parallel (which has a different license on every system), doesn’t require installing anything, and works everywhere. But raw xargs -P gives you no feedback and no job tracking. pararun is the missing UI layer. If you’ve ever run find . -name '*.flac' | xargs -P4 flac -d and stared at a blank terminal wondering if anything is happening, this is the script you wanted. The nested-detection pattern is also worth noting: checking an environment variable ($PARARUN_ACTIVE) and falling back to sequential execution is a general technique for preventing recursive parallelism blowups in any tool.


Special Callout: util.sh — The Foundation Layer

I’ve been sprinkling references to util.sh throughout, but it deserves its own spotlight. This is the shared library that most of the other scripts source, and at 592 lines it’s the biggest single file in the repo. It’s also the densest concentration of portable shell technique I’ve seen in one place.

Here’s a quick catalog of what’s in there, beyond the self-modifying rand() we already covered:

Portable reimplementations of missing tools. Shell functions that replace seqtacnprocshufreadlink -f, and mktemp on systems that lack them. Each one detects the native binary first (e.g., gseq on BSD, gnureadlink on macOS), defines a shell function as the fallback, and handles the edge cases the real tool is expected to handle. The readlinkf function alone is worth studying — it resolves symlinks through cd -P in a loop, handling the case where the last path component doesn’t exist (which tricks simpler implementations).

Higher-order functions in shell. mapfilteranyevery — actual functional programming primitives that operate on newline-delimited lists piped through stdin. map applies a command to each line and collects output. filter and any use a command as a predicate. These are implemented as shell functions that use while read loops and accumulate results in variables, and they work because the predicates are simple commands (exit 0 for true, exit 1 for false). It’s a surprisingly natural fit in shell.

The atexit system. push_atexit and pop_atexit maintain an EXIT trap with a stack of cleanup commands. You push cleanup actions as you allocate resources, and they run in reverse order when the script exits — even if it’s killed by a signal. This is the pattern you should use in every non-trivial script that creates temp files, locks, or other resources.

Utility functions. url_encode/url_decode in pure shell (I’ve needed that more often than I would have thought in the past six months). head_neg for “show everything except the last N lines.” sselect — an interactive terminal menu in pure shell. text_format for applying bold/underline to man page-style text via regex. fat_sanitize for making filenames VFAT-compatible. quote and dquote for proper shell quoting. timefmt for formatting timestamps. list_join for assembling comma-separated lists.

So what? util.sh is the Rosetta Stone for portable shell programming. Every function in it solves a problem that comes up in real scripts, and the implementations show you how to do it right across every Unix variant. If you’ve ever been burned by readlink -f not existing on a BSD system, or written a brittle seq replacement, or hacked together an atexit pattern with a single trap that broke when a second library tried to use it — every one of those fixes is in this file. Clone the repo just for util.sh. Read it like a textbook.


What holds the whole repo together isn’t any single script. Every script uses printf instead of echo. Every script handles the local var=$(command) portability footgun correctly: POSIX doesn’t guarantee sequential binding, so the pattern throughout is local var=; var=$(command). The .shellcheckrc is enforced, and the portability matrix is real data, not aspirational documentation. The consistency across all the tooling is admirable, but it’s also a bit shaming (at least to me…I truly wish I was this precise in more areas).

The README states with brutal honesty: “POSIX sh scripting is powerful but insanely brain damaged if you want to stay portable.” After reading/scanning 85 scripts operating at this level of care, I think this is both a cautionary tale and a hard-won badge of honor.


FIN

Remember, you can follow and interact with the full text of The Daily Drop’s free posts on:

  • 🐘 Mastodon via @dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev
  • 🦋 Bluesky via https://bsky.app/profile/dailydrop.hrbrmstr.dev.web.brid.gy

☮️

Leave a Reply

Discover more from hrbrmstr's Daily Drop

Subscribe now to keep reading and get access to the full archive.

Continue reading