Bonus Drop #54 (2024-07-21): Rusty POSIX

posixutils-rs

Just one section in the weekend Bonus Drop, as I spent far too much time trying to fix the issues preventing the thing you’re about to read about from building on macOS. That also means no TL;DR!


posixutils-rs

  • NB 1: Perplexity, using Claude 2 Sonnet, was used to add the descriptions to the list of utilities.
  • NB 2: The Open Group has the HTML version of POSIX.1-2024 up! “Shells and Utilities has the full listing of POSIX utilities that you can explore.”
  • NB 3: The project did not build under macOS 15 beta 3. All testing was done on Ubuntu 22.04.

POSIX (Portable Operating System Interface) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. It defines the application programming interfaces (APIs), command line shells, and utility interfaces that should be available on POSIX-compliant operating systems. We’ve talked about POSIX somewhat frequently in the Drop, and recently covered the new 2024 version of the standard.

We’ve also talked about a project that aims to port the GNU coreutils over to Rust, but there’s also a project — posixutils-rs — underway with a similar aim to port the suite of core POSIX-compliant command line utilities to Rust. It has three core goals:

  • create clean, safe, race-free utilities that maximize compatibility with existing shell scripts while minimizing bloat
  • implement utilities using idiomatic Rust code and leveraging existing Rust community crates where possible
  • conform to the POSIX standard (specifically SuSv3) as the baseline, only adding popular non-POSIX options needed for compatibility

One reason to undertake this effort is that the suite of POXIX utilities is larger than GNU’s coreutils (and/also: GNU coreutils are kinda bloated). By porting them to Rust, there’s more of a safety guarantee (you can 100% write unsafe Rust programs). And, since the POSIX utilities themselves are more minimal than GNU coreutils, that, combined with the project’s promise to minimize dependencies,

Another difference is the size of the binaries (built with --release):

find -maxdepth 1 -type f -executable \
  | xargs stat -c '%s' \
  | Rscript -e 'scan("stdin") |> summary()'
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 407112  942456  956964 1151352 1044834 3805456

The same summary() for their counterparts on my main Ubuntu 22.04 box is:

 Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 2346   35240   43420   58588   68104  282088

These Rust versions had better be super safe for that tradeoff. I’ll re-run that comparison when the project finalizes utility coverage.

Speaking of utility coverage, here is that list of the current set of ported utilities that I mentioned at the top of the section:

  • ar: create, modify, and extract from archives
  • asa: interpret ASA carriage control characters
  • basename: strip directory and suffix from filenames
  • bc: arbitrary-precision arithmetic language
  • cal: display a calendar
  • cat: concatenate and print files
  • chgrp: change group ownership
  • chmod: change file modes or Access Control Lists
  • chown: change file owner and group
  • cksum: checksum and count the bytes in a file
  • cmp: compare two files
  • comm: select or reject lines common to two files
  • compress: compress data
  • cp: copy files and directories
  • csplit: split files based on context
  • cut: cut out selected fields of each line of a file
  • date: display or set the system date and time
  • dd: convert and copy a file
  • df: report file system disk space usage
  • diff: compare two files
  • dirname: strip non-directory suffix from file name
  • du: estimate file space usage
  • echo: display a line of text
  • env: set environment and execute command
  • expand: convert tabs to spaces
  • expr: evaluate expressions
  • false: do nothing, unsuccessfully
  • file: determine file type
  • find: search for files in a directory hierarchy
  • fold: wrap each input line to fit in specified width
  • getconf: get configuration values
  • grep: search a file for a pattern
  • head: output the first part of files
  • id: return user identity
  • ipcrm: remove IPC resources
  • ipcs: report IPC status
  • kill: terminate or signal processes
  • link: call the link function to create a link to a file
  • ln: make links between files
  • logger: enter messages into the system log
  • logname: return the user’s login name
  • ls: list directory contents
  • mesg: control write access to your terminal
  • mkdir: make directories
  • mkfifo: make FIFOs (named pipes)
  • mv: move (rename) files
  • nice: run a program with modified scheduling priority
  • nl: number lines of files
  • nm: list symbols from object files
  • nohup: run a command immune to hangups
  • od: dump files in octal and other formats
  • paste: merge lines of files
  • pathchk: check whether file names are valid or portable
  • pr: convert text files for printing
  • printf: format and print data
  • pwd: print name of current/working directory
  • readlink: display value of a symbolic link
  • renice: alter priority of running processes
  • rm: remove files or directories
  • rmdir: remove empty directories
  • sleep: delay for a specified amount of time
  • sort: sort lines of text files
  • split: split a file into pieces
  • strings: print the strings of printable characters in files
  • strip: remove unnecessary information from strippable files
  • stty: change and print terminal line settings
  • tabs: set tabs on a terminal
  • tail: output the last part of files
  • tee: read from standard input and write to standard output and files
  • test: evaluate expression
  • touch: change file timestamps
  • tput: change terminal characteristics
  • tr: translate or delete characters
  • true: do nothing, successfully
  • tsort: topological sort
  • tty: return user’s terminal name
  • uname: print system information
  • uncompress: expand compressed data
  • unexpand: convert spaces to tabs
  • uniq: report or omit repeated lines
  • unlink: call the unlink function to remove the specified file
  • uudecode: decode a binary file
  • uuencode: encode a binary file
  • wc: print newline, word, and byte counts for each file
  • who: display who is logged in
  • write: write to another user
  • xargs: build and execute command lines from standard input

FIN

Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via @dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev ☮️

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.