Drop #704 (2025-09-04): Toss Up Thursday

FreshRSS; Compromise; Clankers Die on Christmas

This edition turned out to have an unplanned/unintentional anti-“AI” theme across all three sections, which I’m, now, kind of proud of.

TL;DR

(This is an LLM/GPT-generated summary of today’s Drop using SmolLM3-3B-8bit via MLX and a custom prompt.)

FreshRSS is a self-hosted RSS aggregator capable of handling 1M+ articles and 50k+ feeds, with built-in web scraping for non-RSS sources, and a modular API supporting Google Reader and Fever. (https://github.com/FreshRSS/FreshRSS) (https://hub.docker.com/r/freshrss/freshrss/)
Compromise is a JavaScript NLP library with a modular, rule-based approach for text analysis, including part-of-speech tagging, sentence counting, and advanced transformations like verb conjugation and temporal parsing. (https://github.com/spencermountain/compromise)
“Clankers Die on Christmas” is a satirical science fiction piece involving “AI”. (https://remyhax.xyz/posts/clankers-die-on-christmas)

FreshRSS

First, Google took away Google Reaader, a much loved RSS aggregator/reader, because they’re evil, and because the value from the telemetry of what you subscribed to and interacted with was in direct conflict with selling you out to their adtech division.

Then, Feedly screwed over scores of cybersecurity professionals (including me) by restricting certain workflows we had set up, because they used those concepts to create a far-too-expensive (for individual use) specialized “Feedly Threat Intelligence” product.

And, now, Inoreader continues to follow down the en💩ification path by slowly adding needless “AI” features that encourages destruction of source content through various means of “summarization”/condensing.

I was a paid user of Feedly, and still am of Inoreader, but it’s pretty clear that virtually all commercial endeavours are — at some point — going to fall prey to the false promise of “AI” enhancement. I want none of that, at least when it comes to (a) sharing my content choices with a third-party I do not trust (most online aps like Inoreader are not running their own GPU farms), (b) creating dependencies on non-deterministic processes, and (c) trying to convince me to shrink content I deliberately chose to consume from selected sources into chunks of tokens that look convincingly like coherent abstracts, but will regularly miss critical points, and completely remove the author’s voice.

So, it looks like I’m going to eventually have to ditch Inoreader as well. Lucky[?] for you, that means occasional coverage of self-hosted RSS aggregation+reader servers until I land on one.

Since I was already familiar with FreshRSS (GH) (Docker), we’ll start there.

FreshRSS can manage 1M+ articles and 50k+ feeds without complaining, making it suitable whether you follow a dozen blogs or hundreds of news sources. It even works on something as tiny as a Raspberry Pi 1 with response time under a second. It handily imported my OPML file (from Inoreader) and quickly digested the feeds of well-over a thousand feed sources.

Like the commercial counterparts, you can read articles directly within the interface, search and save queries for quick access, and generate feeds by scraping websites that don’t provide RSS feeds natively. The system supports custom tags for organization and offers multiple reading modes tailored to different consumption styles.

It natively supports basic Web scraping, based on XPath, for Web sites not providing any RSS / Atom feed GitHub, solving the increasingly common problem of wanting to follow sites that haven’t embraced RSS, or have abandoned RSS completely. This web scraping capability extends to JSON documents, broadening the range of sources you can monitor.

Mobile access works through established APIs rather than proprietary apps. FreshRSS supports access from mobile / native apps for Linux, Android, iOS, Windows and macOS, via two distinct APIs: Google Reader API (best), and Fever API (limited features, less efficient, less safe). This means you can use existing RSS apps you already know while keeping your data on your own server.

The extension system allows customization beyond the core functionality. It has built-in core extensions like UserCSS and UserJS which let us customize the appearance and behavior of the interface, and scads of other ones (including, sadly, “AI” ones).

I went with the Docker approach and am settling for SQLite storage for the evaluation period, and — thanks to Tailscale — I don’t need to expose the app to the big, bad internets. This weekend, I’m going to attempt to re-create the scraping-based feeds I have in Inoreader. If that goes well, I suspect this is the aggregator/reader I will eventually land on.

Compromise

Last week, despite hopes being dashed that a patriotic Big Mac finally did what would be best for humanity, POTUS 47 is still POTUS, and continues to “be him”. While I’m more than a tad behind on the “47 Watch” project I started earlier this year (partly for mental health, and mostly because nobody seems to really care we’re becoming a second-world authoritarian state), the potential to use some proxy information as an indicator for whether we truly are rid of this menace or not brought forth another project. This one collects posts from POTUS 47’s social media site (yes, I’m deliberately not using proper names in this section), and provides a way to view and search them without giving the OG site any clicks. It also performs a bunch of analysis on the posts to help detect when patterns change to try to help folks determine if POTUS 47 himself is still crafting cruel content.

Most of the core data wrangling happens on the back-end (the site is a purely front-end app with no build step), but I shunted some experimental wrangling features client side, including the text analysis ones. While there are many JavsScript-based natural language processing (NLP) packages out there, I didn’t need something like Transformers.js (which relies on models).

After a quick poke, I deemed Compromise (GH) (Observable) fit for duty and used it to extract parts of speech, count sentences, and more to help come up with an “Authenticity Score”.

It’s a super cool JS library.

If I had to describe it in just a few words, they would be “practical”, “sensible”, and “thoughtfully designed”.

It makes focused decisions about how language works, which means you don’t need a PhD in computational linguistics to transform text. Spencer Kelly, the creator, built this after realizing that most NLP solutions were either too academic or too bloated for everyday web development needs.

Under the hood, it has a three-tier architecture. You can use just the tokenizer with compromise/one for lightning-fast text splitting, add the part-of-speech tagger with compromise/two for grammar understanding, or go full-featured with compromise/three for complete phrase manipulation. This modular approach means you only “pay” for what you use in terms of bundle size.

While I’m one of the oddfellows who actually likes regular expressions, Compromise is super helpful when you need to do things that sound impossible with those beasts. Want to convert “she sells seashells by the seashore” to past tense? doc.verbs().toPastTense() gives you “she sold seashells by the seashore.” Need to find numbers in text and do math on them? nlp('ninety five thousand and fifty two').numbers().add(20).text() returns “ninety five thousand and seventy two”.

The package works by maintaining a built-in lexicon of ~14K words with their various forms and grammatical roles already mapped out. Instead of using machine learning, it relies on linguistic rules and pattern matching, which makes it both predictable and surprisingly small (~250kb minified). It can process ~1MB of text a second (~10 wikipedia pages), making it fast enough to even run on keypress events.

The API is YUGE, and — as noted — is well designed; the plugin ecosystem extends Compromise’s capabilities even further. There’s compromise-dates for parsing temporal expressions, compromise-stats for text analysis metrics like tf-idf, and compromise-syllables for phonetic analysis. Each plugin follows the same chainable API pattern, so adding new functionality feels natural.

The linked Observable notebook showcases real examples of Compromise in action (hence why there are just a scant few in this section), demonstrating everything from basic tokenization to complex text transformations. These interactive examples make it easy to understand how the library thinks about language and what kinds of problems it’s designed to solve.

It’s been a literal joy to use and is something to keep in your toolkit if you work with JavaScript and text even just infrequently.

Clankers Die on Christmas

If you made it this far in today’s Drop, you deserve some dessert after consuming two unusually large resource sections.

One of my teammates penned a satirical science fiction piece that presents an elaborate fictional scenario involving clankers (“AI” systems).

It’s super clever, very cathartic (if you are as sick of “AI” hype as I am), and short enough to consume quickly.

If only our salvation from this AIpocalypse was as simple as the premise of the piece.

FIN

Remember, you can follow and interact with the full text of The Daily Drop’s free posts on:

🐘 Mastodon via @dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev
🦋 Bluesky via https://bsky.app/profile/dailydrop.hrbrmstr.dev.web.brid.gy

☮️

2 responses to “Drop #704 (2025-09-04): Toss Up Thursday”

jdeaton

September 4, 2025

I say “thanks to Tailscale” typically whenever I add another self-hosted life-improving application. I considered FreshRSS a few years ago before moving to Inoreader but for my types of use for an RSS reader, it sounds like FreshRSS could be great.

LikeLiked by 1 person

Bonus Drop #97 (2025-09-07): Bridging, Building & Browsing – hrbrmstr's Daily Drop

September 7, 2025

[…] talked about FreshRSS the other day, and I’ve been using it as my primary feed reader since setting it up, save for the handful of […]

LikeLike