Bonus Drop #71 (2025-01-05): A ‘Flare’ For The Dramatic

Cap’n Proto; workerd; Cloudflare Workers AI

Today’s inaugural Bonus Drop of 2025 is a tad on the hefty side. But, we need to talk about the resource in section one before we talk about the one in the middle section, as the tech in that section relies on the configuration file format defined in the first. We then take a look at how to quickly-and-easily deploy an AI model and REST API for it in Cloudflare (for free).

Confused? Great! Let’s dig in!


TL;DR

(This is an AI-generated summary of today’s Drop using Ollama + llama 3.2 and a custom prompt.)

  • Cap’n Proto is a high-performance data interchange format and language with zero encoding/decoding overhead, featuring a robust schema system and command-line tools for configuration management (https://capnproto.org/)
  • workerd, Cloudflare’s open-source JavaScript/Wasm Worker Runtime, enables running serverless functions locally using Cap’n Proto configuration files (https://github.com/cloudflare/workerd)
  • Cloudflare Workers AI provides free access to various AI models with 10,000 daily inference tokens, allowing deployment of AI-powered APIs through their edge network (https://dash.cloudflare.com/ai/workers-ai)

Cap’n Proto

Cap’n Proto (GH(I keep reading it in the ‘Captain CAAAVEEEMAAAAAAN!’ voice due to a misspent youth watching cartoons, despite the spelling of this language/framework being more cereal-related since it’s all about [de]serialization) is a language and data interchange format for use in (at a minimum) configuration management and fast data interchange (think “RPC”). It offers binary encoding with zero encoding/decoding overhead. Created by Kenton Varda, the original author of Protocol Buffers v2, it emerged from years of experience with protobuf limitations. You’ll see me use “capnp” as shorthand since typing that single apostrophe is a pain (and the CLI work is all in a capnp tool).

The format uses a binary representation that serves as both the serialization format and in-memory structure. Data is arranged with fixed widths, fixed offsets, and proper alignment, similar to compiler-arranged structs. The encoding remains platform-independent through offset-based pointers and little-endian byte ordering.

Despite its raw performance focus, Cap’n Proto maintains strong security guarantees (which is good since deserialization vulnerabilities abound). It generates accessor methods that validate pointers before traversal, with configurable handling of invalid pointers through exceptions or default values. The system has undergone security audits and fuzzing, and is actively used in security-critical environments like Cloudflare Workers.

The system employs arena allocation for better cache locality and reduced memory management overhead. While the format includes some padding and fixed-width integers, it offers a specialized “packing” compression scheme for bandwidth optimization.

Cap’n Proto also introduces some pretty advanced features like incremental reading, random access capabilities, and memory-mapped file support. Its inter-process communication allows shared memory usage between processes on the same machine, eliminating kernel-mediated data transfer.

It has a very well-thought-out schema language that has all the usual suspects (and more):

  • comments
  • structs
  • unions
  • groups
  • dynamically-typed fields
  • enums
  • interfaces
  • generic types
  • generic methods
  • constants
  • nesting, scope, and aliases
  • imports
  • annotations
  • unique IDs

A practical example is in order. Say, you monitor the internet for exploit activity and have a bunch of structure detection rules (dubbed ‘Tags’) you need to use to classify the activity. We might define a very basic capnp schema for it as such:

@0xadf2a6e0acc73da9; # generated via `capnp id`

struct Tag {
  name @0 :Text;
  id @1 :Text;
  uuid @2 :Text; 
  description @3 :Text;
  category @4 :Text;
  subCategory @5 :Text;
  intention @6 :Text;
  confidence @7 :Text;
  references @8 :List(Text);
  cves @9 :List(Text);
  suricataRules @10 :List(Text);
  negates @11 :Bool;
  silent @12 :Bool;
  userSubmitted @13 :Bool;
  recommendBlock @14 :Bool;
  enabled @15 :Bool;
  created @16 :Text;  # YYYY-MM-DD format
}

Ideally, we’d add real/custom types for some of those fields, but I’m trying to keep the example short.

As noted, Cap’n Proto comes with a command-line tool called capnp intended to aid development, debugging, and is fit for interactive use. This tool can be used to:

  • compile Cap’n Proto schemas to produce source code in multiple languages
  • generate unique type IDs
  • decode Cap’n Proto messages to human-readable text
  • encode text representations of Cap’n Proto messages to binary
  • evaluate and extract constants defined in Cap’n Proto schemas

Let’s make an (almost real!) tag for a (legit) randomly chosen CVE of ours:

@0x9c7d2c4fc8553c2b; # generated via `capnp id`
using import "./tag.capnp".Tag; # this tells capnp we rely on this schema definition

const exampleTag :Tag = ( # we tell it that this is a Tag; also COMMENTS IN STRUCTURED CONFIG FILES FTW!
  name = "Splunk Enterprise CVE-2024-36991 Path Traversal Attempt",
  id = "SPLUNK_ENTERPRISE_CVE_2024_36991_PATH_TRAVERSAL_ATTEMPT",
  uuid = "e06772dd-6038-4e55-a617-1662128cb846",
  description = "IP addresses with this tag have been observed attempting to exploit CVE-2024-36991…",
  category = "activity",
  subCategory = "",
  intention = "malicious",
  confidence = "high",
  references = [
    "https://nvd.nist.gov/vuln/detail/CVE-2024-36991",
    "https://advisory.splunk.com/advisories/SVD-2024-0711",
    "https://www.sonicwall.com/blog/critical-splunk-vulnerability-cve-2024-36991-patch-now-to-prevent-arbitrary-file-reads"
  ],
  cves = ["CVE-2024-36991"],
  suricataRules = ["alert http …"],
  negates = false,
  silent = false,
  userSubmitted = false,
  recommendBlock = true,
  enabled = true,
  created = "2024-12-16"
);

I could have added some extra bits to all of this so we could get Golang code (we use Go @ work, but tons of languages are supported) generated for us (similar to what QuickType might do for JSON), but let’s just use it directly, similar to how we might use jq against raw JSON:

$ capnp eval example.capnp exampleTag.name
"Splunk Enterprise CVE-2024-36991 Path Traversal Attempt"

A big “so what” for that example is that, say I accidentally added an invalid = true entry to the example tag (or, perhaps, misspelled a valid name). This:

$ capnp eval example.capnp exampleTag.name
example.capnp:26:3-10: error: Struct has no field named 'invalid'.
"Splunk Enterprise CVE-2024-36991 Path Traversal Attempt"

does return the name, but also lets us know the file is borked, and has a non-0 exit value; SUPER USEFUL to not break pipelines.

Of course we can get the capnp-encoded tag in a more universally mungable format:

$ capnp eval example.capnp exampleTag -ojson     
{ "name": "Splunk Enterprise CVE-2024-36991 Path Traversal Attempt",
  "id": "SPLUNK_ENTERPRISE_CVE_2024_36991_PATH_TRAVERSAL_ATTEMPT",
  "uuid": "e06772dd-6038-4e55-a617-1662128cb846",
  "description": "IP addresses with this tag have been observed attempting to exploit CVE-2024-36991…",
  "category": "activity",
  "subCategory": "",
  "intention": "malicious",
  "confidence": "high",
  "references": [
    "https://nvd.nist.gov/vuln/detail/CVE-2024-36991",
    "https://advisory.splunk.com/advisories/SVD-2024-0711",
    "https://www.sonicwall.com/blog/critical-splunk-vulnerability-cve-2024-36991-patch-now-to-prevent-arbitrary-file-reads" ],
  "cves": ["CVE-2024-36991"],
  "suricataRules": ["alert http …"],
  "negates": false,
  "silent": false,
  "userSubmitted": false,
  "recommendBlock": true,
  "enabled": true,
  "created": "2024-12-16" }

While JSON schema can get us similar safety guarantees, the venerable jq can’t perform the same validation-on-extraction (or validation in general without creating it manually).

The Cap’n Proto site has great documentation, but you can get started with the above examples after installing capnproto:

  • Debian / Ubuntu: sudo apt-get install capnproto
  • Arch Linux: sudo pacman -S capnproto
  • Homebrew (macOS): brew install capnp

workerd

Photo by Paul Efe on Pexels.com

Cloudflare Workers are serverless functions that traditionally run on Cloudflare’s “global network of over 300 cities, using V8 isolates instead of traditional containers or virtual machines”. These isolates are lightweight contexts that provide secure code execution environments, with significantly lower overhead and faster startup times compared to traditional serverless platform.

Workers use an execution model where each function runs in its own isolate, allowing a single runtime instance to handle thousands of concurrent executions efficiently. The V8 engine overhead is paid once at container startup, rather than per-execution, making Workers substantially more efficient than container-based serverless platforms.

They support both ES modules and traditional ServiceWorker syntax. The primary entry point is the fetch() handler, which processes incoming HTTP requests and returns Response objects1. While Workers implement most standard web APIs, they differ from browser JavaScript in that they run on Cloudflare’s edge network rather than client device.

We’ll talk just a tad more about CF’s hosted workers in the last section, but we’re not here in this section to shill CF services. We’re here to talk about workerd, which is what powers Cloudflare’s JavaScript/Wasm Worker Runtime and has recently-ish been released as open source so we can use them on our own infrastructure.

There is an npm workflow one can use for CF Workers, but CF has simplified the development a tad, and we can use a Cap’n Proto config to define our worker (this is almost verbatim from a CF example since it didn’t seem worthwhile to reinvent the wheel):

# Imports the base schema for workerd configuration files.

# Refer to the comments in /src/workerd/server/workerd.capnp for more details.

using Workerd = import "/workerd/workerd.capnp";

# A constant of type Workerd.Config defines the top-level configuration for an
# instance of the workerd runtime. A single config file can contain multiple
# Workerd.Config definitions and must have at least one.
const helloWorldExample :Workerd.Config = (

  # Every workerd instance consists of a set of named services. A worker, for instance,
  # is a type of service. Other types of services can include external servers, the
  # ability to talk to a network, or accessing a disk directory. Here we create a single
  # worker service. The configuration details for the worker are defined below.
  services = [ (name = "main", worker = .helloWorld) ],

  # Each configuration defines the sockets on which the server will listen.
  # Here, we create a single socket that will listen on localhost port 8080, and will
  # dispatch to the "main" service that we defined above.
  sockets = [ ( name = "http", address = "*:8080", http = (), service = "main" ) ]
);

# The definition of the actual helloWorld worker exposed using the "main" service.
# In this example the worker is implemented as an ESM module (see worker.js).
# The compatibilityDate is required. For more details on compatibility dates see:
#   https://developers.cloudflare.com/workers/platform/compatibility-dates/

const helloWorld :Workerd.Worker = (
  modules = [
    (name = "worker", esModule = embed "worker.js")
  ],
  compatibilityDate = "2023-02-28",
);

You can hit this link to see all the config options.

We told workerd that this will listen on port 8080. We can define a whole group of workers at once, but I’m trying to keep this example somewhat manageable.

We now need a worker.js to handle these web requests:

worker.js:

export default {
  async fetch(req, env) {
    return new Response("Hello Drop Readers!\n");
  }
};

So, literally after a mkdir somedir and the creation of two small files, we have a basic web/API endpoint up:

$ npx workerd serve config.capnp
$ curl --silent http://localhost:8080/
Hello Drop Readers!

And, we can even compile that to a standalone binary for lightweight deployment:

$ npx workerd compile config.capnp > drop-01

(You can now just run drop-01 vs. cart the whole dev environment around.)

We can extend thisexample to add a couple endoints:

function rot13(str) {
  return str.replace(/[a-zA-Z]/g, (char) => {
    const base = char <= 'Z' ? 'A'.charCodeAt(0) : 'a'.charCodeAt(0);
    return String.fromCharCode((char.charCodeAt(0) - base + 13) % 26 + base);
  });
}

export default {
  async fetch(req, env) {
    const url = new URL(req.url);
    
    // Route: /time
    if (url.pathname === "/time") {
      return new Response(new Date().toISOString());
    }
    
    // Route: /rot13?text=sometext
    if (url.pathname === "/rot13") {
      const text = url.searchParams.get("text");
      if (!text) {
        return new Response(JSON.stringify({ error: "text parameter required" }), {
          headers: { "Content-Type": "application/json" },
          status: 400
        });
      }
      
      return new Response(
        JSON.stringify({
          original: text,
          rot13: rot13(text)
        }), {
          headers: { "Content-Type": "application/json" }
        }
      );
    }
    
    // Default route
    return new Response("Hello Drop Readers!\n");
  }
};

Those have very predictable outputs:

$ curl --silent http://localhost:8080/time
2025-01-05T11:19:50.620Z
curl --silent http://localhost:8080/rot13?text="Hello+Drop+Readers\!" | jq
{
  "original": "Hello Drop Readers!",
  "rot13": "Uryyb Qebc Ernqref!"
}

I’d still be more likely to use Deno or Bun for similar functionality, but this setup is pretty sweet if you want to try running things on your own for a bit, then scaling up to CF (if needed) with almost no effort.


Cloudflare Workers AI

Photo by Google DeepMind on Pexels.com

Way back in August of 2024 we looked at how to use a local Ollama model to generate CVE short names from long descriptions. I made a small REST API around that on my M1 Mini (now M4 Mini) and it served (heh) me pretty well.

I don’t remember where I learned about the free inference credits on Cloudflare, but you get 10,000 daily tokens for free, which is plenty for hobby tasks. I generate CISA KEV-like CVE short names for CVESky (and some other public experiments we’re running) and decided to transition that functionality to CF’s Workers AI.

Without adding too much color, I used the old-school CF Worker creation idiom:

$ npx wrangler init cve-shortener

and used the Wizard defaults:

╭ Create an application with Cloudflare Step 1 of 3
│
├ In which directory do you want to create your application?
│ dir ./cve-shortener
│
├ What would you like to start with?
│ category Hello World example
│
├ Which template would you like to use?
│ type Hello World Worker
│
├ Which language do you want to use?
│ lang TypeScript
│
├ Copying template files
│ files copied to project directory
│
├ Updating name in `package.json`
│ updated `package.json`
│
├ Installing dependencies
│ installed via `npm install`
│
╰ Application created

╭ Configuring your application for Cloudflare Step 2 of 3
│
├ Installing @cloudflare/workers-types
│ installed via npm
│
├ Adding latest types to `tsconfig.json`
│ added @cloudflare/workers-types/2023-07-01
│
├ Retrieving current workerd compatibility date
│ compatibility date 2024-12-30
│
├ Do you want to use git for version control?
│ yes git
│
├ Initializing git repo
│ initialized git
│
├ Committing new files
│ git commit
│
╰ Application configured

╭ Deploy with Cloudflare Step 3 of 3
│
├ Do you want to deploy your application?
│ no deploy via `npm run deploy`
│
╰ Done

Keen-eyed Drop readers may feel compelled to harang me on social media (or the comments) for using TypeScript, given my prolific anti-TypeScript stance, which has been mentioned in many-a-Drop. While I’m still not thrilled about Microsoft’s control over TS, TS makes autocompletion in Zed much more useful. It’s still a type safety Potemkin village, since you ultimately are running JS code, but I’ll sacrifice coding through inherent disdain for (ultimately) more efficient coding.

We need to tweak the setup a bit since we’re using CF’s AI features:

name = "cve-shortener"
main = "src/index.ts"
compatibility_date = "2024-12-24"
compatibility_flags = ["nodejs_compat"]

[observability]
enabled = true

[ai]
binding = "AI"

Followed by an npm i @cloudflare/ai.

The index.ts handler is pretty straightforward. I chose POST over GET since the prompt is pretty big, and the CVE descriptions can get pretty big as well. If you do any work with LLM/GPTs in an API context, the following will look familiar. If not, you likely aren’t reading this section 🙃:

import { Ai } from '@cloudflare/ai';

interface RequestBody {
	description: string;
}

export interface Env {
	AI: any;
}

export default {
	async fetch(request: Request, env: Env): Promise<Response> {
		if (request.method !== 'POST') {
			return new Response('Method not allowed', { status: 405 });
		}

		try {
			const { description } = (await request.json()) as RequestBody;

			if (!description) {
				return new Response('Description is required', {
					status: 400,
				});
			}

			const ai = new Ai(env.AI);

			const systemPrompt = `You are an AI system specializing in generating concise short names for vulnerabilities described by CVEs. Your task is to convert verbose CVE descriptions into clear, descriptive titles that resemble entries in CISA's Known Exploited Vulnerabilities (KEV) catalog. Stop generating output immediately after providing the short name.`;

			const userPrompt = `Given a CVE description, create a short name similar to those used in CISA's KEV catalog. The short name should be concise, descriptive, and highlight the affected product or vulnerability type. Do not provide additional information or explanations. The response should be under 10 words, needs to be a grammatically correct phrase, always end with the word 'Vulnerability', and should end immediately after the short name.

Here is the CVE description to process:
${description}`;

			const response = (await ai.run('@cf/meta/llama-3-8b-instruct', {
				messages: [
					{ role: 'system', content: systemPrompt },
					{ role: 'user', content: userPrompt },
				],
				temperature: 0.0,
				max_tokens: 30,
				stream: false,
			})) as { response: string };

			return new Response(
				JSON.stringify({
					shortName: response.response.trim(),
				}),
				{
					headers: {
						'Content-Type': 'application/json',
					},
				},
			);
		} catch (error) {
			return new Response(
				JSON.stringify({
					error: 'Internal server error',
				}),
				{
					status: 500,
					headers: {
						'Content-Type': 'application/json',
					},
				},
			);
		}
	},
};

You will need to hit up AI => Workers AI (https://dash.cloudflare.com/YOUR-ACCOUNT-UUID/ai/workers-ai) to make sure you’ve got “AI” enabled

and you can check usage in “Workers & Pages” (https://dash.cloudflare.com/YOUR-ACCOUNT-UUID/workers-and-pages):

After an npm run deploy, you can test this out:

curl \
  --request POST \
  --url https://YOUR-WORKER-SUBDOMAIN.workers.dev \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer YOUR-TOKEN" \
  --data '
  { 
    "description": "systeminformation is a System and OS information library for node.js. In affected versions SSIDs are not sanitized when before they are passed as a parameter to cmd.exe in the getWindowsIEEE8021x function. This means that malicious content in the SSID can be executed as OS commands. This vulnerability may enable an attacker, depending on how the package is used, to perform remote code execution or local privilege escalation. This issue has been addressed in version 5.23.7 and all users are advised to upgrade. There are no known workarounds for this vulnerability."
  }'
{"shortName":"Node.js System Information SSID Command Injection Vulnerability"}                                                                  

While I chose '@cf/meta/llama-3-8b-instruct' (to almost match what I was using locally), you can choose from scads of models that CF has to offer.

I threw this behind a caching proxy to both save on inference token use and since we don’t need to re-gen the name after getting one. A nice feature of the 10K free tokens in the Workers AI setup is that it just sends back REST API errors if you run out (vs. start to charge you).

If you don’t have access to a GPU, this is a great way to experiment/play. I know many readers and humans are “anti-AI”, but there are good uses for this tech, and more of us need to showcase those positive uses so we can drown out the daft ones.

Hit me up if you want more specifics or want me to go more in-depth in a solo Knowledge Drop.


FIN

Remember, you can follow and interact with the full text of The Daily Drop’s free posts on:

  • 🐘 Mastodon via @dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev
  • 🦋 Bluesky via https://bsky.app/profile/dailydrop.hrbrmstr.dev.web.brid.gy

Also, refer to:

to see how to access a regularly updated database of all the Drops with extracted links, and full-text search capability. ☮️

One response to “Bonus Drop #71 (2025-01-05): A ‘Flare’ For The Dramatic”

  1. LogFlux Avatar

    Cap'n Proto is one of those technologies that makes you feel like you've been doing serialization wrong your entire career.

    "Zero encoding/decoding overhead" sounds like marketing until you realize it literally just memory maps the data structure. Elegant and terrifying.

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.