Adding to the Syndicated Web

Sep 05, 2025

4 min

I was showing off this site to someone the other day, and they asked me a simple question which sent me down a rabbit hole: "does it have an RSS feed?".

Now, I have no illusions that it is extremely unlikely people will utilise this feature for my site, but nonetheless it's an interesting technical problem, so I stepped up to the plate.

I love the idea of RSS/atom. As someone who recently nuked all their social media accounts this friend's comment and my subsequent rediscovery of web syndication could have come at no better time.

I had a vague idea of how it worked from back in uni where they got us to code up an elaborate network of atom content hosts, aggregation servers, and feed readers without any libraries in java. Unfortunately, I gave the whole thing about 0 thought after that.

I'm now knee-deep in self-hosted aggregation servers, FOSS android apps and blogs about RSS best practices¹ ². I feel like a kid In a candy shop.

Let's get into how I got it working for this site:

Also sub here: https://blog.acarling.au/feed.atom

RSS Feed

On the face of it, it's a simple task: Expose a list of posts as an XML file, I've got a function to collect each post and all of its metadata then filter and sort them into a feed now all I need to do is write the data to XML, easy.

...

Turns out RSS readers can't render custom HTML and CSS, let alone MDX, AND atom needs your post HTML embedded as an escaped string inside the XML content tag.

Back to the drawing board.

I use the next-mdx-remote Next.js package for my MD renderer, this works well and has a bunch of JSX specific optimisations in it but even the non-component-based function returns a JSX.Element object which must be sent as server payload and rendered. There's no easy way to get it to spit out an HTML string.

This is both a problem and an opportunity

I need to include each article entry's HTML as an escaped string in the feed XML file. This means I need a separate mdstring -> htmlstring conversion pipeline.
Since it needs to be separate it means I can build it up to replace RSS unfriendly components with fallbacks (e.g mathjax SVG -> PNG) and also reader relevant HTML.

I started by writing an MDX renderer in remark, rehype (this is what remote MDX uses under the hood). I added 1 plugin: MDX-to-plaintext, which will stop my RSS output being full of unusable JS bloat and rendered non-functional / empty HTML.

typescript
function mdToHtml(mdText: string, isRss = false): Promise<string> {
	const processedMD = await preProcessMd(mdText, isRss)
	const { content } = matter(processedMD);
	const file = await unified()
		.use(remarkParse)
		.use(remarkMdx)
		.use(stripMdx)
		.use(remarkMath, { singleDollarTextMath: true })
		.use(remarkGfm)
		.use(remarkBreaks)
		.use(remarkRehype, { allowDangerousHtml: true })
		.use(rehypeMathjax)
		.use(rehypeStringify)
		.process(content);
	return String(file);
}

Another consideration is that since RSS feeds are normally read outside your site, all links need to be relative. I started by adding an .env.develop file with my home VPN's virtual static IP for dev, this will be prepended to all links when they're statically rendered. In prod I change this to https://blog.acarling.au

I use Miniflux for my RSS feed, it's hosted on my server which is connected to my VPN. This means I can test the feed directly in Miniflux via VPN without publishing or modifying the prod feed, which is very handy. (I just swap out the feed title based on the environment)

Content filtering

Currently, I have a flag in my document's metadata to tell the feed whether to render the whole article or just a summary. I've got a couple thoughts rolling around in my head about the pros and cons of rendering the whole page to RSS vs a summary and link.

Considerations:

If I put the entire site In the RSS feed and people read it there then I've kind of wasted time on my MDX components.
If it's all in the RSS feed, it's much easier to scrape and steal data from.

As someone who gets paid to think up and put to IDE unique and high quality code, I'm generally pretty against the idea of people taking stuff I do without consent³, and then getting unfortunate souls to annotate it in ethically dubious working conditions⁴ ⁵. Suffice to say, ease of scraping is on my mind more than it would have been pre-2022. Realistically though if it's on the web it'll likely make its way into something these days, so maybe I'm being overly pedantic.

I'm expecting that striking a balance between ease of reading, accessibility, and my own (probably unfounded) personal ethical qualms will take a bit of iteration.

Possible paths:

Dump the entire thing in RSS with fallback components.
Scan pages and evaluate if they're "RSS friendly", render these as full articles, others as previews with my disclaimer / link.
Like 2 but with a manual "RSS friendly tag" and fallbacks for MDX etc.
Just have previews for every page, no full articles.

I'm currently doing path 3, I would like to have a fallback for mathjax though.

I'm also trying to decide whether to host the entire feed or just the latest x entries, maybe I can have path 3 for the latest 5 or so and then summaries for everything else.

Kevin Cox (2022): RSS Feed Best Practises https://kevincox.ca/2022/05/06/rss-feed-best-practices/ ↩
Chris Hardie (2025): How far I’ll go to make an RSS feed of your website https://tech.chrishardie.com/2025/rss-feed-of-your-website/ ↩
Hayden Field (2025) Anthropic to pay $1.5 billion to authors in landmark AI settlement https://www.theverge.com/anthropic/773087/anthropic-to-pay-1-5-billion-to-authors-in-landmark-ai-settlement ↩
ssss Josh Dzieza (2023) AI Is a Lot of Work https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots ↩
Chris Simon (2025) # AI is a Hype-Fuelled Dumpster Fire https://www.youtube.com/watch?app=desktop&v=0bF_AQvHs1M&ab_channel=BanksProductions ↩

Adding to the Syndicated Web

RSS Feed

Content filtering

Footnotes