aboutsummaryrefslogtreecommitdiff
path: root/content/posts/2021-11-05-feed-bloater.md
blob: b26ebfedf274385fd5e1516040be0381c46d32a1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
slug: feed-bloater
title: "Feed Bloater"
date: "2021-11-05T23:49:56-04:00"
draft: false
---
In addition to [fixing the RSS feed for this site][site-rss], I also
created a simple command-line tool named [Feed Bloater][]
which expands truncated [RSS 2.0 feeds][rss] by doing the following:

1. Fetch the contents of the feed.
2. Fetch the [HTML][] from the `<link>` for each feed item.
3. Filter the [HTML][] from the previous step based on a [CSS selector][].
4. Replace the truncated item descriptions with the [HTML][] from the
   previous step.
5. Write a new [RSS][] feed to the given output path.

[Feed Bloater][] maintains an internal cache and respects the
[`ETag`][etag] and [`Last-Modified`][last-modified] headers, and by
default it won't update the output file if the source feed has not
changed.

Here is an example that uses [Feed Bloater][] to expand the truncated
[LLVM Weekly][] [RSS feed][rss]:

```sh
feedbloater https://llvmweekly.org/rss.xml div.post path/to/llvmweekly.xml
```
&nbsp;

Here's what the original [LLVM Weekly][] [RSS feed][rss] looks like in
[The Old Reader][]:

{{< figure
  src="/files/posts/feed-bloater/llvmweekly-old.png"
  class=image
  width=1013
  height=623
  caption="Truncated LLVM Weekly RSS feed, viewed in The Old Reader."
>}}

Here's what the expanded [RSS feed][rss] generated by the example above
looks like:

{{< figure
  src="/files/posts/feed-bloater/llvmweekly-new.png"
  class=image
  width=1046
  height=673
  caption="LLVM Weekly RSS feed, expanded by Feed Bloater, and viewed in The Old Reader."
>}}

Much better!  I've been happily using [Feed Bloater][] to expand several
truncated feeds for about a week.

If you're interested in trying [Feed Bloater][], you can find
installation, usage, and configuration instructions in the documentation
on the [Feed Bloater GitHub Repository][feed bloater].

[site-rss]: {{< ref "/posts/2021-10-26-rss-feed-no-longer-annoyingly-trunca.md" >}}
  "RSS Feed No Longer Annoyingly Trunca..."
[rss]: https://en.wikipedia.org/wiki/RSS
  "Really Simple Syndication"
[feed bloater]: https://github.com/pablotron/feedbloater
  "Expand truncated RSS feeds."
[html]: https://en.wikipedia.org/wiki/HTML
  "HyperText Markup Language"
[css selector]: https://en.wikipedia.org/wiki/CSS#Selector
  "Cascading Style Sheet selector."
[llvm weekly]: https://llvmweekly.org/
  "LLVM Weekly newsletter."
[the old reader]: https://theoldreader.com/
  "Web-based RSS reader."
[bundler]: https://bundler.io/
  "Ruby dependency manager."
[etag]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag
  "ETag HTTP header."
[last-modified]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Last-Modified
  "Last-Modified HTTP header."
[url]: https://en.wikipedia.org/wiki/URL
  "Uniform Resource Locator"