Using tt-rss FeedIron plugin to clean up posts


I read a lot of RSS feeds with tt-rss (after the death of Google Reader a couple years ago), but not all pages provide a RSS feed for their articles. And even if they do, the feed sometimes is a mess because the content is not really tailored to the feed. Luckily there is a plugin for tt-rss called “FeedIron” which takes care of this and allows heavy customization of the article, for example

  • clean up of HTML content (getting rid of unnecessary elements in the feed, like comments, social sharing button, extended author information)
  • replacing the article with original content from the page
  • modification of the content with regex

I do use this plugin to get full articles from arstechnica in my feed. Unfortunately, after replacing the original article in the feed with the multi-page content (get the configuration for that here) from the original page, the images are broken because ars uses javascript to put images into the article. To fix this, I modified the configuration for the page:

"arstechnica.com": {
    "type": "xpath",
    "multipage": {
        "xpath": "span[@class='numbers']\/a",
        "append": true,
        "recursive": true
    },
    "xpath": [
        "section[@class='article-guts']"
    ],
    "cleanup": [
        "aside",
        "div[@class='article-expander']",
        "nav"
    ],
    "modify": [{
        "type": "regex",
        "pattern": "(data-thumb=\"(.+?)\".*?data-src=\"(.+?)\".*?>)",
        "replace": "><a href=\"$2\"><img src=\"$1\" \/><\/a>"
    }]
}

The interesting part here is the modify section:

"modify": [{
    "type": "regex",
    "pattern": "(data-thumb=\"(.+?)\".*?data-src=\"(.+?)\".*?>)",
    "replace": "><a href=\"$2\"><img src=\"$1\" \/><\/a>"
}]

The configuration has a regex pattern which matches on the images, grabs the URLs to the thumbnail and the original image and replaces the javascripty stuff with a regular HTML link and image. This way when I read the feed with TTRSS-Reader, my android client of choice for tt-rss I do see the images even without javascript.

Weitere Artikel

Dark Mode für Firefox und Thunderbird

Video Thumbnails unter Windows erstellen

Eindrücke aus Red Dead Redemption 2

Neuer Bluray Player: PS4

How to hardware reset the new Oura ring

Neuer Monitor: Dell S2716DG

Ein paar Bilder

Endlich da: Mein Oura Ring

Quick changelog with git

Pik Ass in Destiny 2