Audiobooks

Turn some audiobooks resources into podcasts RSS

Type in book title, copy the RSS link, paste it somewhere.

1. Context

This page is mostly autogenerated from the http://litteratureaudio.com website.

They do a fantastic job reading books and publishing mp3 … in wordpress articles. I like the way http://librivox.org publishes books as an RSS feed, for everyone to enjoy in their favorite podcast player.

2. Create RSS from the website

RSS is a must have for audiobooks, to be able to keep track and sync between devices.

litteratureaudio.com is a sad wordpress instance with little to no automation options. The content seems to be mostly organized by hand. There is no apparent full RSS nor json, XML-RDF of any sort. To produce a RSS, we will need to first scrap the website.

2.1. Get the full list of books

http://www.litteratureaudio.com/notre-bibliotheque-de-livres-audio-gratuits is all we need. There are 10000+ books.

2.2. Handle the different kinds of books

2.2.1. Books aren't always laid out and structured the same.

Most books embed the mp3 list in a .link-roman-mp3-file class container but if there is only one episode, then the mp3 link is just sitting.

2.2.2. [2023-04-15 Sat] New version

Of course, they had to change everything… Now, the website got updated, the script breaks, the provided download link don't work with Apple podcast. The good news is that there is an API, which should make things easier.

the entry point is https://www.litteratureaudio.com/wp-json/
the "book" entry point is https://www.litteratureaudio.com/wp-json/wp/v2/posts, items reference their "episodes", named stations. This endpoint doesn't bring in much value. I will probably stick with the HTML thing, which makes it easier to get images, author names and such.
the "episodes" entry point is https://www.litteratureaudio.com/wp-json/wp/v2/station, they don't seem to reference their parent quite directly BUT they provide a proper link to the MP3 file, not the nonce obfuscated one.
the API seems to have search features though, maybe I could batch requests. Nope, actually it's bad enough that it embed stations in posts.

Anyway. https://github.com/jeromenerf/ab2rss

It took a good hour to fetch everything.

    here=`pwd`
    cd ~/public_html/audiobooks
    touch index.json
    echo -n "[" > index.json
    for file in `ls *.rss`; do
        parts=`echo $file | sed 's/[_,.-]/ /g'`
        echo "{\"title\": \"$parts\", \"uri\": \"$file\"}," >> index.json
    done
    echo -n "{}]" >> index.json
    cp index.json $here/
    date
    exit 0