we just look for the bad string in the HTML. this has the downside
that we may consider songs that have those exact lyrics (you never
know, really) may trigger this warning as well and we would fail to
fetch those songs.
we also fail if lyrics contain another magic string that seems to come
up when you do fill in the CAPTCHA after being blocked.
this is necessary because otherwise artists with different string
representations but the same slug would overwrite one another
this outlines more clearly the code duplication between the rst code
and the slugify function, something which can be fixed later.
we write the artists files in a subdirectory, to avoid infinite
recursions or flooding the current directory needlessly.
this way, the user has a good base structure and can just chain the
command into sphinx to continue building the next format, after
possible tweaks.
skip was a misnomer: we actually skip "unfetched" lyrics. this means
it's somewhat of a double-negative and really confusing.
--local is clearer, although less in opposition with --force
this makes the ePUB easier to parse by e-readers, because they do not
need to load one giant HTML file, but one per author. it also makes
sphinx rendering more efficient and interactive
ReStructuredText has the advantage over HTML that it can be rendered
easily to multiple formats (HTML, ePUB, PDF) and it supports indexes.
the output needs to be fed into a file and integrated into an existing
Sphinx document, of course.
the idea here is to format the lyrics output a little better so that
it can (for example) be shown as a web page or an ebook.
the new skip option allows for faster generation of the output in the
(most common) case where not all lyrics are available.
I did a little audit using the `openssl` command-line tool to find the servers
that don't require SNI. Here's what I found:
icbrainz.org: SNI
images.weserv.nl: inconclusive, but docs say yes SNI
coverartarchive.org: SNI
webservice.fanart.tv: *no* SNI
dbpedia.org: *no* SNI
en.wikipedia.org: *no* SNI
ws.audioscrobbler.com: *no* SNI
api.microsofttranslator.com: *no* SNI
In summary, *only* MusicBrainz and CoverArtArchive were found to require SNI.
So I'm using SSL unconditionally on all the other sites.