Commit graph

199 commits

Author SHA1 Message Date
Antoine Beaupré
f667428758
write sphinx base files
we write the artists files in a subdirectory, to avoid infinite
recursions or flooding the current directory needlessly.

this way, the user has a good base structure and can just chain the
command into sphinx to continue building the next format, after
possible tweaks.
2017-07-15 15:43:16 -04:00
Antoine Beaupré
e6adb5e7da
cosmetic: do not use needless heredoc 2017-07-15 15:33:35 -04:00
Antoine Beaupré
91de8aac84
move rst writer to a different function
this simplifies and clarifies the code, although we need to call the writerst function twice to wrap up at the end of the loop
2017-07-15 15:33:35 -04:00
Antoine Beaupré
d330353e1c
rename the skip option to local
skip was a misnomer: we actually skip "unfetched" lyrics. this means
it's somewhat of a double-negative and really confusing.

--local is clearer, although less in opposition with --force
2017-07-15 14:19:25 -04:00
Antoine Beaupré
ac32ae574c
optimize: write only 3 times per file
this makes the code more readable and reduces the number of syscalls
to write files
2017-07-15 09:23:59 -04:00
Antoine Beaupré
469c03a7bf
deal properly with empty album titles 2017-07-15 09:23:59 -04:00
Antoine Beaupré
63aa3b3165
write to separate rst files
this makes the ePUB easier to parse by e-readers, because they do not
need to load one giant HTML file, but one per author. it also makes
sphinx rendering more efficient and interactive
2017-07-15 09:23:58 -04:00
Antoine Beaupré
0fbfa1feae
render RST instead of HTML
ReStructuredText has the advantage over HTML that it can be rendered
easily to multiple formats (HTML, ePUB, PDF) and it supports indexes.

the output needs to be fed into a file and integrated into an existing
Sphinx document, of course.
2017-07-14 17:34:55 -04:00
Antoine Beaupré
9f3e5b28b4
output lyrics in HTML, allow skipping
the idea here is to format the lyrics output a little better so that
it can (for example) be shown as a web page or an ebook.

the new skip option allows for faster generation of the output in the
(most common) case where not all lyrics are available.
2017-07-14 15:31:22 -04:00
Fabrice Laporte
409f070970 Remove lyrics.com source 2017-05-03 22:54:09 +02:00
Fabrice Laporte
7dab9f339e Restore beets module import 2017-05-02 23:48:20 +02:00
Fabrice Laporte
07af27e44b Lyrics are last paragraph with class 'mxm-lyrics__content'
Remove ‘data-reactid’ from marker.
2017-05-02 23:40:25 +02:00
Fabrice Laporte
3e38a33c4a Fix PEP8 2017-05-02 23:37:20 +02:00
Fabrice Laporte
11eb90c758 Fix PEP8 2017-05-02 07:46:51 +02:00
Fabrice Laporte
3e3ad6974c Fix PEP8 2017-05-02 07:30:40 +02:00
Fabrice Laporte
a165d6c00b Fix MusiXmatch text extraction markers 2017-05-01 23:40:09 +02:00
Fabrice Laporte
2bf58a61c3 Decode string with Unicode escape 2017-04-30 23:14:23 +02:00
Adrian Sampson
0a4709f7ef lyrics: Tolerate empty Google response (#2437) 2017-02-13 16:54:56 -05:00
Adrian Sampson
8087e82891 lyrics: Use Requests for Google backend (fix #2437) 2017-02-12 10:30:22 -05:00
Adrian Sampson
d389ac15e1 Use HTTPS for MS translator API (from #2247) 2017-01-02 21:00:01 -05:00
Adrian Sampson
fbc0f322f6 Merge branch 'tigranl-https_fix' 2017-01-02 20:54:17 -05:00
Adrian Sampson
f941fd42de Always use SSL on servers that don't require SNI
I did a little audit using the `openssl` command-line tool to find the servers
that don't require SNI. Here's what I found:

icbrainz.org: SNI
images.weserv.nl: inconclusive, but docs say yes SNI
coverartarchive.org: SNI
webservice.fanart.tv: *no* SNI
dbpedia.org: *no* SNI
en.wikipedia.org: *no* SNI
ws.audioscrobbler.com: *no* SNI
api.microsofttranslator.com: *no* SNI

In summary, *only* MusicBrainz and CoverArtArchive were found to require SNI.
So I'm using SSL unconditionally on all the other sites.
2017-01-02 20:39:10 -05:00
Adrian Sampson
8bb24e3134 lyrics: Set User-Agent header (fix #2357) 2016-12-30 10:55:24 -05:00
tigranl
dd115b1310 Add ui import 2016-12-11 00:35:51 +03:00
tigranl
5ca664e4aa Fix typos 2016-12-11 00:25:37 +03:00
tigranl
6ba5099034 Python version check for lyrics.py 2016-12-06 16:17:25 +03:00
Adrian Sampson
62e9a15f4d Fix a copy n' paste error found by flake8 2016-11-16 12:03:07 -05:00
Fabrice Laporte
7226624405 replace strip_part() by generate_alternatives()
Delegate the update of titles and artists lists to the helper
generate_alternatives() function.
2016-09-25 19:37:14 +02:00
Fabrice Laporte
e2703b9a7c always yield item artist and title first
Rather than using an unordered set for storing pairs, append to a list
and build an OrderedDict from it to filter duplicated strings while
keeping order.
2016-09-25 15:46:22 +02:00
Fabrice Laporte
8b4f39da42 lyrics: search for song title part preceding colon. fix #2205 2016-09-23 22:23:32 +02:00
Fabrice Laporte
4b702b338e lyrics: reduce code duplication in search_pairs() 2016-09-23 22:21:00 +02:00
Johnny Robeson
7a2bdf502f s/utf8/utf-8/ in all encoding/decoding contexts
This matches up with the python documentation.
2016-09-06 23:10:24 -04:00
Johnny Robeson
fcbfce3984 replace deprecated log.warn() with log.warning() 2016-08-09 00:33:38 -04:00
Johnny Robeson
be08d4b129 replace unichr with six.unichr in lyrics plugin 2016-07-02 02:36:05 -04:00
Adrian Sampson
5efd5b21c5 Use new as_str method
Instead of `get(six.text_type)`, which was a surprisingly large portion of our
uses of six.
2016-06-25 19:16:14 -07:00
Adrian Sampson
e16cc58cb9 Walk back some six.iter* uses
In places where it doesn't much matter whether we use an iterator or the old
Python 2 list way, using the six name just hurts legibility.
2016-06-25 18:29:55 -07:00
Johnny Robeson
78334876c3 treat HTMLParseError as a noop when missing
Strict mode no longer exists in html.parser on python >= 3.5, and no longer means anything on python >= 3.3
2016-06-24 05:53:56 -04:00
Johnny Robeson
edb1cbc5fc replace iter{items|values} with six.iter{items|values} 2016-06-24 05:53:55 -04:00
Johnny Robeson
e8afcbe7ec replace unicode with six.text_type 2016-06-24 05:53:49 -04:00
Johnny Robeson
4649226b9b use urllib from six.moves 2016-06-23 04:40:18 -04:00
Johnny Robeson
129e140015 use html_parser (really html.parser) from six.moves 2016-06-23 04:40:18 -04:00
Johnny Robeson
8fa71f78fe decode bytes from .encode() in lyrics plugin 2016-06-14 00:44:43 -04:00
Adrian Sampson
0051bdb506 lyrics: Avoid a spurious warning 2016-06-02 21:33:33 -07:00
Adrian Sampson
581fba6288 lyrics: Avoid crash when enabling google
If you *both* haven't set an API key *and* BeautifulSoup wasn't
installed, the list.remove() call would crash. (This came up when
running the tests on a fresh machine without many dependencies.)
2016-06-02 11:58:14 -07:00
wordofglass
1dd6739218 lyrics: fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash 2016-04-30 01:25:02 +02:00
wordofglass
c3c7da8061 lyrics: simplify source handling a little 2016-04-28 18:31:22 +02:00
wordofglass
2928a16bd5 lyrics: actually disable translation when there's no langdetect 2016-04-28 17:22:55 +02:00
wordofglass
c4b11f889f lyrics: clean up import handling and source removal 2016-04-28 17:15:25 +02:00
Jack Wilsdon
c5e2334fb5 Remove useless unescape
Remove useless unescape as _scrape_script_cruft does it for us.
2016-04-25 19:24:26 +01:00
Jack Wilsdon
1be9c3003e Use different method to remove junk from LyricsWiki
Use `_scrape_strip_cruft` instead of `scrape_lyrics_from_html` so that
LyricsWiki does not depend on Beautiful Soup.
2016-04-25 19:14:30 +01:00