FichteFoll
1a23eab8b6
Use https for lyrics.wikia.com, when supported
2019-06-05 23:00:52 +02:00
Carl Suster
e4c03fd63f
Fix deprecated placement of inline regex flags
...
https://bugs.python.org/issue22493
2019-03-31 19:44:49 +11:00
Adrian Sampson
bac8faad78
Resolve W605: invalid escape sequence
...
This came up in lots of regexes that weren't using "raw" literals.
2018-08-13 10:41:01 -04:00
Abra
3348a466f4
Make lyrics plugin group songs by 'albumartist' rather than 'artist'
...
when writing ReST
2018-05-14 11:18:34 +04:00
rachmadaniHaryono
e90a547642
chg: dev: fix list item remove error
2018-05-03 09:46:03 +08:00
Adrian Sampson
224d782c2c
Fix #2771 : handle errors in genius lyrics source
2018-01-30 22:37:44 -05:00
Adrian Sampson
277d81b4d6
lyrics: Don't write ReST by default!
2018-01-30 22:33:32 -05:00
Adrian Sampson
e7417e3683
lyrics: Don't crash when BeautifulSoup isn't found
2018-01-30 22:31:15 -05:00
Lucas Magno
fc2d379fb5
Comply with PEP8
2017-10-09 06:22:42 -03:00
Lucas Magno
1b35a5df0d
Fetch lyrics from Genius through scraper
2017-10-08 09:13:51 -03:00
Adrian Sampson
c06eca7e58
Merge pull request #2634 from anarcat/musixmatch-block-detect
...
lyrics: detect MusixMatch blocking
2017-07-18 17:13:31 -04:00
Antoine Beaupré
5ef68783a8
strip trailing and leading extra dashes
...
those are introduced if non-word characters are found, and are ugly
2017-07-18 16:33:22 -04:00
Antoine Beaupré
b4b5473093
add pointer to slugify in slug
2017-07-18 16:14:10 -04:00
Antoine Beaupré
a8afabea80
move slug utility function to top-level
...
it's a generic utility function that can be reused, there's nothing
class-specific about it.
2017-07-18 16:12:48 -04:00
Antoine Beaupré
5e8d17a4fc
lyrics: detect MusixMatch blocking
...
we just look for the bad string in the HTML. this has the downside
that we may consider songs that have those exact lyrics (you never
know, really) may trigger this warning as well and we would fail to
fetch those songs.
we also fail if lyrics contain another magic string that seems to come
up when you do fill in the CAPTCHA after being blocked.
2017-07-17 12:21:55 -04:00
Antoine Beaupré
458f3636f4
compare artists based on the slug
...
this is necessary because otherwise artists with different string
representations but the same slug would overwrite one another
this outlines more clearly the code duplication between the rst code
and the slugify function, something which can be fixed later.
2017-07-17 11:59:14 -04:00
Antoine Beaupré
9c36a41ea8
slight refactoring: strip album only once
2017-07-17 11:50:15 -04:00
Antoine Beaupré
9894e8752b
ignore trailing/leading whitespace when comparing artists
2017-07-17 11:49:35 -04:00
Antoine Beaupré
36f84bfedd
add missing trailing newline after lyrics block
...
this would yield a warning for every song
2017-07-17 11:44:06 -04:00
Adrian Sampson
b303d5beb0
Slightly more complete sentences in comments
2017-07-17 10:59:04 -04:00
Antoine Beaupré
93966ed4ee
strip whitespace in titles
...
this would cause problems with songs that had trailing spaces with the
index directive
2017-07-17 09:00:22 -04:00
Antoine Beaupré
b6e42ee2e8
fix another unicode error
...
the unicode strings are not binary - rely on Python to do the right
thing here instead of encoding a string we know is already properly
encoded
2017-07-17 08:55:09 -04:00
Antoine Beaupré
6d58110bd2
move heredocs to top-level globals
2017-07-17 08:50:19 -04:00
Antoine Beaupré
7e0a48a46d
s/rest/rest/
2017-07-17 08:49:40 -04:00
Adrian Sampson
9de94378b9
An even shorter metavariable
2017-07-16 10:14:49 -04:00
Adrian Sampson
813cf97686
Better metavariable for lyrics --help output
2017-07-16 10:10:41 -04:00
Antoine Beaupré
5d8c15980e
fix flake8 warning
2017-07-15 16:24:07 -04:00
Antoine Beaupré
0bcd16f1ab
deal with encoding issues in python3
...
when we encode explicitly, we return bytes, so open files as binary
2017-07-15 16:21:41 -04:00
Antoine Beaupré
f667428758
write sphinx base files
...
we write the artists files in a subdirectory, to avoid infinite
recursions or flooding the current directory needlessly.
this way, the user has a good base structure and can just chain the
command into sphinx to continue building the next format, after
possible tweaks.
2017-07-15 15:43:16 -04:00
Antoine Beaupré
e6adb5e7da
cosmetic: do not use needless heredoc
2017-07-15 15:33:35 -04:00
Antoine Beaupré
91de8aac84
move rst writer to a different function
...
this simplifies and clarifies the code, although we need to call the writerst function twice to wrap up at the end of the loop
2017-07-15 15:33:35 -04:00
Antoine Beaupré
d330353e1c
rename the skip option to local
...
skip was a misnomer: we actually skip "unfetched" lyrics. this means
it's somewhat of a double-negative and really confusing.
--local is clearer, although less in opposition with --force
2017-07-15 14:19:25 -04:00
Antoine Beaupré
ac32ae574c
optimize: write only 3 times per file
...
this makes the code more readable and reduces the number of syscalls
to write files
2017-07-15 09:23:59 -04:00
Antoine Beaupré
469c03a7bf
deal properly with empty album titles
2017-07-15 09:23:59 -04:00
Antoine Beaupré
63aa3b3165
write to separate rst files
...
this makes the ePUB easier to parse by e-readers, because they do not
need to load one giant HTML file, but one per author. it also makes
sphinx rendering more efficient and interactive
2017-07-15 09:23:58 -04:00
Antoine Beaupré
0fbfa1feae
render RST instead of HTML
...
ReStructuredText has the advantage over HTML that it can be rendered
easily to multiple formats (HTML, ePUB, PDF) and it supports indexes.
the output needs to be fed into a file and integrated into an existing
Sphinx document, of course.
2017-07-14 17:34:55 -04:00
Antoine Beaupré
9f3e5b28b4
output lyrics in HTML, allow skipping
...
the idea here is to format the lyrics output a little better so that
it can (for example) be shown as a web page or an ebook.
the new skip option allows for faster generation of the output in the
(most common) case where not all lyrics are available.
2017-07-14 15:31:22 -04:00
Fabrice Laporte
409f070970
Remove lyrics.com source
2017-05-03 22:54:09 +02:00
Fabrice Laporte
7dab9f339e
Restore beets module import
2017-05-02 23:48:20 +02:00
Fabrice Laporte
07af27e44b
Lyrics are last paragraph with class 'mxm-lyrics__content'
...
Remove ‘data-reactid’ from marker.
2017-05-02 23:40:25 +02:00
Fabrice Laporte
3e38a33c4a
Fix PEP8
2017-05-02 23:37:20 +02:00
Fabrice Laporte
11eb90c758
Fix PEP8
2017-05-02 07:46:51 +02:00
Fabrice Laporte
3e3ad6974c
Fix PEP8
2017-05-02 07:30:40 +02:00
Fabrice Laporte
a165d6c00b
Fix MusiXmatch text extraction markers
2017-05-01 23:40:09 +02:00
Fabrice Laporte
2bf58a61c3
Decode string with Unicode escape
2017-04-30 23:14:23 +02:00
Adrian Sampson
0a4709f7ef
lyrics: Tolerate empty Google response ( #2437 )
2017-02-13 16:54:56 -05:00
Adrian Sampson
8087e82891
lyrics: Use Requests for Google backend ( fix #2437 )
2017-02-12 10:30:22 -05:00
Adrian Sampson
d389ac15e1
Use HTTPS for MS translator API (from #2247 )
2017-01-02 21:00:01 -05:00
Adrian Sampson
fbc0f322f6
Merge branch 'tigranl-https_fix'
2017-01-02 20:54:17 -05:00
Adrian Sampson
f941fd42de
Always use SSL on servers that don't require SNI
...
I did a little audit using the `openssl` command-line tool to find the servers
that don't require SNI. Here's what I found:
icbrainz.org: SNI
images.weserv.nl: inconclusive, but docs say yes SNI
coverartarchive.org: SNI
webservice.fanart.tv: *no* SNI
dbpedia.org: *no* SNI
en.wikipedia.org: *no* SNI
ws.audioscrobbler.com: *no* SNI
api.microsofttranslator.com: *no* SNI
In summary, *only* MusicBrainz and CoverArtArchive were found to require SNI.
So I'm using SSL unconditionally on all the other sites.
2017-01-02 20:39:10 -05:00
Adrian Sampson
8bb24e3134
lyrics: Set User-Agent header ( fix #2357 )
2016-12-30 10:55:24 -05:00
tigranl
dd115b1310
Add ui import
2016-12-11 00:35:51 +03:00
tigranl
5ca664e4aa
Fix typos
2016-12-11 00:25:37 +03:00
tigranl
6ba5099034
Python version check for lyrics.py
2016-12-06 16:17:25 +03:00
Adrian Sampson
62e9a15f4d
Fix a copy n' paste error found by flake8
2016-11-16 12:03:07 -05:00
Fabrice Laporte
7226624405
replace strip_part() by generate_alternatives()
...
Delegate the update of titles and artists lists to the helper
generate_alternatives() function.
2016-09-25 19:37:14 +02:00
Fabrice Laporte
e2703b9a7c
always yield item artist and title first
...
Rather than using an unordered set for storing pairs, append to a list
and build an OrderedDict from it to filter duplicated strings while
keeping order.
2016-09-25 15:46:22 +02:00
Fabrice Laporte
8b4f39da42
lyrics: search for song title part preceding colon. fix #2205
2016-09-23 22:23:32 +02:00
Fabrice Laporte
4b702b338e
lyrics: reduce code duplication in search_pairs()
2016-09-23 22:21:00 +02:00
Johnny Robeson
7a2bdf502f
s/utf8/utf-8/ in all encoding/decoding contexts
...
This matches up with the python documentation.
2016-09-06 23:10:24 -04:00
Johnny Robeson
fcbfce3984
replace deprecated log.warn() with log.warning()
2016-08-09 00:33:38 -04:00
Johnny Robeson
be08d4b129
replace unichr with six.unichr in lyrics plugin
2016-07-02 02:36:05 -04:00
Adrian Sampson
5efd5b21c5
Use new as_str method
...
Instead of `get(six.text_type)`, which was a surprisingly large portion of our
uses of six.
2016-06-25 19:16:14 -07:00
Adrian Sampson
e16cc58cb9
Walk back some six.iter* uses
...
In places where it doesn't much matter whether we use an iterator or the old
Python 2 list way, using the six name just hurts legibility.
2016-06-25 18:29:55 -07:00
Johnny Robeson
78334876c3
treat HTMLParseError as a noop when missing
...
Strict mode no longer exists in html.parser on python >= 3.5, and no longer means anything on python >= 3.3
2016-06-24 05:53:56 -04:00
Johnny Robeson
edb1cbc5fc
replace iter{items|values} with six.iter{items|values}
2016-06-24 05:53:55 -04:00
Johnny Robeson
e8afcbe7ec
replace unicode with six.text_type
2016-06-24 05:53:49 -04:00
Johnny Robeson
4649226b9b
use urllib from six.moves
2016-06-23 04:40:18 -04:00
Johnny Robeson
129e140015
use html_parser (really html.parser) from six.moves
2016-06-23 04:40:18 -04:00
Johnny Robeson
8fa71f78fe
decode bytes from .encode() in lyrics plugin
2016-06-14 00:44:43 -04:00
Adrian Sampson
0051bdb506
lyrics: Avoid a spurious warning
2016-06-02 21:33:33 -07:00
Adrian Sampson
581fba6288
lyrics: Avoid crash when enabling google
...
If you *both* haven't set an API key *and* BeautifulSoup wasn't
installed, the list.remove() call would crash. (This came up when
running the tests on a fresh machine without many dependencies.)
2016-06-02 11:58:14 -07:00
wordofglass
1dd6739218
lyrics: fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash
2016-04-30 01:25:02 +02:00
wordofglass
c3c7da8061
lyrics: simplify source handling a little
2016-04-28 18:31:22 +02:00
wordofglass
2928a16bd5
lyrics: actually disable translation when there's no langdetect
2016-04-28 17:22:55 +02:00
wordofglass
c4b11f889f
lyrics: clean up import handling and source removal
2016-04-28 17:15:25 +02:00
Jack Wilsdon
c5e2334fb5
Remove useless unescape
...
Remove useless unescape as _scrape_script_cruft does it for us.
2016-04-25 19:24:26 +01:00
Jack Wilsdon
1be9c3003e
Use different method to remove junk from LyricsWiki
...
Use `_scrape_strip_cruft` instead of `scrape_lyrics_from_html` so that
LyricsWiki does not depend on Beautiful Soup.
2016-04-25 19:14:30 +01:00
wordofglass
607f41be43
Fix the previous fix...
2016-04-24 00:42:31 +02:00
wordofglass
4a5b886944
Fix two non-guarded import statements in the lyrics plugin
...
These could make the import process crash with a traceback.
2016-04-24 00:35:15 +02:00
Guilherme Danno
bf1b06f0c7
don't print entire lyrics during import
2016-04-22 17:30:06 -03:00
Fabrice Laporte
05970e8a93
re-query token when it has expired
2016-04-14 22:57:41 +02:00
Fabrice Laporte
56d7e5dfa0
send as little text as possible to bing api
...
Bing API has a limit of 2M chars/month. It’s common to have repeating
sentences in lyrics so to reduce number of chars sent per song, store
sentences in a set and send it, instead of sending the whole lyrics.
2016-04-14 22:57:17 +02:00
Fabrice Laporte
6cfc106b8a
better docs and debug msg
2016-04-14 08:31:55 +02:00
Fabrice Laporte
58df77e2cb
langdetect conditional import
2016-04-14 08:31:14 +02:00
Fabrice Laporte
e03c3af91f
don't translate lyrics already in the target language
2016-04-14 01:11:14 +02:00
Fabrice Laporte
66a627fed8
restore module docstring
2016-04-14 00:58:42 +02:00
Fabrice Laporte
3c2479ab49
translate lyrics using Bing API
...
By subscribing to Microsoft Translator API, one can now activate the
translation of lyrics from one set of source langages to a target
langage.
Translations are appended to each original sentence using ‘/‘ as
separator.
2016-04-14 00:53:58 +02:00
Fabrice Laporte
d67950cdcc
pep8
2016-04-14 00:45:55 +02:00
Adrian Sampson
d1753b341e
lyrics: Some comments and better naming
2016-03-21 10:28:30 -07:00
Adrian Sampson
f684f29a25
lyrics: Tolerate pages without text ( fix #1914 )
2016-03-21 10:24:13 -07:00
Adrian Sampson
c9be5bc7d1
Merge pull request #1911 from jackwilsdon/fix-musixmatch-url
...
Fix MusixMatch issues
2016-03-18 11:51:27 -04:00
Jack Wilsdon
60148918d9
Fix LyricsWiki scraping code
...
LyricsWiki now escapes song lyrics using HTML entities (presumably to
prevent scraping), so we now unescape these before parsing.
LyricsWiki has also added a script tag inside the div we are scraping,
so we have to remove this using `scrape_lyrics_from_html`.
2016-03-17 17:49:41 +00:00
Jack Wilsdon
c417003184
Add missing newline
2016-03-16 21:07:28 +00:00
Jack Wilsdon
1ec06e14c5
Fix lyrics extraction from MusiXmatch
...
Remove "lyrics_" prefix from extract_text_between arguments to reflect
changes made to the MusiXmatch website.
2016-03-16 20:48:57 +00:00
Jack Wilsdon
44c799320f
Improve URL generation in lyrics plugin
...
Allow custom replacements to be defined in subclasses of
SymbolsReplaced.
Replace spaces with a hyphens when the source is MusiXmatch, instead of
(incorrectly) using underscores. This fixes #1880 .
2016-03-16 20:46:36 +00:00
Adrian Sampson
e54c7eec3d
Standardize __future__ imports without parentheses
...
Since the list is short enough now, we don't need parentheses for the line
wrap. This is a little less ugly.
2016-02-28 15:03:51 -08:00
Adrian Sampson
d53019f9db
Further whitespace fiddling
...
Most commonly, this sticks with:
log.debug(
'some long message here'
)
instead of placing the closing ) at the end of the string literal.
2016-02-28 14:48:10 -08:00
Peter Kessen
f2fc1a78bf
Removed import of unicode_literals from plugins
...
* keyfinder
* lastimport
* lyrics
2016-02-20 13:59:58 +01:00
Adrian Sampson
60888274c4
lyrics: Re-disable Genius backend
...
As #1854 pointed out, the Genius API service is down *again*.
2016-02-02 08:14:22 -08:00