See https://realpython.com/python-namespace-package.
This setup is backwards-compatible, so plugins using the old
pkgutil-based setup will continue working fine.
This setup has an advantage where external plugins will now be able to
import modules from 'beetsplug' package for typing purposes. Previously,
mypy could not resolve these modules due to presence of `__init__.py`.
If we get caught by Cloudfare, it forwards our request somewhere else
and returns some validation text response. To make sure that this text
does not get assumed for lyrics, we can disable redirects for the Google
backend, check the response code and raise if there's a redirect
attempt. This source will then be skipped and the backend continues with
the next one.
Additionally, improve HTML pre-processing:
* Ensure a new line between blocks of lyrics text from letras.mus.br.
* Parse a missing last block of lyrics text from lacocinelle.net.
* Parse a missing last block of lyrics text from paroles.net.
* Fix encoding issues with AZLyrics by setting response encoding to
None, allowing `requests` to handle it.
* Type the response data that Google Custom Search API return.
* Exclude some 'letras.mus.br' pages that do not contain lyric.
* Exclude results from Musixmatch as we cannot access their pages.
* Improve parsing of the URL title:
- Handle long URL titles that get truncated (end with ellipsis) for
long searches
- Remove domains starting with 'www'
- Parse the title AND the artist. Previously this would only parse the
title, and fetch lyrics even when the artist did not match.
* Remove now redundant credits cleanup and checks for valid lyrics.
Tidy up 'Google.is_page_candidate' method and remove 'Google.sluggify'
method which was a duplicate of 'slug'.
Since 'GeniusFetchTest' only tested whether the artist name is cleaned
up (the rest of the functionality is patched), remove it and move its
test cases to the 'test_slug' test.
Having removed it I fuond that only the Genius lyrics changed: it had en
extra new line. Thus I defined a function 'collapse_newlines' which now
gets called for the Genius lyrics.
Improve requests performance with requests.Session which uses connection
pooling for repeated requests to the same host.
Additionally, this centralizes request configuration, making sure that
we use the same timeout and provide beets user agent for all requests.
This commit introduces a distance threshold mechanism for the Genius and
Google backends.
- Create a new `SearchBackend` base class with a method `check_match`
that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
and mention it in the docs. This controls the maximum allowable
distance for matching artist and title names.
These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
- Rename method _dedup_genre, since it's only used for
finalizing/polishing existing genres.
- Return separator-delimited string already.
- Decide on not passing "separator" to methods, it's a config
setting available throughout the plugin. Assign to variable where
useful for readability though.
- In the force branch, remove re-assigning keep_genres to empty list.
- Fix a test. Existing genres are "polished" now, which means:
configured title_case is applied.
- Fix/add type hints on all touched and new methods