This commit introduces a distance threshold mechanism for the Genius and
Google backends.
- Create a new `SearchBackend` base class with a method `check_match`
that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
and mention it in the docs. This controls the maximum allowable
distance for matching artist and title names.
These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
## Description
Added a quick checkpoint to ensure the config file is set up correctly
prior to users importing their music library. This was something I
discovered later after running into an issue with my config file and
hope it helps new users avoid the issues I had.
## New config option `keep_existing` (#4982)
- Fix the behavior of the`force` option. Previously disabling the option
had "incomplete" behaviour:
- If content was found, a whitelist check was issued and if valid the
plugin exited early and logged ("keep").
- This whitelist check was not aware of multiple genres (separated
typically by a string like `, `), thus it failed erased all existing
genres and overwrote with new ones.
_**This didn't feel like a typical behaviour of a `force` option, which
this PR tries to improve as follows...**_
- String-separated multi-genres are now compiled into a list and
depending on the `whitelist` option are kept and enriched with freshly
fetched last.fm genres.
- If force is off, pre-populated tags are not touched.
- A lot of refactoring was done, some absolutely required, some as a
preparation for future work on the plugin.
- The main processing function `_get_genre` was massively overhauled and
got a new `pytest.mark.parametrize` test which includes much more test
cases.
- Rename method _dedup_genre, since it's only used for
finalizing/polishing existing genres.
- Return separator-delimited string already.
- Decide on not passing "separator" to methods, it's a config
setting available throughout the plugin. Assign to variable where
useful for readability though.
- In the force branch, remove re-assigning keep_genres to empty list.
- Fix a test. Existing genres are "polished" now, which means:
configured title_case is applied.
- Fix/add type hints on all touched and new methods
- If the keep_existing option is set, just remember everything for now.
- Dedup happening later on via _combine... _resolve_genres...
- Even knowing if whitelist or not is not important at this point.
Useless variables that only were introduced for temporary debug logging
while refactoring earlier. Get rid of them.
Co-authored-by: Šarūnas Nejus <snejus@protonmail.com>
The best place to log what we actually fetched from last.fm seems to be
here in _combine_and_label_genres. Leave out the existing genres we also
receive in this function - less is more.