Fixes an issue where each spotify query was converted to ascii before sending. Adds a
new config option to enable legacy behaviour.
A file called japanese_track_request.json was made to mimic the Spotify
API response since I don't have the credentials. Entries in that will
need to be modified with the actual entries.
Co-authored-by: Sebastian Mohr <sebastian@mohrenclan.de>
Co-authored-by: Sebastian Mohr <39738318+semohr@users.noreply.github.com>
Co-authored-by: J0J0 Todos <2733783+JOJ0@users.noreply.github.com>
avoid linter error
avoid other linter error
fix format
changing deps (no lock!)
poetry lock?
lint & format
attempt 2 at poetry lock
crlf -> lf line endings
changelog!
Adds replace plugin. The plugin allows the user to replace the audio
file of a song, while keeping the tags and file name.
Some music servers keep track of favourite songs via paths and tags. Now
there won't be a need to 'refavourite'. Plus, this skips the
import/merge steps.
- Instead of checking for empty `artist` query, use `va_likely`
parameter to determine whether we should query for Various Artists or
not.
- `album` / `title` is always a truthy string - no need to handle empty
criteria case
- `tracks` list always has at least one track - no need to check for
`len(items)`
Using the correct function signature for g_file_new_for_path fixes the
tests on s390x.
I do not have the full story on why this failed consistently only on
s390x, but I guess the big endian might have something to play with
this.
Here is how the tests were failing:
```
169s ___________________________ ThumbnailsTest.test_uri ____________________________
169s
169s self = <test.plugins.test_thumbnails.ThumbnailsTest testMethod=test_uri>
169s
169s def test_uri(self):
169s gio = GioURI()
169s if not gio.available:
169s self.skipTest("GIO library not found")
169s
169s > assert gio.uri("/foo") == "file:///" # silent fail
169s E AssertionError: assert '' == 'file:///'
169s E
169s E - file:///
169s
169s test/plugins/test_thumbnails.py:268: AssertionError
```
You can see a full log here [1] and a history of consistent failure
here [2]. Both links are bound to expire at some point, sorry future
archeologist 🤷.
[1]: https://autopkgtest.ubuntu.com/results/autopkgtest-plucky/plucky/s390x/b/beets/20250403_162414_5d1da@/log.gz#S5
[2]: https://autopkgtest.ubuntu.com/packages/beets/plucky/s390x
This was not thought through clearly before. It now behaves as follows
which I suppose is least surprising to a user:
- force is on, keep_existing is on, but the whitelist is DISABLED
- no stage found anything on last.fm
- fall back to the original genre
If in this example the whitelist would be ENABLED, the behaviour
changes: Only if the existing genre passes the whitelist test the
original is kept.
If no album was found the next stage (artist) should be entered, the
original genre kicked out (not whitelisted) and artist genre accepted.
The log message is slightly misleading since it tried to keep existing
genres but they were not whitelisted, thus kicked out. This is expected.
- Rename method from _combine_genres() to _combine_resolve_and_log() to
make clear that it not only combines new and old genres but also
resolves them (which in this plugin's wording means "do the magic" of
canonicalizationm, whitelist checking and reducing to a configured
genre count).
- Clarify in _resolve docstring that a possible outcome might be all
genres being removed.
- Add an additional log message telling which existing genres are taken
into account BEFORE "the magic happens".
- Rename _to_delimited_genre_string() to _format_and_stringify()
- Move count reduction logic to _resolve_genres()
- Fix and rename a test
I found that the translator would sometimes replace the pipe character
with another symbol (maybe it got confused thinking the character is
part of the text?).
Added spaces around the pipe to make it more clear that it's definitely
the separator.
The return type of the stage decorator should in theory be `T|None`
but the return of task types is not consistent in its usage. Would need
some bigger changes for which I'm not ready at the moment.
URL-encode additional item `fields` within generated EXTM3U playlists instead of JSON-encoding them.
This is because JSON-encoding additional fields/attributes made it difficult to parse the `EXTINF` line but using URL-encoding for these values makes parsing easy (because URL-encoded values cannot contain commas, quotation marks and spaces).
I introduced the generation of additional EXTM3U item fields earlier this year and I want to correct that now.
**Design/definition background:**
Unfortunately, I didn't find a clear definition of how additional playlist item attributes should be encoded - apparently there is none.
Given that item URIs within an M3U playlist can be URL-encoded already, defining the values of additional attributes to be URL-encoded is consistent design.
I didn't find examples of additional EXTM3U item attributes in the web where the attribute value contains a comma, space or quotation mark but examples that specified numeric IDs and URLs as attribute values.
Because the URL attribute examples I found didn't contain URL-encoded characters and because it is more readable and unproblematic for parsing, I've let the attribute URL encoding treat `:` and `/` as safe characters.
**Breaking change:**
While this is a breaking change in theory, in practice it is not since afaik all integrations of the smartplaylist plugin's additional EXTM3U item attribute generation feature (beets-webm3u) work with simple attribute values such as the item ID (numeric) whose formatting/encoding is not affected when changing from JSON to URL-encoding.
In other words the change is backward-compatible with the beets-webm3u plugin (which I'll adjust correspondingly after this beets PR was merged).
Additionally, improve HTML pre-processing:
* Ensure a new line between blocks of lyrics text from letras.mus.br.
* Parse a missing last block of lyrics text from lacocinelle.net.
* Parse a missing last block of lyrics text from paroles.net.
* Fix encoding issues with AZLyrics by setting response encoding to
None, allowing `requests` to handle it.
* Type the response data that Google Custom Search API return.
* Exclude some 'letras.mus.br' pages that do not contain lyric.
* Exclude results from Musixmatch as we cannot access their pages.
* Improve parsing of the URL title:
- Handle long URL titles that get truncated (end with ellipsis) for
long searches
- Remove domains starting with 'www'
- Parse the title AND the artist. Previously this would only parse the
title, and fetch lyrics even when the artist did not match.
* Remove now redundant credits cleanup and checks for valid lyrics.
Tidy up 'Google.is_page_candidate' method and remove 'Google.sluggify'
method which was a duplicate of 'slug'.
Since 'GeniusFetchTest' only tested whether the artist name is cleaned
up (the rest of the functionality is patched), remove it and move its
test cases to the 'test_slug' test.
Having removed it I fuond that only the Genius lyrics changed: it had en
extra new line. Thus I defined a function 'collapse_newlines' which now
gets called for the Genius lyrics.
This commit introduces a distance threshold mechanism for the Genius and
Google backends.
- Create a new `SearchBackend` base class with a method `check_match`
that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
and mention it in the docs. This controls the maximum allowable
distance for matching artist and title names.
These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
- Rename method _dedup_genre, since it's only used for
finalizing/polishing existing genres.
- Return separator-delimited string already.
- Decide on not passing "separator" to methods, it's a config
setting available throughout the plugin. Assign to variable where
useful for readability though.
- In the force branch, remove re-assigning keep_genres to empty list.
- Fix a test. Existing genres are "polished" now, which means:
configured title_case is applied.
- Fix/add type hints on all touched and new methods
- Adapt tests to _resolve_genres returning a list with not yet formatted genres.
- Rename and adapt test_count -> test_to_delimited_string. Note that the
new function does not apply whitelist, prefer anything. It just cuts
to count and formats!
- No idea where a missing separator (which is default) could
happen...just set it explicitely.
- Since we now refactored fetch_genre to returning a list we can add
mock multiple fetched gernes easier.
I found that the `/get` endpoint often returns incorrect or unsynced
lyrics, while results returned by the `/search` more accurate options.
Thus I reversed the change in the previous commit to prioritize
searching first.
Adjust the base URL to perform a '/search' instead of attempting to
'/get' specific lyrics where we're unlikely to find lyrics for the
specific combination of album, artist, track names and the duration (see
https://lrclib.net/docs).
Since we receive an array of matching lyrics candidates, rank them by
their duration similarity to the item's duration, and whether they
contain synced lyrics.
Add explicit checks for lyrics texts fetched from the tested sources.
- Introduced `LyricsPage` class to represent lyrics pages for integrated
tests.
- Configured expected lyrics for each of the URLs that are being
fetched.
- Consolidated integrated tests in a new `TestLyricsSources` class.
- Mocked Google Search API to return the lyrics page under test.
Since at least one Backend requires album` and `duration` arguments
(`LRCLib`), the caller (`LyricsPlugin.fetch_item_lyrics`) must always
provide them.
Since they need to provided, we need to enforce this by defining them as
positional arguments.
Why is this important? I found that integrated `LRCLib` tests have been
passing, but they called `LRCLib.fetch` with values for `artist` and
`title` fields only, while the actual functionality *always* provides
values for `album` and `duration` fields too.
When I adjusted the test to provide values for the missing fields,
I found that it failed. This makes sense: Lib `album` and `duration`
filters are strict on LRCLib, so I was not surprised the lyrics could
not be found.
Thus I adjusted `LRCLib` backend implementation to only filter by each
of these fields when their values are truthy.
Modified `search_pairs` function in `lyrics.py` to:
* Firstly strip each of `artist`, `artist_sort` and `title` fields
* Only generate alternatives if both `artist` and `title` are not empty
* Ensure that `artist_sort` is not empty and not equal to artist (ignoring
case) before appending it to the artists
Extended tests to cover the changes.
- Consolidated multiple test cases into parameterized tests for better
readability and maintainability.
- Simplified assertions by comparing lists of actual and expected
artists/titles.
- Added `unexpected_empty_artist` marker to handle cases which
unexpectedly return an empty artist. This seems to be happen when
`artist_sort` field is empty.
- Replaced unittest.mock with pytest fixtures for better test isolation and readability.
- Simplified test cases by using parameterized tests.
- Added `requests-mock` dependency to `pyproject.toml` and `poetry.lock`.
- Removed redundant helper functions and classes.
This utilises regex substitution in the substitute plugin. The previous
approach only used regex to match the pattern, then replaced it with a
static string. This change allows more complex substitutions, where the
output depends on the input.
### Example use case
Say we want to keep only the first artist of a multi-artist credit, as
in the following list:
```
Neil Young & Crazy Horse -> Neil Young
Michael Hurley, The Holy Modal Rounders, Jeffrey Frederick & The Clamtones -> Michael Hurley
James Yorkston and the Athletes -> James Yorkston
````
This would previously have required three separate rules, one for each
resulting artist. By using a regex substitution, we can get the desired
behaviour in a single rule:
```yaml
substitute:
^(.*?)(,| &| and).*: \1
```
(Capture the text until the first `,` ` &` or ` and`, then use that
capture group as the output)
### Notes
I've kept the previous behaviour of only applying the first matching
rule, but I'm not 100% sure it's the ideal approach.
I can imagine both cases where you want to apply several rules in
sequence and cases where you want to stop after the first match.
This PR refactors the test codebase by removing redundant functions and
simplifying item and album creation. Key changes include:
- Removed redundant `_item_ident` index tracker from `_common.py`.
- Removed `album` function from `_common.py` replacing it with direct
`library.Album` invocations.
- Removed `generate_album_info` and `generate_track_info` functions,
replacing them directly with `TrackInfo` and `AlbumInfo`.
- Updated `setup.cfg` to exclude test helper files from coverage
reports.
- Adjusted the tests regarding the changes, and simplified
`test_mbsync.py`.
These functions were used to generate mock data for tests but have been
replaced with direct instantiation of AlbumInfo and TrackInfo objects.
This change simplifies the test code and removes unnecessary helper
functions.
- Refactored Tekstowo backend to fetch lyrics directly from song pages.
- Added `encode` method to convert artist and title to their URL format,
where non-alphanumeric characters are replaced with underscores.
- Removed the now redundant search functionality and associated tests.
- Simplified `extract_lyrics` method to directly parse lyrics without
any checks.
Since the `for_artist` keyword has been removed from
`ftintitle.contains_feat`, the unit tests need to be updated.
This includes the deletion of the test cases that test the
`for_artist=True` delimiters.
The previous version of the `plugins.feat_tokens` regular expression
only matched "feat. X" parts if preceded by a space. This caused missed
detections in the `ftintitle.contains_feat` function.
This commit adds unit tests for the updated regex that also matches
"feat. X" parts within parentheses and brackets
* Replace `noqa` comments in `assert...` method definitions with
a configuration option to ignore these names.
* Use the `__all__` variable to specify importable items from the
module, replacing `*` imports and `noqa` comments for unused imports.
* Address issues with poorly named variables and methods by renaming
them appropriately.
- Fix imports
- Fix pytest issues
- Do not assign lambda as variable
- Use isinstance instead of type to check type
- Rename ambiguously named variables
- Name custom errors with Error suffix
A constant `preload_plugin` is used to disable loading the plugin in the
`setUp` initialisation, allowing the plugin to be loaded manually by the
tests.
Also added a cleanup instruction to remove listeners from configured
plugins, and removed this logic from several tests.