Adjust the base URL to perform a '/search' instead of attempting to
'/get' specific lyrics where we're unlikely to find lyrics for the
specific combination of album, artist, track names and the duration (see
https://lrclib.net/docs).
Since we receive an array of matching lyrics candidates, rank them by
their duration similarity to the item's duration, and whether they
contain synced lyrics.
## Description
Fixes#2635Fixes#5133
I realised that #5406 has gotten too big, thus I'm splitting it into
several smaller PRs.
This PR refactors lyrics plugin tests and fixes an empty metadata issue
in the lyrics logic.
#### CI
- Added `--extras=lyrics` to the Poetry install command to include the
lyrics plugin dependencies.
- In the main task which measures coverage, set `LYRICS_UPDATED`
environment variable based on changes detected in the lyrics files.
#### Test setup
- Introduced `ConfigMixin` to centralize configuration setup for tests,
reducing redundancy. This can be used by tests based on `pytest`.
#### Lyrics logic
- Trimmed whitespace from `item.title`, `item.artist`, and
`item.artist_sort` in `search_pairs` function.
- Added checks to avoid searching for lyrics if either the artist or
title is missing.
- Improved `_scrape_strip_cruft` function to remove Google Ads tags and
unnecessary HTML tags.
#### Lyrics tests overhaul
- Migrated lyrics tests to use `pytest` for better isolation and
configuration management.
- Deleted redundant lyrics text files and some unused utils.
- Marked tests that should only run when lyrics source code is updated
(`LYRICS_UPDATED` is set from the CI) using the `on_lyrics_update`
marker.
#### Documentation and Dependencies
- Added `requests-mock` version `1.12.1` to `pyproject.toml` and
`poetry.lock` for mocking HTTP requests in tests.
- Updated `setup.cfg` to include a new marker `on_lyrics_update`.
Add explicit checks for lyrics texts fetched from the tested sources.
- Introduced `LyricsPage` class to represent lyrics pages for integrated
tests.
- Configured expected lyrics for each of the URLs that are being
fetched.
- Consolidated integrated tests in a new `TestLyricsSources` class.
- Mocked Google Search API to return the lyrics page under test.
Since at least one Backend requires album` and `duration` arguments
(`LRCLib`), the caller (`LyricsPlugin.fetch_item_lyrics`) must always
provide them.
Since they need to provided, we need to enforce this by defining them as
positional arguments.
Why is this important? I found that integrated `LRCLib` tests have been
passing, but they called `LRCLib.fetch` with values for `artist` and
`title` fields only, while the actual functionality *always* provides
values for `album` and `duration` fields too.
When I adjusted the test to provide values for the missing fields,
I found that it failed. This makes sense: Lib `album` and `duration`
filters are strict on LRCLib, so I was not surprised the lyrics could
not be found.
Thus I adjusted `LRCLib` backend implementation to only filter by each
of these fields when their values are truthy.
Modified `search_pairs` function in `lyrics.py` to:
* Firstly strip each of `artist`, `artist_sort` and `title` fields
* Only generate alternatives if both `artist` and `title` are not empty
* Ensure that `artist_sort` is not empty and not equal to artist (ignoring
case) before appending it to the artists
Extended tests to cover the changes.
- Consolidated multiple test cases into parameterized tests for better
readability and maintainability.
- Simplified assertions by comparing lists of actual and expected
artists/titles.
- Added `unexpected_empty_artist` marker to handle cases which
unexpectedly return an empty artist. This seems to be happen when
`artist_sort` field is empty.
- Replaced unittest.mock with pytest fixtures for better test isolation and readability.
- Simplified test cases by using parameterized tests.
- Added `requests-mock` dependency to `pyproject.toml` and `poetry.lock`.
- Removed redundant helper functions and classes.
Two google sources failed to return the expected output. I looked into
each case why parsing failed:
- lyrics on musica.com contain <aside> Google Ads
- each lyrics line on lacoccinelle.net is wrapped within alternating
<em> and <strong> tags
Thus remove these tags as part of the HTML cleanup logic.
## Description
Fixes a `sqlite3.OperationalError` that occurs when querying *any* field
and a field that only exists in the relation table. For exaple, trying
to list albums that contain **keyword** and contain a track with **foo**
in its title:
```
beet list -a keyword title:foo
```
## Root Cause
SQLite fails when JOINs contain ambiguous column references. This
happened because:
- *any* Album field search looks at `album`, `albumartist` and `genre`
fields.
- The second part of the query `title:foo` queries a field in the
`items` table, which got
JOINed with `albums`
- Some fields (like `album`) exist in both `items` and `albums` tables,
thus SQLite couldn't resolve which table's column to use
## Changes
- Centralize query construction in `LibModel` with consistent table
qualification
- Add methods:
- `field_query()` - Creates table-qualified field queries
- `any_field_query()` - Creates multi-field OR queries
- `any_writable_media_field_query()` - Similar to the above but for BPD
/ media files
- `match_all_query()` - Creates multi-field AND queries
- Remove `AnyFieldQuery` in favor of composed `OrQuery`
- Add tests for shared field querying
In order to include the table name for fields in this query, use the
`field_query` method.
Since `AnyFieldQuery` is just an `OrQuery` under the hood, remove it and
construct `OrQuery` explicitly instead.
Unify query creation logic from
- queryparse.py:construct_query_part,
- Model.field_query,
- DefaultTemplateFunctions._tmpl_unique
to a single implementation under `LibModel.field_query` class method.
This method should be used for query resolution for model fields.
Adjust Sphinx configuration and release script
Changes:
- Update `source_suffix` in docs/conf.py to use dict format as required
by newer Sphinx
versions
- Fix typo in theme config: `pygment_light_style` ->
`pygments_light_style`
- Improve release script to handle Sphinx intersphinx output formatting:
- Replace tabs with spaces for consistent parsing
- Add reference to sphinx.ext.intersphinx command in docstring
- Update line parsing to match space-indented format
- Restructure CI workflow to ensure relevant dependencies are installed
and changelog formatting is tested.
These changes ensure compatibility with newer Sphinx versions and
improve the robustness
of the release script's reference parsing.
Edit:
While working on this PR GitHub decided to upgrade their linux/ubuntu
runners with `Ubuntu 24.04`. Ubuntu upgrades are the worst: imagine not
updating your system for 3 years and then suddenly upgrading
**everything** to the most recent versions. At which point you start
realizing that the final **S** in their **LTS** means __Suffering__
instead of __Support__.
We use just a few system dependencies in our builds and as expected,
things broke:
* `libcairo2-dev` now needs to be installed for pygobject.
* `pandoc` `rst` -> `markdown` conversion output has changed
* Completion tests are unhappy about `bash` / `bash-completion` upgrade,
and I could not figure out why so I'm just `xfail`ing that test in CI.
Ubuntu version in GitHub Actions has recently been upgraded to 24.04:
https://github.com/actions/runner-images/issues/10636)
This meant that pandoc was upgraded and it changed the way markdown is
formatted by default.
- Fix `lastgenre -A` in combination with config option `source: track`
(_Tracks inherited the album's genre even when this option was set_)
- Now, When an album-level genre is set already, single tracks don't
fall back to the album's genre and request their own last.fm genre.
- Fix log-level and message wording being slightly different for
`source:` track, album, artist genre
- Now log messages follow the same wording, level and structure
throughout.
- Printing out album/item in default format could lead to unreadable
clutter depending on the user's configured formats.
- The album's name and the individual tracks' title should be just
sufficient to provide context as well readability.
- Log like this while importing as well as in standalone runs.
It was rather confusing that the lastgenre plugin, when handling
singletons, sometimes showed that it applied genres from last.fm and
sometimes didn't (it did only in debug log). This streamlines the
behaviour:
- Change debug to info log.
- Streamline wording.
- Display details about the track.
When `lastgenre.source: track` is configured,
- `lastgenre -a` _should not_ fall back to the album level genre (by
making use of the with_album=False kwarg of the Libary's get method).
- `lastgenre -a`, when finally storing the genres of _an album_, should
_not_ also write the tracks genres (by making use of the inherit=False
kwarg of the Album's store method.
# Improve release notes formatting / changelog conversion from rst to md
During our last release, we discovered issues with changelog formatting.
This PR improves and fixes several aspects:
## Changes
- Rewrite the changelog conversion logic to be more robust and
maintainable
- Fix indentation issues with nested bullet points
- Improve handling of long section headers
- Order bullet points alphabetically within sections for better
readability
- Use Sphinx `objects.inv` to resolve references and include links to
the documentation in _Markdown_
- Add tests to prevent formatting regressions
- Add pandoc as a dependency for Ubuntu CI builds
- Ensure documentation is built before generating changelog
## Problem
A regression was introduced when adjusting the track matching logic to
use `lapjv` instead of `munkres`. The `lapjv` algorithm returns `-1` for
unmatched items, which wasn't being handled correctly in the matching
logic. This caused incorrect track assignments when importing new music.
## Solution
- Modified the mapping creation to filter out unmatched items (where
index is `-1`)
- Updated test case to properly catch this scenario
## Description
Fixes#5512. When sorting on a field, if the field is missing from some
items and it has a type, use the type's `null` value. Otherwise,
continue to fall back to an empty string, as I don't think there's much
to be done in that case.
The new test `test_int_field_present_in_some_items` fails without the
fix in `query.py`.
These tests depend on certain `track_length_grace` and
`track_length_max` configuration which was set by other tests in this
module.
I discovered this issue when I tried to run
`test_order_works_when_track_names_are_entirely_wrong` test only
- I found that my local configuration was read and the test failed.
I had previously tested the `munkres` -> `lapjv` replacement
extensively, so I was today surprised to find that nothing gets matched
correctly when I tried importing some new tracks.
On the other hand I now remember making a small adjustment in the logic
to make autotagging tests pass which is when I introduced a bug: I did
not realize that `lapjv` returns index '-1' for each unmatched item.
This issue did not get caught by tests because this 'unmatched' item
index '-1' anecdotally ended up pointing to the last (expected) item in
the test making it pass.
This commit adjusts the aforementioned test to catch this issue and
fixes the logic to correctly identify unmatched tracks.