bs4 scraping routine has been made more generic,
relying less on specific markup tags.
Better algorithm to detect which url titles match
song titles: domain names are now removed from url
titles.
Use regex to decimate \n in fetched lyrics.
This also brought to light the fact that the distance calculations for tracks
was incorrect because there's no source field on TrackInfo objects yet. This
needs to be fixed.
The new Distance object knows how to perform various types of distance
calculations (expression, equality, number, priority, string).
It will keep track of each individual penalty that has been applied so
that we can utilise that information in the UI and when making decisions
about the recommendation level.
We now display the top 3 penalties (sorted by weight) on the release
list (and "..." if there are more than 3), and we display all penalties
on the album info line and track change line.
The implementation of the `max_rec` setting has been simplified by
removing duplicate validation and instead looking at the penalties that
have been applied to a distance. As a result, we can now configure a
maximum recommendation for any penalty that might be applied.
We have a few new checks when calculating album distance:
`match: preferred: countries` and `match: preferred: media` can each be
set to a list of countries and media in order of your preference. These
are empty by default. A value that matches the first item will have no
penalty, and a value that doesn't match any item will have an unweighted
penalty of 1.0.
If `match: preferred: original_year` is set to "yes", beets will apply
an unweighted penalty of 1.0 for each year of difference between the
release year and the original year.
We now configure individual weights for `mediums` (disctotal), `label`,
`catalognum`, `country` and `albumdisambig` instead of a single generic
`minor` weight. This gives more control, but more importantly separates
and names the applied penalties so that the UI can convey exactly which
fields have contributed to the overall distance penalty.
Likewise, `missing tracks` and `unmatched tracks` are penalised and
displayed in the UI separately, instead of a combined `partial` penalty.
Display non-MusicBrainz source in the disambiguation string, and
"source" in the list of penalties if a release is penalised for being
a non-MusicBrainz.
As outlined in #299, we broke many places in the code that were expecting
_album_for_id and _track_for_id to return a single item rather than a list. I
took this opportunity to divide up the interface: there's now one function for
MBIDs (returning a single object or None) and one for generic IDs (returning a
list).
It's possible that these plugins might require a little more thought
into how they should work (or not work) with potentially multiple albums
matching an idea, and potentially those those albums being an incorrect
match (e.g. if two data source plugins share a simple numerical ID
format).
An earlier change (due to @pedros) added the ability for plugins to define
template fields that work with Albums as well as Items. This enables some
cool new use cases but required that every template field definition check the
type of its arguments. Instead, this iteration on the idea distinguishes
between fields meant for Items and those meant for Albums.
In addition to simplifying the implementation of these functions, this also
enables the creation of album fields with identical names to item fields.
(For example, a user contacted me recently about adding a $bitrate field for
albums, which would be the average bitrate of the items. They can do this now
using a plugin.)
I also changed the docs to stop using the decorator approach to registering
template fields. We're moving toward removing those.
Otherwise, this was penalizing all album matches.
This also applies the same treatment as the previous commit: only load the
config at run time (at the expense of code clutter).
Discogs (and other individual plugins) should return a penalty between
0.0 and 0.1, which will be multiplied by the general source weight. This
allows individual plugins to be weighted against relative to each other,
and amplified together against other penalty categories (track, title,
etc.)
Some albums have a single disc with positions like I, II, III, IV, etc.
Previously beets thought each track on these albums was a new medium.
Now we assume that if there is no explicit medium index and the ordinal
of an alpha medium does not appear to be sequential (e.g. A, B, C) that
the medium is actually the medium index.