A second base class, LibModel, maintains a reference to the Library and should
take care of database-related tasks like load and store. This is the beginning
of the end of the terrible incongruity between Item and Album objects (only
the latter had a library reference). More refactoring to come.
One large side effect: Album objects no longer automatically store
modifications. You have to call album.store(). Several places in the code
assume otherwise; they need cleaning up.
ResultIterator is now polymorphic (it takes a type parameter, which must be a
subclass of LibModel).
In preparation for enabling queries over flexattrs, this is a new path that
lets queries avoid generating SQLite expressions altogether. Any query that
can be completely evaluated in SQLite will be, but when it can't, we now fall
back to running the entire query in Python by selecting everything from the
database and running the `match` predicate.
To begin with, this mechanism replaces RegisteredFieldQueries, which
previously used Python callbacks for evaluation. Now they just indicate that
they're slow queries and the query system falls back automatically.
This has the great upside that it lets use implement arbitrarily complex
queries without shoehorning everything into SQLite when that (a) is way too
complicated and (b) doesn't buy us much performance anyway. The obvious
drawback is that any code dealing with queries now has to handle two cases
(slow and fast).
In the future, we could optimize this further by combing fast and slow query
styles. For example, if you want to match with a substring *and* a regular
expression, we can do a first pass in SQLite and apply the regex predicate on
the results. Avoided for now because premature optimization, etc., etc.
Next step: implement flexattr matches as slow queries.
bs4 scraping routine has been made more generic,
relying less on specific markup tags.
Better algorithm to detect which url titles match
song titles: domain names are now removed from url
titles.
Use regex to decimate \n in fetched lyrics.
This also brought to light the fact that the distance calculations for tracks
was incorrect because there's no source field on TrackInfo objects yet. This
needs to be fixed.
The new Distance object knows how to perform various types of distance
calculations (expression, equality, number, priority, string).
It will keep track of each individual penalty that has been applied so
that we can utilise that information in the UI and when making decisions
about the recommendation level.
We now display the top 3 penalties (sorted by weight) on the release
list (and "..." if there are more than 3), and we display all penalties
on the album info line and track change line.
The implementation of the `max_rec` setting has been simplified by
removing duplicate validation and instead looking at the penalties that
have been applied to a distance. As a result, we can now configure a
maximum recommendation for any penalty that might be applied.
We have a few new checks when calculating album distance:
`match: preferred: countries` and `match: preferred: media` can each be
set to a list of countries and media in order of your preference. These
are empty by default. A value that matches the first item will have no
penalty, and a value that doesn't match any item will have an unweighted
penalty of 1.0.
If `match: preferred: original_year` is set to "yes", beets will apply
an unweighted penalty of 1.0 for each year of difference between the
release year and the original year.
We now configure individual weights for `mediums` (disctotal), `label`,
`catalognum`, `country` and `albumdisambig` instead of a single generic
`minor` weight. This gives more control, but more importantly separates
and names the applied penalties so that the UI can convey exactly which
fields have contributed to the overall distance penalty.
Likewise, `missing tracks` and `unmatched tracks` are penalised and
displayed in the UI separately, instead of a combined `partial` penalty.
Display non-MusicBrainz source in the disambiguation string, and
"source" in the list of penalties if a release is penalised for being
a non-MusicBrainz.
As outlined in #299, we broke many places in the code that were expecting
_album_for_id and _track_for_id to return a single item rather than a list. I
took this opportunity to divide up the interface: there's now one function for
MBIDs (returning a single object or None) and one for generic IDs (returning a
list).
It's possible that these plugins might require a little more thought
into how they should work (or not work) with potentially multiple albums
matching an idea, and potentially those those albums being an incorrect
match (e.g. if two data source plugins share a simple numerical ID
format).
An earlier change (due to @pedros) added the ability for plugins to define
template fields that work with Albums as well as Items. This enables some
cool new use cases but required that every template field definition check the
type of its arguments. Instead, this iteration on the idea distinguishes
between fields meant for Items and those meant for Albums.
In addition to simplifying the implementation of these functions, this also
enables the creation of album fields with identical names to item fields.
(For example, a user contacted me recently about adding a $bitrate field for
albums, which would be the average bitrate of the items. They can do this now
using a plugin.)
I also changed the docs to stop using the decorator approach to registering
template fields. We're moving toward removing those.
Otherwise, this was penalizing all album matches.
This also applies the same treatment as the previous commit: only load the
config at run time (at the expense of code clutter).
Discogs (and other individual plugins) should return a penalty between
0.0 and 0.1, which will be multiplied by the general source weight. This
allows individual plugins to be weighted against relative to each other,
and amplified together against other penalty categories (track, title,
etc.)
Some albums have a single disc with positions like I, II, III, IV, etc.
Previously beets thought each track on these albums was a new medium.
Now we assume that if there is no explicit medium index and the ordinal
of an alpha medium does not appear to be sequential (e.g. A, B, C) that
the medium is actually the medium index.
This is a refactor of the plugin developed by `imenem`.
- Pass `artist`, `album` and `va_likely` to `candidates()` so that
plugins don't have to work this out from `items` all over again.
- Pass `artist` and `title` to `item_candidates()`.
- Silence spurious `urllib3` info log lines.
- Use a proper "beets" user agent with `discogs_client`.
- Remove `abstract_search` plugin. It seems unnecessary. How many
music databases are there? How many will beets support? How much
common code might there be between them? We can add some abstraction
if or when more databases are supported.
- Derive more AlbumInfo and TrackInfo properties from discogs Release
objects, especially album ID so that beets doesn't just use the first
release and think all subsequent releases are duplicates.
- Add basic documentation, doc strings and code comments.
- Sanitise search query. Remove non-word characters and medium info that
might filter out good search results.
- Use artist `join` strings from discogs Release object when an album
or track has multiple artists.
- Don't rely on discogs track position, which is unreliable. But tracks
are in order, so we can recalculate medium and medium_index as long as
we can extract a consistent medium across tracks from the position.
- Add "various" as a known signal to indicate various artists.
- Prevent `chroma` plugin from returning a a huge track distance for any
track that is missing an ID (e.g. all discog tracks).
- `TrackInfo.index` should be the release index (calculated by beets),
not the medium index (derived from discogs track position).
- Add `AlbumInfo.data_source`. It's "Unknown" by default which is shown
in red when displaying a suggested or selected match. The built in
auto tagger sets it to "MusicBrainz" which is shown in green. Anything
else (e.g. "Discogs") is shown in yellow.
- Remove double spaces from album titles (bad data from Discogs).
This slight modification to the selection algorithm avoids the situation in
which too many objects are chosen for a given artist and fewer than N objects
are eventually returned. We do this by implementing "selection without
replacement" literally: we choose objects one at a time and pop them from the
population when they are selected.
The main change here, aside from documentation/naming updates, is that we skip
"duplicates" that arise from albums/tracks that are missing their MBIDs.
Add a 'fallback' option to facilitate working around the 100 queries/day google
limit by marking files as 'visited' so they are not considered for lyrics search
on the next beet run.
I've put my own google_engine_ID as default value in the code but could be
reconsidered, this engine contains databases known to be scrappable by the
plugin algorithm though.
The initial idea for this refactor was motivated by the need to make
PluginQuery.match() have the same method signature as the match() methods on
other queries. That is, it needed to take an *item*, not the pattern and
value. (The pattern is supplied when the query is constructed.) So it made
sense to move the value-to-pattern code to a class method.
But then I realized that all the other FieldQuery subclasses needed to do
essentially the same thing. So I eliminated PluginQuery altogether and
refactored FieldQuery to subsume its functionality. I then changed all the
other FieldQuery subclasses to conform to the same pattern.
This has the side effect of allowing different kinds of queries (even
non-field queries) down the road.
This turns on metadata-writing based on the import.write config option, so
those with this option turned off will be spared any surprises. (Affects #217
and #143.)
This avoids naming conflicts in the source directory. In particular, when
encoding MP3 -> MP3, the previous scheme would overwrite the original file
(and hang ffmpeg waiting for input). This should also work in
situations where the source directory is read-only.
I introduced a regression a few commits ago when I started using
lib.destination with the basedir keyword argument as opposed to doing
os.path.join manually.
The major functional change here is how files move around when in keep_new
mode. Now, files are first moved to the destination directory and then
copied/transcoded back into the library.
This avoids problems where naming conflicts could occur when transcoding from
MP3 to MP3 (and thus not changing the filename).
As _print_and_apply_changes itself does for items, we now shortcut
modifications (metadata and filesystem) for albums when no changes are
required for a given album. This avoids effectively doing a "beet move" on an
album even when nothing has changed.
This change uses _album_for_id and _track_for_id instead of the full
autotag.match.* functions. This should be faster (requiring fewer calls to the
MusicBrainz API) while also being more predictable. It also won't, for
example, use acoustic fingerprinting even if the chroma plugin is installed.
Finally, this change catches the error case in which MBIDs are erroneous. This
can happen, for example, if the user has some track MBIDs left over from
before the NGS transition.
The main change here is to use shorter transactions -- one per matching entity
-- rather than one large one. This avoids very long transactions when the
network happens to move slowly.
this plugin provides a faster way to query new metadata from
musicbrainz. (instead of having to 're-import' the files)
Currently it lacks all forms of documentation and will only work for
album queries. not really tested so far so be careful
The lastgenre command should always log what it's doing so the user can see
the progress being made. If you really don't want any output, just pipe to
/dev/null.
- Remove "part", "volume", "vol." multi-disc markers. These are often
part of album titles, and not necessarily indicative of a multi-disc
album. Only look for "CD X" and "disc X" (case insensitive), ignoring
white space and other non-word characters.
- Don't only expect each disc to be in a subdirectory of a common parent
directory, with all siblings belonging to the same release. Also match
any consecutive siblings (even when the parent contains other albums)
that are named with the same prefix and multi-disc marker.
- The `albums_in_dir(path)` function now always yields a list of paths
along with each list of items. `ItemTask.path` is now always a list of
paths.
- The `displayable_path(path)` function now accepts a list of paths, and
will join them with "; " by default. This can be changed with the
`separator` argument.
- The `sorted_walk()` function now does a case insensitive sort on
directories, but still returns case sensitive results. This allows
better multi-disc album detection.
- The `art_for_album()` function now takes a list of paths as its second
argument, instead of a single path.
This provides a default for source, preventing a crash when not present in the
user's config.
It also refactors the source decision to a helper function, _lastfm_obj, to
avoid copypasta.
I'm transitioning to using exclusively instance-level fields instead of
class-level fields in plugin objects, but I neglected to bring inline and
rewrite into the future. This manifested as silent inaction on the part of
these plugins.
This change restores the old behavior (for compatibility) but also updates the
plugins to use the new behavior.
If config is set to:
lastgenre:
source: artist
The genre will be fetched for the artist, rather than the album. This
allows for filesystem org like:
genre/artist/album
Currently defaults to previous behaviour for anything other than
`artist`
* Adds support for importing/managing .wma and .asf files
* Adds support for all available ASF tag equivalents
* Adds two utility methods for (un)packing embedded ASF pictures
* Modifies scrub plugin to work around the lack of a delete method on
ASFTags object.