Setting the default preferred_media to null is more like previous versions.
This way, as digital becomes more popular, we aren't stuck with a default
configuration that prefers an outdated format.
Saves paranoid and interested users from having to either force all max
recommendations to none or constantly go back to candidate selection
from a recommendation to see if there is another slightly less similar
but more preferred (by the user) candidate.
The new Distance object knows how to perform various types of distance
calculations (expression, equality, number, priority, string).
It will keep track of each individual penalty that has been applied so
that we can utilise that information in the UI and when making decisions
about the recommendation level.
We now display the top 3 penalties (sorted by weight) on the release
list (and "..." if there are more than 3), and we display all penalties
on the album info line and track change line.
The implementation of the `max_rec` setting has been simplified by
removing duplicate validation and instead looking at the penalties that
have been applied to a distance. As a result, we can now configure a
maximum recommendation for any penalty that might be applied.
We have a few new checks when calculating album distance:
`match: preferred: countries` and `match: preferred: media` can each be
set to a list of countries and media in order of your preference. These
are empty by default. A value that matches the first item will have no
penalty, and a value that doesn't match any item will have an unweighted
penalty of 1.0.
If `match: preferred: original_year` is set to "yes", beets will apply
an unweighted penalty of 1.0 for each year of difference between the
release year and the original year.
We now configure individual weights for `mediums` (disctotal), `label`,
`catalognum`, `country` and `albumdisambig` instead of a single generic
`minor` weight. This gives more control, but more importantly separates
and names the applied penalties so that the UI can convey exactly which
fields have contributed to the overall distance penalty.
Likewise, `missing tracks` and `unmatched tracks` are penalised and
displayed in the UI separately, instead of a combined `partial` penalty.
Display non-MusicBrainz source in the disambiguation string, and
"source" in the list of penalties if a release is penalised for being
a non-MusicBrainz.
An earlier change (due to @pedros) added the ability for plugins to define
template fields that work with Albums as well as Items. This enables some
cool new use cases but required that every template field definition check the
type of its arguments. Instead, this iteration on the idea distinguishes
between fields meant for Items and those meant for Albums.
In addition to simplifying the implementation of these functions, this also
enables the creation of album fields with identical names to item fields.
(For example, a user contacted me recently about adding a $bitrate field for
albums, which would be the average bitrate of the items. They can do this now
using a plugin.)
I also changed the docs to stop using the decorator approach to registering
template fields. We're moving toward removing those.
That's 371cc72f2d09 in hg. This makes the patch slightly more general by
reusing our type conversion infrastructure. It also uses "bytes" as a synonym
for "str" that I find a little bit clearer.
This is a refactor of the plugin developed by `imenem`.
- Pass `artist`, `album` and `va_likely` to `candidates()` so that
plugins don't have to work this out from `items` all over again.
- Pass `artist` and `title` to `item_candidates()`.
- Silence spurious `urllib3` info log lines.
- Use a proper "beets" user agent with `discogs_client`.
- Remove `abstract_search` plugin. It seems unnecessary. How many
music databases are there? How many will beets support? How much
common code might there be between them? We can add some abstraction
if or when more databases are supported.
- Derive more AlbumInfo and TrackInfo properties from discogs Release
objects, especially album ID so that beets doesn't just use the first
release and think all subsequent releases are duplicates.
- Add basic documentation, doc strings and code comments.
- Sanitise search query. Remove non-word characters and medium info that
might filter out good search results.
- Use artist `join` strings from discogs Release object when an album
or track has multiple artists.
- Don't rely on discogs track position, which is unreliable. But tracks
are in order, so we can recalculate medium and medium_index as long as
we can extract a consistent medium across tracks from the position.
- Add "various" as a known signal to indicate various artists.
- Prevent `chroma` plugin from returning a a huge track distance for any
track that is missing an ID (e.g. all discog tracks).
- `TrackInfo.index` should be the release index (calculated by beets),
not the medium index (derived from discogs track position).
- Add `AlbumInfo.data_source`. It's "Unknown" by default which is shown
in red when displaying a suggested or selected match. The built in
auto tagger sets it to "MusicBrainz" which is shown in green. Anything
else (e.g. "Discogs") is shown in yellow.
- Remove double spaces from album titles (bad data from Discogs).
I've removed the -p option. The command now always shows plugin-provided
template fields if any are available. We also avoid printing out blank lines
for plugins that don't provide fields.
The major functional change here is how files move around when in keep_new
mode. Now, files are first moved to the destination directory and then
copied/transcoded back into the library.
This avoids problems where naming conflicts could occur when transcoding from
MP3 to MP3 (and thus not changing the filename).
This would trigger a warning in Unidecode when metadata was missing (which is
the only case when those empty-string literals are used). Closes#109, which
is a different fix for the same problem.
- Partial matches are always downgraded to a "medium" match.
- The config option, now called "default_action", lets you choose what to do
with "medium" matches.
- Expanded the "low" recommendation level to include cases with just one
match.
Due mostly to some improvements in Confit, we now have a reasonable way to
define the default filenames of auxiliary data files. These are relative to the
beets config directory (i.e., alongside config.yaml).
Renamed fuzzy_search to fuzzy and rdm to random. These names should be easier
to remember since they are the same as the commands they provide.
--HG--
rename : beetsplug/fuzzy_search.py => beetsplug/fuzzy.py
rename : beetsplug/rdm.py => beetsplug/random.py
rename : docs/plugins/fuzzy_search.rst => docs/plugins/fuzzy.rst
rename : docs/plugins/rdm.rst => docs/plugins/random.rst
The changes introduced in rc1 caused paths to be syspath-ified before they were
passed to os.path.abspath. The magic prefix caused them to be interpreted as
absolute paths even if they were relative. The fix is, in this *isolated*
case, to use Unicode but prefix-free paths in calls to the os.path.* functions.
Those functions need to act on Unicode objects but seem to be purely syntactic
-- nothing is tripped up by using long filenames without the magic prefix.
Also pertaining to #58: for most utility functions, paths should *not* be
`syspath`-ified. (This only occurs right before a path is sent to the OS.) In
fact, as @Wessie discovered, using the result of `syspath` with `ancestry` leads
to incorrect behavior. I checked and this should not currently happen anywhere,
but these docstring changes make that requirement explicit.
With the new centralized print_obj function, we can greatly simplify the code
for the list command. This necessitated a couple of additional tweaks:
- For performance reasons, print_obj can now take a compiled template. (There's
still an issue with using the default/configured template, but we can cross
that bridge later).
- When listing albums, $path now expands to the album's item dir. So the format
string '$path' now exactly corresponds to passing the -p switch.
As an added bonus, we can now also reduce copypasta in the random plugin (which
behaves almost exactly the same as list).
This version of the (renamed) _print_obj function uses introspection to
determine whether we're printing an Album or an Item. It's like function
overloading for Python! 😁
This is fixed by allowing MediaFiles to convert strings to integers on
assignment. An eventual complete fix will perform these type conversions in the
Item interface.
Eliminate the __subclasses__ trick for finding all plugins. Now we explicitly
look in each plugin module for a plugin class. This allows us to import plugin
modules with unintentionally loading them. This lets us reuse the image
embedding machinery without copypasta.
When we store paths in the database, we always use bytestrings for consistency.
But on Windows, these paths are converted back to Unicode before they reach the
FS API. This means that the codec used internally is immaterial.
However, we were naively using sys.getfilesystemencoding() for this internal
representation. On Windows, this is MBCS, a broken encoding that can't represent
all of Unicode. This change replaces that with UTF-8, a "real" codec.
The decoding bit now tries UTF-8 and falls back to MBCS for compatibility with
existing databases. The reality, however, is that existing databases may not
work with this change -- a byte string may represent something different in
UTF-8 from what it represents in MBCS. So users should recreated their DBs if
anything goes wrong.
This necessitated a slight refactoring in the plugin event handling mechanism.
Rather than loading all handlers up front and storing them in a module-scope
structure, we now scan for event handlers at every send(). This is probably
very slightly less efficient but allows for more flexible logic.
The shutil.move() function attempts to copy metadata (e.g., permissions and
mtime) when copying a file across filesystems. This always fails on Samba shares
because the utime() call is never permitted by normal users. We don't care about
preserving mtimes across moves, though, so this commit eschews shutil and
reimplements the move algorithm.
Previously, exceptions while communicating with the MusicBrainz API would bring
down the entire autotagging process. These errors are depressingly common, so we
now handle them and log errors to the console (much as we already do with any
exception raised by the Mutagen module). Fault isolation!
This has the added side-effect of giving better context for MB errors when they
do happen -- the logged errors now show the query that was running when MB
failed.
In an attempt to finally address the longstanding SQLite locking issues, I'm
introducing a way to explicitly, lexically scope transactions. The Transaction
class is a context manager that always fully fetches after SELECTs and
automatically commits on exit. No direct access to the library is allowed, so
all changes will eventually be committed and all queries will be completed. This
will also provide a debugging mechanism to show where concurrent transactions
are beginning and ending.
To support composition (transaction reentrancy), an internal, per-Library stack
of transactions is maintained. Commits only happen when the outermost
transaction exits. This means that, while it's possible to introduce atomicity
bugs by invoking Library methods outside of a transaction, you can conveniently
call them *without* a currently-active transaction to get a single atomic
action.
Note that this "transaction stack" concepts assumes a single Library object per
thread. Because we need to duplicate Library objects for concurrent access due
to sqlite3 limitation already, this is fine for now. Later, the interface should
provide one transaction stack per thread for shared Library objects.
Previously, all files would be moved/copied/deleted before the corresponding
Items and Albums were added to the database. Now, the in-place items are added
to the database; the files are moved; and then the new paths are saved to the
DB. The apply_choices coroutine now executes two database transactions per task.
This has a couple of benefits:
- %aunique{} requires album structures to be in place before the destination()
call, so this now works as expected.
- As an added bonus, the "in_album" parameter to move() and destination() --
along with its associated ugly hacks -- is no longer required.
- Copying and moving are mutually exclusive. Moving overrides copying so the
user only has to add one line ("import_move: true") to disable copying and
enable moving in its place.
- Deleting is only possible when copying.
- Deprecating the "delete" option (moving is almost always better).
- Removed command-line switch for moving. It's somewhat "unsafe", so this
removes some potential for accidental irreversible changes.
- Changelog & thanks.
- Update docs to refer to import_move instead of import_delete as the
correct solution for ending up with only one copy of the file.
The new fields are:
ALBUM: mb_releasegroupid asin catalognum script language country albumstatus
media albumdisambig
TRACK: disctitle encoder
These are not yet parsed from MusicBrainz responses (just added to MediaFile
and the database).
There's no longer a distinction between Unix and Windows substitutions. Enough
users reported problems with Windows-forbidden characters on Samba shares that
it seems appropriate to make all filenames Windows-safe, even on Unix. Users who
really want those additional characters (<>:"?*|\) can re-enable them via the
"replace" option. Nobody has complained about beets being *too* conservative.
This also adds sanitization of control characters, which is an all-around good
idea, and the substitution now runs in the Unicode (rather than byte) domain.
This is accomplished via a new event, "import_task_apply", which is called
right after metadata is applied to newly-imported items.
This change makes chroma REQUIRE a new version (0.6) of pyacoustid. Users with
older versions installed will see complaints about a missing method
"fingerprint_file".
Previously, there was just an "artist sort name" field -- now there's a
corresponding sort name for both track artists and album artists. I also made
the names shorter (artist_sort and albumartist_sort).
Apparently, info.bitrate can be zero for AAC files when the information is not
available in the header. MediaFile now falls back to estimation when this is
true.
Previously, the thinking went that if the MB database didn't have a length, we
should penalize the track maximally in order to avoid prioritizing a bad match
just because its information was incomplete. As it turns out, though, missing
track lengths are pretty common and this was causing more problems than it was
solving; this way, mysteriously high distances won't appear.
When a partial match is found, its first item (task.items[0]) may be None, and
_infer_album_fields would crash in this case. This solution walks through the
items list and finds the first non-None item.