The default path formats now include both a "default", which is the same as
before but now uses $albumartist instead of $artist, and a "comp" path, which
uses a Compilations directory. Old paths are supported as-is by letting $artist
refer to either a track artist (when present, as it is in all old library
tracks) or album artist (when the track artist isn't present, as is the case
with most albums imported now).
I've essentially loaded up the string distance function with heuristics that
apply different weights to different kinds of string cruft that one encounters
in music tags. For example, tracks ending with "feat. Somebody" shouldn't be
penalized for all those extra characters. Now the weight of that part of the
string is significantly reduced.
This involves yet another new plugin method: album_distance. This leaves as the
last major puzzle piece for lastid the ability to augment the initial search
into MB (i.e., can start a search using fingerprinted metadata).
(I'm not sure why, but the weight for track index mismatches was set to 0.0.
This way, the tagger will be slightly more reluctant to frivolously reorder.)
When computing track destination paths, we now look for album-level values when
they're available. This has the effect of making albums go into a single
directory even when their tracks have heterogeneous metadata. We will need to
revisit this once we start explicitly supporting non-album tracks.
In the end, after all of this, it turns out that we basically need to abandon
the temptation of dealing with unicode paths altogether. The POSIX filesystem
API has no notion of unicode and is very much a bytes-only interface. This
means that undecodable pathnames are a reality we must deal with. This new
approach stores all paths as buffers (blobs) in SQLite and -- as transparently
as possible -- presents them as str objects to the Python code. Legacy
databases will have their paths automatically encoded into str objects, and
will lazily have their unicodes in the database replaced with buffers.
As part of this, the BaseLibrary class was also adapted to include a notion of
albums. This is reflected by the new BaseAlbum class, which the Album class
(formerly _AlbumInfo) completely replaces in the concrete Library. The BaseAlbum
class just fetches metadata from the underlying items.