Improve regex detection for the drive_sep_replace default (#6417)

I imported an album where a track had the name `1:00 AM - Clear` and
another track named `12:00 AM - Clear` (just two examples).

See: [Animal Crossing: New Horizons
OST](https://musicbrainz.org/release/263f7ed3-60c2-4251-ac7d-6da3f8691256)

After import, the former was renamed `1_00 AM - Clear`, and the latter
`12;00 AM - Clear`. Notice the inconsistency of how the `:` was
replaced.

I did not make use of the (hidden) `drive_sep_replace` setting. These
were my `replace` settings:
```
replace:                        # prevent file name incompatibiliy
  '[\s]'                        : ' '   # standardize whitespace
  '["`‘’“”]'                    : "'"   # standardize quotes
  '[\u002D\u2010-\u2015\u2E3A]' : '-'   # standardize dashes
  '[\u2E3B\uFE58\uFE63\uFF0D]'  : '-'   # standardize dashes
  '[\xAD]'                      : '-'   # standardize dashes
  '[\\\|\/]'                    : ' '   # slashes, pipe > space
  '[:]'                         : ';'   # colon > semicolon
  '[<>]'                        : '-'   # chevrons > dashes
  '[\?\*]'                      : ''    # remove restricted characters
  '[\x00-\x1F\x7F]'             : ''    # remove basic control characters
  '[\x80-\x9F]'                 : ''    # remove extra control characters
  '^\.'                         : ''    # remove leading period
  '\.$'                         : ''    # remove trailing period
  '^\s+'                        : ''    # remove leading space
  '\s+$'                        : ''    # remove trailing space
```

I found the issue to be too generic regex for drive separator detection.
I'm on macOS, so this is irrelevant to me anyway (and I got around it by
adding `drive_sep_replace: ';'` to my settings), but regardless, I think
this could be improved.

This PR improves the regex to detect drive separators. Instead of merely
looking for any first character followed by a colon (`^\w:`), we look
for a letter, followed by a colon, followed by a backslash instead
(`^[a-zA-Z]:\\`).

The regex logic is solid, but I am not able to test this on a real
Windows environment.

~Still have to add an entry to the changelog, will do so soon.~

# Update

Initially this commit failed the
`MoveTest.test_move_file_with_colon_alt_separator` test because it
checks the logic using a `C:DOS` path. So I had to make the logic less
restrictive again, not checking for a backslash (`^[a-zA-Z]:`). I would
argue the test itself should be amended (test with `C:\DOS` instead),
but that's not up to me.

As a result, my case of "1:00 AM" being replaced incorrectly is still
resolved, but other hypothetical cases like "a:b" would still not be
covered due to an arguably incorrect test limiting a more precise regex.
This commit is contained in:
Šarūnas Nejus 2026-03-10 00:16:56 +00:00 committed by GitHub
commit 44dc3cd4e9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 5 additions and 2 deletions

View file

@ -159,8 +159,8 @@ class FormattedMapping(Mapping[str, str]):
sep_repl: str = beets.config["path_sep_replace"].as_str()
sep_drive: str = beets.config["drive_sep_replace"].as_str()
if re.match(r"^\w:", value):
value = re.sub(r"(?<=^\w):", sep_drive, value)
if re.match(r"^[a-zA-Z]:", value):
value = re.sub(r"(?<=[a-zA-Z]):", sep_drive, value)
for sep in (os.path.sep, os.path.altsep):
if sep:

View file

@ -16,6 +16,9 @@ Unreleased
Bug fixes
~~~~~~~~~
- :ref:`replace`: Made ``drive_sep_replace`` regex logic more precise to prevent
edge-case mismatches (e.g., a song titled "1:00 AM" would incorrectly be
considered a Windows drive path).
- :doc:`plugins/fish`: Fix AttributeError. :bug:`6340`
..