diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index ee963ab46..d19a376b3 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -286,31 +286,6 @@ according to the specifications required by the project. Similarly, run ``poe format-docs`` and ``poe lint-docs`` to ensure consistent documentation formatting and check for any issues. -Handling Paths -~~~~~~~~~~~~~~ - -A great deal of convention deals with the handling of **paths**. Paths are -stored internally—in the database, for instance—as byte strings (i.e., ``bytes`` -instead of ``str`` in Python 3). This is because POSIX operating systems’ path -names are only reliably usable as byte strings—operating systems typically -recommend but do not require that filenames use a given encoding, so violations -of any reported encoding are inevitable. On Windows, the strings are always -encoded with UTF-8; on Unix, the encoding is controlled by the filesystem. Here -are some guidelines to follow: - -- If you have a Unicode path or you’re not sure whether something is Unicode or - not, pass it through ``bytestring_path`` function in the ``beets.util`` module - to convert it to bytes. -- Pass every path name through the ``syspath`` function (also in ``beets.util``) - before sending it to any *operating system* file operation (``open``, for - example). This is necessary to use long filenames (which, maddeningly, must be - Unicode) on Windows. This allows us to consistently store bytes in the - database but use the native encoding rule on both POSIX and Windows. -- Similarly, the ``displayable_path`` utility function converts bytestring paths - to a Unicode string for displaying to the user. Every time you want to print - out a string to the terminal or log it with the ``logging`` module, feed it - through this function. - Editor Settings ~~~~~~~~~~~~~~~ diff --git a/docs/changelog.rst b/docs/changelog.rst index 1dabbc58a..749ddf005 100644 --- a/docs/changelog.rst +++ b/docs/changelog.rst @@ -26,6 +26,10 @@ For packagers: Other changes: +- The documentation chapter :doc:`dev/paths` has been moved to the "For + Developers" section and revised to reflect current best practices (pathlib + usage). + 2.5.1 (October 14, 2025) ------------------------ diff --git a/docs/dev/index.rst b/docs/dev/index.rst index 7bd0ba709..f22aa8c56 100644 --- a/docs/dev/index.rst +++ b/docs/dev/index.rst @@ -18,6 +18,7 @@ configuration files, respectively. plugins/index library + paths importer cli ../api/index diff --git a/docs/dev/paths.rst b/docs/dev/paths.rst new file mode 100644 index 000000000..a593580f6 --- /dev/null +++ b/docs/dev/paths.rst @@ -0,0 +1,64 @@ +Handling Paths +============== + +``pathlib`` provides a clean, cross-platform API for working with filesystem +paths. + +Use the ``.filepath`` property on ``Item`` and ``Album`` library objects to +access paths as ``pathlib.Path`` objects. This produces a readable, native +representation suitable for printing, logging, or further processing. + +Normalize paths using ``Path(...).expanduser().resolve()``, which expands ``~`` +and resolves symlinks. + +Cross-platform differences—such as path separators, Unicode handling, and +long-path support (Windows) are automatically managed by ``pathlib``. + +When storing paths in the database, however, convert them to bytes with +``bytestring_path()``. Paths in Beets are currently stored as bytes, although +there are plans to eventually store ``pathlib.Path`` objects directly. To access +media file paths in their stored form, use the ``.path`` property on ``Item`` +and ``Album``. + +Legacy utilities +---------------- + +Historically, Beets used custom utilities to ensure consistent behavior across +Linux, macOS, and Windows before ``pathlib`` became reliable: + +- ``syspath()``: worked around Windows Unicode and long-path limitations by + converting to a system-safe string (adding the ``\\?\`` prefix where needed). +- ``normpath()``: normalized slashes and removed ``./`` or ``..`` parts but did + not expand ``~``. +- ``bytestring_path()``: converted paths to bytes for database storage (still + used for that purpose today). +- ``displayable_path()``: converted byte paths to Unicode for display or + logging. + +These functions remain safe to use in legacy code, but new code should rely +solely on ``pathlib.Path``. + +Examples +-------- + +Old style + +.. code-block:: python + + displayable_path(item.path) + normpath("~/Music/../Artist") + syspath(path) + +New style + +.. code-block:: python + + item.filepath + Path("~/Music/../Artist").expanduser().resolve() + Path(path) + +When storing paths in the database + +.. code-block:: python + + path_bytes = bytestring_path(Path("/some/path/to/file.mp3"))