dbcore.Results: Avoid duplicate construction

Iterating over a results set multiple times should not take the same amount each time. We now keep around the materialized objects and re-use them with iteration. This solves a performance problem in the `play` plugin, which uses len(results) multiple times and was therefore taking an unnecessary performance hit when the query was slow.
2026-03-31 10:44:16 +02:00 · 2014-10-11 12:08:27 -07:00 · 2014-10-11 12:08:27 -07:00 · ea94ce5eef
commit ea94ce5eef
parent a2ce367c64
2 changed files with 30 additions and 9 deletions
--- a/beets/dbcore/db.py
+++ b/beets/dbcore/db.py
@ -459,14 +459,18 @@ class Results(object):
    """
    def __init__(self, model_class, rows, db, query=None, sort=None):
        """Create a result set that will construct objects of type
-        `model_class`, which should be a subclass of `LibModel`, out of
-        the query result mapping in `rows`. The new objects are
-        associated with the database `db`.
-        If `query` is provided, it is used as a predicate to filter the results
-        for a "slow query" that cannot be evaluated by the database directly.
-        If `sort` is provided, it is used to sort the full list of results
-        before returning. This means it is a "slow sort" and all objects must
-        be built before returning the first one.
+        `model_class`.
+
+        `model_class` is a subclass of `LibModel` that will be
+        constructed. `rows` is a query result: a list of mappings. The
+        new objects will be associated with the database `db`.
+
+        If `query` is provided, it is used as a predicate to filter the
+        results for a "slow query" that cannot be evaluated by the
+        database directly. If `sort` is provided, it is used to sort the
+        full list of results before returning. This means it is a "slow
+        sort" and all objects must be built before returning the first
+        one.
        """
        self.model_class = model_class
        self.rows = rows
@ -474,16 +478,31 @@ class Results(object):
        self.query = query
        self.sort = sort

+        self._objects = []  # Model objects materialized *so far*.
+        self._row_iter = iter(self.rows)  # Indicate next row to materialize.
+
    def _get_objects(self):
        """Construct and generate Model objects for they query. The
        objects are returned in the order emitted from the database; no
        slow sort is applied.
+
+        For performance, this generator caches materialized objects to
+        avoid constructing them more than once. This way, iterating over
+        a `Results` object a second time should be much faster than the
+        first.
        """
-        for row in self.rows:
+        # Get the previously-materialized objects.
+        for object in self._objects:
+            yield object
+
+        # Now, for the rows that have not yet been processed, materialize
+        # objects and add them to the list.
+        for row in self._row_iter:
            obj = self._make_model(row)
            # If there is a slow-query predicate, ensurer that the
            # object passes it.
            if not self.query or self.query.match(obj):
+                self._objects.append(obj)
                yield obj

    def __iter__(self):
--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@ -28,6 +28,8 @@ Fixes:
  quantities (track numbers and durations), which was often confusing.
 * Date-based queries that are malformed (not parse-able) no longer crash
  beets and instead fail silently.
+* Slow queries, such as those over flexible attributes, should now be much
+  faster when used with certain commands---notably, the :doc:`/plugins/play`.


 1.3.8 (September 17, 2014)