Restore uncapped Parallel.ForEach when env cap unset (TPL -1)

- Omit/invalid/<=0 LIDARR_MEDIA_IO_PARALLELISM: default ParallelOptions for
  folder scan + tag reads (matches original multithread fork).
- PLINQ candidate scoring: ProcessorCount when uncapped; same 1–64 cap when set.
- Fixes throughput loss from equating 16 cores with TPL default concurrency.
- Log line shows TPL default vs cap and PLINQ degree; update MULTITHREAD_README.

Made-with: Cursor
This commit is contained in:
Sean Parsons 2026-03-27 21:56:55 +00:00
parent c0aec91d50
commit 83298dabf3
5 changed files with 67 additions and 44 deletions

View file

@ -4,26 +4,20 @@ This branch adds a faster, **parallel** disk scan and import path. Upstream Lida
A **Dockerfile.multithread** in this repository builds a self-contained binary and overlays it on `ghcr.io/linuxserver/lidarr:nightly` (see CI or build from repo root per that files comments). A wrapper layout that keeps this tree in a `lidarr-src/` subdirectory can use the parent `Dockerfile` instead.
## `LIDARR_MEDIA_IO_PARALLELISM`
## `LIDARR_MEDIA_IO_PARALLELISM` (optional IO cap)
Parallel import work is **not** limited by Lidarrs download bandwidth or rate settings (those apply to indexers/clients only). On **slow or remote storage** (especially NFS), too much concurrency can saturate IOPS and make the host feel stuck. This variable caps how many workers run at once for the forks parallel paths.
Parallel import work is **not** limited by Lidarrs download bandwidth or rate settings (those apply to indexers/clients only). On **slow or remote storage** (especially NFS), the default **uncapped** parallelism can saturate IOPS. Set this variable only when you need to **limit** concurrency.
| | |
| --- | --- |
| **Name** | `LIDARR_MEDIA_IO_PARALLELISM` |
| **Default** | **Unset / empty / invalid / `0`:** same as **`Environment.ProcessorCount`** (original fork behavior). For **NFS or slow storage**, set **`1`** or **`2`** explicitly. |
| **Allowed** | **`1``64`:** use that cap. **`0`:** treat as unset (processor count). |
| **Scope** | Process environment (re-read on each scan/import parallel section) |
| **Omit / empty / invalid / ≤0** | **Original fork behavior:** `Parallel.ForEach` uses the **TPL default** (`MaxDegreeOfParallelism = -1`), which can use **more** concurrent workers than `ProcessorCount` on I/O-heavy work (this is why setting `16` on a 16-core box could feel *slower* than before). **PLINQ** still uses **`ProcessorCount`** (TagLib / candidate scoring cannot use `-1`). |
| **164** | Hard cap on **both** `Parallel.ForEach` loops **and** PLINQ degree (same number). Use **`1``2`** on NFS if the host stalls. |
| **Scope** | Environment is read when each parallel section runs |
It applies to:
**Docker:** set on the container like any other env variable.
- parallel **folder scans** when collecting audio files;
- parallel **tag / metadata reads** when building import decisions;
- parallel **candidate release scoring** during identification.
**Docker:** fully supported. Set the variable on the container like any other env; the .NET process reads the container environment.
On the first disk scan, Lidarr logs a line like `Media import parallelism: MaxDegreeOfParallelism=…` so you can confirm the value it sees (useful if compose/env typos leave the variable unset).
On the first disk scan, Lidarr logs `Media import parallelism:` with **TPL default (-1, uncapped)** or your numeric cap, plus PLINQ degree and host `ProcessorCount`.
### Docker Compose
@ -35,24 +29,18 @@ services:
- PUID=1000
- PGID=1000
- TZ=Etc/UTC
# Gentle on NFS / network mounts (omit var for processor-count default)
- LIDARR_MEDIA_IO_PARALLELISM=1
# Omit LIDARR_MEDIA_IO_PARALLELISM on fast local storage (max throughput).
# - LIDARR_MEDIA_IO_PARALLELISM=2 # NFS / slow disk — cap concurrent work
```
### `docker run`
### When to set it
```bash
docker run -e LIDARR_MEDIA_IO_PARALLELISM=4 … your-image
```
### When to change it
- **NFS, SMB, or sluggish disks:** try `1` or leave default `2`.
- **Library and app on fast local storage (e.g. same NAS app dataset, local SSD):** try `4``8` or higher (up to 64) and watch CPU, I/O, and responsiveness.
- **Fast local RAID / SSD:** **omit** the variable (matches the first multithread fork).
- **NFS or network filesystem:** start with **`2`** (or **`1`**) if scans overwhelm the host.
### Implementation reference
Logic and constant name: `src/NzbDrone.Common/MediaImportParallelism.cs`.
`src/NzbDrone.Common/MediaImportParallelism.cs`.
## Relationship to upstream

View file

@ -1,9 +1,10 @@
using System;
using System.Threading.Tasks;
namespace NzbDrone.Common
{
/// <summary>
/// Caps parallel disk work during library scan/import (tag reads, folder scans, candidate scoring), optional via env.
/// Optional cap on parallel scan/import work via <c>LIDARR_MEDIA_IO_PARALLELISM</c>.
/// Unrelated to download bandwidth limits in Lidarr settings.
/// </summary>
public static class MediaImportParallelism
@ -13,27 +14,60 @@ public static class MediaImportParallelism
private const int MaxDegreeCap = 64;
/// <summary>
/// Maximum concurrent workers for scan/import parallelism.
/// If <c>LIDARR_MEDIA_IO_PARALLELISM</c> is unset, empty, invalid, 0, or negative: uses <see cref="Environment.ProcessorCount"/> (matches pre-cap fork behavior).
/// Otherwise uses the set value clamped to 164.
/// Re-reads the environment each call so container/env changes are visible without restart (same process still needs a new read on next scan).
/// <b>Unset / empty / invalid / ≤0:</b> Original fork behavior — no explicit cap on
/// <see cref="ParallelOptions.MaxDegreeOfParallelism"/> (TPL default <c>-1</c>, scheduler chooses concurrency; often higher than core count for I/O).
/// <b>164:</b> Cap <see cref="Parallel.ForEach"/> loops to that many concurrent workers (use on NFS / slow storage).
/// </summary>
public static int MaxDegreeOfParallelism => ReadMaxDegree();
private static int ReadMaxDegree()
public static ParallelOptions GetParallelForEachOptions()
{
if (!TryParseUserCap(out var cap))
{
return new ParallelOptions();
}
return new ParallelOptions { MaxDegreeOfParallelism = cap };
}
/// <summary>
/// PLINQ <c>WithDegreeOfParallelism</c> must be ≥ 1.
/// <b>Uncapped:</b> <see cref="Environment.ProcessorCount"/> (same as pre-env ImportDecisionMaker / IdentificationService).
/// <b>Capped:</b> user value (164).
/// </summary>
public static int PlinqMaxDegreeOfParallelism
{
get
{
if (!TryParseUserCap(out var cap))
{
return Math.Max(1, Environment.ProcessorCount);
}
return cap;
}
}
/// <summary>
/// For logging: -1 means TPL default (uncapped loops); otherwise the explicit cap.
/// </summary>
public static int EffectiveParallelForEachDegreeForLog =>
TryParseUserCap(out var cap) ? cap : -1;
private static bool TryParseUserCap(out int cap)
{
cap = 0;
var raw = Environment.GetEnvironmentVariable(EnvironmentVariableName);
if (string.IsNullOrWhiteSpace(raw) || !int.TryParse(raw.Trim(), out var parsed))
{
return Math.Max(1, Environment.ProcessorCount);
return false;
}
if (parsed <= 0)
{
return Math.Max(1, Environment.ProcessorCount);
return false;
}
return Math.Min(parsed, MaxDegreeCap);
cap = Math.Min(parsed, MaxDegreeCap);
return true;
}
}
}

View file

@ -131,15 +131,18 @@ public void Scan(List<string> folders = null, FilterFilesType filter = FilterFil
if (Interlocked.CompareExchange(ref _mediaParallelismLogged, 1, 0) == 0)
{
var envRaw = Environment.GetEnvironmentVariable(MediaImportParallelism.EnvironmentVariableName);
var loopDeg = MediaImportParallelism.EffectiveParallelForEachDegreeForLog;
var loopDesc = loopDeg < 0 ? "TPL default (-1, uncapped)" : loopDeg.ToString();
_logger.Info(
"Media import parallelism: MaxDegreeOfParallelism={0} ({1}={2}). Set 164 to cap IO; unset or 0 uses processor count ({3}).",
MediaImportParallelism.MaxDegreeOfParallelism,
"Media import parallelism: Parallel.ForEach MaxDegreeOfParallelism={0} ({1}; PLINQ degree {2}). Set {3}=164 to cap; omit or ≤0 restores pre-cap fork (uncapped loops). Host ProcessorCount={4}.",
loopDesc,
string.IsNullOrEmpty(envRaw) ? $"{MediaImportParallelism.EnvironmentVariableName}=(unset)" : $"{MediaImportParallelism.EnvironmentVariableName}={envRaw}",
MediaImportParallelism.PlinqMaxDegreeOfParallelism,
MediaImportParallelism.EnvironmentVariableName,
string.IsNullOrEmpty(envRaw) ? "(unset)" : envRaw,
Environment.ProcessorCount);
}
Parallel.ForEach(foldersToScan, new ParallelOptions { MaxDegreeOfParallelism = MediaImportParallelism.MaxDegreeOfParallelism }, folder =>
Parallel.ForEach(foldersToScan, MediaImportParallelism.GetParallelForEachOptions(), folder =>
{
_logger.ProgressInfo("Scanning {0}", folder);

View file

@ -305,11 +305,10 @@ private void GetBestRelease(LocalAlbumRelease localAlbumRelease, List<CandidateA
_logger.Debug("Matching {0} track files against {1} candidates", localAlbumRelease.TrackCount, candidateReleases.Count);
_logger.Trace("Processing files:\n{0}", string.Join("\n", localAlbumRelease.LocalTracks.Select(x => x.Path)));
var maxParallelism = MediaImportParallelism.MaxDegreeOfParallelism;
var scoredCandidates = candidateReleases
.Select((candidateRelease, index) => new { candidateRelease, index })
.AsParallel()
.WithDegreeOfParallelism(maxParallelism)
.WithDegreeOfParallelism(MediaImportParallelism.PlinqMaxDegreeOfParallelism)
.Select(item =>
{
var release = item.candidateRelease.AlbumRelease;

View file

@ -107,10 +107,9 @@ public Tuple<List<LocalTrack>, List<ImportDecision<LocalTrack>>> GetLocalTracks(
var processedTracks = new ConcurrentBag<(int Index, LocalTrack Track)>();
var processedDecisions = new ConcurrentBag<(int Index, ImportDecision<LocalTrack> Decision)>();
var progress = 0;
var maxParallelism = MediaImportParallelism.MaxDegreeOfParallelism;
var filesWithIndex = files.Select((file, index) => new { file, index }).ToList();
Parallel.ForEach(filesWithIndex, new ParallelOptions { MaxDegreeOfParallelism = maxParallelism }, item =>
Parallel.ForEach(filesWithIndex, MediaImportParallelism.GetParallelForEachOptions(), item =>
{
var current = Interlocked.Increment(ref progress);
_logger.ProgressInfo($"Reading file {current}/{files.Count}");