Commit graph

395 commits

Author SHA1 Message Date
notsafeforgit
34821a5d40 perf: eliminate O(N^2) bottlenecks in image and scene duplicate detection
This update resolves major performance regressions when processing large libraries:

1. Optimized FindMany in both Image and Scene stores to use map-based ID lookups. Previously, this function used slices.Index in a loop, resulting in O(N^2) complexity. On a library with 300k items, this was causing the server to hang indefinitely.

2. Refined the exact image duplicate SQL query to match the scene checker's level of optimization. It now joins the files table and orders results by total duplicate file size, ensuring that the most impactful duplicates are shown first.

3. Removed the temporary LIMIT 1000 from the image duplicate query now that the algorithmic bottlenecks have been resolved.
2026-03-29 23:47:36 -07:00
notsafeforgit
15c6dd5575 perf: further optimize image duplicate detection
This update provides additional performance improvements specifically targeted at large image libraries (e.g. 300k+ images):

1. Optimized the exact match SQL query for images:
   - Added filtering for zero/empty fingerprints to avoid massive false-positive groups.
   - Added a LIMIT of 1000 duplicate groups to prevent excessive memory consumption and serialization overhead.
   - Simplified the join structure to ensure better use of the database index.

2. Parallelized the Go comparison loop in pkg/utils/phash.go:
   - Utilizes all available CPU cores to perform Hamming distance calculations.
   - Uses a lock-free design to minimize synchronization overhead.
   - This makes non-zero distance searches significantly faster on multi-core systems.
2026-03-29 23:47:36 -07:00
notsafeforgit
b9752723b6 perf: massive optimization for image and scene duplicate detection
This update provides significant performance improvements for both image and scene duplicate searching:

1. Optimized the core Hamming distance algorithm in pkg/utils/phash.go:
   - Uses native CPU popcount instructions (math/bits) for bit counting.
   - Pre-calculates hash values to eliminate object allocations in the hot loop.
   - Halves the number of comparisons by leveraging the symmetry of the Hamming distance.
   - The loop is now several orders of magnitude faster and allocation-free.

2. Solved the N+1 database query bottleneck:
   - Replaced individual database lookups for each duplicate group with a single batched query for all duplicate IDs.
   - This optimization was applied to both Image and Scene repositories.

3. Simplified the SQL fast path for exact image matches to remove redundant table joins.
2026-03-29 23:47:36 -07:00
notsafeforgit
3444c21263 perf(sqlite): implement SQL-based fast path for exact image duplicate detection
This change adds a specialized SQL query to find exact image duplicate matches (distance 0) directly in the database.

Previously, the image duplicate checker always used an O(N^2) Go-based comparison loop, which caused indefinite loading and timeouts on libraries with a large number of images. The new SQL fast path reduces the time to find exact duplicates from minutes/hours to milliseconds.
2026-03-29 23:47:36 -07:00
notsafeforgit
fcc5b51bfd fix(sqlite): fix image duplicate detection by scanning phash as integer
This fixes a bug where identical image duplicates were not being detected.

The implementation was incorrectly scanning the phash BLOB into a string and then attempting to parse it as a hex string. Since phashes are stored as 64-bit integers, they were being converted to decimal strings. For phashes with the MSB set (negative when treated as int64), the resulting decimal string started with a '-', which caused the hex parser to fail and skip the image entirely.

Additionally, even for non-negative phashes, parsing a decimal string as hex yielded incorrect hash values.

Scanning directly into the utils.Phash struct (which uses int64) matches how Scene phashes are handled and ensures the hash values are correct.
2026-03-29 23:47:36 -07:00
notsafeforgit
1b093e244d fix: update image duplicate checker UI and API handling
- Fixes 400 error in ImageDuplicateChecker

- Updates UI and frontend types

- Fixes tools casing
2026-03-29 23:47:36 -07:00
notsafeforgit
9e54daef97 fix: resolve image duplicate finder issues
- Wrap FindDuplicateImages query in r.withReadTxn() to ensure a database transaction in context.
- Use queryFunc instead of queryStruct for fetching multiple hashes, preventing runtime errors.
- Fix N+1 query issue in duplicate grouping by using qb.FindMany() instead of qb.Find() for each duplicate image.
- Revert searchColumns array to exclude "images.details" which was from another PR and remove related failing test.
2026-03-29 23:47:36 -07:00
notsafeforgit
cc5be78489 fix: resolve unused import and undefined reference in sqlite image repository
- Removed unused `strconv` import from `pkg/sqlite/image.go`.
- Added missing `github.com/stashapp/stash/pkg/utils` import to resolve the undefined `utils` reference.
- Fixed pagination prop in ImageDuplicateChecker component.
- Formatted modified go files using gofmt.
- Ran prettier over the UI codebase to resolve the formatting check CI failure.
2026-03-29 23:47:36 -07:00
notsafeforgit
0d05dd3e2c feat: improve Image Duplicate Checker implementation
This change unifies the duplicate detection logic by leveraging the shared phash utility. It also enhances the UI with:
- Pagination for large result sets.
- Sorting duplicate groups by total file size.
- A more detailed table view with image thumbnails, paths, and dimensions.
- Consistency with the existing Scene Duplicate Checker tool.
2026-03-29 23:47:36 -07:00
notsafeforgit
2fb31cfff2 feat: Implement Image Duplicate Checker
This change introduces a new tool to identify duplicate images based on their perceptual hash (phash). It includes:
- Backend implementation for phash distance comparison and grouping.
- GraphQL schema updates and API resolvers.
- Frontend UI for the Image Duplicate Checker tool.
- Unit tests for the image search and duplicate detection logic.
2026-03-29 23:47:36 -07:00
(Moai Emoji)
c861d3991a
Fix 'not equals' custom field to include unset objects (#6742)
* Fix custom field 'not equals' to include unset objects
* also fix Excludes and NotBetween null handling
2026-03-26 09:01:43 +11:00
WithoutPants
c5034422cb
Expand folder select hierarchy based on initial selected folder (#6738)
* Add sub_folders field to Folder type
* Expand folder select for the initial value
2026-03-23 16:15:23 +11:00
WithoutPants
2bb1df8443
Fix incorrect where clause for gallery parent folder filter (#6737) 2026-03-23 13:45:31 +11:00
WithoutPants
8f3188ff74
Make gallery/scene association during scan more consistent (#6705) 2026-03-19 08:54:44 +11:00
WithoutPants
93fbb4be80
Add option to ignore zip contents during clean (#6700)
* Add option to ignore zip file contents while cleaning

Speeds up the clean process with the assumption that files within zip files are not deleted.

* Add UI for new option
2026-03-18 15:58:32 +11:00
dev-null-life
f7b04fba61
Sort performers and studios by scenes file size (#6642)
* feat: add scenes_size sort option for performers

Adds sorting performers by total file size of associated scenes.
Follows the existing scenes_duration pattern.

Ref: stashapp/stash#5530

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add scenes_size sort option for studios

Adds sorting studios by total file size of associated scenes.
Follows the existing scenes_duration pattern.

Ref: stashapp/stash#5530

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(ui): add Scenes Size sort option for performers and studios

Adds 'Scenes Size' to the sort dropdown for performer and studio
list views, with i18n label.

Ref: stashapp/stash#5530

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: extend scenes_size sort to tags

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 11:51:56 +11:00
WithoutPants
f3c8e7ac9c
Convert career length fields to dates (#6682)
* Convert career start/end to date
* Update UI to accept dates for career length fields
* Fix date filtering
---------
Co-authored-by: Gykes <24581046+Gykes@users.noreply.github.com>
2026-03-17 15:48:30 +11:00
notsafeforgit
c2e80d2676
feat: add image details to search filter (#6673)
* feat: add image details to search filter

This change adds the image details field to the image search filter, allowing for better discovery of images based on their description.
---------
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-03-17 10:18:44 +11:00
dev-null-life
8d1aeede1c
fix: correct typos in GraphQL schema (#6679)
* fix: correct typos in GraphQL schema comments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: rename CircumisedEnum to CircumcisedEnum across codebase

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: gofmt performer model files after enum rename

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 17:48:59 +11:00
hyper440
300e7edb75
fix: support string-based fingerprints in hashes filter (#6654)
* fix: support string-based fingerprints in hashes filter
* Fix tests and add phash test

File fingerprints weren't using correct types. Filter test wasn't using correct types. Add phash to general files.
---------

Co-authored-by: hyper440 <hyper440@users.noreply.github.com>
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-03-10 15:07:46 +11:00
WithoutPants
717f968a2c
Add folder criteria to scenes, images and galleries and sidebars (#6636)
* Add useDebouncedState hook
* Add basename to folder sort whitelist
* Add parent_folder criterion to gallery
* Add selection on enter if single result
2026-03-05 08:02:13 +11:00
Gykes
c874bd560e
Fix: Custom Field Filtering (#6614)
* add tests
* Refactor queryBuilder: split args into per-clause fields
2026-02-28 11:05:13 +11:00
Gykes
3b8f6bd94c
update logs and fix UNIQUE constraint failure (#6617) 2026-02-28 09:11:13 +11:00
WithoutPants
d8448ba37e
Add basename and parent_folders fields to Folder graphql interface (#6494)
* Add basename field to folder
* Add parent_folders field to folder
* Add basename column to folder table
* Add basename filter field
* Create missing folder hierarchies during migration
* Treat files/folders in zips where path can't be made relative as not found

Addresses an issue during clean where corrupt folder entries in zip files could not be removed due to an error during the call to Rel.
2026-02-27 10:58:11 +11:00
Gykes
b77abd64e2
FR: Add Missing is-missing Filter Options Across all Object Types (#6565)
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-02-26 16:36:54 +11:00
WithoutPants
5734ee43ff
Add sidebar to scene markers list (#6603)
* Add tag markers filter
* Add marker count and markers filter to performer filter
* Add sidebar to marker list
2026-02-26 07:54:40 +11:00
Gykes
0103fe4751
FR: Tags Tagger (#6559)
* Refactor Tagger components
* condense localization
* add alias and description to model and schema
2026-02-25 11:39:14 +11:00
WithoutPants
86abe7b24c
Backend support for image custom fields (#6598)
* Initialise maps in bulk get custom fields to fix graphql validation error
2026-02-24 07:41:40 +11:00
WithoutPants
ca5178f05e
Backend support for Group custom fields (#6596) 2026-02-23 11:53:12 +11:00
WithoutPants
47dcdd439c
Backend support for gallery custom fields (#6592) 2026-02-23 07:39:28 +11:00
WithoutPants
843806247d
Add group scene count filter (#6593) 2026-02-20 09:14:25 +11:00
Gykes
3dc86239d2
Feature Request: Add organized flag to studios (#6303) 2026-02-19 09:05:17 +11:00
WithoutPants
e289199911
Scene custom field backend support (#6584)
* Add custom fields to scenes
* Generalise set custom fields tests to other object types
2026-02-18 16:50:32 +11:00
Gykes
adaadee368
FR: Change Career Length to Career Start and Career End (#6449)
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-02-17 13:44:03 +11:00
WithoutPants
bede849fa6
Add sidebar to group list (#6573)
* Add group filter criteria to tag and studio
* Add sidebar to groups list
* Refactor ListOperations to accept buttons
* Move create new button back to navbar

Having the create new button with a plus icon conflicted with the add sub-group button in the sub-groups view.

* Simplify group-sub-groups view
2026-02-16 17:28:41 +11:00
Gykes
d1479ca4e5
Feature: Scene Duplicate Filter (#6344) 2026-02-11 11:52:44 +11:00
WithoutPants
d64b3b711c
Revamp studio list with sidebar (#6549)
* Add studios_filter to TagFilterType
* Convert studio list to use sidebar
2026-02-06 12:37:38 +11:00
WithoutPants
b278525647
Tag custom fields support for backend (#6546)
* Fix custom field import/export for studio
* Update studio unit tests
* Add tag create and update unit tests
* Add custom fields to tag filter graphql
* Add unit tests for tag filtering
* Add filter unit tests for studio
2026-02-06 12:35:05 +11:00
CJ
f629191b28
Future support for filtering tags list by current filter on Performers page (#6091) 2026-02-05 13:35:58 +11:00
WithoutPants
9eda7c2f60
Studio custom fields backend support (#6156) 2026-02-05 09:01:29 +11:00
Gykes
244d70e20e
Feature: Stash ID Count Filter (#6347)
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-01-27 17:26:42 +11:00
WeedLordVegeta420
6f5a7d1f0a
Add latest scene sort for performers and studios. (#6501) 2026-01-27 17:24:14 +11:00
WithoutPants
a05500342a
Image phash generation (#6497)
* Add image phash generation
* Add phash image filter
* Add phash to image file info and phash image filtering in ui
* Add options to generate image phash for generate/scan tasks
* Add imageIDs input to generate task
* Add generate option to image menus
* Add ellipses to generate
2026-01-27 17:00:56 +11:00
WithoutPants
5b3785f164
Revert stashIDCriterionHandler changes. Reimplement stashIDsCriterionHandler to not use stringCriterionHandler (#6496) 2026-01-16 10:21:15 +11:00
sashapp
ed3a239366
Implement stash_ids_endpoint for the SceneFilterType (#6401)
* Implement stash_ids_endpoint for the SceneFilterType
* Reduce code duplication by calling the stashIDsCriterionHandler from the stashIDCriterionHandler
* Mark stash_id_endpoint in SceneFilterType, StudioFilterType, and PerformerFilterType as deprecated
2026-01-14 14:53:40 +11:00
Gykes
956af44a29
FR: Sort Scenes and Images by Resolution (#6441) 2026-01-05 16:36:21 +11:00
sezzim
65e82a0cf6
Performer merge (#5910)
* Implement merging of performers
* Make the tag merge UI consistent with other types of merges
* Add merge action in scene menu
---------
Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
2026-01-05 15:54:19 +11:00
WithoutPants
1691280d1b
Fix excludes handling in performer studio filter (#6413) 2025-12-15 11:49:30 +11:00
WithoutPants
67b1dd8dd0
Add tag stash ids filter criterion (#6403)
* Add stash id filter to tag filter
* Add tag stash id criterion in UI
2025-12-12 08:54:57 +11:00
WithoutPants
7db394bbea
Date precision (#6359)
* Remove month/year only formats from ParseDateStringAsTime
* Add precision field to Date and handle parsing year/month-only dates
* Add date precision columns for date columns
* Adjust UI to account for fuzzy dates
2025-12-08 09:11:40 +11:00