Backup_Repos/stash

mirror of https://github.com/stashapp/stash.git synced 2025-12-06 16:34:02 +01:00

Author	SHA1	Message	Date
WithoutPants	7b5bd80515	Separate graphql API from rest of the system (#2503 ) * Move graphql generated files to api * Refactor identify options * Remove models.StashBoxes * Move ScraperSource to scraper package * Rename field strategy enums * Rename identify.TaskOptions to Options	2022-09-06 07:03:40 +00:00
SmallCoccinelle	401660e6a3	Hoist context, enable errchkjson (#2488 ) * Make the script scraper context-aware Connect the context to the command execution. This means command execution can be aborted if the context is canceled. The context is usually bound to user-interaction, i.e., a scraper operation issued by the user. Hence, it seems correct to abort a command if the user aborts. * Enable errchkjson Some json marshal calls are safe in that they can never fail. This is conditional on the types of the the data being encoded. errchkjson finds those calls which are unsafe, and also not checked for errors. Add logging warnings to the place where unsafe encodings might happen. This can help uncover usage bugs early in stash if they are tripped, making debugging easier. While here, keep the checker enabled in the linter to capture future uses of json marshalling. * Pass the context for zip file scanning. * Pass the context in scanning * Pass context, replace context.TODO() Where applicable, pass the context down toward the lower functions in the call stack. Replace uses of context.TODO() with the passed context. This makes the code more context-aware, and you can rely on aborting contexts to clean up subsystems to a far greater extent now. I've left the cases where there is a context in a struct. My gut feeling is that they have solutions that are nice, but they require more deep thinking to unveil how to handle it. * Remove context from task-structs As a rule, contexts are better passed explicitly to functions than they are passed implicitly via structs. In the case of tasks, we already have a valid context in scope when creating the struct, so remove ctx from the struct and use the scoped context instead. With this change it is clear that the scanning functions are under a context, and the task-starting caller has jurisdiction over the context and its lifetime. A reader of the code don't have to figure out where the context are coming from anymore. While here, connect context.TODO() to the newly scoped context in most of the scan code. * Remove context from autotag struct too * Make more context-passing explicit In all of these cases, there is an applicable context which is close in the call-tree. Hook up to this context. * Simplify context passing in manager The managers context handling generally wants to use an outer context if applicable. However, the code doesn't pass it explicitly, but stores it in a struct. Pull out the context from the struct and use it to explicitly pass it. At a later point in time, we probably want to handle this by handing over the job to a different (program-lifetime) context for background jobs, but this will do for a start.	2022-04-15 11:34:53 +10:00
WithoutPants	0cd9a0a474	Python path setting (#2409 ) * Add python package * Add python path backend config * Add python path to system settings page * Apply python path to script scrapers and plugins	2022-03-24 09:22:41 +11:00
WithoutPants	f69bd8a94f	Restructure go project (#2356 ) * Move main to cmd * Move api to internal * Move logger and manager to internal * Move shell hiding code to separate package * Decouple job from desktop and utils * Decouple session from config * Move static into internal * Decouple config from dlna * Move desktop to internal * Move dlna to internal * Decouple remaining packages from config * Move config into internal * Move jsonschema and paths to models * Make ffmpeg functions private * Move file utility methods into fsutil package * Move symwalk into fsutil * Move single-use util functions into client package * Move slice functions to separate packages * Add env var to suppress windowsgui arg * Move hash functions into separate package * Move identify to internal * Move autotag to internal * Touch UI when generating backend	2022-03-17 11:33:59 +11:00
WithoutPants	9e3d56b22f	Fix identify and script scraper bugs (#2375 ) * Continue identify if source fails * Handle empty result set correctly * Parse null values from scraper script correctly * Omit warning when json selector value missing * Return nil when scraped item not found * Fix graphql validation errors	2022-03-15 09:42:22 +11:00
kermieisinthehouse	0e514183a7	Desktop integration (#2073 ) * Open stash in system tray on Windows/MacOS * Add desktop notifications * MacOS Bundling * Add binary icon Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2022-02-03 11:20:34 +11:00
SmallCoccinelle	4089fcf1e2	Scraper refactor middle (#2043 ) * Push scrapeByURL into scrapers Replace ScrapePerfomerByURL, ScrapeMovie..., ... with ScrapeByURL in the scraperActionImpl interface. This allows us to delete a lot of repeated code in the scrapers and replace the central part with a switch on the scraper type. * Fold name scraping into one call Follow up on scraper refactoring. Name scrapers use the same code path. This allows us to restructure some code and kill some functions, adding variance to the name scraping code. It allows us to remove some code repetition as well. * Do not export loop refs. * Simplify fragment scraping Generalize fragment scrapers into ScrapeByFragment. This simplifies fragment code flows into a simpler pathing which should be easier to handle in the future. * Eliminate more context.TODO() In a number of cases, we have a context now. Use the context rather than TODO() for those cases in order to make those operations cancellable. * Pass the context for the stashbox scraper This removes all context.TODO() in the path of the stashbox scraper, and replaces it with the context that's present on each of the paths. * Pass the context into subscrapers Mostly a mechanical update, where we pass in the context for subscraping. This removes the final context.TODO() in the scraper code. * Warn on unknown fields from scripts A common mistake for new script writers are that they return fields not known to stash. For instance the name "description" is used rather than "details". Decode disallowing unknown fields. If this fails, use a tee-reader to fall back to the old behavior, but print a warning for the user in this case. Thus, we retain the old behavior, but print warnings for scripts which fails the more strict unknown-fields detection. * Nil-check before running the postprocessing chain Fixes panics when scraping returns nil values. * Lift nil-ness in post-postprocessing If the struct we are trying to post-process is nil, we shouldn't enter the postprocessing flow at all. Pass the struct as a value rather than a pointer, eliminating nil-checks as we go. Use the top-level postProcess call to make the nil-check and then abort there if the object we are looking at is nil. * Allow conversion routines to handle values If we have a non-pointer type in the interface, we should also convert those into ScrapedContent. Otherwise we get errors on deprecated functions.	2021-11-26 11:20:06 +11:00
SmallCoccinelle	c1f89611e2	Refactor scraper top half (#1893 ) * Simplify scraper listing Introduce an enum, scraper.Kind, which explains what we are looking for. Make it possible to match this from a scraper struct. Use the enum to rewrite all the listing code to use the same code path. * Use a map, nitpick ScrapePerformerList Let the cache store a map from ID of a scraper to the scraper. This improves lookups when there are many scrapers, making it practically O(1) rather than O(n). If many scrapers are stored, this is faster. Since range expressions work unchanged, we don't have to change much, and things will still work. make Kind a Stringer Rename ScraperPerformerList -> ScraperPerformerQuery since that name is used in the other scrapers, and we value consistency. Tune ScraperPerformerQuery: * Return static errors * Use the new functionality * When loading scrapers, do so directly Rather than first walking the directory structure to obtain file paths, fold the load directly in the the filepath walk. This makes the code for more direct. * Use static ErrNotFound If a scraper isn't found, return one static error. This paves the way for eventually doing our own error-presenter in gqlgen. * Store the cache in the Resolver state Putting the scraperCache directly in the resolver avoids the need to call manager.GetInstance() all over the place to get access to the scraper cache. The cache is stored by pointer, so it should be safe, since the cache will just update its internal state rather than being overwritten. We can now utilize the resolver state to grab the cache where needed. While here, pass context.Context from the resolver down into a function, which removes a context.TODO() * Introduce ScrapedContent Create a union in the GraphQL schema for all scraped content. This simplifies the internal implementation because we get variance on the output content type. Introduce a new type ScrapedContentType which signifies the scraped content you want as a caller. Use these to generalize the List interface and the URL scraping interface. * Simplify the scraper API Introduce a new interface for scraping. This interface is then used in the upper half of the scraper code, to make the code use one code flow rather than multiple code flows. Variance is currently at the old scraper structure. Add extending interfaces for the different ways of invoking scrapes. Use interface conversions to convert a scraper from the cache to a scraper supporting the extra methods. The return path returns models.ScrapedContent. Write a general postProcess function in the scraper, handling all ScrapedContent via type switching. This consolidates all postprocessing code flows. Introduce marhsallers in the resolver code for converting ScrapedContent into the underlying concrete types. Use this to plug the existing fields in the Query resolver, so everything still works. * ScrapedContent: add more marshalling functions Handle all marshalling of ScrapedContent through marhsalling functions. Removes some hand-rolled early variants of it, and replaces it with a canonical code flow. * Support loadByName via scraper_s In order to temporarily plug a hole in the current implementation, we use the older implementation as a hook to get the newer implementation to run. Later on, this can serve as a guide for how to implement the lower level bits inside the scrapers themselves. For now, it just enables support. * Plug the remaining scraper functions for now Since we would like to have a scraper which works in between refactors, plug the lower level parts of the scraper for now. It avoids us having to tackle this part just yet. * Move postprocessing to its own file There's enough postprocessing to clutter the main scrapers.go file. Move all of this into a new file, postprocessing to make the API simpler. It now lives in scrapers.go. * Scraper: Invoke API consistency scraper.Cache.ScrapeByName -> ScrapeName * Fix scraping scenes by URL Simple typo. While here, also make a single marshaller nil-aware. * Introduce scraper groups, consolidate loadByURL Rename `scraper_s` into `group`. A group is a group of scrapers with the same identity. This corresponds to a single YAML file for a scraper configuration. It defines a group which supports different types of scraping contexts. Move config into the group, and lift txnManager and globalConfig to the group. Because we now return models.ScrapedContent we can use interfaces to get variance from the different underlying scrapers. Use a type switch for the URL matcher candidates. And then again for the scrapers. This consolidates all URL scraping paths into one. While here, remove the urlMatcher interface which isn't needed. Also clean up the remaining interfaces for url scraping and delete code which has no purpose anymore. * Consolidate fragment scraping in one code path While here, abide the linters checks. * Refactor loadByFragment Give it the same treatment as loadByURL: Step 1: find a scraperActionImpl which works for the data. Step 2: use that to scrape Most of this is simple analysis on the data at hand. It can be pushed down further in a later commit, but for now we leave it here. * Remove configScraper, autotag is a scraper Remove the remains of the configScraper struct. It now lives on in the group struct. Kill the remaining interfaces from the old implementation while here. Remove group.specification since it can now be handled by a simple func call to spec(). Work through the autotag scraper. It now implements the scraper interface, so it can be used as a scraper. This also simplifies the autotag scraper quite a bit since it doens't have to implement a number of unsupported func calls. * Simplify the fragment scraper flow * Pass the context Eliminate a round of context.TODO() in the scraper code by passing the calling context down into the subsystem. This will gracefully allow for termination of remote calls if the client goes away for some reason in GraphQL requests. * Improve listScrapers in the schema Support lists of types we accept. * Be graceful on nil values in conversion Supporting nil-values make the API more robust in the case of partial results in a multi-scrape situation. * Improve listScrapers: output at-most-once Use the ID of a scraper to reduce the output set. If a scraper has been included, don't include it again. * Consolidate all API level errors into resolver.go * Reorder files and functions: scrapers.go -> cache.go: It almost contains nothing but the cache code. Move errors into scraper.go from here because It is a better place to have them living right now group.go: All of the group structure. This can now go from scraper.go, making it more lean. Move group create from config_scraper to here. config.go: Move the `(c config) spec()` call to here. config_scraper.go: Empty file by now * Name-update the scraper interfaces Use 'via' rather than 'loadBy'. The scrape happens via a given scrape method, so I think this is a nice name for it. * Rename scrapers for consistency. While here, improve the error formatting, so different errors come back differently. * Nuke the freeones field from the GraphQL schema * Fix autotag interfacing, refactor The autotag scraper uses a pointer receiver, but the rest of the code we use for scraping doesn't expect a pointer-receiver. Hence, to fix the autotag scraper, we change it to be a value receiver, like the rest of the code. Fix: viaScene, and viaGallery. While here, remove a couple of pointer-receiver methods which can be trivially rewritten into plain functions. * Protect against pointer interfaces The underlying code can be a bit inconsistent in what it returns. Introduce pointer-types in the postprocessing layer and handle them accordingly for now. Once a better understanding of the lower levels are understood, we can lift this. * Move ErrConversion into the models package. The conversion error pertains to the logic of converting models. Because of this, it should move there, so it is centralized. * Be consistent in scraper resolver error handling If we have a static error Err = errors.New(..) Then use it wrapped at the start: fmt.Errorf("%w: ...context...", Err) This reads better. While here, avoid using the underlying Atoi errors: they are verbose, and like 99% of the time, the user know what is wrong from the input string, so just give that back. Also, remove the scraper id from the error contexts: it is implicit, and the error wouldn't change if we used a different scraper, which the error message would imply. * Mark the listScrapers() API as deprecated The same functionality is now present in listScrapers. Improve error formatting Think about how each error is going to be used and tweak them to be nicer. * Return a sorted list of scrapers This helps testing, it's closer to what we had, caches like stable data, and it is easier for humans. It also makes the output stable, because map iteration is randomized. * Fix listScrapers calls to return in ID-order Since we need the ordering to be by ID in all situations, it is easier to just generalize the cache listScrapers call to support multiple scraper types. This avoids a de-dupe map up the chain, since every scraper is only considered once. Sorting now happens in the cache listScrapers call. Use this generalized function in all resolvers, which are now simple passthroughs. * Remove UpdateConfig from the scraper cache. This isn't needed, so get rid of it. * Pull a context into identify Scraping scenes in the identify tasks now use a context from up the call chain. * Do not store the scraper cache in the resolver. Scraper caches are updated through manager.singleton•RefreshScraperCache, so we can't keep a pointer to it in the resolver. Instead, solve this by adding a fetcher method to the resolver type. This keeps it local to the resolver, while handling the problem of updating caches in the configuration.	2021-11-19 10:55:34 +11:00
SmallCoccinelle	a9e2a590b2	Lint checks phase 2 (#1747 ) * Log 3 unchecked errors Rather than ignore errors, log them at the WARNING log level. The server has been functioning without these, so assume they are not at the ERROR level. * Log errors in concurrency test If we can't initialize the configuration, treat the test as a failure. * Undo the errcheck on configurations for now. * Handle unchecked errors in pkg/manager * Resolve unchecked errors * Handle DLNA/DMS unchecked errors * Handle error checking in concurrency test Generalize config initialization, so we can initialize a configuration without writing it to disk. Use this in the test case, since otherwise the test fails to write. * Handle the remaining unchecked errors * Heed gosimple in update test * Use one-line if-initializer statements While here, fix a wrong variable capture error. * testing.T doesn't support %w use %v instead which is supported. * Remove unused query builder functions The Int/String criterion handler functions are now generalized. Thus, there's no need to keep these functions around anymore. * Mark filterBuilder.addRecursiveWith nolint The function is useful in the future and no other refactors are looking nice. Keep the function around, but tell the linter to ignore it. * Remove utils.Btoi There are no users of this utility function * Return error on scan failure If we fail to scan the row when looking for the unique checksum index, then report the error upwards. * Fix comments on exported functions * Fix typos * Fix startup error	2021-09-23 17:15:50 +10:00
gitgiggety	b83ce29ac4	Scraper log improvements (#1741 ) * Fix logs from scraper and plugins not being shown in UI Using `logger.` in the logger package to write logs is "incorrect". This as the package contains a variable named `logger` which contains the logrus instance. So instead of the log line being handled by the custom log implementation / wrapper which makes sure the lines are shown in the UI as well, it's written to logrus directly meaning the wrapper is skipped. This "issue" is obviously triggered because in any other place `logger.X` can be used and it will used the custom logger package / wrapper which works fine. * Add plugin / scraper name to logging output Indicate which plugin / scraper wrote a log message by including its name to the `[Scrape]` prefix. * Add missing addLogItem call	2021-09-19 10:06:34 +10:00
stg-annon	d29699fa30	Support scraper logging to specific log levels (#1648 ) * init scrapper log levels * Refactor plugin logging Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2021-09-17 09:09:44 +10:00
WithoutPants	1a3a2f1f83	Scrape scene by name (#1712 ) * Support scrape scene by name in configs * Initial scene querying * Add to manual	2021-09-14 14:54:53 +10:00
WithoutPants	4625e1f955	Unify scrape refactor (#1630 ) * Unify scraped types * Make name fields optional * Unify single scrape queries * Change UI to use new interfaces * Add multi scrape interfaces * Use images instead of image	2021-09-07 11:54:22 +10:00
SpedNSFW	bde5d07afb	Find correct python executable (#1156 ) * find correct python executable For script scrapers using python, both python and python3 are valid depending on the OS and running environment. To save users from having any issues, this change will find the correct executable for them. Co-authored-by: bnkai <bnkai@users.noreply.github.com>	2021-03-03 08:01:01 +11:00
bnkai	984a0c9247	Tweak scraper script error printing (#1107 )	2021-02-09 19:07:53 +11:00
SpedNSFW	714ae541d4	fix json unmarshal error return (#1109 )	2021-02-09 19:04:42 +11:00
SpedNSFW	147d0067f5	Add gallery scraping (#862 )	2020-10-21 09:24:32 +11:00
woodgen	4045ddf3e9	Implement scraping movies by URL (#709 ) * api/urlbuilders/movie: Auto format. * graphql+pkg+ui: Implement scraping movies by URL. This patch implements the missing required boilerplate for scraping movies by URL, using performers and scenes as a reference. Although this patch contains a big chunck of ground work for enabling scraping movies by fragment, the feature would require additional changes to be completely implemented and was not tested. * graphql+pkg+ui: Scrape movie studio. Extends and corrects the movie model for the ability to store and dereference studio IDs with received studio string from the scraper. This was done with Scenes as a reference. For simplicity the duplication of having `ScrapedMovieStudio` and `ScrapedSceneStudio` was kept, which should probably be refactored to be the same type in the model in the future. * ui/movies: Add movie scrape dialog. Adds possibility to update existing movie entries with the URL scraper. For this the MovieScrapeDialog.tsx was implemented with Performers and Scenes as a reference. In addition DurationUtils needs to be called one time for converting seconds from the model to the string that is displayed in the component. This seemed the least intrusive to me as it kept a ScrapeResult<string> type compatible with ScrapedInputGroupRow.	2020-08-10 15:34:15 +10:00
WithoutPants	2b9215702e	Refactor xpath scraper code. Add fixed and map (#616 ) * Refactor xpath scraper code * Make post-process a list * Add map post-process action * Add fixed xpath values * Refactor scrapers into cache * Refactor into mapped config * Trim test html	2020-07-21 14:06:25 +10:00
WithoutPants	92837fe1f7	Add scene metadata scraping functionality (#236 ) * Add scene scraping functionality * Adapt to changed scraper config	2019-12-15 20:35:34 -05:00
WithoutPants	50784025f2	Change scraper config to yaml (#256 )	2019-12-12 14:27:44 -05:00

21 commits