Backup_Repos/stash

mirror of https://github.com/stashapp/stash.git synced 2025-12-10 10:22:18 +01:00

Author	SHA1	Message	Date
SmallCoccinelle	e513b6ffa5	Cache and reuse the scraper HTTP client (#1855 ) * Add Cookies directly to the request Rather than maintaining a cookie jar on a one-shot HTTP client, maintain the jar ourselves: make a new jar, then use it to select the right cookies. The cookies are set on the request rather than on the client. This will retain the current behavior as we are always throwing the client away after each use. This patch enables the lifting of the http client as well over time. * Introduce a cached scraper HTTP client The scraper cache is augmented with an http.Client. These are safe for concurrent use, so the pointer can safely be passed around. Push this into scraper configurations where applicable, next to the txnManagers. When we issue a loadUrl request, do so on the cached http.Client, which will reuse existing idle connections in the client if any are present. * Set MaxIdleConnsPerHost. Closes #1850 We allow for up to 8 idle connections to a single host. This should make concurrent operation toward the same host reuse connections, even for sizeable concurrency. The number isn't bumped excessively high. We should probably limit concurrency toward a single site anyway, since we'll be able to overrun a site with queries quite easily if we have many concurrent goroutines issuing requests at the same time. * Reinstate driverOptions / useCDP check Use DeMorgan's laws to invert the logic and exit early. Fixes tests breaking. * Documentation fixup. * Use the scraper http.Client when fetching images Fold image fetchers onto the cached scraper http.Client as well. This makes the scraper have a single http.Client cache for all its operations. Thread the client upwards to the relevant attachment points: either the cache, or a stash_box instance, which is extended to include a pointer to the client. Style roughly follows that of txnManagers. * Use the same http Client as the GraphQL client use Rather than using http.DefaultClient, use the same client as the GraphQL client use in the stash_box subsystem. This localizes the client used in the subsystem into the constructing New.. call. * Hoist HTTP client construction Create a function for initializaing the HTTP Client we use. While here hoist magic numbers into constants. Introduce a proper static redirect error and use it in the client code as well. * Reinstate printCookies This is a debugging function, and it might still come in handy in the future at some point. * Nitpick comment. * Minor tidy Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2021-10-20 16:12:24 +11:00
SmallCoccinelle	e14bb8432c	Enable gocritic (#1848 ) * Don't capitalize local variables ValidCodecs -> validCodecs * Capitalize deprecation markers A deprecated marker should be capitalized. * Use re.MustCompile for static regexes If the regex fails to compile, it's a programmer error, and should be treated as such. The regex is entirely static. * Simplify else-if constructions Rewrite else { if cond {}} to else if cond {} * Use a switch statement to analyze formats Break an if-else chain. While here, simplify code flow. Also introduce a proper static error for unsupported image formats, paving the way for being able to check against the error. * Rewrite ifElse chains into switch statements The "Effective Go" https://golang.org/doc/effective_go#switch document mentions it is more idiomatic to write if-else chains as switches when it is possible. Find all the plain rewrite occurrences in the code base and rewrite. In some cases, the if-else chains are replaced by a switch scrutinizer. That is, the code sequence if x == 1 { .. } else if x == 2 { .. } else if x == 3 { ... } can be rewritten into switch x { case 1: .. case 2: .. case 3: .. } which is clearer for the compiler: it can decide if the switch is better served by a jump-table then a branch-chain. * Rewrite switches, introduce static errors Introduce two new static errors: * `ErrNotImplmented` * `ErrNotSupported` And use these rather than forming new generative errors whenever the code is called. Code can now test on the errors (since they are static and the pointers to them wont change). Also rewrite ifElse chains into switches in this part of the code base. * Introduce a StashBoxError in configuration Since all stashbox errors are the same, treat them as such in the code base. While here, rewrite an ifElse chain. In the future, it might be beneifical to refactor configuration errors into one error which can handle missing fields, which context the error occurs in and so on. But for now, try to get an overview of the error categories by hoisting them into static errors. * Get rid of an else-block in transaction handling If we succesfully `recover()`, we then always `panic()`. This means the rest of the code is not reachable, so we can avoid having an else-block here. It also solves an ifElse-chain style check in the code base. * Use strings.ReplaceAll Rewrite strings.Replace(s, o, n, -1) into strings.ReplaceAll(s, o, n) To make it consistent and clear that we are doing an all-replace in the string rather than replacing parts of it. It's more of a nitpick since there are no implementation differences: the stdlib implementation is just to supply -1. * Rewrite via gocritic's assignOp Statements of the form x = x + e is rewritten into x += e where applicable. * Formatting * Review comments handled Stash-box is a proper noun. Rewrite a switch into an if-chain which returns on the first error encountered. * Use context.TODO() over context.Background() Patch in the same vein as everything else: use the TODO() marker so we can search for it later and link it into the context tree/tentacle once it reaches down to this level in the code base. * Tell the linter to ignore a section in manager_tasks.go The section is less readable, so mark it with a nolint for now. Because the rewrite enables a ifElseChain, also mark that as nolint for now. * Use strings.ReplaceAll over strings.Replace * Apply an ifElse rewrite else { if .. { .. } } rewrite into else if { .. } * Use switch-statements over ifElseChains Rewrite chains of if-else into switch statements. Where applicable, add an early nil-guard to simplify case analysis. Also, in ScanTask's Start(..), invert the logic to outdent the whole block, and help the reader: if it's not a scene, the function flow is now far more local to the top of the function, and it's clear that the rest of the function has to do with scene management. * Enable gocritic on the code base. Disable appendAssign for now since we aren't passing that check yet. * Document the nolint additions * Document StashBoxBatchPerformerTagInput	2021-10-18 14:12:40 +11:00
SmallCoccinelle	655d3ae969	Toward better context handling (#1835 ) * Use the request context The code uses context.Background() in a flow where there is a http.Request. Use the requests context instead. * Use a true context in the plugin example Let AddTag/RemoveTag take a context and use that context throughout the example. * Avoid the use of context.Background Prefer context.TODO over context.Background deep in the call chain. This marks the site as something which we need to context-handle later, and also makes it clear to the reader that the context is sort-of temporary in the code base. While here, be consistent in handling the `act` variable in each branch of the if .. { .. } .. check. * Prefer context.TODO over context.Background For the different scraping operations here, there is a context higher up the call chain, which we ought to use. Mark the call-sites as TODO for now, so we can come back later on a sweep of which parts can be context-lifted. * Thread context upwards Initialization requires context for transactions. Thread the context upward the call chain. At the intialization call, add a context.TODO since we can't break this yet. The singleton assumption prevents us from pulling it up into main for now. * make tasks context-aware Change the task interface to understand contexts. Pass the context down in some of the branches where it is needed. * Make QueryStashBoxScene context-aware This call naturally sits inside the request-context. Use it. * Introduce a context in the JS plugin code This allows us to use a context for HTTP calls inside the system. Mark the context with a TODO at top level for now. * Nitpick error formatting Use %v rather than %s for error interfaces. Do not begin an error strong with a capital letter. * Avoid the use of http.Get in FFMPEG download chain Since http.Get has no context, it isn't possible to break out or have policy induced. The call will block until the GET completes. Rewrite to use a http Request and provide a context. Thread the context through the call chain for now. provide context.TODO() at the top level of the initialization chain. * Make getRemoteCDPWSAddress aware of contexts Eliminate a call to http.Get and replace it with a context-aware variant. Push the context upwards in the call chain, but plug it before the scraper interface so we don't have to rewrite said interface yet. Plugged with context.TODO() * Scraper: make the getImage function context-aware Use a context, and pass it upwards. Plug it with context.TODO() up the chain before the rewrite gets too much out of hand for now. Minor tweaks along the way, remove a call to context.Background() deep in the call chain. * Make NOTIFY request context-aware The call sits inside a Request-handler. So it's natural to use the requests context as the context for the outgoing HTTP request. * Use a context in the url scraper code We are sitting in code which has a context, so utilize it for the request as well. * Use a context when checking versions When we check the version of stash on Github, use a context. Thread the context up to the initialization routine of the HTTP/GraphQL server and plug it with a context.TODO() for now. This paves the way for providing a context to the HTTP server code in a future patch. * Make utils func ReadImage context-aware In almost all of the cases, there is a context in the call chain which is a natural use. This is true for all the GraphQL mutations. The exception is in task_stash_box_tag, so plug that task with context.TODO() for now. * Make stash-box get context-aware Thread a context through the call chain until we hit the Client API. Plug it with context.TODO() there for now. * Enable the noctx linter The code is now free of any uncontexted HTTP request. This means we pass the noctx linter, and we can enable it in the code base.	2021-10-14 15:32:41 +11:00
SmallCoccinelle	a9e2a590b2	Lint checks phase 2 (#1747 ) * Log 3 unchecked errors Rather than ignore errors, log them at the WARNING log level. The server has been functioning without these, so assume they are not at the ERROR level. * Log errors in concurrency test If we can't initialize the configuration, treat the test as a failure. * Undo the errcheck on configurations for now. * Handle unchecked errors in pkg/manager * Resolve unchecked errors * Handle DLNA/DMS unchecked errors * Handle error checking in concurrency test Generalize config initialization, so we can initialize a configuration without writing it to disk. Use this in the test case, since otherwise the test fails to write. * Handle the remaining unchecked errors * Heed gosimple in update test * Use one-line if-initializer statements While here, fix a wrong variable capture error. * testing.T doesn't support %w use %v instead which is supported. * Remove unused query builder functions The Int/String criterion handler functions are now generalized. Thus, there's no need to keep these functions around anymore. * Mark filterBuilder.addRecursiveWith nolint The function is useful in the future and no other refactors are looking nice. Keep the function around, but tell the linter to ignore it. * Remove utils.Btoi There are no users of this utility function * Return error on scan failure If we fail to scan the row when looking for the unique checksum index, then report the error upwards. * Fix comments on exported functions * Fix typos * Fix startup error	2021-09-23 17:15:50 +10:00
WithoutPants	1a3a2f1f83	Scrape scene by name (#1712 ) * Support scrape scene by name in configs * Initial scene querying * Add to manual	2021-09-14 14:54:53 +10:00
WithoutPants	4625e1f955	Unify scrape refactor (#1630 ) * Unify scraped types * Make name fields optional * Unify single scrape queries * Change UI to use new interfaces * Add multi scrape interfaces * Use images instead of image	2021-09-07 11:54:22 +10:00
bnkai	117e6326db	Expose url for URLReplace in JSON scrapeByURL and scrapeByFragment (#1150 ) * Expose url for URLReplace in JSON scrapeByURL and scrapeByFragment * Apply queryURLReplace to xpath scrapers Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2021-03-02 09:19:56 +11:00
WithoutPants	1e04deb3d4	Data layer restructuring (#997 ) * Move query builders to sqlite package * Add transaction system * Wrap model resolvers in transaction * Add error return value for StringSliceToIntSlice * Update/refactor mutation resolvers * Convert query builders * Remove unused join types * Add stash id unit tests * Use WAL journal mode	2021-01-18 12:23:20 +11:00
WithoutPants	109e55a25a	Query url parameters (#878 )	2020-10-22 11:56:04 +11:00
SpedNSFW	147d0067f5	Add gallery scraping (#862 )	2020-10-21 09:24:32 +11:00
WithoutPants	9a84726128	Fix xpath comment element parsing (#759 )	2020-08-23 17:39:15 +10:00
woodgen	4045ddf3e9	Implement scraping movies by URL (#709 ) * api/urlbuilders/movie: Auto format. * graphql+pkg+ui: Implement scraping movies by URL. This patch implements the missing required boilerplate for scraping movies by URL, using performers and scenes as a reference. Although this patch contains a big chunck of ground work for enabling scraping movies by fragment, the feature would require additional changes to be completely implemented and was not tested. * graphql+pkg+ui: Scrape movie studio. Extends and corrects the movie model for the ability to store and dereference studio IDs with received studio string from the scraper. This was done with Scenes as a reference. For simplicity the duplication of having `ScrapedMovieStudio` and `ScrapedSceneStudio` was kept, which should probably be refactored to be the same type in the model in the future. * ui/movies: Add movie scrape dialog. Adds possibility to update existing movie entries with the URL scraper. For this the MovieScrapeDialog.tsx was implemented with Performers and Scenes as a reference. In addition DurationUtils needs to be called one time for converting seconds from the model to the string that is displayed in the component. This seemed the least intrusive to me as it kept a ScrapeResult<string> type compatible with ScrapedInputGroupRow.	2020-08-10 15:34:15 +10:00
WithoutPants	7158e83b75	Add JSON scrape support (#717 ) * Add support for scene fragment scrape in xpath	2020-08-10 14:21:50 +10:00
bnkai	4373f9bf01	Add cdp support for xpath scrapers (#625 ) Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2020-08-04 10:42:40 +10:00
WithoutPants	2b9215702e	Refactor xpath scraper code. Add fixed and map (#616 ) * Refactor xpath scraper code * Make post-process a list * Add map post-process action * Add fixed xpath values * Refactor scrapers into cache * Refactor into mapped config * Trim test html	2020-07-21 14:06:25 +10:00
bnkai	56210cf456	Use referer on xpath getImage, apply printHTML to subscraper also (#661 )	2020-07-10 08:42:06 +10:00
bnkai	f8048dc27c	Increase xpath redirects, use cookies (#624 )	2020-06-22 12:18:02 +10:00
bnkai	9d0522f62d	Add "split" xpath in post-processing , newlines in replace support (#579 )	2020-06-18 10:47:10 +10:00
bnkai	f40e234748	Apply xpath parseDate after subScraper (#606 )	2020-06-15 21:38:59 +10:00
WithoutPants	ec420df871	Add debug logging for xpath scraping (#555 ) * Add debug logging for xpath scraping * Add logging for processing scene members	2020-05-20 22:46:00 +10:00
bnkai	0fc57ce1e0	Fix xpath comments text (#550 )	2020-05-18 12:26:20 +10:00
WithoutPants	abf2b49803	Configurable scraper user agent string (#409 ) * Add debug scrape option. Co-authored-by: HiddenPants255 <>	2020-03-21 08:55:15 +11:00
WithoutPants	34d829338d	Add image scraping support (#370 ) * Add sub-scraper functionality * Add scraping of performer image * Add scene cover image scraping * Port UI changes to v2.5 * Fix v2.5 dialog suggest color * Don't convert eol of UI to support pretty	2020-03-11 11:41:55 +11:00
caustico	5fb8bbf768	Movies Section (#338 ) Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>	2020-03-10 14:28:15 +11:00
WithoutPants	03c07a429d	Add Xpath post processing and performer name query (#333 ) * Extend xpath configuration. Support concatenation * Add parseDate parsing option * Add regex replacements * Add xpath query performer by name * Fix loading spinner on scrape performer * Change ReplaceAll to Replace	2020-01-31 17:17:40 -05:00
WithoutPants	78eb527ec4	Scraper fixes (#332 ) * Fix panic on invalid xpath * Add missing attrs to scraped performer fragment	2020-01-24 22:36:24 -05:00
WithoutPants	7fdaccf669	Xpath scraping from URL (#285 ) * Add xpath performer and scene scraping * Add studio scraping * Refactor code * Fix compile error * Don't overwrite performer URL during a scrape	2020-01-04 11:39:33 -05:00

27 commits