1
0
Fork 0
mirror of https://github.com/kemayo/leech synced 2026-01-02 13:46:17 +01:00
Commit graph

162 commits

Author SHA1 Message Date
Idan Dor
422360de4e Fixed whitespacing for flake8. 2022-11-04 16:10:58 +02:00
Idan Dor
d3e603a028 Added image embedding support for epub
Specifically, added image_selector for arbitrary sites that allows
selecting img tags from chapters, downloading them
and embedding them within the resulting epub.

In the case of Pale, this means that the character banners and
extra materials do not require an internet connection to view.

Also made the two pale.json's more consistent (pale.json now correctly
includes the title of the chapters).
2022-11-04 16:04:18 +02:00
David Lynch
d81eefa7f3 AO3: use new form helper so this shouldn't break again if fields change 2022-05-13 11:04:25 -05:00
David Lynch
f57db3e1a8 Helper for extracting form data from a soup 2022-05-13 11:04:05 -05:00
David Lynch
e9f704716a Xenforo: change some of the style-removal
It was causing some formatting issues, particularly on Worm fics which
did forum-style sections. (Also, indented text done via margin-left on
divs, which entirely removed the div and ran lines together.)
2022-04-27 11:07:16 -05:00
David Lynch
56bc2b941c AO3: utf8 field no longer in login form 2022-04-16 18:26:26 -05:00
David Lynch
08abe54e79 Switch out use of :=, forgot I wasn't requiring 3.8 yet 2022-03-06 10:46:13 -06:00
David Lynch
172877410b Xenforo: if fetching a specific threadmark category, add it to the title
Unless it's 1, since that's always "threadmarks" and the main story.

Refs #79
2022-03-06 10:42:39 -06:00
David Lynch
29589a0886 RoyalRoad: don't error when covers are relative URLs
Only happens when the work has no set cover, because it gets a /dist/
URL rather than a CDN URL.

Fixes #77
2022-02-22 12:19:58 -06:00
David Lynch
f204dcd928 Add a class to generated spoiler divs 2022-02-13 11:44:36 -06:00
David Lynch
697e4c0bf9 Royalroad: don't crash on malformed spoiler tags
Fixes #74
2022-02-03 11:08:40 -06:00
David Lynch
dc9c9dbe57 Pull summary and tags for royalroad 2021-11-07 13:16:59 -06:00
David Lynch
4242aa6f63 Strip colors on all sites, not just xenforo 2021-11-07 11:16:26 -06:00
David Lynch
f05bfb51ef AO3: work if www is present in the URL 2021-08-10 17:15:18 -05:00
David Lynch
f1bd28e942 Fanfiction.net: experiment with falling back to the wayback machine 2021-07-19 15:17:39 -05:00
David Lynch
d1caf85883 Extract tags when present
Supported currently on Xenforo and AO3
2021-05-01 16:35:49 -05:00
David Lynch
37cb0332b7 AO3: fix issue that could occur if the work had gaps in chapter numbers 2021-04-05 19:55:46 -05:00
David Lynch
77cc334bcf
Merge pull request #60 from ClaasJG/master
Stable seed generation for Sections
2021-03-27 19:16:11 -05:00
ClaasJG
5b39c73904 Add stable Section id based on URL
Remove Chapter id
2021-03-28 00:41:03 +01:00
David Lynch
bf315d06fe Grab the much more-pythonic CF email decode from #37 2021-03-27 11:20:01 -05:00
David Lynch
f25befc237 Decode cloudflare email address protection
Makes a generic _clean function on Site that can be called. Will
probably want to migrate some other generic bits into there after
analysis of what's *really* generic.
2021-03-27 10:46:39 -05:00
David Lynch
dfa298dd3b Better error message for restricted AO3 stories 2021-03-21 23:17:29 -05:00
claasjg
d4f3986515
Detect URL loop with next selector 2021-03-19 14:49:38 +01:00
David Lynch
ce998c84c3 Extract spoilers to footnotes on royalroad 2021-03-07 11:28:49 -06:00
David Lynch
d50f23d07b Special exception for hitting a cloudflare captcha page
Fanfiction.net is currently doing this, so let's at least acknowledge it

Refs #53
2021-02-12 16:02:55 -06:00
David Lynch
28cc1fbcc7 Arbitrary should store contents as a string, not a bs4 Tag
It coincidentally works by being string-like for previous uses, but it's
not string-like enough for the new unicode stuff.

Fixes #54
2021-02-05 19:58:47 -06:00
David Lynch
ae1b77da2f Wattpad: use API instead
Their on-page HTML sometimes uses JS to load parts of the story
2021-01-26 13:11:56 -06:00
David Lynch
23c7a1496c Quick take on wattpad 2021-01-26 01:56:41 -06:00
IdanDor
6d7b5ffcf0 Removed trailing whitespace. 2021-01-23 13:30:03 +02:00
IdanDor
1afac50437 Made arbitrary sites no longer leak memory and fixed worm epub.
Each `Chapter` object had a reference to the entire page tree, meaning that the program rose in RAM usage by a lot.

Transformed Worm to be with next_selector so the chapters are correctly ordered, E.2 is not skipped and the download does not crush due to `?share=twitter` url matched before.

Fixed Worm titles.
2021-01-23 12:12:48 +02:00
David Lynch
c208e33752 Arbitrary: strip all namespaced elements
This is `fb:like` and similar, which break some epub readers.

Refs: #41, #43
2020-09-08 23:04:47 -05:00
David Lynch
988368bb66 Better xenforo blockquote chrome removal 2020-08-18 13:21:01 -05:00
David Lynch
2103f37cfb AO3: fallback for single-chapter works 2020-05-04 00:31:19 -05:00
David Lynch
6fbdc8843d Make arbitrary site chapter-title selectors more resilient 2020-04-29 17:55:20 -05:00
David Lynch
6631095726 Fiction.live: niche URLs
* occasional stories with "Sci-fi" in the URL instead of "stories"
* rare cases of `-` in the work id

Fixes #31
2019-11-14 14:45:19 -06:00
David Lynch
a856f9d0f8 Fiction.live: account for a weird rare bug/possibility in votes
Also, add a bunch of error handling / logging to the section-parsing to
avoid this in the future.

Fixes #30
2019-11-07 09:34:39 -06:00
David Lynch
f89f5163b5 Fiction.live: Fix choices array check
Fixes #29
2019-11-05 15:02:09 -06:00
David Lynch
4861ffbd7e Fiction.live can have votes for absent choices
Fixes #28.
2019-10-29 08:17:01 -05:00
David Lynch
dc10e4cf17 FFN: less-destructive attribute clearing 2019-10-17 22:29:01 -05:00
David Lynch
7208cfdaaf Minor readability improvement: use f-strings 2019-10-15 11:14:27 -05:00
David Lynch
c584988994 Update dependencies 2019-10-14 00:40:34 -05:00
David Lynch
9d0b5f1d3a
Merge pull request #26 from thegrinner/no-vote-fictionlive
Fix FictionLive download failure on missing vote node
2019-10-14 00:07:34 -05:00
David Lynch
d782928e0e Spacebattles is now on XenForo2 2019-10-12 10:51:22 -05:00
thegrinner
4e4f16e7cc Appease flake8 2019-10-03 17:48:45 -04:00
thegrinner
d0402daa7b Add handling for votes that don't have a votes kvp 2019-10-03 17:36:43 -04:00
David Lynch
5e034a7d65 Xenforo let non-first-category threadmarks work
Currently this just requires passing a link to the reader view of a particular
category. In the future I might want to support more variants on this -- a
flag to pull down all the threadmark categories, for instance.
2019-08-06 17:29:53 -05:00
David Lynch
532a7c6682 Fix typo of title_element in arbitrary
Fixes #25
2019-07-30 09:37:03 -05:00
David Lynch
f002064352 Xenforo2 title labels 2019-07-24 23:29:12 -05:00
David Lynch
a148fa8c43 Flake8 errors 2019-07-13 13:17:54 -05:00
David Lynch
3443304ab1 XenForo: handle SV's XenForo2 changes 2019-07-13 11:42:22 -05:00