1
0
Fork 0
mirror of https://github.com/kemayo/leech synced 2025-12-06 08:22:56 +01:00
Commit graph

193 commits

Author SHA1 Message Date
David Lynch
9ed2d54db7 Make the _soup method able to cope with being given a html string 2025-03-04 23:14:51 -06:00
Max Isom
53bc2045f0 Use lxml (>40% faster) 2025-03-04 22:23:50 -06:00
David Lynch
9a2b574b4b Missed a call to _soup in ao3 2024-12-23 21:02:09 -06:00
David Lynch
acce8138a9 Also pass the base through to the super clean for royalroad 2024-12-02 11:01:34 -06:00
David Lynch
31154ed8d4 Fix a call to _clean for royalroad 2024-12-02 00:00:58 -06:00
David Lynch
ffb8e54e91 Better error for an Arbitrary story that fetches no content 2024-11-23 23:07:16 -06:00
David Lynch
d49d7891c3 Fix some images not having srcset and sizes removed 2024-11-23 22:34:46 -06:00
David Lynch
746ec1b994 Fix image enabling by default
Follow-up to 6ecb1d8942
2024-11-23 22:10:19 -06:00
David Lynch
bf248bbfc8 Remove unused register import in xenforo.py 2024-11-23 21:48:46 -06:00
David Lynch
ef43295c25 AlternateHistory is on XenForo2 now
...this was the last site I had in old XenForo, so I will probably want
to clean that up soon.
2024-11-23 21:41:36 -06:00
David Lynch
0cac7ff945 New spoilers behavior: --spoilers [include/inline/skip]
Fixes #75
2024-11-23 21:39:54 -06:00
David Lynch
a39e1e9f89 Use the newer syntax for attrs 2024-11-23 19:42:35 -06:00
David Lynch
b6310658e8 Command-line flag to enable/disable fetching images 2024-11-23 16:33:01 -06:00
David Lynch
9510a22cb0 Remove arbitrary's special-case image loading, since the default works 2024-11-23 16:33:01 -06:00
David Lynch
21834bb5ed _clean takes a base argument and reformats image srcs into absolute urls 2024-11-23 15:30:57 -06:00
David Lynch
a0a057c48c _soup always returns a base URL 2024-11-23 15:15:29 -06:00
Emmanuel Jemeni
4e9ad1ed7e feat: Leech can now download images in xenforo spoilers. The --include-spoilers tag has to be added for Leech to download images in spoilers. 2024-11-23 13:22:54 -06:00
Idan Dor
1edde92a9d Fixed whitespacing for flake8. 2024-11-23 13:22:53 -06:00
Idan Dor
31f663c6e0 Added image embedding support for epub
Specifically, added image_selector for arbitrary sites that allows
selecting img tags from chapters, downloading them
and embedding them within the resulting epub.

In the case of Pale, this means that the character banners and
extra materials do not require an internet connection to view.

Also made the two pale.json's more consistent (pale.json now correctly
includes the title of the chapters).
2024-11-23 13:22:53 -06:00
David Lynch
7967c59636 Support 2fa for xenforo logins 2024-10-13 00:52:50 -05:00
David Lynch
249221f5d7 Fix questionable questing, which has moved to xenforo2 2024-05-14 22:08:22 -05:00
David Lynch
1f57cd6f07 Basic success-testing on logins 2024-05-14 22:07:05 -05:00
David Lynch
ef9309eb66 Fix xenforo login 2024-05-14 22:06:09 -05:00
David Lynch
cc423f62bb Fix the royalroad stolen-content removal
They added speak:none to the CSS, and I was strictly checking for a rule
that only contained display:none.
2024-02-10 20:05:49 -06:00
David Lynch
64d77b62db Improve cloudflare email decoding
New format for the protected emails, wrapping a span in an a.
2024-01-28 13:26:34 -06:00
David Lynch
d30e56a518 Strip out the new stolen-content warnings on royalroad
They might make these harder to work out in the future, but for now...
2024-01-19 21:34:39 -06:00
David Lynch
6c692968a4 Use isinstance rather than direct type comparison 2023-08-06 17:56:13 -05:00
David Lynch
03e9d3844f Add the-sietch.com to xenforo sites 2023-08-06 17:43:51 -05:00
David Lynch
5ddbb310b3 Let xenforo sites cope with index.php URLs 2023-08-06 17:43:28 -05:00
David Lynch
7230f65a68 Add offset/limit options to royalroad 2023-05-04 09:42:13 -05:00
KeinNiemand
356bae9a7a Don't prettify royalroad soup, Fixes #92 2023-05-04 13:17:28 +02:00
David Lynch
6895a0eb61 AO3 single-chapter story bugs 2023-03-31 23:51:27 -05:00
David Lynch
fe5ca86d87 Royalroad's markup has changed slightly, fix so title and summary work 2023-03-17 16:06:52 -05:00
David Lynch
d81eefa7f3 AO3: use new form helper so this shouldn't break again if fields change 2022-05-13 11:04:25 -05:00
David Lynch
f57db3e1a8 Helper for extracting form data from a soup 2022-05-13 11:04:05 -05:00
David Lynch
e9f704716a Xenforo: change some of the style-removal
It was causing some formatting issues, particularly on Worm fics which
did forum-style sections. (Also, indented text done via margin-left on
divs, which entirely removed the div and ran lines together.)
2022-04-27 11:07:16 -05:00
David Lynch
56bc2b941c AO3: utf8 field no longer in login form 2022-04-16 18:26:26 -05:00
David Lynch
08abe54e79 Switch out use of :=, forgot I wasn't requiring 3.8 yet 2022-03-06 10:46:13 -06:00
David Lynch
172877410b Xenforo: if fetching a specific threadmark category, add it to the title
Unless it's 1, since that's always "threadmarks" and the main story.

Refs #79
2022-03-06 10:42:39 -06:00
David Lynch
29589a0886 RoyalRoad: don't error when covers are relative URLs
Only happens when the work has no set cover, because it gets a /dist/
URL rather than a CDN URL.

Fixes #77
2022-02-22 12:19:58 -06:00
David Lynch
f204dcd928 Add a class to generated spoiler divs 2022-02-13 11:44:36 -06:00
David Lynch
697e4c0bf9 Royalroad: don't crash on malformed spoiler tags
Fixes #74
2022-02-03 11:08:40 -06:00
David Lynch
dc9c9dbe57 Pull summary and tags for royalroad 2021-11-07 13:16:59 -06:00
David Lynch
4242aa6f63 Strip colors on all sites, not just xenforo 2021-11-07 11:16:26 -06:00
David Lynch
f05bfb51ef AO3: work if www is present in the URL 2021-08-10 17:15:18 -05:00
David Lynch
f1bd28e942 Fanfiction.net: experiment with falling back to the wayback machine 2021-07-19 15:17:39 -05:00
David Lynch
d1caf85883 Extract tags when present
Supported currently on Xenforo and AO3
2021-05-01 16:35:49 -05:00
David Lynch
37cb0332b7 AO3: fix issue that could occur if the work had gaps in chapter numbers 2021-04-05 19:55:46 -05:00
David Lynch
77cc334bcf
Merge pull request #60 from ClaasJG/master
Stable seed generation for Sections
2021-03-27 19:16:11 -05:00
ClaasJG
5b39c73904 Add stable Section id based on URL
Remove Chapter id
2021-03-28 00:41:03 +01:00