1
0
Fork 0
mirror of https://github.com/kemayo/leech synced 2025-12-06 08:22:56 +01:00
Commit graph

53 commits

Author SHA1 Message Date
David Lynch
9ed2d54db7 Make the _soup method able to cope with being given a html string 2025-03-04 23:14:51 -06:00
Max Isom
53bc2045f0 Use lxml (>40% faster) 2025-03-04 22:23:50 -06:00
David Lynch
d49d7891c3 Fix some images not having srcset and sizes removed 2024-11-23 22:34:46 -06:00
David Lynch
746ec1b994 Fix image enabling by default
Follow-up to 6ecb1d8942
2024-11-23 22:10:19 -06:00
David Lynch
0cac7ff945 New spoilers behavior: --spoilers [include/inline/skip]
Fixes #75
2024-11-23 21:39:54 -06:00
David Lynch
a39e1e9f89 Use the newer syntax for attrs 2024-11-23 19:42:35 -06:00
David Lynch
b6310658e8 Command-line flag to enable/disable fetching images 2024-11-23 16:33:01 -06:00
David Lynch
21834bb5ed _clean takes a base argument and reformats image srcs into absolute urls 2024-11-23 15:30:57 -06:00
David Lynch
a0a057c48c _soup always returns a base URL 2024-11-23 15:15:29 -06:00
Idan Dor
1edde92a9d Fixed whitespacing for flake8. 2024-11-23 13:22:53 -06:00
Idan Dor
31f663c6e0 Added image embedding support for epub
Specifically, added image_selector for arbitrary sites that allows
selecting img tags from chapters, downloading them
and embedding them within the resulting epub.

In the case of Pale, this means that the character banners and
extra materials do not require an internet connection to view.

Also made the two pale.json's more consistent (pale.json now correctly
includes the title of the chapters).
2024-11-23 13:22:53 -06:00
David Lynch
64d77b62db Improve cloudflare email decoding
New format for the protected emails, wrapping a span in an a.
2024-01-28 13:26:34 -06:00
David Lynch
f57db3e1a8 Helper for extracting form data from a soup 2022-05-13 11:04:05 -05:00
David Lynch
4242aa6f63 Strip colors on all sites, not just xenforo 2021-11-07 11:16:26 -06:00
David Lynch
f1bd28e942 Fanfiction.net: experiment with falling back to the wayback machine 2021-07-19 15:17:39 -05:00
David Lynch
d1caf85883 Extract tags when present
Supported currently on Xenforo and AO3
2021-05-01 16:35:49 -05:00
David Lynch
77cc334bcf
Merge pull request #60 from ClaasJG/master
Stable seed generation for Sections
2021-03-27 19:16:11 -05:00
ClaasJG
5b39c73904 Add stable Section id based on URL
Remove Chapter id
2021-03-28 00:41:03 +01:00
David Lynch
bf315d06fe Grab the much more-pythonic CF email decode from #37 2021-03-27 11:20:01 -05:00
David Lynch
f25befc237 Decode cloudflare email address protection
Makes a generic _clean function on Site that can be called. Will
probably want to migrate some other generic bits into there after
analysis of what's *really* generic.
2021-03-27 10:46:39 -05:00
David Lynch
d50f23d07b Special exception for hitting a cloudflare captcha page
Fanfiction.net is currently doing this, so let's at least acknowledge it

Refs #53
2021-02-12 16:02:55 -06:00
David Lynch
7208cfdaaf Minor readability improvement: use f-strings 2019-10-15 11:14:27 -05:00
David Lynch
c584988994 Update dependencies 2019-10-14 00:40:34 -05:00
David Lynch
2bd5d77715 Helper for URL-joining 2019-05-29 01:55:35 -05:00
David Lynch
40b4856a14 Optimize AO3: use full_work URL 2019-05-25 15:31:39 -05:00
David Lynch
0a81069d24 Slightly more verbose logging of load failures 2018-12-29 20:46:55 -06:00
David Lynch
e78ffdb85b Method to get a site-key for config
Means that things like XenForoIndex and AO3Series don't require separate
config entries.
2018-10-11 15:42:59 -05:00
Alex Raubach
fe76b5427b Add cover_url attribute 2018-09-02 22:08:36 -04:00
Will Oursler
d1842e2bf1 Adds a system for site options to be included as click.options on commands. 2018-04-14 12:56:31 -04:00
Will Oursler
ecebf1de58 Merge branch 'master' into clickify 2018-04-13 17:52:37 -04:00
David Lynch
7d2c1647e2 Safer check on retry-after 2018-02-28 20:54:37 -06:00
David Lynch
6d52c72c99 Use logging instead of print
Fixes #10
2017-11-04 00:09:09 -05:00
David Lynch
43599aceb5
Merge branch 'master' into clickify 2017-11-03 15:21:44 -05:00
David Lynch
f1ac7c8bda Retry failed site-requests 2017-10-31 00:27:54 -05:00
Will Oursler
9b4d2a0998 Adds a more sensible default for options in the Site base class. 2017-10-13 19:43:38 -04:00
Will Oursler
c702337040 Reworks how site-specific options work. 2017-10-13 19:37:13 -04:00
Will Oursler
db48233cf4 Switch from using raw argparser to using click. Preserves the existing
interface, except leech --flush becomes leech flush
2017-10-12 13:00:24 -04:00
Will Oursler
5bd07a5b90 Splits out ebook generation logic into a seperate module, in anticipation of maybe supporting multiple output formats. 2017-10-12 09:49:32 -04:00
David Lynch
5b4b9a0dc3 Canonicalize URLs 2017-02-23 15:03:23 -06:00
David Lynch
f066fc663d Use attrs 2017-02-02 23:18:21 -06:00
David Lynch
e6343cb1c9 Stories are now made of nested sections/chapters
This is prep-work for improving epub TOC generation a bit.
2017-01-10 00:23:24 -08:00
David Lynch
24fa9aa22d Use a namedtuple for chapters 2016-09-23 13:11:52 -05:00
David Lynch
574cea3fc8 Make the sites system not require editing __init__.py 2016-09-23 12:51:03 -05:00
David Lynch
86f02812d2 Use requests-cache 2016-08-29 10:59:20 -05:00
David Lynch
d9e65e5b6a Add a little documentation on the extract method 2016-04-04 09:58:47 -05:00
David Lynch
9eb5b270ab Ignore the linting on my sites import 2016-04-04 09:45:45 -05:00
David Lynch
008eb8e63d Support ArchiveOfOurOwn 2016-04-03 21:30:29 -05:00
David Lynch
aa4ba528b7 Let sites define their own custom arguments
Use this to let xenforo force the inclusion of the index-post
2015-12-05 01:34:20 -06:00
David Lynch
c69eb1e33e Footnotes off in their own file 2015-11-30 20:10:58 -06:00
David Lynch
95e25dabd3 First pass at turning spoilers into footnotes for Xenforo
This works as popup-footnotes in iBooks and on Kindle. It'd be a bit
better if I put the footnotes in their own file, so they won't be
dropped at the end of chapters on a Kindle. However, that requires
some flow restructuring, and this is an acceptable proof-of-concept
for now.
2015-11-30 16:46:29 -06:00