1
0
Fork 0
mirror of https://github.com/kemayo/leech synced 2025-12-06 00:15:22 +01:00
Commit graph

393 commits

Author SHA1 Message Date
ClaasJG
a5ca274637 Change idiom 'python3 leech.py' -> 'poetry run leech'
See:  https://github.com/kemayo/leech/issues/103
2025-10-29 14:21:19 -05:00
David Lynch
a71ac62f8b Stop xenforo from over-stripping styles 2025-08-04 13:31:57 -05:00
David Lynch
b3489d5016 Add a basic Patreon site definition
Works for getting *all* posts from an author, or (more usefully) getting
all posts within a tag from an author
2025-08-01 21:23:26 -05:00
David Lynch
5f72f23e72 Note to self about royalroad chapter URLs 2025-08-01 19:43:31 -05:00
dependabot[bot]
7a155a7b98 Bump urllib3 from 2.2.3 to 2.5.0
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.2.3 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.3...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-01 19:43:23 -05:00
David Lynch
2c57830ab9
Merge pull request #101 from kemayo/dependabot/pip/requests-2.32.4
Bump requests from 2.32.3 to 2.32.4
2025-06-11 06:36:23 +03:00
dependabot[bot]
f509c36c75
Bump requests from 2.32.3 to 2.32.4
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-10 10:11:26 +00:00
David Lynch
6fddf628fb Add a bit more messaging around logging in to sites 2025-06-09 20:06:54 -05:00
David Lynch
5bfd1b40a0 Give image downloading a timeout 2025-03-26 20:55:19 -05:00
David Lynch
5cb887f767 Move image processing into sites
The epub-builder still downloads the image, but all the html-mangling
is done in the extraction process now.

Turns footnotes into a chapter-object, for easier processing later on.
2025-03-22 19:39:16 -05:00
David Lynch
81189f4e1d xenforo: minor fixes around images in spoilers 2025-03-22 00:16:11 -05:00
David Lynch
3c5a4bb75a
Merge pull request #100 from kpedro88/multiple-next-items
Handle multiple entries in next_link
2025-03-18 20:07:16 -05:00
Kevin Pedro
de6913a9af simplify algorithm 2025-03-08 09:48:32 -06:00
Kevin Pedro
d4e1214be3 return to loop-based algorithm 2025-03-08 09:40:42 -06:00
David Lynch
cfd073fb5c Fix an error in _soup if parsed content doesn't have a <head> 2025-03-06 22:33:32 -06:00
Kevin Pedro
b2f15eb76c satisfy linter 2025-03-05 21:03:35 -06:00
Kevin Pedro
280b242a27 stop loop once a new link is found 2025-03-05 20:56:47 -06:00
Kevin Pedro
0066a148bb process all next_link items 2025-03-05 20:56:47 -06:00
David Lynch
5213ec2632 Update dependencies to latest versions, remove html5lib 2025-03-04 23:14:51 -06:00
David Lynch
4d9c31b6ac Make the parser used for BeautifulSoup configurable, still default lxml
Refs #98
2025-03-04 23:14:51 -06:00
David Lynch
9ed2d54db7 Make the _soup method able to cope with being given a html string 2025-03-04 23:14:51 -06:00
Max Isom
53bc2045f0 Use lxml (>40% faster) 2025-03-04 22:23:50 -06:00
Kevin Pedro
52213725c9 update docker recipe for python changes 2025-03-04 22:14:38 -06:00
David Lynch
9a2b574b4b Missed a call to _soup in ao3 2024-12-23 21:02:09 -06:00
David Lynch
3fdbae5851 Pass through some more headers in the session 2024-12-17 16:34:57 -06:00
David Lynch
204807add6 Don't hardcode a story ID into a path before it's needed 2024-12-04 17:20:11 -06:00
David Lynch
bb1fcc0e50 Always process images if they're included in the chapter object 2024-12-04 17:15:09 -06:00
David Lynch
5392593621 Image options in an options-object pattern, like cover options 2024-12-04 17:12:06 -06:00
David Lynch
bedaec9989 Avoid potential image overlaps with nested sections 2024-12-04 16:51:21 -06:00
David Lynch
1fe907bec2 Pass image arguments to nested sections 2024-12-04 16:42:47 -06:00
David Lynch
acce8138a9 Also pass the base through to the super clean for royalroad 2024-12-02 11:01:34 -06:00
David Lynch
31154ed8d4 Fix a call to _clean for royalroad 2024-12-02 00:00:58 -06:00
David Lynch
e3c63bce3c New config option: allow_spaces
Determines whether spaces in filenames will be replaced with underscores
2024-11-30 14:07:40 -06:00
David Lynch
2f21280d76 Adjust option loading so it's easier to override 2024-11-30 14:07:40 -06:00
David Lynch
91d2c4fd4b Fully cancel if the story extraction fails 2024-11-30 13:52:43 -06:00
David Lynch
ffb8e54e91 Better error for an Arbitrary story that fetches no content 2024-11-23 23:07:16 -06:00
acestronautical
85da618cb2 Fix selectors for the Dungeon Keeper Ami example 2024-11-23 22:57:50 -06:00
David Lynch
7f91f1cc43 Some general readme updates 2024-11-23 22:44:34 -06:00
David Lynch
59923e0f63 Add note about alt="" behavior 2024-11-23 22:35:55 -06:00
David Lynch
6988fc8ccc Add output mentioning when an image is cached 2024-11-23 22:34:46 -06:00
David Lynch
d49d7891c3 Fix some images not having srcset and sizes removed 2024-11-23 22:34:46 -06:00
David Lynch
746ec1b994 Fix image enabling by default
Follow-up to 6ecb1d8942
2024-11-23 22:10:19 -06:00
David Lynch
bf248bbfc8 Remove unused register import in xenforo.py 2024-11-23 21:48:46 -06:00
David Lynch
ef43295c25 AlternateHistory is on XenForo2 now
...this was the last site I had in old XenForo, so I will probably want
to clean that up soon.
2024-11-23 21:41:36 -06:00
David Lynch
0cac7ff945 New spoilers behavior: --spoilers [include/inline/skip]
Fixes #75
2024-11-23 21:39:54 -06:00
David Lynch
3f6fd401ad Update the readme with the current python version requirement 2024-11-23 19:43:56 -06:00
David Lynch
a39e1e9f89 Use the newer syntax for attrs 2024-11-23 19:42:35 -06:00
David Lynch
d6d23e4c60 Bump dependency versions and required python version 2024-11-23 17:38:52 -06:00
David Lynch
4f15e0517f Change how the build a cover test runs 2024-11-23 16:59:25 -06:00
David Lynch
3fbe181b12 In no-images case, replace with alt if present rather than decomposing
Putting a placeholder there for the altless, to avoid confusion.
2024-11-23 16:48:09 -06:00