1
0
Fork 0
mirror of https://github.com/kemayo/leech synced 2026-05-06 03:21:07 +02:00
Commit graph

414 commits

Author SHA1 Message Date
David Lynch
82cf246593 Update attrs to 26.1.0 2026-04-24 11:09:47 -05:00
David Lynch
16b935ce33 Remove hardcoded lxml in xenforo
Follow-up to 4d9c31b6ac
2026-04-24 11:06:57 -05:00
David Lynch
606f85b2da Update dependencies 2026-04-24 11:06:57 -05:00
David Lynch
335d8cf2e9 Update the gitignore
It's https://github.com/github/gitignore/blob/main/Python.gitignore
2026-04-24 10:54:19 -05:00
David Lynch
5bca057c42 Add DS_Store to gitignore 2026-04-06 14:30:35 -05:00
David Lynch
5506f568e3 Cope with output_dir containing a ~, and warn if it doesn't exist 2026-04-06 00:28:19 -05:00
David Lynch
34ec69af6f Use platformdirs to allow storing centralized config
Search order is: <current directory>, <directory of leech.py>,
<directory from platformdirs>
2026-04-06 00:27:42 -05:00
David Lynch
7ab331a3a4 Store cache in an OS temp directory 2026-04-05 23:38:17 -05:00
David Lynch
1eb877bd63 Don't manually vacuum after clearing the cache
requests-cache added this itself in 0.5.1, so this has been superfluous
for a while.
2026-04-05 23:37:43 -05:00
David Lynch
dec30e9639 Sync up poetry.lock 2026-03-30 23:21:41 -05:00
David Lynch
8216221eb5 Allow overriding the user-agent 2026-03-30 23:12:04 -05:00
David Lynch
9e2f2f4b7a Ignore an output directory 2026-03-30 23:12:04 -05:00
dependabot[bot]
0e229b1a1b Bump requests from 2.32.5 to 2.33.0
Bumps [requests](https://github.com/psf/requests) from 2.32.5 to 2.33.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.5...v2.33.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.33.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-26 12:16:26 -05:00
David Lynch
c1fb9c9d0c Update the readme 2026-03-06 13:51:16 -06:00
David Lynch
715dff041f Update Pillow to 12 2026-03-06 12:20:09 -06:00
David Lynch
a0c8a557f2 Drop python 3.9 2026-03-06 12:14:35 -06:00
David Lynch
4fb69375f4 Update the github action 2026-03-06 12:11:23 -06:00
David Lynch
ffc83e5b6c Switch to linting with ruff 2026-03-06 12:04:29 -06:00
David Lynch
eed66dcc48 Move pyproject.toml to be more standard
In recent years there's needed to be less tool-specific sections. Poetry
only started supporting dependency-groups in 2.2.0, so the --dev linting
might need manual intervention for people who're not up to date.
2026-03-06 11:58:50 -06:00
David Lynch
14e7d5bf1b Don't break trying to de-cloudflare links without an href 2026-01-04 22:41:46 -06:00
dependabot[bot]
a570580f98 Bump urllib3 from 2.5.0 to 2.6.0
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-07 09:31:55 -06:00
ClaasJG
a5ca274637 Change idiom 'python3 leech.py' -> 'poetry run leech'
See:  https://github.com/kemayo/leech/issues/103
2025-10-29 14:21:19 -05:00
David Lynch
a71ac62f8b Stop xenforo from over-stripping styles 2025-08-04 13:31:57 -05:00
David Lynch
b3489d5016 Add a basic Patreon site definition
Works for getting *all* posts from an author, or (more usefully) getting
all posts within a tag from an author
2025-08-01 21:23:26 -05:00
David Lynch
5f72f23e72 Note to self about royalroad chapter URLs 2025-08-01 19:43:31 -05:00
dependabot[bot]
7a155a7b98 Bump urllib3 from 2.2.3 to 2.5.0
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.2.3 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.3...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-01 19:43:23 -05:00
David Lynch
2c57830ab9
Merge pull request #101 from kemayo/dependabot/pip/requests-2.32.4
Bump requests from 2.32.3 to 2.32.4
2025-06-11 06:36:23 +03:00
dependabot[bot]
f509c36c75
Bump requests from 2.32.3 to 2.32.4
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-10 10:11:26 +00:00
David Lynch
6fddf628fb Add a bit more messaging around logging in to sites 2025-06-09 20:06:54 -05:00
David Lynch
5bfd1b40a0 Give image downloading a timeout 2025-03-26 20:55:19 -05:00
David Lynch
5cb887f767 Move image processing into sites
The epub-builder still downloads the image, but all the html-mangling
is done in the extraction process now.

Turns footnotes into a chapter-object, for easier processing later on.
2025-03-22 19:39:16 -05:00
David Lynch
81189f4e1d xenforo: minor fixes around images in spoilers 2025-03-22 00:16:11 -05:00
David Lynch
3c5a4bb75a
Merge pull request #100 from kpedro88/multiple-next-items
Handle multiple entries in next_link
2025-03-18 20:07:16 -05:00
Kevin Pedro
de6913a9af simplify algorithm 2025-03-08 09:48:32 -06:00
Kevin Pedro
d4e1214be3 return to loop-based algorithm 2025-03-08 09:40:42 -06:00
David Lynch
cfd073fb5c Fix an error in _soup if parsed content doesn't have a <head> 2025-03-06 22:33:32 -06:00
Kevin Pedro
b2f15eb76c satisfy linter 2025-03-05 21:03:35 -06:00
Kevin Pedro
280b242a27 stop loop once a new link is found 2025-03-05 20:56:47 -06:00
Kevin Pedro
0066a148bb process all next_link items 2025-03-05 20:56:47 -06:00
David Lynch
5213ec2632 Update dependencies to latest versions, remove html5lib 2025-03-04 23:14:51 -06:00
David Lynch
4d9c31b6ac Make the parser used for BeautifulSoup configurable, still default lxml
Refs #98
2025-03-04 23:14:51 -06:00
David Lynch
9ed2d54db7 Make the _soup method able to cope with being given a html string 2025-03-04 23:14:51 -06:00
Max Isom
53bc2045f0 Use lxml (>40% faster) 2025-03-04 22:23:50 -06:00
Kevin Pedro
52213725c9 update docker recipe for python changes 2025-03-04 22:14:38 -06:00
David Lynch
9a2b574b4b Missed a call to _soup in ao3 2024-12-23 21:02:09 -06:00
David Lynch
3fdbae5851 Pass through some more headers in the session 2024-12-17 16:34:57 -06:00
David Lynch
204807add6 Don't hardcode a story ID into a path before it's needed 2024-12-04 17:20:11 -06:00
David Lynch
bb1fcc0e50 Always process images if they're included in the chapter object 2024-12-04 17:15:09 -06:00
David Lynch
5392593621 Image options in an options-object pattern, like cover options 2024-12-04 17:12:06 -06:00
David Lynch
bedaec9989 Avoid potential image overlaps with nested sections 2024-12-04 16:51:21 -06:00