David Lynch
6fddf628fb
Add a bit more messaging around logging in to sites
2025-06-09 20:06:54 -05:00
David Lynch
5bfd1b40a0
Give image downloading a timeout
2025-03-26 20:55:19 -05:00
David Lynch
5cb887f767
Move image processing into sites
...
The epub-builder still downloads the image, but all the html-mangling
is done in the extraction process now.
Turns footnotes into a chapter-object, for easier processing later on.
2025-03-22 19:39:16 -05:00
David Lynch
81189f4e1d
xenforo: minor fixes around images in spoilers
2025-03-22 00:16:11 -05:00
David Lynch
3c5a4bb75a
Merge pull request #100 from kpedro88/multiple-next-items
...
Handle multiple entries in next_link
2025-03-18 20:07:16 -05:00
Kevin Pedro
de6913a9af
simplify algorithm
2025-03-08 09:48:32 -06:00
Kevin Pedro
d4e1214be3
return to loop-based algorithm
2025-03-08 09:40:42 -06:00
David Lynch
cfd073fb5c
Fix an error in _soup if parsed content doesn't have a <head>
2025-03-06 22:33:32 -06:00
Kevin Pedro
b2f15eb76c
satisfy linter
2025-03-05 21:03:35 -06:00
Kevin Pedro
280b242a27
stop loop once a new link is found
2025-03-05 20:56:47 -06:00
Kevin Pedro
0066a148bb
process all next_link items
2025-03-05 20:56:47 -06:00
David Lynch
5213ec2632
Update dependencies to latest versions, remove html5lib
2025-03-04 23:14:51 -06:00
David Lynch
4d9c31b6ac
Make the parser used for BeautifulSoup configurable, still default lxml
...
Refs #98
2025-03-04 23:14:51 -06:00
David Lynch
9ed2d54db7
Make the _soup method able to cope with being given a html string
2025-03-04 23:14:51 -06:00
Max Isom
53bc2045f0
Use lxml (>40% faster)
2025-03-04 22:23:50 -06:00
Kevin Pedro
52213725c9
update docker recipe for python changes
2025-03-04 22:14:38 -06:00
David Lynch
9a2b574b4b
Missed a call to _soup in ao3
2024-12-23 21:02:09 -06:00
David Lynch
3fdbae5851
Pass through some more headers in the session
2024-12-17 16:34:57 -06:00
David Lynch
204807add6
Don't hardcode a story ID into a path before it's needed
2024-12-04 17:20:11 -06:00
David Lynch
bb1fcc0e50
Always process images if they're included in the chapter object
2024-12-04 17:15:09 -06:00
David Lynch
5392593621
Image options in an options-object pattern, like cover options
2024-12-04 17:12:06 -06:00
David Lynch
bedaec9989
Avoid potential image overlaps with nested sections
2024-12-04 16:51:21 -06:00
David Lynch
1fe907bec2
Pass image arguments to nested sections
2024-12-04 16:42:47 -06:00
David Lynch
acce8138a9
Also pass the base through to the super clean for royalroad
2024-12-02 11:01:34 -06:00
David Lynch
31154ed8d4
Fix a call to _clean for royalroad
2024-12-02 00:00:58 -06:00
David Lynch
e3c63bce3c
New config option: allow_spaces
...
Determines whether spaces in filenames will be replaced with underscores
2024-11-30 14:07:40 -06:00
David Lynch
2f21280d76
Adjust option loading so it's easier to override
2024-11-30 14:07:40 -06:00
David Lynch
91d2c4fd4b
Fully cancel if the story extraction fails
2024-11-30 13:52:43 -06:00
David Lynch
ffb8e54e91
Better error for an Arbitrary story that fetches no content
2024-11-23 23:07:16 -06:00
acestronautical
85da618cb2
Fix selectors for the Dungeon Keeper Ami example
2024-11-23 22:57:50 -06:00
David Lynch
7f91f1cc43
Some general readme updates
2024-11-23 22:44:34 -06:00
David Lynch
59923e0f63
Add note about alt="" behavior
2024-11-23 22:35:55 -06:00
David Lynch
6988fc8ccc
Add output mentioning when an image is cached
2024-11-23 22:34:46 -06:00
David Lynch
d49d7891c3
Fix some images not having srcset and sizes removed
2024-11-23 22:34:46 -06:00
David Lynch
746ec1b994
Fix image enabling by default
...
Follow-up to 6ecb1d8942
2024-11-23 22:10:19 -06:00
David Lynch
bf248bbfc8
Remove unused register import in xenforo.py
2024-11-23 21:48:46 -06:00
David Lynch
ef43295c25
AlternateHistory is on XenForo2 now
...
...this was the last site I had in old XenForo, so I will probably want
to clean that up soon.
2024-11-23 21:41:36 -06:00
David Lynch
0cac7ff945
New spoilers behavior: --spoilers [include/inline/skip]
...
Fixes #75
2024-11-23 21:39:54 -06:00
David Lynch
3f6fd401ad
Update the readme with the current python version requirement
2024-11-23 19:43:56 -06:00
David Lynch
a39e1e9f89
Use the newer syntax for attrs
2024-11-23 19:42:35 -06:00
David Lynch
d6d23e4c60
Bump dependency versions and required python version
2024-11-23 17:38:52 -06:00
David Lynch
4f15e0517f
Change how the build a cover test runs
2024-11-23 16:59:25 -06:00
David Lynch
3fbe181b12
In no-images case, replace with alt if present rather than decomposing
...
Putting a placeholder there for the altless, to avoid confusion.
2024-11-23 16:48:09 -06:00
David Lynch
740a41f4ef
Avoid refetching images that're repeated across chapters
2024-11-23 16:33:01 -06:00
David Lynch
6ecb1d8942
Make downloading images the default behavior
2024-11-23 16:33:01 -06:00
David Lynch
400c5cc801
Configurable whether to always convert images
2024-11-23 16:33:01 -06:00
David Lynch
b6310658e8
Command-line flag to enable/disable fetching images
2024-11-23 16:33:01 -06:00
David Lynch
e2bc6eba1c
Change order of config loading so site-specific overrides of cover/image work
2024-11-23 16:33:01 -06:00
David Lynch
4856649424
Be less verbose when downloading images
2024-11-23 16:33:01 -06:00
David Lynch
9510a22cb0
Remove arbitrary's special-case image loading, since the default works
2024-11-23 16:33:01 -06:00