Compare commits

...

3693 commits
v2.8.0 ... main

Author SHA1 Message Date
Jim Miller
a172a7bd2b Bump Test Version 4.57.7 2026-05-07 13:54:08 -05:00
Jim Miller
ab103dce6e browsercache_sqldb: Better share_open and read-only. #1341 2026-05-07 13:54:02 -05:00
Jim Miller
892e9207f0 Bump Test Version 4.57.6 2026-05-06 19:53:58 -05:00
Jim Miller
b4e392fae1 browsercache_sqldb: Use share_open for windows file locking. #1341 2026-05-06 19:53:44 -05:00
Jim Miller
d9525d9726 Bump Test Version 4.57.5 2026-05-06 13:22:28 -05:00
Jim Miller
cb77b12754 Adding browsercache_sqldb for Yet Another caching scheme in Chrome. #1341 2026-05-06 13:22:22 -05:00
Jim Miller
b41a633821 Bump Test Version 4.57.4 2026-05-05 08:11:07 -05:00
Jim Miller
50c8db2992 browsercache_simple: Tweak index file size check. #1341 2026-05-05 08:10:59 -05:00
Jim Miller
ef6dd99bfe Bump Test Version 4.57.3 2026-05-04 15:05:25 -05:00
Jim Miller
59796ff537 Add debug out to Browser Cache cache dir checking #1341 2026-05-04 15:05:13 -05:00
Jim Miller
8ee0a6e898 Bump Test Version 4.57.2 2026-05-03 09:06:51 -05:00
Jim Miller
c53fc362bd Include genre/category in defaults.ini when include_in_X for extragenres/extracategories 2026-05-03 09:06:44 -05:00
Jim Miller
c87cfc1057 adapter_fanficauthorsnet: Domains changed from .nsns to -nsns 2026-05-01 10:10:37 -05:00
Jim Miller
6ee151c90a Bump Release Version 4.57.0 2026-05-01 09:38:27 -05:00
Jim Miller
db01c828a0 Update translations. 2026-05-01 09:37:13 -05:00
Jim Miller
4d03874f06 Fix a bad comment-out 2026-04-29 15:42:59 -05:00
Jim Miller
36f56483e6 Bump Test Version 4.56.10 2026-04-29 13:01:28 -05:00
Jim Miller
18e45a403b PI Anthology: Reuse epub cover if there is one. 2026-04-29 13:01:22 -05:00
Jim Miller
2e25172ba3 adapter_scribblehubcom: Update ajax call for chapters data. Didn't fix #1339 but change noted 3+ years ago 2026-04-29 10:15:26 -05:00
Jim Miller
65e3fd562b Update translations. 2026-04-27 16:53:06 -05:00
Jim Miller
7089bf6689 Bump Test Version 4.56.9 2026-04-21 15:02:05 -05:00
Jim Miller
061dc1333f PI: Correct Series field url link when setanthologyseries 2026-04-21 15:01:58 -05:00
Jim Miller
0a7fb5c090 Bump Test Version 4.56.8 2026-04-19 14:08:29 -05:00
Jim Miller
cf02f729ae adapter_literotica: Fix for numeric tag value from json. #1336 2026-04-19 14:08:21 -05:00
Jim Miller
730c4f77f9 Bump Test Version 4.56.7 2026-04-19 09:33:07 -05:00
Jim Miller
c02da29cbd Added strings for translation 2026-04-19 09:33:00 -05:00
Jim Miller
b87d796221 PI: Add Fix Series Case setting for #1338 2026-04-19 09:30:15 -05:00
Jim Miller
436370fe5b Done profiling for now 2026-04-19 09:03:10 -05:00
Jim Miller
ac77f31bc2 Move NotGoingToDownload to exceptions.py #1337 2026-04-19 09:02:32 -05:00
Jim Miller
16f2c74e4b Bump Test Version 4.56.6 2026-04-18 13:47:51 -05:00
praschke
af5c2aa0bc adapter_kakuyomujp: site update 2026-04-18 13:47:14 -05:00
Jim Miller
31dec5b62d Bump Test Version 4.56.5 2026-04-18 12:58:56 -05:00
Jim Miller
97d37fcfc1 fix_relative_text_links: Allow hrefs to name anchors as well as id. 2026-04-18 12:58:46 -05:00
Jim Miller
c730aa2f68 Bump Test Version 4.56.4 2026-04-17 10:22:20 -05:00
Jim Miller
4e2e359dee PI Anthologies: Only put status in tags if in include_subject_tags. Closes #1332 2026-04-17 10:22:13 -05:00
Jim Miller
bb96049934 Remove some debug 2026-04-16 14:27:48 -05:00
Jim Miller
84965ef25f Bump Test Version 4.56.3 2026-04-12 21:20:09 -05:00
Jim Miller
348d129a1e adapter_ficwadcom: Detect missing username as well as failed login #1330 2026-04-12 21:05:42 -05:00
Jim Miller
4794e9bc51 Bump Test Version 4.56.2 2026-04-10 21:56:43 -05:00
Jim Miller
d46dc76ae1 Somewhat better consolidated perf profiling 2026-04-10 21:56:43 -05:00
Jim Miller
08bae8d9be Imperfect, but working perf profiling 2026-04-10 16:49:17 -05:00
Jim Miller
405c37aeb5 Remove some dead code. 2026-04-10 16:43:49 -05:00
Jim Miller
270e01c3c7 Cache config values for performance improvement. 2026-04-10 16:24:37 -05:00
Jim Miller
12d57f5950 Bump Test Version 4.56.1 2026-04-06 12:07:14 -05:00
Jim Miller
562b3a4ecd Unnew Perf Improvement w/profiling 2026-04-06 12:07:05 -05:00
Jim Miller
e69045fd98 Bump Release Version 4.56.0 2026-04-02 10:03:42 -05:00
Jim Miller
747bde3394 Update (commented out) profiling code. 2026-04-02 10:02:58 -05:00
Jim Miller
aa00c7ae03 Bump Test Version 4.55.4 2026-03-27 11:54:50 -05:00
Jim Miller
0539f818f3 Add top menu items for Add/Edit Reject URLs. 2026-03-27 11:54:44 -05:00
Jim Miller
41a6f56f44 Remove fanficfare_macmenuhack. 2026-03-27 11:43:53 -05:00
Jim Miller
e3832245e6 Add Reject URLs: Accept story URLs drag/drop & paste like Add Stories by URL 2026-03-27 10:52:30 -05:00
Jim Miller
909b64c83c Remove some image processing debug output 2026-03-27 10:51:29 -05:00
Jim Miller
732f5e2571 Bump Test Version 4.55.3 2026-03-19 13:03:11 -05:00
Jim Miller
d9dd04396e Epub Update: Don't cache cover image with others, trips dedup. 2026-03-19 13:03:03 -05:00
Jim Miller
36e2183d45 Bump Test Version 4.55.2 2026-03-12 15:13:01 -05:00
Jim Miller
040b7205b8 adapter_literotica: Fix for site change (#1318) 2026-03-12 15:11:26 -05:00
Jim Miller
d8ed180eb1 Bump Test Version 4.55.1 2026-03-09 13:04:56 -05:00
Jim Miller
2a6c1e74db Make seriesUrl mutable again. 2026-03-09 13:04:50 -05:00
Jim Miller
b7c8c96153 Put download list at start of BG job too 2026-03-09 13:04:24 -05:00
Jim Miller
a16096592c Bump Release Version 4.55.0 2026-03-01 09:25:11 -06:00
Jim Miller
bb34eecc7c Remove a line of unused code. 2026-02-23 13:08:57 -06:00
Jim Miller
ceed7ef1a8 Bump Test Version 4.54.5 2026-02-10 08:45:34 -06:00
Jim Miller
1d2a887c2d Epub Update: Skip missing chapter, image and css files instead of failing. 2026-02-10 08:45:20 -06:00
Jim Miller
a3f3302312 Plugin only: In Skip mode, don't do initial metadata fetch if already matched in library. #1309 2026-02-10 08:30:02 -06:00
Jim Miller
ecf005b145 Bump Test Version 4.54.4 2026-02-05 16:09:00 -06:00
Jim Miller
3bd074fa2c Additional checks for svg images to reject--Calibre only. Related to #1298 2026-02-05 16:08:54 -06:00
Jim Miller
0fd95daa8e Bump Test Version 4.54.3 2026-02-05 13:46:42 -06:00
Jim Miller
1b57e49d98 Ignore CSS url() when ttf/otf/woff/woff2 font files 2026-02-05 13:46:24 -06:00
Jim Miller
db0d39c9cd Bump Test Version 4.54.2 2026-02-02 13:12:56 -06:00
Jim Miller
cbde66cf41 adapter_fimfictionnet/adapter_royalroadcom: Better handling of cover image size fall back #1306 2026-02-02 13:12:42 -06:00
Jim Miller
17331e9eb3 Bump Test Version 4.54.1 2026-02-01 13:51:23 -06:00
Jim Miller
9b96c151a5 adapter_adultfanfictionorg: Fixes for site changes #1305 2026-02-01 13:51:22 -06:00
Jim Miller
1b65a30798 Making some metadata entries immutable 2026-02-01 13:51:22 -06:00
Jim Miller
c9a47877f7 Allow for language getting changed by replace_metadata not breaking langcode 2026-02-01 09:15:31 -06:00
Jim Miller
bdc77ad0f6 Remove Site: swi.org.ru No DNS for site. 2026-02-01 09:15:31 -06:00
Jim Miller
719971c76c Don't set numChapters--it's done automatically. 2026-02-01 09:15:31 -06:00
Jim Miller
c74dba472a Fixes for mutable metadata entries used in code 2026-02-01 09:15:31 -06:00
Jim Miller
c1fb7f0fc5 Refactor metadata entry and settings name code a bit 2026-02-01 09:15:31 -06:00
Jim Miller
94c932cd2f Bump Release Version 4.54.0 2026-02-01 09:04:34 -06:00
Jim Miller
27fb765c0d Update translations. 2026-02-01 09:04:08 -06:00
Jim Miller
06ce46f64a Bump Test Version 4.53.15 2026-01-30 08:52:46 -06:00
Jim Miller
c04d85fa97 Plugin BG settings: Remove 'old' vs 'new' BG handling verbiage 2026-01-29 13:16:56 -06:00
Jim Miller
b6cdc30db5 Bump Test Version 4.53.14 2026-01-29 11:23:03 -06:00
Jim Miller
9bbb5e8b01 adapter_ficbooknet: Change how replace_text_formatting converts to text. 2026-01-29 11:22:40 -06:00
Jim Miller
18ce6e6fba BrowserCache: Add comment about py2 and gzip.decompress 2026-01-29 11:20:42 -06:00
Jim Miller
507910f5da Don't give format section warnings for fix_excess_space 2026-01-29 09:28:45 -06:00
Jim Miller
ccf7801a89 Bump Test Version 4.53.13 2026-01-27 11:24:25 -06:00
Jim Miller
9a52a10626 adapter_ficbooknet: Add replace_text_formatting option to replace CSS paragraphing with tags, for txt output. 2026-01-27 11:24:15 -06:00
Jim Miller
6963153aac adapter_storiesonlinenet: Site changed, get series number from series page now. 2026-01-27 10:10:52 -06:00
Jim Miller
ee357cd5b4 Bump Test Version 4.53.12 2026-01-24 09:32:26 -06:00
Jim Miller
b84e3d2858 adapter_royalroadcom: Fix login failure reporting #1302 2026-01-24 09:32:09 -06:00
Jim Miller
9377fc6671 Bump Test Version 4.53.11 2026-01-22 13:33:43 -06:00
Jim Miller
aaa0fa613a Image Handling: Fix tidy cover caching when no cover. 2026-01-22 13:33:36 -06:00
Jim Miller
eac5acfbfa Bump Test Version 4.53.10 2026-01-22 12:13:37 -06:00
Jim Miller
8dca1ef343 Image Handling: Remove unused images properly with dedup_img_files 2026-01-22 12:11:45 -06:00
Jim Miller
28e8f61cf8 Image Handling: Tidy cover caching 2026-01-22 11:29:20 -06:00
Jim Miller
78abf476ea Image Handling: Rename dedup'ed images on first pass, too. 2026-01-22 11:20:12 -06:00
Jim Miller
2b1f9446dd Bump Test Version 4.53.9 2026-01-20 10:09:19 -06:00
Jim Miller
9815736b4e Fix dedup_img_files - changes <img longdesc= to deduped URL. 2026-01-20 10:09:19 -06:00
Jim Miller
3f54cce9a1 Don't record longdesc on img fails. 2026-01-20 10:09:19 -06:00
Jim Miller
223138b8e5 Image Handling: Cache fails w/in download (but not between), keep full src URL with failedtodownload marker 2026-01-20 10:09:12 -06:00
Jim Miller
4aa47c8bab Bump Test Version 4.53.8 2026-01-15 18:06:47 -06:00
Jim Miller
a97a85f357 epub update: Read all images for oldimgs after reading chapters to keep longdesc=origurl 2026-01-15 18:03:54 -06:00
Jim Miller
ffc3696d84 Bump Test Version 4.53.7 2026-01-15 15:14:38 -06:00
Jim Miller
86c4e1974b Skip CSS url() handling on empty tags by content instead of stripHTML 2026-01-15 15:14:23 -06:00
Jim Miller
b6fd7c2ca4 Fix additional_images 2026-01-15 13:23:01 -06:00
Jim Miller
326300b40e Correct comment. 2026-01-15 13:22:40 -06:00
Jim Miller
282bafe514 Bump Test Version 4.53.6 2026-01-15 12:20:53 -06:00
Jim Miller
061a8feccf CSS url() processing only when include_images:true 2026-01-15 12:20:46 -06:00
Jim Miller
26c9b6d2ce Bump Test Version 4.53.5 2026-01-15 09:10:13 -06:00
Jim Miller
ed02d61953 epubutils: Load all images, not just referenced. uuid5 will still allow use. 2026-01-15 09:10:07 -06:00
Jim Miller
b58d54b8ea Bump Test Version 4.53.4 2026-01-14 16:53:53 -06:00
Jim Miller
1bc3ffc269 base_xenforo2forum_adapter: Add ytimg.com to default cover_exclusion_regexp 2026-01-14 16:53:46 -06:00
Jim Miller
cbd295f911 Bump Test Version 4.53.3 2026-01-14 13:55:33 -06:00
Jim Miller
35653f533f base_xenforo2forum_adapter: Add link_embedded_media option 2026-01-14 13:55:23 -06:00
Jim Miller
ea7afea8c2 Fix XF sites lists in configurable.py 2026-01-14 13:35:51 -06:00
Jim Miller
384a2fe8b7 CSS url() style attr--don't do when tag is empty. 2026-01-14 13:18:51 -06:00
Jim Miller
b278cac620 Bump Test Version 4.53.2 2026-01-13 16:45:35 -06:00
Jim Miller
e23de49fb5 uuid5 converts to bytes but gets unhappy about getting bytes to start on
Calibre?
2026-01-13 16:45:00 -06:00
Jim Miller
f64f041546 Adding CSS url() image inclusion, name all images by uuid5 2026-01-13 14:20:11 -06:00
Jim Miller
1d53c506c9 writer_epub: Pretty print epub meta files 2026-01-13 13:47:56 -06:00
Jim Miller
c8d6ce8004 Add webp as a known image type. 2026-01-13 13:43:57 -06:00
Jim Miller
3f08417c04 writer_epub: Don't dup image ids in content.opf on update with old cover. 2026-01-10 15:16:00 -06:00
Jim Miller
79ebf6a02b Bump Test Version 4.53.1 2026-01-08 10:04:59 -06:00
Jim Miller
41dfb8eab8 base_xenforo2forum_adapter: Fix include_nonauthor_poster: Had left testing conditional 2026-01-08 09:10:40 -06:00
Jim Miller
590b663170 Bump Release Version 4.53.0 2026-01-01 09:18:34 -06:00
Jim Miller
9bb408c8b3 Bump Test Version 4.52.9 2025-12-31 10:01:20 -06:00
Jim Miller
5d6a63a8ca Fix for rare 'false' as INI list corner case 2025-12-31 09:59:53 -06:00
Jim Miller
4078ccfdb1 Bump Test Version 4.52.8 2025-12-29 12:49:57 -06:00
Jim Miller
79c29121c3 writer_epub: Add <spine page-progression-direction=rtl> option as page_progression_direction_rtl 2025-12-29 12:49:40 -06:00
Jim Miller
dea48d9e07 adapter_storiesonlinenet: Improve inject_chapter_title for #1294 2025-12-29 12:25:27 -06:00
Jim Miller
c165196a35 base_xenforo2forum_adapter: Add include_nonauthor_poster option 2025-12-29 12:10:26 -06:00
Jim Miller
c385013db9 adapter_literotica: Remove unused chapter_categories_use_all option, fix other site options for better defaults.ini #1292 2025-12-29 10:48:36 -06:00
Jim Miller
8780aa3105 Bump Test Version 4.52.7 2025-12-26 11:53:04 -06:00
Jim Miller
12c7bfe29c adapter_literotica: Remove unused chapter_categories_use_all option, fix other site options for better defaults.ini #1292 2025-12-26 11:52:51 -06:00
Jim Miller
08d0b8a4e0 Changes for #1292 for normalizing different series URL forms. 2025-12-26 11:45:26 -06:00
Jim Miller
1d401f8dba Bump Test Version 4.52.6 2025-12-20 19:41:22 -06:00
Jim Miller
193bb3ed61 AO3: Site changed 'don't have permission' string 2025-12-20 19:40:54 -06:00
Jim Miller
63fd8cd660 Calc words_added even if not in logpage_entries. 2025-12-14 19:49:45 -06:00
Jim Miller
26a1152390 Bump Test Version 4.52.5 2025-12-11 11:20:26 -06:00
WWeapn
e0907147f7
adapter_literotica: Get series ID from data object (#1290) 2025-12-11 11:20:02 -06:00
Jim Miller
99bba3ff12 Bump Test Version 4.52.4 2025-12-10 09:57:34 -06:00
Jim Miller
3fdb6630fb Remove dup of remove_class_chapter from get_valid_set_options() 2025-12-10 09:57:28 -06:00
dbhmw
0d6b789c9f
adapter_literotica: Add chapter descriptions to summary (#1287) 2025-12-10 09:56:15 -06:00
Jim Miller
edaa03ef42 Bump Test Version 4.52.3 2025-12-07 11:06:43 -06:00
Jim Miller
4e17a10792 adapter_literotica: Don't require tags_from_chapters for old eroticatags collection. From #1280 2025-12-07 11:06:37 -06:00
Jim Miller
9fd48e0168 Bump Test Version 4.52.2 2025-12-04 14:04:35 -06:00
Jim Miller
818e990184 adapter_fictionlive: create self.chapter_id_to_api earlier for normalize_chapterurl 2025-12-04 14:04:24 -06:00
Jim Miller
9bb7b54023 Bump Test Version 4.52.1 2025-12-04 09:26:48 -06:00
Jim Miller
af6695e27f adapter_literotica: Fix for one-shot aver_rating #1285 2025-12-04 09:26:32 -06:00
Jim Miller
46293f2d02 Bump Release Version 4.52.0 2025-12-01 08:25:22 -06:00
Jim Miller
7f968ba102 Bump Test Version 4.51.7 2025-11-30 11:02:57 -06:00
Jim Miller
1e5cb9b184 Update translations. 2025-11-30 11:02:30 -06:00
Jim Miller
9627e6e62c Remove site: www.wuxiaworld.xyz - DN parked somewhere questionable for +2 years 2025-11-30 10:58:18 -06:00
Jim Miller
5e644098f9 Remove Site: sinful-dreams.com/whispered/muse - broken for 6+ years even though other two sites on same DN work 2025-11-30 10:37:42 -06:00
Jim Miller
fa3a56d096 adapter_fanfictionsfr: Site SSL requires www now 2025-11-30 10:33:40 -06:00
Jim Miller
ba18216ef8 Bump Test Version 4.51.6 2025-11-28 12:48:32 -06:00
Jim Miller
f207e31b3b Add standard metadata entry marked_new_chapters for epub updated '(new)' chapters count 2025-11-28 12:48:25 -06:00
Jim Miller
0e1ace18e4 Bump Test Version 4.51.5 2025-11-28 09:05:21 -06:00
Jim Miller
b17a632640 adapter_literotica: fix tags_from_chapters for #1283 2025-11-25 10:48:46 -06:00
Jim Miller
485d4631f9 adapter_literotica: Partial fix for #1283, chapters from JSON fetch 2025-11-24 13:20:38 -06:00
Jim Miller
30929bc38e Better handling for no chapters found (#1283) 2025-11-24 12:24:44 -06:00
Jim Miller
ae4311f4dd Bump Test Version 4.51.4 2025-11-19 09:56:07 -06:00
MacaroonRemarkable
3a3c35ea1f Made it possible to use human-readable URLs in addition to api urls for ignore_chapter_url_list 2025-11-19 09:54:57 -06:00
MacaroonRemarkable
19dd89fb4d Fixed missing setting in plugin defaults 2025-11-19 09:54:57 -06:00
MacaroonRemarkable
b247a7465b Added include_appendices config option for fiction.live 2025-11-19 09:54:57 -06:00
albyofdoom
d5c20db681 Implement Alternate Tagging and Date calculation for Literotica 2025-11-19 09:54:40 -06:00
MacaroonRemarkable
a599ff6ad2 Added missing line to plugin-defaults 2025-11-19 09:54:13 -06:00
MacaroonRemarkable
e21c6604a1 Update QQ reader_posts_per_page default 2025-11-19 09:54:13 -06:00
Jim Miller
273c1931f4 Bump Test Version 4.51.3 2025-11-13 08:27:08 -06:00
Jim Miller
fdf29eeade adapter_royalroadcom: New status Inactive 2025-11-13 08:26:54 -06:00
Jim Miller
06e55728d0 Bump Test Version 4.51.2 2025-11-11 20:09:20 -06:00
Jim Miller
0a3ab4bc9d Fix for add_chapter_numbers:toconly and unnew. Closes #1274 2025-11-11 20:08:57 -06:00
Jim Miller
a4a91b373f Bump Test Version 4.51.1 2025-11-10 08:50:28 -06:00
Jim Miller
a68e771026 Don't issue flaresolverr image warning unless include_images:true 2025-11-10 08:50:11 -06:00
Jim Miller
d7c79fcb3b Bump Release Version 4.51.0 2025-11-07 09:53:24 -06:00
Jim Miller
5cc05ed96d Update translations. 2025-11-07 09:33:20 -06:00
Jim Miller
e5b5768f11 Perf improvement for unnew 2025-11-04 12:20:39 -06:00
Jim Miller
6cf2519ef9 Bump Test Version 4.50.5 2025-11-02 20:09:20 -06:00
Jim Miller
f4f98e0877 Don't include default_cover_image with use_old_cover with a different name. 2025-11-02 20:08:16 -06:00
Jim Miller
bb8fb9efa5 writer_epub: More epub3 - prefix & prop cover-image 2025-11-02 18:38:29 -06:00
Jim Miller
be38778d72 Bump Test Version 4.50.4 2025-11-02 09:50:15 -06:00
Jim Miller
55d8efbdcd writer_epub: Only do svg check for epub3 2025-11-02 09:49:51 -06:00
Jim Miller
9df7822e32 Bump Test Version 4.50.3 2025-11-01 14:12:45 -05:00
Jim Miller
69e6a3d2cf writer_epub: Rearrange to detect and flag files containing svg tags for epub3. 2025-11-01 14:12:40 -05:00
Jim Miller
8ea03be5f3 epub3 - Flag the cover *page*--epub3 only flags cover *img* 2025-11-01 13:03:08 -05:00
Jim Miller
75a213beb9 Find and use epub3 cover on update--relies on Calibre's calibre:title-page property. 2025-11-01 12:48:03 -05:00
Jim Miller
ead830c60a adapter_storiesonlinenet: Set authorUrl to site homepage when (Hidden) author for #1272 2025-11-01 09:09:31 -05:00
Brian
20681315e7
Update adapter_storiesonlinenet.py
Removed extraneous parens on conditional 'if' statements
2025-10-31 22:50:56 -07:00
Brian
e2961eaadf
adapter_storiesonlinenet.py - tolerate contest stories
Contest stories have author="(Hidden)" which breaks the code to get story info from author's page.
Added checks for this and also checks to verify soup actually found results before trying to blindly use the results.
2025-10-31 15:01:45 -07:00
Jim Miller
7f0d7f70be Bump Test Version 4.50.2 2025-10-29 13:48:06 -05:00
dbhmw
c5264c2147 adapter_ficbooknet: Collect numWords 2025-10-29 13:47:46 -05:00
MacaroonRemarkable
ff402c16ca
Preserve original titles for Reader Post blocks from fiction.live (#1269)
* Preserve original titles for Reader Post blocks from fiction.live

* Update adapter_fictionlive.py

Changed for py2 backward compatibility

* Update adapter_fictionlive.py

Switched to concatenation rather than .format

* Update adapter_fictionlive.py

Missing space -_-
2025-10-29 13:47:26 -05:00
Jim Miller
4a9da1c02e Bump Test Version 4.50.1 2025-10-19 22:14:16 -05:00
Jim Miller
c14f1014b8 OTW/AO3: Don't apply series page handling to non-series pages 2025-10-19 22:14:08 -05:00
Jim Miller
74bc398994 Bump Release Version 4.50.0 2025-10-19 19:00:10 -05:00
Jim Miller
6e8e74fc55 Bump Test Version 4.49.6 2025-10-18 09:29:20 -05:00
Jim Miller
68ad4c87aa OTW: Fix for site change breaking logged in detection. Closes #1263 2025-10-18 09:29:14 -05:00
Jim Miller
fe82aed91d Bump Test Version 4.49.5 2025-10-12 09:26:37 -05:00
Jim Miller
7d14bf6e90 base_otw_adapter: Fix for markedforlater site change 2025-10-12 09:26:20 -05:00
Brian
39500a9386 Update adapter_storiesonlinenet.py
Add check for SOL accounts in renewal warning period to verbosely explain to users why their downloads don't work
2025-10-12 09:15:38 -05:00
dbhmw
d5f8891e4f adapter_literotica: Site change, regex outdated. 2025-10-12 09:08:12 -05:00
Jim Miller
edce6949ae Bump Test Version 4.49.4 2025-10-10 11:12:09 -05:00
Jim Miller
bec6fac2ea base_otw_adapter: Use download link for chapter->work conversion #1258 2025-10-10 11:11:58 -05:00
Jim Miller
a9bd19a079 Bump Test Version 4.49.3 2025-10-07 10:35:46 -05:00
Jim Miller
7135ba5892 OTW(AO3): Accept /chapter/999 URLs without /works/999 for #1258 2025-10-07 10:35:38 -05:00
Jim Miller
9ba4c100ca Bump Test Version 4.49.2 2025-10-02 13:38:44 -05:00
Jim Miller
fe565149ba Fix tuple vs grouping vs list, closes #1254 2025-10-02 13:38:26 -05:00
Jim Miller
624f60a5c1 Bump Test Version 4.49.1 2025-10-01 11:55:08 -05:00
Jim Miller
5c79ac0b5c New site: althistory.com (NOT alternatehistory.com) for #1252 2025-10-01 11:55:08 -05:00
Jim Miller
615711f904 Comment some debugs 2025-10-01 11:55:08 -05:00
kilandra
2f77bd9e97 Spiritfanfiction login, closes #1247
Add login functionality to Spiritfanfiction.com
2025-10-01 09:05:09 -05:00
Jim Miller
abdc881812 Bump Release Version 4.49.0 2025-10-01 08:50:15 -05:00
Jim Miller
1ba73bf316 Update translations. 2025-09-30 09:22:34 -05:00
Jim Miller
a359c6b326 adapter_storiesonlinenet: Change page not found error reporting 2025-09-23 10:04:29 -05:00
Jim Miller
ff64356e85 Bump Test Version 4.48.7 2025-09-11 09:09:46 -05:00
Jim Miller
0271b14f6c adapter_literotica: Yet another site change, addresses #1245 2025-09-11 09:09:28 -05:00
Jim Miller
bf845e200f Bump Test Version 4.48.6 2025-09-10 13:47:45 -05:00
Jim Miller
e94ff6e1e8 base_otw: Add collectionsUrl and collectionsHTML metadata--keep in order 2025-09-10 13:47:39 -05:00
Jim Miller
07313d2744 Bump Test Version 4.48.5 2025-09-10 13:40:29 -05:00
Jim Miller
bd2026df7e base_otw: Add collectionsUrl and collectionsHTML metadata 2025-09-10 13:40:23 -05:00
Jim Miller
0fa177ff79 Bump Test Version 4.48.4 2025-09-10 08:40:01 -05:00
Jim Miller
d84c72a215 adapter_literotica: Site change 2025-09-10 08:39:55 -05:00
Jim Miller
c319857da0 Bump Test Version 4.48.3 2025-09-08 21:41:18 -05:00
Jim Miller
df586e9bb7 browsercache_simple: Code for 0 length stream in cache file, only seen in Mac 2025-09-08 21:41:11 -05:00
Jim Miller
354a5708ce Bump Test Version 4.48.2 2025-08-27 11:13:15 -05:00
Jim Miller
096face5d2 Add continue_on_chapter_error_try_limit setting 2025-08-27 11:13:07 -05:00
Jim Miller
02e3bddd5c Bump Test Version 4.48.1 2025-08-22 11:19:06 -05:00
Jim Miller
9dadef1905 adapter_fireflyfansnet: Allow for missing authorId. 2025-08-22 11:19:01 -05:00
Jim Miller
2e8a899d8c Bump Release Version 4.48.0 2025-08-07 11:42:37 -05:00
Jim Miller
623915f623 Update translations. 2025-08-07 11:42:36 -05:00
Jim Miller
57865ca53d scribblehub: slow_down_sleep_time:5 per user recommendation 2025-08-07 11:32:03 -05:00
Jim Miller
e9c4b9ef30 Bump Test Version 4.47.4 2025-08-05 08:41:54 -05:00
Jim Miller
0ad088b663 adapter_ficwadcom: Fix for site change. 2025-08-05 08:41:48 -05:00
Jim Miller
e37a7f72be Tweak a few defaults.ini settings. 2025-08-05 08:41:27 -05:00
Jim Miller
9befe122dd Bump Test Version 4.47.3 2025-07-20 12:17:29 -05:00
Jim Miller
e6d6227ff1 Improve error reporting for open_pages_in_browser_tries_limit #1231 2025-07-20 12:17:24 -05:00
Jim Miller
d854a6efe7 Bump Test Version 4.47.2 2025-07-09 10:50:08 -05:00
Jim Miller
a97af94f8a OTW/AO3 - change to 'need to login' text, accept both old and new and another string. #1229 2025-07-09 10:49:45 -05:00
Jim Miller
e2ea97e99a Bump Test Version 4.47.1 2025-07-05 08:41:20 -05:00
Jim Miller
215f6dd8ff OTW/AO3 - change to 'need to login' text 2025-07-05 08:41:09 -05:00
Jim Miller
687aa9c3ba Bump Release Version 4.47.0 2025-07-03 08:21:33 -05:00
Jim Miller
523cf78640 Update strings for translation. 2025-07-03 08:19:59 -05:00
Jim Miller
90e50964b6 Bump Test Version 4.46.11 2025-06-25 08:42:33 -05:00
Jim Miller
a83823ea13 adapter_ashwindersycophanthexcom: http to https 2025-06-25 08:41:47 -05:00
Jim Miller
727aa6f1bc Bump Test Version 4.46.10 2025-06-22 20:15:59 -05:00
Jim Miller
072d929298 adapter_fimfictionnet: New img attr and class. #1226 2025-06-22 20:15:19 -05:00
Jim Miller
992c5a1378 Bump Test Version 4.46.9 2025-06-22 11:59:56 -05:00
Jim Miller
f8937c1af3 Report BG job failed entirely as individual books failed instead of just exception. For #1225 2025-06-22 10:45:05 -05:00
Jim Miller
af5c78e2e9 Remove some unused imports 2025-06-22 09:38:40 -05:00
Jim Miller
4a26dfdfff Plugin BG Jobs: Remove old multi-process code 2025-06-16 19:24:46 -05:00
Jim Miller
a82ef5dbae Bump Test Version 4.46.8 2025-06-16 19:16:18 -05:00
snoonan
6adc995fa5 Update defaults.ini per PR 2025-06-16 19:11:43 -05:00
snoonan
f534efd3df Support for logging into royal road to keep chapter progress (and count as page views) 2025-06-16 19:11:43 -05:00
Jim Miller
f41e64141a Add SB favicons to cover_exclusion_regexp. 2025-06-15 17:30:47 -05:00
Jim Miller
94036e3fbb Send refresh_screen=True when updating Reading Lists in case of series column updates. 2025-06-13 21:07:42 -05:00
Jim Miller
9142609c61 Bump Test Version 4.46.7 2025-06-12 22:05:11 -05:00
Jim Miller
f9d7b893ee Fix images from existing epub being discarded during update. 2025-06-12 22:02:35 -05:00
Jim Miller
4e2ae7441d Bump Test Version 4.46.6 2025-06-11 15:29:12 -05:00
Jim Miller
87dbef980f Mildly kludgey fix for status bar notifications. 2025-06-11 10:47:09 -05:00
Jim Miller
921f8c287b Shutdown IMAP connection when done with it. 2025-06-10 17:42:07 -05:00
Jim Miller
637c6e3cc3 Change default base_xenforoforum minimum_threadmarks:1. See #1218 2025-06-10 16:36:21 -05:00
Jim Miller
ba90ff9f3a Bump Test Version 4.46.5 2025-06-10 12:56:26 -05:00
Jim Miller
34e84b2942 PI BG Jobs: Fix split without reconsolidate. 2025-06-10 12:56:16 -05:00
Jim Miller
31eb7f421a Bump Test Version 4.46.4 2025-06-08 09:45:01 -05:00
Jim Miller
85d4656005 alternatehistory needs at least cloudscraper now, it seems. 2025-06-08 09:45:01 -05:00
Jim Miller
006b8873a5 Fix xenforo2 prefixtags, some still using tags in title 2025-06-08 09:44:48 -05:00
Jim Miller
3246036f88 Bump Test Version 4.46.3 2025-06-08 08:39:04 -05:00
Jim Miller
6d114532e2 Py2 fix for split BG jobs, closes #1214 2025-06-08 08:38:24 -05:00
Jim Miller
2edb1d58d5 Bump Test Version 4.46.2 2025-06-07 13:42:29 -05:00
Jim Miller
8dc3c5d3d8 Skip OTW(AO3) login when open_pages_in_browser AND use_browser_cache AND use_browser_cache_only 2025-06-07 13:22:30 -05:00
Jim Miller
2ec8c97e28 Bump Test Version 4.46.1 2025-06-07 12:51:24 -05:00
Rae Knowler
c51161c3d1 Include Accept:image/* header when requesting an image url 2025-06-07 12:50:12 -05:00
Jim Miller
bd645a97c7 Add use_flaresolverr_session and flaresolverr_session settings for #1211 2025-06-07 12:49:08 -05:00
Jim Miller
f7cbfa56bb Bump Release Version 4.46.0 2025-06-06 20:02:47 -05:00
Jim Miller
07fd16813f Bump Test Version 4.45.15 2025-06-05 16:56:16 -05:00
Jim Miller
2fe971c79f OTW(AO3): Don't attempt login with use_archive_transformativeworks_org or open_pages_in_browser #1210 2025-06-05 16:56:10 -05:00
Jim Miller
e4082c6235 Bump Test Version 4.45.14 2025-06-05 08:59:03 -05:00
Jim Miller
960d5ba11a Ignore use_browser_cache_only when URL scheme is file 2025-06-05 08:57:39 -05:00
Jim Miller
066539793d Update translations. 2025-06-04 22:14:33 -05:00
Jim Miller
5b312494fb Bump Test Version 4.45.13 2025-05-27 19:16:33 -05:00
Jim Miller
e628b10247 adapter_literotica: Fix date parsing. See #1208 2025-05-27 19:16:23 -05:00
dbhmw
61c063ed72 adapter_ficbooknet: Site changes 2025-05-27 19:11:54 -05:00
Jim Miller
11d3f601c9 Add Ctrl-Enter to AddDialog, consolidating code with INIEdit 2025-05-24 13:05:05 -05:00
Jim Miller
3b8d0f63d4 Bump Test Version 4.45.12 2025-05-23 11:46:28 -05:00
Jim Miller
b8b30c6a78 adapter_literotica: Update for site change #1208 2025-05-23 11:46:17 -05:00
Jim Miller
b007f68a88 Bump Test Version 4.45.11 2025-05-23 10:19:17 -05:00
Jim Miller
6d8a67ef2e adapter_literotica: Update for site change #1208 2025-05-23 10:19:05 -05:00
Jim Miller
ab66e9e285 Bump Test Version 4.45.10 2025-05-23 10:02:15 -05:00
Jim Miller
b3f7add5a1 Split BG: Fixes for error column & showing meta collection errors 2025-05-23 10:02:09 -05:00
Jim Miller
800be43d24 Bump Test Version 4.45.9 2025-05-22 12:31:02 -05:00
Jim Miller
70f77e17e2 adapter_literotica: Update for site change 2025-05-22 12:07:16 -05:00
Jim Miller
caf46ba421 Bump Test Version 4.45.8 2025-05-19 15:38:40 -05:00
Jim Miller
686ed80230 Update BG Job changes settings verbiage and defaults 2025-05-19 15:38:27 -05:00
Jim Miller
56689a10c4 Bump Test Version 4.45.7 2025-05-18 10:13:45 -05:00
Jim Miller
065d077752 Improve job 'reconsolidate' for failed jobs and setting changing. 2025-05-18 10:10:02 -05:00
Jim Miller
c8f817e830 Bump Test Version 4.45.6 2025-05-17 13:53:49 -05:00
Jim Miller
1432241319 Single proc bg processing, optionally split by site & accumulate results -- experimental 2025-05-17 13:53:27 -05:00
Jim Miller
0e9f60f8a6 Bump Test Version 4.45.5 2025-05-12 17:02:59 -05:00
Jim Miller
74de62385f Fix remove_empty_p regexp to work with nested <br> tags and whitespace. 2025-05-12 17:02:51 -05:00
Jim Miller
d2f69eb5d5 Bump Test Version 4.45.4 2025-05-10 09:29:20 -05:00
Jim Miller
c3655d59ca AO3 make use_(domain) options not replace media.archiveofourown.org 2025-05-10 09:29:14 -05:00
Emmanuel Ferdman
aca07bbf59 Migrate to new bs4 API
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-05-06 17:38:14 -05:00
Jim Miller
3edd3c3e7b Bump Test Version 4.45.3 2025-05-06 16:17:58 -05:00
Jim Miller
61ba096c6e Fix 'Add New Book' dialog when multiple existing found on update. 2025-05-06 16:17:51 -05:00
Jim Miller
47fd71c4b9 XF2: Allow extra / before threads in story URL. 2025-05-05 12:59:38 -05:00
Jim Miller
e1d0bed52d Bump Test Version 4.45.2 2025-05-05 09:46:15 -05:00
Jim Miller
acb88cbefc Include 'Add New Book' dialog when multiple existing found on update. 2025-05-05 09:45:33 -05:00
Jim Miller
f1e7cabf6a Bump Test Version 4.45.1 2025-05-04 10:15:28 -05:00
kilandra
21ec27ffd4 Fix for adapter_spiritfanfictioncom.py
Commenters are being identified as authors since webpage change.
2025-05-04 10:14:27 -05:00
Matěj Cepl
5567e6417d fix(pyproject): replace license by file to using SPDX keyword
As per https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license
2025-05-02 16:17:35 -05:00
Jim Miller
af352a480c Bump Release Version 4.45.0 2025-05-01 10:49:01 -05:00
Jim Miller
92069dc638 Add comment 2025-05-01 10:48:49 -05:00
Jim Miller
76e9421858 Bump Test Version 4.44.11 2025-04-30 09:17:23 -05:00
Jim Miller
70558bf444 Transition finestories.com to storyroom.com 2025-04-30 09:16:07 -05:00
Brian
b60dfdcc28
Update configurable.py
Add support for WPC/WLPC/SOL sub-site "storyroom.com" to replace "finestories.com"
2025-04-29 19:51:02 -07:00
Brian
b976439669
Create adapter_storyroomcom.py
Add support for WPC/WLPC/SOL sub-site "soryroom.com" which replaces "finestories.com"
2025-04-29 19:49:15 -07:00
Brian
6de50509ed
Update __init__.py
Add support for new WPC/WLPC/SOL subsite "storyroom.com" to duplicate/replace "finestories.com"
2025-04-29 19:38:48 -07:00
Jim Miller
4d9c38d3c2 Bump Test Version 4.44.10 2025-04-28 20:27:42 -05:00
Jim Miller
90ecb63be4 Fix for alternatehistory.com changing threadmark date attr. 2025-04-28 20:26:07 -05:00
Jim Miller
bd49f8e8fa XF2: Add threadmarks_per_page setting 2025-04-28 19:40:53 -05:00
Jim Miller
21c0315e60 Bump Test Version 4.44.9 2025-04-28 09:22:39 -05:00
dbhmw
fc97fa6d5c adapter_literotica: get_urls_from_page - series have urls 2025-04-28 09:22:22 -05:00
Jim Miller
2c3bf3c642 Update translations. 2025-04-28 09:22:01 -05:00
Jim Miller
a9c725d32a Bump Test Version 4.44.8 2025-04-26 22:14:30 -05:00
Jim Miller
f936c5b0fb Remove base_xenforoforum_adapter, consolidate into base_xenforo2forum_adapter 2025-04-26 22:14:24 -05:00
Jim Miller
53344afa49 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2025-04-25 12:25:18 -05:00
Jim Miller
d5addfa2fd Complete impl of use_archiveofourown_gay 2025-04-25 12:25:01 -05:00
Jim Miller
6d8375a9f3 Bump Test Version 4.44.7 2025-04-23 09:47:10 -05:00
Jim Miller
7bc03ac798 adapter_archiveofourownorg: Add use_archiveofourown_gay, allow archiveofourown.gay input for story URLs. 2025-04-23 09:44:18 -05:00
Jim Miller
05d62a5343
Update CLI test version link 2025-04-21 12:06:49 -05:00
Jim Miller
31115f9245 Bump Test Version 4.44.6 2025-04-19 16:40:51 -05:00
dbhmw
26ee692208 adapter_fanfictionnet: Make get_urls_from_page work. 2025-04-19 16:35:01 -05:00
Jim Miller
dd43d25f76 Bump Test Version 4.44.5 2025-04-12 12:32:54 -05:00
dbhmw
fffd15d7ea adapter_ficbooknet: Add series collection & fix downloads 2025-04-12 12:32:30 -05:00
Jim Miller
7c2700c8ea Bump Test Version 4.44.4 2025-04-11 16:00:13 -05:00
Jim Miller
94518c4f25 adapter_fictionmaniatv: Update for ancient stories 2025-04-11 16:00:07 -05:00
Jim Miller
531b965b22 Bump Test Version 4.44.3 2025-04-09 09:46:28 -05:00
Jim Miller
658b637716 adapter_fictionmaniatv: Updates for site change 2025-04-09 09:46:22 -05:00
Jim Miller
44f5feacfb Remove some debugs. 2025-04-09 09:45:55 -05:00
Jim Miller
52451a3eba Bump Test Version 4.44.2 2025-04-03 10:00:46 -05:00
dbhmw
7123f7dd6f Reject HTML sites in no_convert_image 2025-04-03 10:00:32 -05:00
Jim Miller
08a0f9b5fc Bump Test Version 4.44.1 2025-04-01 22:32:36 -05:00
Jim Miller
74ac96a67e base_xenforoforum: Add timeperiodtags and better handle unexpected typed tags 2025-04-01 22:32:26 -05:00
Jim Miller
9eed0340e9 Bump Release Version 4.44.0 2025-04-01 09:54:58 -05:00
Jim Miller
73b90c0291 Additional translation strings 2025-04-01 09:52:17 -05:00
Jim Miller
c33a6e6b05 Bump Test Version 4.43.12 2025-03-29 17:17:51 -05:00
Jim Miller
d77cc15586 adapter_storiesonlinenet(et al): Add always_login option. Closes #1185 2025-03-29 17:17:44 -05:00
Jim Miller
21483f7227 Bump Test Version 4.43.11 2025-03-24 13:26:48 -05:00
Jim Miller
6c0df42fe7 Implementing Timed One Time Password(TOTP) 2FA Exception and collection 2025-03-24 13:22:26 -05:00
Jim Miller
c3a90a8914 Improve logpage updating 2025-03-24 10:47:05 -05:00
Jim Miller
e7f66d293a Bump Test Version 4.43.10 2025-03-21 12:24:46 -05:00
Jim Miller
e49b3a6be0 adapter_asianfanficscom: Add inject_chapter_image option. Closes #1143 2025-03-21 12:20:42 -05:00
Jim Miller
ae72efdc00 Note on open_pages_in_browser for MacOS users linking to #1142 2025-03-21 11:10:44 -05:00
Jim Miller
bc935e213a Bump Test Version 4.43.9 2025-03-21 10:51:29 -05:00
dbhmw
a8e0eabbd8 adapter_literotica: Fixed incorrect parsing for get url from webpage option. 2025-03-21 10:51:06 -05:00
Jim Miller
81b84a8133 Bump Test Version 4.43.8 2025-03-18 20:51:16 -05:00
Jim Miller
a973b8c926 ffnet only: try_shortened_title_urls option #1166 2025-03-18 20:50:12 -05:00
Jim Miller
08ccc659ca open_pages_in_browser_tries_limit is an int 2025-03-17 13:03:04 -05:00
Jim Miller
fb610de27a Revert "adapter_fanfictionnet: Attempt chapter from m. (vs www) when chapter not found"
This reverts commit 370be379f0.
2025-03-17 12:17:56 -05:00
Jim Miller
29d2e3734b base_otw sites list for ini settings. 2025-03-16 21:37:31 -05:00
Jim Miller
48cf17c7b7 Bump Test Version 4.43.7 2025-03-16 13:52:52 -05:00
Jim Miller
ac61c2bb68 AO3 use_archive_transformativeworks_org option 2025-03-16 13:52:52 -05:00
Nicolas SAPA
a12d2a688b Document the new 'directimages' for BrowserCache feature
Explain that this feature is useful for images delivered by WebSite with
a no-cache attribute when `use_browser_cache_only` is true (currently AO3).

Signed-off-by: Nicolas SAPA <nico@ByMe.at>
2025-03-16 13:52:19 -05:00
Nicolas SAPA
52027eac46 Silence a spammy debug
Silence a debug in addImgUrl that was spammy.

Signed-off-by: Nicolas SAPA <nico@ByMe.at>
2025-03-16 13:52:19 -05:00
Nicolas SAPA
a1d4fba728 Add support for 'directimages' with use_browser_cache
Hook the configurable into the direct_fetcher logic already existing for flaresolverr

Signed-off-by: Nicolas SAPA <nico@ByMe.at>
2025-03-16 13:52:19 -05:00
Nicolas SAPA
69872b922c Convert 'use_browser_cache' to bool+
Permit the 'use_browser_cache' configurable to take 'directimage'
so we can later use the default fetcher for image (only).

Signed-off-by: Nicolas SAPA <nico@ByMe.at>
2025-03-16 13:52:19 -05:00
dbhmw
7bd1a1acfc adapter_ficbooknet: Fix additional metadata collection 2025-03-14 18:58:36 -05:00
Jim Miller
80e5a22f0d Bump Test Version 4.43.5 2025-03-10 20:17:19 -05:00
Jim Miller
3cd4188bd8 Add remove_empty_p option, usually for AO3/OTW. #1177 2025-03-10 20:17:04 -05:00
Jim Miller
21d16dbe90 Bump Test Version 4.43.4 2025-03-09 09:56:47 -05:00
Brian
5ce7875851
Update adapter_storiesonlinenet.py
Moved soup.find for article below chapter search code, as it breaks when the description/details contains extraneous /div tag.
2025-03-08 15:20:01 -08:00
Jim Miller
35be14a168 Bump Test Version 4.43.3 2025-03-06 16:04:46 -06:00
dbhmw
930940c7fd adapter_fimfictionnet: Correct the config 2025-03-06 15:57:04 -06:00
dbhmw
f001f19a47 adapter_fimfictionnet: Fetch only the stories in the bookshelf. 2025-03-06 15:57:04 -06:00
Jim Miller
fd7382fb56 Bump Test Version 4.43.2 2025-03-06 13:08:46 -06:00
praschke
c69e940d2a
adapter_syosetucom: remove warningtags from ini 2025-03-06 18:58:39 +00:00
praschke
31dcd8e6ff
adapter_syosetucom: site update 2025-03-06 18:58:26 +00:00
Jim Miller
0bd85c10a8 Bump Test Version 4.43.1 2025-03-05 11:09:54 -06:00
Jim Miller
b075c22261 BrowserCache Chrome Block: Treat entry missing headers same as not found. #1167 #1169 2025-03-05 11:03:32 -06:00
Jim Miller
87b3e04fa1 Bump Release Version 4.43.0 2025-03-01 15:27:40 -06:00
Jim Miller
630f09e644 Bump Test Version 4.42.14 2025-02-28 20:03:09 -06:00
Jim Miller
a0463fc85b base_xenforoforum: Add details_spoiler option for #1165 2025-02-28 20:00:54 -06:00
Jim Miller
de7d8079d9 Add [base_otw] with use_basic_cache:true to defaults.ini 2025-02-26 13:42:04 -06:00
Jim Miller
4aad0ec913 Bump Test Version 4.42.13 2025-02-24 21:24:55 -06:00
Jim Miller
c379b45cb9 BrowserCache: Better handle cache file changing/failing while reading. 2025-02-24 21:24:43 -06:00
Jim Miller
82825d1b16 Bump Test Version 4.42.12 2025-02-24 20:26:13 -06:00
Jim Miller
11b2d5643e Fix BrowserCache for image--cache partitioned by parent(story) page. 2025-02-24 20:26:05 -06:00
Jim Miller
06dc2add8f Bump Test Version 4.42.11 2025-02-24 11:46:50 -06:00
Jim Miller
ab7198bb8f base_otw_adapter: Detect & report 'This site is in beta' page 2025-02-24 11:05:38 -06:00
Jim Miller
d854733ffa AO3: Double default slow_down_sleep_time 2025-02-24 11:05:07 -06:00
Jim Miller
a2cc6bcdd3 Bump Test Version 4.42.10 2025-02-23 20:46:13 -06:00
Jim Miller
c9accda3f8 adapter_mcstoriescom: Suppress site URLs that look like stories but aren't. #1160 2025-02-23 20:46:03 -06:00
Jim Miller
8e55d1e6f4 More direct way for /../ in Get Story URLs from web page, previous broke other sites. #1160 2025-02-23 20:45:47 -06:00
Jim Miller
9b8eb547fc Use urljoin() to remove /../ and /./ from Get Story URLs from web page 2025-02-23 15:22:27 -06:00
Jim Miller
62b3c9264e Bump Test Version 4.42.9 2025-02-22 10:00:41 -06:00
Jim Miller
370be379f0 adapter_fanfictionnet: Attempt chapter from m. (vs www) when chapter not found 2025-02-22 10:00:22 -06:00
Jim Miller
1addfe14fc Strip leading m./www. from domain for brower cache partition. 2025-02-22 10:00:18 -06:00
Jim Miller
e510fb027e Bump Test Version 4.42.8 2025-02-20 19:27:48 -06:00
dbhmw
86b807805f adapter_literotica: Implements get_urls_from_page 2025-02-20 19:27:25 -06:00
Jim Miller
0ace02ee75 six fix for py2/Calibre2 2025-02-19 20:28:40 -06:00
Jim Miller
38ad74af68 Bump Test Version 4.42.7 2025-02-19 10:10:49 -06:00
Jim Miller
6c70a60cdb Add include_tocpage:always option. 2025-02-19 10:10:42 -06:00
Jim Miller
80ee0ca9b9 adapter_fimfictionnet: Further cover fix 2025-02-19 09:58:28 -06:00
Jim Miller
8b143a0c1b Bump Test Version 4.42.6 2025-02-17 20:23:03 -06:00
Jim Miller
9fb86da341 adapter_fimfictionnet: Fix cover images and use data-source attr for img src. 2025-02-17 20:22:55 -06:00
Jim Miller
5c703122ec Bump Test Version 4.42.5 2025-02-14 20:43:59 -06:00
Jim Miller
75f89beab1 adapter_storiesonlinenet: Remove some code that broke parsing when 'author' was in the title. 2025-02-14 20:43:51 -06:00
Jim Miller
fc9d184f20 Bump Test Version 4.42.4 2025-02-13 12:46:19 -06:00
Jim Miller
6c411e054a adapter_literotica: Site changes for non-www domains. 2025-02-13 12:46:12 -06:00
Jim Miller
dbef4719d9 adapter_literotica: http->https 2025-02-13 10:22:06 -06:00
Jim Miller
da6b4c25f2 Bump Test Version 4.42.3 2025-02-12 09:00:13 -06:00
Jim Miller
23004e3953 Make plugin use own copy of six only--including in Smarten Punc 2025-02-12 09:00:06 -06:00
Jim Miller
4a15c2a7d5 Bump Test Version 4.42.2 2025-02-09 08:39:01 -06:00
Hazel Shanks
84dad2ec43 fix bounds check in vote accumulaton. resolves JimmXinu#1154 2025-02-09 08:37:52 -06:00
Jim Miller
5ac38fc327 Bump Test Version 4.42.1 2025-02-05 17:15:32 -06:00
Jim Miller
35e0ada643 Make plugin use own copy of six only. 2025-02-05 17:15:08 -06:00
Alexandre Detiste
a9533364ec make plugin work without system "six" 2025-02-05 21:48:17 +01:00
Jim Miller
4a03186ce6 Bump Release Version 4.42.0 2025-02-01 16:52:54 -06:00
Jim Miller
a0271e2957 Update translations. 2025-02-01 16:52:35 -06:00
Jim Miller
11491c6383 Bump Test Version 4.41.5 2025-01-23 13:50:36 -06:00
Jim Miller
24dccc73f0 Re-alphabetize defaults.ini 2025-01-23 13:50:06 -06:00
Jim Miller
8e3a88776a adapter_wwwaneroticstorycom: Update for site changes. 2025-01-23 13:47:43 -06:00
Jim Miller
28141ce9d1 Remove site: ponyfictionarchive.net - Moved to AO3 2025-01-23 13:22:11 -06:00
Jim Miller
ffaa3bf82a Remove site: www.novelupdates.cc - Domain parked somewhere sketchy 2025-01-23 13:19:35 -06:00
Jim Miller
d0d05d6c3b Remove site: fastnovels.net - Blog only now, no stories. 2025-01-23 13:14:48 -06:00
Jim Miller
6d74a58181 Remove site: starskyhutcharchive.net, moved to starskyandhutcharchive.net, not eFiction. Nobody's missed it. 2025-01-23 12:54:08 -06:00
Jim Miller
de85fd42f7 adapter_fanfictalkcom: Update domain name & match pattern 2025-01-23 12:48:02 -06:00
Jim Miller
c4aebd40df Bump Test Version 4.41.4 2025-01-22 17:21:47 -06:00
Jim Miller
81cb631491 Browser Simple Cache adding orig resp time field & removing browser_cache_simple_header_old option. 2025-01-22 17:21:41 -06:00
Jim Miller
35aa5d2143 Bump Test Version 4.41.3 2025-01-18 08:18:27 -06:00
Jim Miller
a8b1489233 Strip out unused parts of requests_toolbelt to avoid dependency issues. #1145 2025-01-18 08:18:13 -06:00
Jim Miller
ffb179c9a1 Bump Test Version 4.41.2 2025-01-13 09:38:38 -06:00
Jim Miller
6d8d7ab66f Cleanup some INI comments 2025-01-13 09:38:38 -06:00
Jim Miller
a128083ce8 Add no_image_processing_regexp option for #1144 2025-01-13 09:37:40 -06:00
Jim Miller
9f78ec0177 Bump Test Version 4.41.1 2025-01-03 09:46:22 -06:00
Jim Miller
d941810825 adapter_fictionmaniatv: Change to https 2025-01-03 09:46:16 -06:00
Jim Miller
ba1975342c Bump Release Version 4.41.0 2025-01-01 10:37:56 -06:00
Jim Miller
27cfac45e4 Bump Test Version 4.40.5 2024-12-30 20:11:47 -06:00
Jim Miller
64a4eb2bb2 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2024-12-30 20:11:27 -06:00
dbhmw
371f995fda adapter_inkbunnynet: Implemented always_login 2024-12-30 20:11:21 -06:00
dbhmw
816bbdfd66
Small fixes for Wattpad. (#1137)
* adapter_wattpadcom: Various fixes and changes

* adapter_wattpadcom: Config update & category 0 not always present

---------

Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-12-30 20:10:56 -06:00
Jim Miller
cdd6df8a57 Bump Test Version 4.40.4 2024-12-30 19:06:07 -06:00
Jim Miller
5d4489bb28 Update translations. 2024-12-30 19:05:43 -06:00
kat
a9944cd255
add superlove & CFAA (otwarchive sites) support (#1136)
* add superlove otwarchive site support

* add cfaa otwarchive site support

* fixes slash changes for PR

* another fix sorry
2024-12-30 18:34:12 -06:00
Jim Miller
c284b2a6c6 XenForo lazyload: use data-src first if data-url also present. QQ proxy in data-src caches/bypasses some issues 2024-12-15 21:26:25 -06:00
Jim Miller
15dde72f14 Bump Test Version 4.40.3 2024-12-15 12:40:47 -06:00
Jim Miller
ff0f22565c adapter_fimfictionnet: Implement always_login. Remove unused fail_on_password & do_update_hook settings. #1135 2024-12-15 12:33:00 -06:00
Jim Miller
33813b4047 Bump Test Version 4.40.2 2024-11-17 09:23:18 -06:00
Jim Miller
ae3accca27 Call Calibre's safe_open_url for open_pages_in_browser 2024-11-17 09:20:59 -06:00
Jim Miller
d998467f7a Bump Test Version 4.40.1 2024-11-08 10:36:53 -06:00
Jim Miller
29fddbce8e Fix for double replace_metadata when non-list metadata called by getList(). 2024-11-08 10:36:44 -06:00
Jim Miller
a4e1db32e0 Add subject_tags to -m/j CLI output 2024-11-08 10:35:39 -06:00
Jim Miller
81aea65555 Update certifi to 2024.08.30 certs 2024-11-01 09:32:12 -05:00
Jim Miller
9005f9db4c Bump Release Version 4.40.0 2024-11-01 09:03:43 -05:00
Jim Miller
7de040d8db Update translations. 2024-10-30 20:18:54 -05:00
Jim Miller
9c53cf236e Bump Test Version 4.39.10 2024-10-29 16:15:13 -05:00
Jim Miller
2e6ac07020 Fix for D/L from URL Mode Menu actions not honoring changed update mode in dialog. 2024-10-29 16:15:06 -05:00
Jim Miller
3febac62a8 Bump Test Version 4.39.9 2024-10-22 13:37:07 -05:00
Jim Miller
c4ea6ca5fd Add translation strings. 2024-10-22 13:37:07 -05:00
Jim Miller
75f9fb2d38 Add error_dialog for email fetch failure about 2FA/outlook etc. 2024-10-22 13:37:03 -05:00
Jim Miller
e4f83c52ca Fix for translation string bug 2024-10-22 13:26:47 -05:00
Jim Miller
eb54731ae9 Bump Test Version 4.39.8 2024-10-16 16:20:47 -05:00
dbhmw
eb24bcb2ac
adapter_ficbook: Another site update (#1125)
Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-10-16 16:20:05 -05:00
Jim Miller
ffa533e5fd Bump Test Version 4.39.7 2024-10-15 17:37:28 -05:00
dbhmw
bd76066905 adapter_ficbooknet: Fixes for site changes 2024-10-15 21:42:24 +00:00
Jim Miller
eb17af9252 Bump Test Version 4.39.6 2024-10-14 22:25:26 -05:00
Jim Miller
4471b1f980 adapter_fimfictionnet: Skip group JSON collection on failure. #1122 2024-10-14 22:25:21 -05:00
Jim Miller
9cfd88c098 Backout flaresolverr_json_fix since it doesn't work for everyone. #1122 2024-10-14 22:25:10 -05:00
Jim Miller
c1cf8995ea Bump Test Version 4.39.5 2024-10-14 16:16:17 -05:00
Jim Miller
55995be7de Add flaresolverr_json_fix. See #1122 2024-10-14 16:16:10 -05:00
Jim Miller
869686f363 Bump Test Version 4.39.4 2024-10-05 20:02:46 -05:00
Jim Miller
f45a05ddb6 adapter_syosetucom 'is' isn't '=='--generated SyntaxWarning 2024-10-05 20:02:46 -05:00
dbhmw
434ff0de74
adapter_inkbunnynet: adds before_get_urls_from_page (#1119)
Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-10-05 19:47:25 -05:00
Jim Miller
d0ece28197 Bump Test Version 4.39.3 2024-10-04 20:16:48 -05:00
Jim Miller
cd1db0a462 adapter_deviantartcom: Site changes, new chapter text tag. See #1118 2024-10-04 20:16:38 -05:00
Jim Miller
075c5cb7c2 Bump Test Version 4.39.2 2024-10-04 19:08:19 -05:00
praschke
b8740ca1c7
syosetu: fix chapter extraction (#1117) 2024-10-04 19:08:02 -05:00
Jim Miller
3db3e28595 Bump Test Version 4.39.1 2024-10-04 14:01:58 -05:00
Jim Miller
b610d49f6b Change decode_emails default to true. 2024-10-04 14:01:58 -05:00
praschke
35afca430a
syosetu: adjust div names on multi-chapter stories (#1116) 2024-10-04 14:01:14 -05:00
Jim Miller
1499037e19 Bump Release Version 4.39.0 2024-10-01 20:52:39 -05:00
Jim Miller
1aaa4102a5 Bump Test Version 4.38.3 2024-10-01 09:25:07 -05:00
Jim Miller
049c9af0e4 adapter_asianfanficscom: use_cloudscraper:true in defaults.ini 2024-10-01 09:25:07 -05:00
Jim Miller
482b6b67eb adapter_asianfanficscom: Add Is adult toggle call. 2024-10-01 09:22:41 -05:00
Jim Miller
cdb752df6a Better error when utf8FromSoup called with None. 2024-10-01 09:20:21 -05:00
Jim Miller
0412355001 Bump Test Version 4.38.2 2024-09-29 20:28:37 -05:00
Jim Miller
0dc049aedb Reduce debug output 2024-09-29 20:28:37 -05:00
Jim Miller
832387dea0 Add decode_emails option, defaults to false. 2024-09-29 20:28:28 -05:00
Jim Miller
94bd4bf236 Bump Test Version 4.38.1 2024-09-28 19:11:06 -05:00
Jim Miller
493e76df30 Fix(es) for get_url_search not found when seriesUrl doesn't match an adapter site. 2024-09-28 19:10:37 -05:00
Jim Miller
44b6e752f6 Reduce debug output 2024-09-28 18:16:12 -05:00
Jim Miller
5d6f2c91c1 Apply replace_chapter_text to chapter title to CLI metadata dump 2024-09-28 13:18:47 -05:00
Jim Miller
04ae49f944 adapter_adastrafanficcom: Fix class name. Doesn't *actually* matter. 2024-09-25 23:06:31 -05:00
Jim Miller
020606fea1 Fix for regression when browser_cache_simple_header_old added. Closes #1104 2024-09-08 14:49:42 -05:00
Jim Miller
711698620e Bump Release Version 4.38.0 2024-09-01 11:36:55 -05:00
Jim Miller
968687bb82 Update translations. 2024-08-31 19:43:14 -05:00
Jim Miller
07ab6d137b Update translations. 2024-08-30 11:04:43 -05:00
Jim Miller
d51ac5d6f5 Bump Test Version 4.37.4 2024-08-28 14:16:40 -05:00
Jim Miller
478d2e8f17 BrowserCache: pass getConfig, add browser_cache_simple_header_old 2024-08-28 14:16:30 -05:00
Jim Miller
67a1dcee90 Bump Test Version 4.37.3 2024-08-27 19:47:12 -05:00
Jim Miller
af834b1e40 Experimental: Chrome Simple Cache extra field 2024-08-27 19:47:07 -05:00
Jim Miller
ae535e2518 Add get_url_search() to base_xenforoforum_adapter. 2024-08-26 13:54:56 -05:00
Jim Miller
96d36ae71a Bump Test Version 4.37.2 2024-08-26 13:53:40 -05:00
Jim Miller
480b7239e5 Add adapter classmethod get_url_search to move site specific calibre search code to adapters 2024-08-26 11:21:14 -05:00
Jim Miller
2666164c5b Bump Test Version 4.37.1 2024-08-14 13:13:08 -05:00
Jim Miller
6ef8d1b215 Make CLI username prompt more visible 2024-08-14 13:13:00 -05:00
Jim Miller
654619e7e2 adapter_scribblehubcom: Allow for changing title in story URL. 2024-08-14 13:00:33 -05:00
dado330
4ea869a764
Update adapter_syosetucom.py (#1095)
Fix update retrieval for series not completed
2024-08-10 14:54:09 -05:00
Jim Miller
837df18cb0 Bump Release Version 4.37.0 2024-08-01 08:31:43 -05:00
Jim Miller
248f1c022b Update translations. 2024-07-31 19:40:47 -05:00
Jim Miller
4fabf9e65c Bump Test Version 4.36.5 2024-07-29 19:34:49 -05:00
Jim Miller
b7c318f520 Fix for paginated AO3 series, closes #1091 2024-07-29 19:34:42 -05:00
Jim Miller
89a15e1b16 Bump Test Version 4.36.4 2024-07-16 13:13:03 -05:00
Jim Miller
5b41097abc Use titlepage_entry for titlepage_wide_entry unless explicitly set. 2024-07-16 13:12:57 -05:00
Jim Miller
a672b6dbdf Bump Test Version 4.36.3 2024-07-16 12:03:39 -05:00
Jim Miller
e4d5d43efa Allow scribblehub.com story URLs w/o title and search calibre w/o title 2024-07-16 12:03:33 -05:00
Jim Miller
cc572857e0 Bump Test Version 4.36.2 2024-07-08 09:17:09 -05:00
Jim Miller
2f52ae31c0 adapter_storiesonlinenet: Fix for chapter select getting cover image link. 2024-07-08 09:17:02 -05:00
Jim Miller
3ddf801925 Bump Test Version 4.36.1 2024-07-07 09:56:12 -05:00
Jim Miller
182695b0af adapter_storiesonlinenet: Remove ''s Page' to '.s Page' 2024-07-07 09:21:51 -05:00
Jim Miller
656e67cc57 Full OTW settings for www.adastrafanfic.com in defaults.ini 2024-07-07 09:15:34 -05:00
Jim Miller
34215ce0ee Bump Release Version 4.36.0 2024-07-01 15:17:26 -05:00
Jim Miller
c706aed271 Bump Test Version 4.35.7 2024-07-01 15:14:43 -05:00
Jim Miller
e5f8e5bba4 Built-in Event For Action Chains plugin 2024-07-01 15:14:37 -05:00
Jim Miller
11d8fae876 Update defaults.ini comments about OTW 2024-06-30 19:12:12 -05:00
Jim Miller
4a14e5fc86 Bump Test Version 4.35.6 2024-06-23 09:54:36 -05:00
Jim Miller
7548ce6ae0 Catch bad href searches during internal link anchor search. 2024-06-23 09:53:36 -05:00
Jim Miller
e113bbfb1c base_xenforoforum: Remove [] from prefixtags ala [NSFW] on QQ 2024-06-18 19:41:12 -05:00
Jim Miller
d1ccdfd21f adapter_spiritfanfictioncom: Minor regex fix. 2024-06-09 14:51:07 -05:00
Jim Miller
68e8f49e9f Bump Test Version 4.35.5 2024-06-09 14:47:05 -05:00
Jim Miller
49a0328268 adapter_spiritfanfictioncom: Cheesy fix for py2 not knowing %z in dates. 2024-06-09 14:46:54 -05:00
Jim Miller
25ea3fcaad adapter_spiritfanfictioncom: use_basic_cache:true 2024-06-09 14:23:09 -05:00
Jim Miller
a5378ca419 Bump Test Version 4.35.4 2024-06-09 12:58:06 -05:00
Jim Miller
e0b733b60d Alphabet order INI sections 2024-06-09 12:57:35 -05:00
kilandra
33b2b10bf3
New Site: SpiritFanfiction.com (#1078)
Add support for spiritfanfiction.com
2024-06-09 12:54:58 -05:00
Jim Miller
c468c26208 Bump Test Version 4.35.3 2024-06-09 08:46:34 -05:00
Jim Miller
9d29f888b3 XF2: SB/SV changed the header for thread_status 2024-06-09 08:46:07 -05:00
Jim Miller
d1e8a77489 Bump Test Version 4.35.2 2024-06-04 10:31:21 -05:00
Jim Miller
ef66e73fa4 adapter_ficbooknet: Better fixes for py2 from dbhmw 2024-06-04 10:31:08 -05:00
Jim Miller
7f128587c0 Bump Test Version 4.35.1 2024-06-03 15:52:05 -05:00
Jim Miller
53a7a60dbc adapter_ficbooknet: Fixes for py2 in older Calibres 2024-06-03 15:51:55 -05:00
dbhmw
71a61ff166
adapter_ficbooknet: Fix breakage for proxies & add covers (#1077)
* Make ficbook work under nsapa

* Support adding covers

* More patches

* .

* Fix

* Fix num pages

* Add updated urls to getSiteExampleURLs
Update configs
Add logging

* New getSiteURLPattern

* Fixed scraping the collections

* Fixed follow count

* Fixed num awards count

* Adds ability to login

* A minor refactor

---------

Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-06-03 15:50:57 -05:00
Jim Miller
9c051e6c3b Bump Release Version 4.35.0 2024-06-01 09:03:25 -05:00
Jim Miller
f0d89498dc Update translations. 2024-06-01 09:01:39 -05:00
Jim Miller
abb370a852 Bump Test Version 4.34.9 2024-05-28 21:54:57 -05:00
Jim Miller
4b9054d1b4 Add download_finished_signal for Action Chains #1073 2024-05-28 21:54:42 -05:00
Jim Miller
2d0db171a8 Remove checks for pre-2.85.1 features--assumed present. 2024-05-28 14:09:10 -05:00
dbhmw
7f67465767
Support for touchfluffytail.org (#1071)
* An attempt is made

* Quick fix

* Update adapter_touchfluffytail.py

* Config fix

* Add num of comments & reviews

* Minor conf change;Add num of views

* Add section to plugin-defaults.ini

* Fixed config
Improved a bit views count
Changed getSiteURLPattern to stop grabbing navigation pages

* Repair plugin-defaults.ini

---------

Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-05-26 10:23:28 -05:00
Jim Miller
6801d5e01d Bump Test Version 4.34.8 2024-05-26 10:20:55 -05:00
Jim Miller
b01914c24e adapter_wattpadcom: Improve error reporting when story not found / connection refused. 2024-05-26 10:20:36 -05:00
Jim Miller
dd41f99288 Bump Test Version 4.34.7 2024-05-22 19:44:45 -05:00
Jim Miller
37db56e6b3 base_xenforo2 better detect whether logged in. 2024-05-22 19:39:31 -05:00
Jim Miller
f0a08f7647 Bump Test Version 4.34.6 2024-05-21 20:56:42 -05:00
Jim Miller
2593f742c9 adapter_deviantartcom: Streamline login vs watchers vs mature See #1070 2024-05-21 20:56:22 -05:00
Jim Miller
6ac299c198 adapter_deviantartcom: Watchers only stories need login #1070 2024-05-20 19:26:41 -05:00
dbhmw
3eda289349
adapter_inkbunny: Fix author & category (#1069)
* Fix author & category

* Remove redundant calls

---------

Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-05-17 16:57:07 -05:00
Jim Miller
95a7bdd3a9 Bump Test Version 4.34.5 2024-05-16 21:55:43 -05:00
Jim Miller
84257e7388 base_xenforo2forum: Prefix tag collecting too much. 2024-05-16 21:55:35 -05:00
Jim Miller
465bffd896 datetime.utcnow() deprecated in more recent py3 versions 2024-05-11 16:54:52 -05:00
dbhmw
eabfd1bef3
Skip invalid images, detect img types (#1068)
* Skip invalid images

* Exception handling

---------

Co-authored-by: dbhmw <github.spherical376@passmail.net>
2024-05-11 16:06:33 -05:00
Jim Miller
8d6676617c Bump Test Version 4.34.4 2024-05-09 11:25:05 -05:00
Jim Miller
c47b620f67 Fix for WebToEpub firefox cache key changing 2024-05-09 11:24:57 -05:00
Jim Miller
df94cc439e Bump Test Version 4.34.3 2024-05-08 11:51:43 -05:00
Jim Miller
08032778bd QQ: Doesn't need reader_posts_per_page:30 anymore 2024-05-08 11:51:26 -05:00
Jim Miller
52deec3fd8 Bump Test Version 4.34.2 2024-05-07 13:38:12 -05:00
Jim Miller
5b443d4363 adapter_forumquestionablequestingcom:Switch to BaseXenForo2ForumAdapter 2024-05-07 13:38:06 -05:00
Jim Miller
4170cfd9a6 Bump Test Version 4.34.1 2024-05-02 11:13:08 -05:00
Jim Miller
ae4735df04 adapter_ficbooknet: Remove py3 string handling that breaks on py2 2024-05-02 11:13:00 -05:00
Jim Miller
6041036787 Update bundled certifi cacert.pem and version, not core.py 2024-05-02 08:48:16 -05:00
Jim Miller
d451265621 Bump Release Version 4.34.0 2024-05-01 09:01:51 -05:00
Jim Miller
677f213337 adapter_literotica: Match prior formatting of averrating. 2024-04-30 16:43:10 -05:00
Jim Miller
8537702028 Bump Test Version 4.33.15 2024-04-29 09:44:35 -05:00
Jim Miller
6d3d4d1ae6 adapter_literotica: Fix category collection. #1058 2024-04-29 09:44:28 -05:00
Jim Miller
1f42c188fa Bump Test Version 4.33.14 2024-04-28 21:08:46 -05:00
Jim Miller
9346985718 adapter_literotica: Restore chapter descs description when nothing else. #1058 2024-04-28 21:08:40 -05:00
Jim Miller
4585afde50 Bump Test Version 4.33.13 2024-04-28 10:57:54 -05:00
Jim Miller
bee6cb9ba6 adapter_literotica: Don't setDescription() if tag is empty #1058 2024-04-28 10:57:33 -05:00
Jim Miller
581b627a3e Bump Test Version 4.33.12 2024-04-27 14:41:58 -05:00
Jim Miller
4436001494 adapter_literotica: Improved description collection. #1058 2024-04-27 14:41:34 -05:00
Jim Miller
6116a19986 adapter_literotica: Collect averrating from hidden JSON. #1058 2024-04-27 14:41:19 -05:00
dbhmw
99fd4ea0e5
adapter_ficbooknet Fix update date not working (#1066)
* Fixes pub and updates v2

* Add 'part_text' for porper formatting

* Fix collections grabbing

* Add collection of numawards

* Add collection of categories

* Collect awards
2024-04-27 12:06:14 -05:00
Jim Miller
a613b842f2 Bump Test Version 4.33.11 2024-04-26 15:42:13 -05:00
Jim Miller
6462c5c366 adapter_literotica: Allow /series/se/alphanumeric instead of just numeric. 2024-04-26 15:31:06 -05:00
Jim Miller
8c4a8cd2da Bump Test Version 4.33.10 2024-04-26 12:26:17 -05:00
Jim Miller
7a0ea3ce96 adapter_literotica: Remove use_meta_keywords option. #1058 2024-04-26 12:25:24 -05:00
Jim Miller
f14fe9d3aa adapter_literotica: Rewrite(mostly) for site changes. #1058 2024-04-26 12:24:10 -05:00
Jim Miller
36add28269 adapter_literotica: Fix for chapter_categories_use_all:true causing Tag vs string error. 2024-04-26 10:34:19 -05:00
Jim Miller
87b4171dd4 Bump Test Version 4.33.9 2024-04-24 16:57:37 -05:00
dbhmw
951acf61b4
ficbook.net Add chapter dates for TOC (#1065)
* Fix the broken stuff

* Last correction

* Add chapter dates
2024-04-24 16:57:23 -05:00
Jim Miller
8674b54753 Bump Test Version 4.33.8 2024-04-24 12:09:04 -05:00
Jim Miller
b7e5bf0468 adapter_literotica: Not all chapters have Rating 2024-04-24 12:08:57 -05:00
Jim Miller
0f12c127b6 Bump Test Version 4.33.7 2024-04-24 10:37:06 -05:00
Jim Miller
50c51dc993 adapter_literotica: Beta site changes #1058 2024-04-24 10:36:36 -05:00
Jim Miller
65bf03a613
Merge pull request #1064 from dbhmw/main
Ficbook.net More metadata collection
2024-04-24 10:35:15 -05:00
dbhmw
0bb8421f98 Fixes 2024-04-24 13:53:43 +00:00
dbhmw
108e603e63 Collect collections v1 2024-04-24 12:54:58 +00:00
dbhmw
1868ed842e En 2024-04-23 22:38:39 +00:00
dbhmw
6c505a6170 The SPACE 2024-04-23 22:33:05 +00:00
dbhmw
72d508b0bf Add more metadata collection 2024-04-23 22:21:07 +00:00
Jim Miller
d6f2faf170 Bump Test Version 4.33.6 2024-04-23 08:39:18 -05:00
Jim Miller
92cbff7db9 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2024-04-23 08:32:42 -05:00
Jim Miller
4bb2d50921
Merge pull request #1063 from dbhmw/dbhmw-patch-1
Ficbook.net Fix categories and add chapter notes
2024-04-23 08:32:33 -05:00
dbhmw
c3d8bc4fd0
Fix categories and add chapter notes 2024-04-23 10:14:52 +00:00
Jim Miller
37ae6cbdbb Bump Test Version 4.33.5 2024-04-20 18:22:28 -05:00
Jim Miller
b953daa3c2 adapter_storiesonlinenet: Fix for /library/ -> /s/ 2024-04-20 18:20:23 -05:00
Jim Miller
463910cd54 Bump Test Version 4.33.4 2024-04-14 08:57:48 -05:00
Jim Miller
95bfdf907f Alphabetize INIs 2024-04-14 08:57:40 -05:00
Yves
85550aeaf6
Add a fanfictions.fr connector (#1061)
* Add a fanfictions.fr connector

* PR fixes

* Move the cover image test outside of the generic tests

* Handle suspended fanfictions

* Allow downloading of fanfictions only available in zip files

* Add the date published element

* Add the basic cache

* Aggressive line breaks

* Fix description formatting

* Add more metadata

* Fix the description test
2024-04-14 08:56:13 -05:00
Jim Miller
5b20926f2c Bump Test Version 4.33.3 2024-04-09 10:58:01 -05:00
Jim Miller
c915aceb85 basexf: Fix for prefix tags, put in prefixtags included in genre 2024-04-09 10:57:34 -05:00
Jim Miller
36d56b867c Bump Test Version 4.33.2 2024-04-09 10:19:21 -05:00
Jim Miller
e1cec84075 basexf: Add XF categorized tags into: category, genre, characters, contenttags and formattags 2024-04-09 10:19:15 -05:00
Jim Miller
ba3676d73f Remove some debugs 2024-04-09 10:12:02 -05:00
Jim Miller
80f50b298f Add a warning output about minimum_threadmarks for XF. 2024-04-05 10:54:29 -05:00
Jim Miller
9120504249 Remove some debugs. 2024-04-04 16:47:31 -05:00
Jim Miller
55c7ca9c10 Bump Test Version 4.33.1 2024-04-02 10:31:36 -05:00
Jim Miller
704ea89d72 OTW(AO3) Support Paginated Series 2024-04-02 10:31:30 -05:00
Jim Miller
8eecd0aa7d Bump Release Version 4.33.0 2024-04-01 08:59:48 -05:00
Jim Miller
c53f99d01c Update translations. 2024-03-30 15:37:02 -05:00
Jim Miller
438a1265f2 Bump Test Version 4.32.13 2024-03-26 20:57:30 -05:00
hmonsta
86766223cb updated inkbunny adapter
fixed "keep_summary_html" config being ignored and always stripping formatting
cover images can now be extracted from more submissions
2024-03-26 20:56:38 -05:00
Jim Miller
1fa94de1d9
Update README.md 2024-03-26 17:41:33 -05:00
Jim Miller
56d1cf19ef Bump Test Version 4.32.12 2024-03-25 09:22:33 -05:00
Jim Miller
701c096ed4 adapter_deviantartcom: Add a 6th different message to indicate 'mature content'. #1052 2024-03-25 09:22:26 -05:00
Jim Miller
aab3e1c601 Bump Test Version 4.32.11 2024-03-24 15:53:15 -05:00
Jim Miller
8d040a4926 adapter_deviantartcom: Bad username fails separately than bad pass. #1052 2024-03-24 15:53:15 -05:00
Jim Miller
4453cbb143 Perform replace_chapter_text on chapter titles, too. 2024-03-24 13:26:10 -05:00
Jim Miller
0c173f8110 Bump Test Version 4.32.10 2024-03-23 17:52:06 -05:00
Jim Miller
a14b39eb4c Paste into ini edit as plain text only. 2024-03-23 17:51:59 -05:00
Jim Miller
c9cb51f8c4 Bump Test Version 4.32.9 2024-03-22 13:01:07 -05:00
Jim Miller
dbe6c6105c Ignore bs4 XMLParsedAsHTMLWarning as per #894 from mcepl 2024-03-22 13:01:00 -05:00
Jim Miller
04231eecfe Add note to tweak_fg_sleep settings. 2024-03-21 10:08:09 -05:00
Jim Miller
a55a4c93a5 Bump Test Version 4.32.8 2024-03-18 19:33:44 -05:00
praschke
dcd4f0f6a5 syosetu: make numeric metadata robust against wording changes 2024-03-18 19:33:31 -05:00
Jim Miller
792ab02195 Bump Test Version 4.32.7 2024-03-14 08:58:53 -05:00
praschke
7a87310403 syosetu: typos 2024-03-14 08:56:35 -05:00
praschke
7e070528a1 Add support for kakuyomu.jp 2024-03-14 08:56:35 -05:00
Jim Miller
4f3af1395f Bump Test Version 4.32.6 2024-03-10 20:55:43 -05:00
Jim Miller
1fc4f3d70b Don't try to set imap tags before checking for 'good' update. 2024-03-10 20:55:36 -05:00
Jim Miller
12ee3dae5e Bump Test Version 4.32.5 2024-03-10 16:14:20 -05:00
Jim Miller
cf28bc26f0 adapter_deviantartcom: Add another way to remove comments section. 2024-03-10 16:14:14 -05:00
Jim Miller
bd41796231 Bump Test Version 4.32.4 2024-03-05 08:04:29 -06:00
Jim Miller
f21f039b3a Move new exception catching for metadata errors 2024-03-05 08:04:22 -06:00
Jim Miller
7263f4120c Bump Test Version 4.32.3 2024-03-03 17:21:03 -06:00
Jim Miller
22e0e8da66 Report errors during library update loop better. 2024-03-03 17:20:57 -06:00
Jim Miller
7173bf0803 Fix setting book['tags'] for bgmeta for update AND overwrite. 2024-03-03 17:20:13 -06:00
Jim Miller
7246cdf853 Bump Test Version 4.32.2 2024-03-02 10:45:51 -06:00
Jim Miller
c60b296bc9 SV site change, '...' in paginated threadmarks list 2024-03-02 10:45:43 -06:00
Jim Miller
a8a86533ad Bump Test Version 4.32.1 2024-03-01 15:14:45 -06:00
Jim Miller
d1c5847a58 SV site change, paginated threadmarks list 2024-03-01 15:14:39 -06:00
Jim Miller
68e0d70fcb Bump Release Version 4.32.0 2024-03-01 08:45:36 -06:00
Jim Miller
74b28f7ead Update translations. 2024-03-01 08:44:07 -06:00
Jim Miller
acda805c3c Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2024-02-28 12:21:10 -06:00
Jim Miller
a37fbbbd51 Bump Test Version 4.31.8 2024-02-28 12:20:41 -06:00
Jim Miller
2cdb6036ea Add Edit personal.ini as a direct menu item--can keyboard shortcut 2024-02-28 12:20:31 -06:00
Jim Miller
77afdc0208
Update README.md 2024-02-27 12:13:31 -06:00
Jim Miller
7e0e68f66f Bump Test Version 4.31.7 2024-02-24 10:35:24 -06:00
Jim Miller
bbec6fcd5f adapter_deviantartcom: Fix for site change and detect no username. Closes #1042 2024-02-24 10:35:11 -06:00
Jim Miller
631fe6c9c9 Bump Test Version 4.31.6 2024-02-18 17:44:19 -06:00
Jim Miller
a86755ad98 Merge branch 'syosetu' of https://github.com/praschke/FanFicFare 2024-02-18 17:24:15 -06:00
praschke
42d2b00007
syosetu: python 2 and beautiful soup compatibility fixes 2024-02-18 21:40:11 +00:00
praschke
ad10cad0b0
syosetu: add all extra metadata to commented titlepage addition 2024-02-18 21:40:06 +00:00
praschke
71d3589ebc
syosetu: remove timezone 2024-02-18 21:39:56 +00:00
praschke
84ed1827be
syosetu: remove suggested japanese labels from ini files 2024-02-18 21:39:40 +00:00
Jim Miller
ce29a6923e Bump Test Version 4.31.5 2024-02-18 14:45:17 -06:00
Jim Miller
d96d194b2b Set book[tags] after writeStory for literotica. 2024-02-18 14:34:11 -06:00
praschke
5cb3bccf45
Add support for syosetu.com 2024-02-18 00:02:00 +00:00
Jim Miller
e6639323b7 Bump Test Version 4.31.4 2024-02-15 10:34:40 -06:00
Jim Miller
f94e0eaf32 Don't need \n after </span> looking for log entries. 2024-02-15 10:34:40 -06:00
Jim Miller
37bcb1284b Don't do random lang anymore in test1--changes series sort in Calibre. 2024-02-13 10:44:21 -06:00
Jim Miller
295bd2e1ab Bump Test Version 4.31.3 2024-02-06 10:21:22 -06:00
Jim Miller
45b4a8d8bf Add include_images:coveronly option for #1037 2024-02-06 10:21:14 -06:00
Jim Miller
cdb60423fe Bump Test Version 4.31.2 2024-02-05 10:47:07 -06:00
Jim Miller
50f913843b adapter_wwwutopiastoriescom: Remove author page get, add extracategories instead. Site static now. 2024-02-05 10:46:57 -06:00
Jim Miller
581d6f6657 adapter_literotica: Allow for empty div.aa_ht tags, remove extra None
from text
2024-02-05 10:36:58 -06:00
Jim Miller
e03f65332a Bump Test Version 4.31.1 2024-02-04 16:48:21 -06:00
Jim Miller
3e9abec817 adapter_wwwutopiastoriescom: Updates for site changes. 2024-02-04 16:48:13 -06:00
Jim Miller
0d8f84ba23 Bump Release Version 4.31.0 2024-02-01 11:26:10 -06:00
Jim Miller
c646419336 Update translations. 2024-02-01 11:25:51 -06:00
Jim Miller
622a4eb44b Change default flaresolverr_proxy_timeout:59000 so it happens before default connect_timeout:60.0 2024-01-30 19:46:49 -06:00
Jim Miller
d4fbc73b41 Bump Test Version 4.30.10 2024-01-30 10:36:48 -06:00
Jim Miller
391f469a99 adapter_deviantartcom: Changed to 2 post login (#1035) and finding story id 2024-01-30 10:35:41 -06:00
Jim Miller
a0ca55d7f6 Bump Test Version 4.30.9 2024-01-29 19:12:43 -06:00
Jim Miller
a4bbe27771 fetcher_flaresolverr: Report errors from Flaresolverr better and fail faster--no retries. 2024-01-29 19:12:32 -06:00
Jim Miller
a5e2d1eb45 Bump Test Version 4.30.8 2024-01-28 14:05:34 -06:00
Moxie
7a89d03339 Add a configuration option to normalize URLs returned from CLI --imap 2024-01-28 14:05:00 -06:00
Jim Miller
ae638fd0a1 Bump Test Version 4.30.7 2024-01-26 15:58:17 -06:00
grenskul
26a59b373a
Update adapter_royalroadcom.py 2024-01-26 21:34:31 +00:00
grenskul
479c0b7d95
Update adapter_royalroadcom.py
owner recommended comits
2024-01-26 20:59:33 +00:00
grenskul
52a0bb6e0e
Update adapter_royalroadcom.py
fix for including "speak: never"; in the style
2024-01-26 20:46:17 +00:00
Jim Miller
f2f333c807 Bump Test Version 4.30.6 2024-01-26 12:32:11 -06:00
grenskul
3f2f2a33d3 Update adapter_royalroadcom.py
forgot the spoilers the first time
2024-01-26 12:31:50 -06:00
grenskul
ba9272822b Update adapter_royalroadcom.py
Added a bypass for Royal Road introducing lines saying "A case of theft: this story is not rightfully on Amazon; if you spot it, report the violation" etc . This is done by finding elements with the "display: none;" style and extracting them
2024-01-26 12:31:50 -06:00
Jim Miller
9575044262 Bump Test Version 4.30.5 2024-01-17 17:33:43 -06:00
Jim Miller
7306e81a30 Fix for site change, adapter_deviantartcom. #1030 2024-01-17 17:33:24 -06:00
Jim Miller
19f9132109 Bump Test Version 4.30.4 2024-01-15 12:20:01 -06:00
Jim Miller
f340ba50da Allow image/comic and poems on literotica 2024-01-15 12:19:51 -06:00
Jim Miller
6e90c7ed7b Bump Test Version 4.30.3 2024-01-14 16:22:07 -06:00
Moxie
0a81bc7c6b Fix selector for xenforo2 stats, needed to pull word count estimate 2024-01-14 16:21:40 -06:00
Jim Miller
f5dd6b90fc Bump Test Version 4.30.2 2024-01-10 13:19:10 -06:00
Jim Miller
e1a9438595 Swap out SuperFastHash implementation #1026 2024-01-10 13:19:02 -06:00
Jim Miller
97a72380e6 Bump Test Version 4.30.1 2023-12-16 09:44:53 -06:00
Jim Miller
a6a3a4e240 Another OTW(AO3) block/hidden story string 2023-12-16 09:41:08 -06:00
Jim Miller
b6b1e6ecdc Bump Release Version 4.30.0 2023-12-01 12:25:25 -06:00
Jim Miller
85cf21a32c Update translations. 2023-12-01 12:24:59 -06:00
Jim Miller
918ed4a23e Bump Test Version 4.29.7 2023-11-26 11:32:48 -06:00
Jim Miller
84d6106a30 Better handling of &<> entities with stripHTML() and chapter titles. #1019 2023-11-26 11:32:41 -06:00
Jim Miller
6761cae9c1 Bump Test Version 4.29.6 2023-11-21 18:52:55 -06:00
Jim Miller
e330ccbe94 SB(but not SV) removed RSS link from thread list title. Closes #1017 2023-11-21 18:52:46 -06:00
Jim Miller
da7059e978 Bump Test Version 4.29.5 2023-11-20 11:41:56 -06:00
Jim Miller
893345dc33 adapter_storiesonlinenet: Allow /n/ as well as /s/ paths 2023-11-20 11:41:50 -06:00
Jim Miller
9fcc6fe68a Bump Test Version 4.29.4 2023-11-20 09:34:49 -06:00
Brian
0c02f17d67 Update adapter_storiesonlinenet.py
Fix issue introduced in initial fix for issue #1013 for paid subscribers that include download link and URL link in same list
2023-11-20 09:34:37 -06:00
Jim Miller
11c8805f4c Bump Test Version 4.29.3 2023-11-19 12:46:22 -06:00
Jim Miller
cf065fa706 adapter_fanfictionnet: Only use data-original cover images. 2023-11-19 12:46:16 -06:00
Jim Miller
3c94c9d308 Bump Test Version 4.29.2 2023-11-19 09:54:43 -06:00
Jim Miller
831bea725f adapter_storiesonlinenet: Update for chapter URL change. See #1013 2023-11-19 09:49:46 -06:00
Jim Miller
b748283484 Comment about why we're keeping a typo. See #1011 2023-11-15 10:48:39 -06:00
Jim Miller
28af7e1722 Bump Test Version 4.29.1 2023-11-15 08:50:27 -06:00
Jim Miller
1673da5a4b BrowserCache should ignore usecache flag, that's for BasicCache. 2023-11-15 08:50:20 -06:00
Jim Miller
c97c0e822d Suppress a debug output except when it matters. 2023-11-15 08:49:40 -06:00
Jim Miller
ce24ac70d9 Bump Release Version 4.29.0 2023-11-01 11:41:05 -05:00
Jim Miller
9ab4739710 Update translations. 2023-10-24 20:59:23 -05:00
Jim Miller
685084e711 Add use_flaresolverr_proxy:directimages comment to defaults.ini #1007 2023-10-24 20:57:47 -05:00
Jim Miller
dd049ac297 Bump Test Version 4.28.8 2023-10-21 09:50:16 -05:00
Jim Miller
516f7464b7 Update messages for translations 2023-10-21 09:47:17 -05:00
Jim Miller
46be37e034 Retry Calibre metadata update when it fails due to conflicting program(Windows File Explorer) 2023-10-21 09:15:54 -05:00
Jim Miller
693f0aa774 Bump Test Version 4.28.7 2023-10-19 13:01:54 -05:00
Jim Miller
646693ca3e Change bs4.find(text=) to string= for deprecation change. 2023-10-19 13:01:47 -05:00
Jim Miller
22534986d3 adapter_asianfanficscom: Fix for 'friend only' stories initially giving 404 2023-10-19 12:08:12 -05:00
Jim Miller
18b183585a Tweaks to use_flaresolverr_proxy:directimages 2023-10-19 12:06:32 -05:00
Jim Miller
5862ba627e Bump Test Version 4.28.6 2023-10-18 12:13:27 -05:00
Jim Miller
c38f4ab400 Add use_flaresolverr_proxy:directimages experimental for #1007 2023-10-18 12:13:27 -05:00
Jim Miller
f5c9fcf029 Comment out a debug 2023-10-18 12:13:22 -05:00
Jim Miller
9e206d2215 Comment out some debugs. 2023-10-16 11:05:14 -05:00
Jim Miller
b1b2451fa6 Fixes for poor '\' escapes that give SyntaxWarning 2023-10-13 17:51:52 -05:00
Jim Miller
91f2f84c10 Remove tests for removed site wuxiaworld.site 2023-10-13 17:43:59 -05:00
Jim Miller
16ba74c98e Bump Test Version 4.28.5 2023-10-13 16:02:03 -05:00
Jim Miller
0cc3b81580 ConfigParser.read_file added in py3.2, readfp removed in py3.12, only used in plugin #1006 2023-10-13 16:01:48 -05:00
Jim Miller
c769900332 Bump Test Version 4.28.4 2023-10-12 20:43:42 -05:00
Jim Miller
a84e6ab385 Update missing image library message. 2023-10-12 20:41:58 -05:00
Jim Miller
af163c27e0 Update six.py to 1.16 for Python 3.12 compatibility. #1006 2023-10-12 20:41:48 -05:00
Jim Miller
016452ec89 Fix a spacing in ini 2023-10-12 11:30:05 -05:00
Jim Miller
b584779a13 Bump Test Version 4.28.3 2023-10-10 12:41:05 -05:00
Jim Miller
01d97ed770 Add base_adapter.img_url_trans() for adapter_fictionlive image URLs #1004 2023-10-10 12:40:49 -05:00
Jim Miller
607ef27fe1 Bump Test Version 4.28.2 2023-10-07 09:29:28 -05:00
Jim Miller
448a9cfaef Pillow minimum version for CLI 2023-10-07 09:29:20 -05:00
Jim Miller
88fb6069fc Bump Test Version 4.28.1 2023-10-07 09:23:23 -05:00
Jim Miller
cd5fd2cab4 Pillow change for CLI, closes #1002 2023-10-07 09:23:04 -05:00
Jim Miller
a21fcf7e77 Bump Release Version 4.28.0 2023-10-02 13:34:29 -05:00
Jim Miller
627a8dbff5 Update translations. 2023-10-02 13:33:49 -05:00
Jim Miller
dd1207f11e Bump Test Version 4.27.4 2023-09-21 20:01:15 -05:00
Jim Miller
49aec452ca adapter_fanficsme: More fixes for unusual cases, now using regexp #999 2023-09-21 20:01:08 -05:00
Jim Miller
e033f71ece Bump Test Version 4.27.3 2023-09-21 11:45:34 -05:00
Jim Miller
62b097f3d5 adapter_fanficsme: Fixes for some unusual cases Closes #999 2023-09-21 11:44:22 -05:00
Jim Miller
3098c1983f Bump Test Version 4.27.2 2023-09-20 11:29:07 -05:00
Jim Miller
37626680f9 Refactor adastrafanfic.com to use base_otw_adapter 2023-09-20 11:29:07 -05:00
Jim Miller
d99fe607da Refactor to make base_otw_adapter 2023-09-20 11:29:07 -05:00
Jim Miller
c80f22cdd3 Remove site: noveltrove.com - Site broken +1 years, owner unresponsive 3years Closes #998 2023-09-20 11:29:07 -05:00
Jim Miller
0b6402ca8a Remove site: hlfiction.net - Site broken ~3 years (sql errors) 2023-09-20 11:29:07 -05:00
Jim Miller
26a7633337 Remove site: worldofx.de - Changed ~2years ago incompatibly, not efiction 2023-09-20 11:29:07 -05:00
Jim Miller
3ee7614441 Remove site: archive.skyehawke.com - Domain parked, broken ~3years 2023-09-20 11:29:07 -05:00
Jim Miller
718ae6ac83 Remove site: www.destinysgateway.com - Domain parked +1year 2023-09-20 11:29:07 -05:00
Jim Miller
e0686eada2 Remove site: merengo.hu - Doesn't serve text in full print +1year 2023-09-20 11:29:07 -05:00
Jim Miller
9f1fd42889 Remove site: www.scarvesandcoffee.net - Changed +2years ago incompatibly 2023-09-20 11:29:07 -05:00
Jim Miller
a088a34c89 Remove site: www.silmarillionwritersguild.org - Changed +2years ago incompatibly 2023-09-20 11:29:07 -05:00
Jim Miller
14cdc10ee3 Remove site: www.lushstories.com - Changed +2years ago incompatibly
Closes #988
2023-09-20 11:28:45 -05:00
Jim Miller
8667643e7c Remove site: www.lotrgfic.com - DNS removed +1year ago 2023-09-20 10:49:57 -05:00
Jim Miller
e6d123a17d Switch from setup.py to pyproject.toml for CLI packaging. 2023-09-05 12:48:50 -05:00
Jim Miller
ae28b714b3 Bump Test Version 4.27.1 2023-09-05 10:30:22 -05:00
Jim Miller
33cd1642f8 Explicitly call set_image_allocation_limit() for larger image buffer and error on 0x0 image from image_and_format_from_data() 2023-09-05 10:29:34 -05:00
Jim Miller
63ec69f9f2 Log calibre, etc version data in FFF BG job. 2023-09-05 10:12:37 -05:00
Jim Miller
20ea9a00ed Bump Release Version 4.27.0 2023-09-01 08:30:19 -05:00
Jim Miller
779222b66d Bump Test Version 4.26.5 2023-08-24 08:10:53 -05:00
Jim Miller
afb2b9fe29 AO3: Real fix for adult string change, revert earlier attempted fixes. 2023-08-24 08:10:45 -05:00
Jim Miller
20052e1922 Bump Test Version 4.26.4 2023-08-23 14:59:03 -05:00
Jim Miller
e03f3f40da AO3: Don't assume div preface always present in chapters. 2023-08-23 14:58:55 -05:00
Jim Miller
00f6656d7d Bump Test Version 4.26.3 2023-08-23 13:24:32 -05:00
Jim Miller
dd2c1a48b5 AO3: Don't assume chapter userstuff module always present. 2023-08-23 13:24:27 -05:00
Jim Miller
a37588a8f7 Bump Test Version 4.26.2 2023-08-23 08:47:36 -05:00
Jim Miller
fc99805a85 AO3: Don't assume byline always present. 2023-08-23 08:47:29 -05:00
Jim Miller
d73b1732d3 Bump Test Version 4.26.1 2023-08-17 23:12:14 -05:00
Jim Miller
043fb289bf Fix for extratags not being picked up by include_in_subjects 2023-08-17 23:12:06 -05:00
Jim Miller
a0332f27be Bump Release Version 4.26.0 2023-08-17 12:06:13 -05:00
Jim Miller
99285763d3 adapter_royalroadcom: user found a story with no chapters 2023-08-16 12:06:00 -05:00
Jim Miller
26467d8f35 Bump Test Version 4.25.14 2023-08-05 16:38:04 -05:00
Jim Miller
930ba5bb19 base_efiction: .string -> stripHTML for nested tags Closes #984 2023-08-05 16:18:47 -05:00
Jim Miller
fb552c823a Bump Test Version 4.25.13 2023-07-27 08:29:51 -05:00
burny2
bfc0c4f3ef Fix fanfiktionde status parsing 2023-07-27 08:27:48 -05:00
Jim Miller
216cb27f03 Bump Test Version 4.25.12 2023-07-26 11:47:36 -05:00
Jim Miller
21a5ded593 AO3: Make subscribed, markedforlater True/False to match bookmarked/bookmarkprivate/bookmarkrec 2023-07-26 11:47:00 -05:00
Jim Miller
ff07987a02 Bump Test Version 4.25.11 2023-07-25 17:54:45 -05:00
Jim Miller
bd6afdafb8 AO3: Add subscribed, markedforlater 2023-07-25 17:54:37 -05:00
Jim Miller
fd7c5ac867 Add title replace_metadata(commented) for literotica Ch/Pt titles 2023-07-25 16:47:32 -05:00
Jim Miller
87eb84b5fa Bump Test Version 4.25.10 2023-07-23 16:29:36 -05:00
Jim Miller
784cb711d8 Update comments for include_subject_tags x_LIST 2023-07-23 16:29:30 -05:00
Jim Miller
54a00a934b Add _LIST option to include_subject_tags(/extra_subject_tags), refactor 2023-07-23 16:09:54 -05:00
Jim Miller
c638ac8457 Bump Test Version 4.25.9 2023-07-22 09:40:28 -05:00
Jim Miller
b710a4cdc7 Filter cookies for flaresolverr 2023-07-22 09:40:22 -05:00
Jim Miller
16c8c6b445 Bump Test Version 4.25.8 2023-07-22 08:48:08 -05:00
Jim Miller
5cee35149f AO3 fix for protected email addr in chapter name breaking chapter datetime 2023-07-22 08:43:36 -05:00
Jim Miller
de201c7263 Remove some test1.com default settings 2023-07-21 20:39:10 -05:00
Jim Miller
222a4f4828 Bump Test Version 4.25.7 2023-07-21 17:47:04 -05:00
Jim Miller
7d6af47f60 Fix for #979, AO3 Get URLs from Page 2023-07-21 17:46:56 -05:00
Jim Miller
1c05d58d1a Bump Test Version 4.25.6 2023-07-19 11:14:02 -05:00
Jim Miller
8152b51353 adapter_storiesofardacom: Fix for detecting adult question on indiv chapters. 2023-07-19 11:13:18 -05:00
Jim Miller
d387eafff2 Bump Test Version 4.25.5 2023-07-18 08:41:29 -05:00
Rose Davidson
fe5605ea50 Add support for www.sunnydaleafterdark.com
This is an EFiction style site, focusing on Buffy the Vampire Slayer fics with the Buffy/Spike ship.

There are a few quirks about how the site shows the infobox metadata.
2023-07-18 08:40:29 -05:00
Jim Miller
7f97decb8a Bump Test Version 4.25.4 2023-07-15 17:06:20 -05:00
Jim Miller
cfd28dd1ff Add anthology_merge_keepsingletocs option, requires new EpubMerge. 2023-07-15 17:06:20 -05:00
Jim Miller
2c43eab432 Use anthology url for site config section 2023-07-15 16:15:30 -05:00
Jim Miller
fda597ddae Bump Test Version 4.25.3 2023-07-12 12:48:24 -05:00
Jim Miller
7502c0f2fb Apply mark_new_chapters to new story chapters in Anthologies. #977 2023-07-12 12:48:13 -05:00
Jim Miller
eaeeda6911 Allow mark_new_chapters when 1 chapter in case it changes. 2023-07-12 10:26:15 -05:00
Jim Miller
8850c1a62b Increase sleep times between cache checks using open_pages_in_browser. 2023-07-08 13:32:02 -05:00
Jim Miller
0205ec4ccb Bump Test Version 4.25.2 2023-07-08 13:16:36 -05:00
Jim Miller
2600bf7be5 adapter_literotica: 'Fix' clean_chapter_titles for titles ending with Pt or Ch 2023-07-08 13:13:42 -05:00
Jim Miller
012ff40f0f Bump Test Version 4.25.1 2023-07-04 18:48:34 -05:00
Jim Miller
0df9e39931 Fix for ficbook.net date change. Closes #973 2023-07-04 18:48:24 -05:00
Jim Miller
97fcc3af33 Bump Release Version 4.25.0 2023-07-03 15:38:14 -05:00
Jim Miller
be40433377 Bump Test Version 4.24.9 2023-06-29 17:23:13 -05:00
Jim Miller
a1f29cb034 Fix for specific cover error. 2023-06-29 17:23:06 -05:00
Jim Miller
b2b584d832 Bump Test Version 4.24.8 2023-06-26 19:34:01 -05:00
Jim Miller
415cd6597e Fix for make_firstimage_cover causing embedded image to also use cover.jpg. 2023-06-26 19:33:47 -05:00
Jim Miller
d1d5d61b87 Bump Test Version 4.24.7 2023-06-24 14:45:14 -05:00
Jim Miller
2c11ecc5c8 adapter_wuxiaworldxyz: Paginated TOC 2023-06-24 14:45:00 -05:00
Jim Miller
0ac66425f8 Bump Test Version 4.24.6 2023-06-18 17:27:09 -05:00
Jim Miller
367d3e4435 Put output_css after workskin so it can override. See #967 2023-06-18 17:26:30 -05:00
Jim Miller
05b7147e64 Fix whitespace 2023-06-18 17:25:51 -05:00
Jim Miller
200c877418 Fix whitespace 2023-06-18 17:04:48 -05:00
Jim Miller
84323c1608 Use site lists for shared config entries in personal.ini 2023-06-18 16:53:42 -05:00
niacdoial
3ba2edef2d
improved config mechanism to include workskin 2023-06-18 18:43:31 +02:00
niacdoial
e5cc1cccf2
add ability to download AO3 workskin if scaping styles is enabled 2023-06-18 17:15:05 +02:00
Jim Miller
c50ffc40dc Bump Test Version 4.24.5 2023-06-17 17:00:54 -05:00
Jim Miller
1f8106c1f3 fix_relative_text_links fix for #anchors doesn't handle '.' in id. Fixes #966 refer #952 2023-06-17 17:00:40 -05:00
Jim Miller
d9ca72571e Bump Test Version 4.24.4 2023-06-17 10:42:16 -05:00
Jim Miller
ecb0620929 Make Rejects List Note column orderable. 2023-06-17 10:41:19 -05:00
Jim Miller
c6b381e61a Bump Test Version 4.24.3 2023-06-14 13:24:06 -05:00
mvlcek
faf352bf80 Login now has a hidden token. 2023-06-14 11:54:26 -05:00
Jim Miller
269b7d5bd1 Bump Test Version 4.24.2 2023-06-14 09:51:49 -05:00
Jim Miller
439d617364 AO3 Check for hidden work after login. 2023-06-14 09:51:28 -05:00
Jim Miller
d0c85feda5 Bump Test Version 4.24.1 2023-06-12 16:48:00 -05:00
Jim Miller
25ebc603e7 Allow for href='' in fix_relative_text_links processing. 2023-06-12 16:47:54 -05:00
Jim Miller
1683d950c3 Cleanup changes from #958 2023-06-08 08:09:40 -05:00
chocolatechipcats
961bb28ecd Update plugin-defaults.ini
Updated the always-login comment to mention hide warnings/additional tags.
2023-06-08 08:05:24 -05:00
chocolatechipcats
bbb3db31a8 Update defaults.ini
Updated the always-login comment to mention hide warnings/additional tags.
2023-06-08 08:05:24 -05:00
chocolatechipcats
c917c5da3d Update plugin-defaults.ini
Added note about login-restricted stories in include_in_genre.
2023-06-08 08:05:24 -05:00
chocolatechipcats
edc2056e75 Update defaults.ini
Added note about login-restricted stories in include_in_genre.
2023-06-08 08:05:24 -05:00
chocolatechipcats
84b7cbcda2 Update defaults.ini
Added note about login-restricted stories in include_in_genre.
2023-06-08 08:05:24 -05:00
Jim Miller
44484670f2 Bump Release Version 4.24.0 2023-06-02 08:40:56 -05:00
Jim Miller
0b442422ab Bump Test Version 4.23.6 2023-05-27 20:21:25 -05:00
Jim Miller
d0448af52e AO3: Apparently minor change to chapter end note HTML. #956 2023-05-27 20:20:41 -05:00
Jim Miller
e82585ecc7 Bump Test Version 4.23.5 2023-05-27 16:36:41 -05:00
Jim Miller
ff36bd30c5 Fix force_cover_image when already in story 2023-05-27 16:36:33 -05:00
Jim Miller
12b2117c77 Fix wuxiaworld.xyz specific setting section 2023-05-25 10:04:49 -05:00
Jim Miller
34ec532eed Bump Test Version 4.23.4 2023-05-21 10:44:41 -05:00
Jim Miller
2fa23ce9fd wuxiaworld.co -> wuxiaworld.xyz changes most data collection #953 2023-05-21 10:44:22 -05:00
Jim Miller
8399061dc9 Bump Test Version 4.23.3 2023-05-08 21:53:45 -05:00
Jim Miller
86ab2806fa fix_relative_text_links: Keep #anchor links if target also in chapter. See #952 2023-05-08 21:51:42 -05:00
Jim Miller
6f77504ca9 Fix Update Always metadata in BG logic 2023-05-08 19:47:46 -05:00
Jim Miller
a259297092 Bump Test Version 4.23.2 2023-05-06 21:34:32 -05:00
Jim Miller
2c662b6f33 Add order_chapters_by_date option for literotica.com, used to be hard coded. 2023-05-06 21:34:17 -05:00
Jim Miller
548d6a5a58 Bump Test Version 4.23.1 2023-05-04 19:24:12 -05:00
Jim Miller
f3d2513d32 Add force_update_epub_always option to update when EPUB has more chapters than source. See #950 #949 #942. 2023-05-04 19:23:18 -05:00
Jim Miller
8b20756095 Bump Release Version 4.23.0 2023-05-02 08:37:33 -05:00
Jim Miller
8f093769ce Bump Test Version 4.22.7 2023-04-27 13:38:53 -05:00
Jim Miller
f6dafecfa1 Change force_img_referer to force_img_self_referer_regexp See #940 #941 2023-04-27 13:37:13 -05:00
Jim Miller
98f95a7da8 adapter_deviantartcom: Another detect login string *properly*. #947 2023-04-27 13:18:59 -05:00
Jim Miller
f3d373c8ca Change force_img_referer to force_img_referer See #940 #941 2023-04-27 09:06:54 -05:00
Jim Miller
536ff35d66 Bump Test Version 4.22.6 2023-04-27 08:37:17 -05:00
Jim Miller
6d31c5fb94 adapter_deviantartcom: Another detect login string. #947 2023-04-27 08:36:58 -05:00
Jim Miller
5730d3583a Bump Test Version 4.22.5 2023-04-18 10:07:20 -05:00
Jim Miller
da64336967 Show INI highlighting under [storyUrl] sections 2023-04-18 10:07:09 -05:00
Jim Miller
480311c442 Update translations. 2023-04-17 10:32:32 -05:00
Jim Miller
8b44e3d4b6 adapter_quotevcom: Some stories have no comments. 2023-04-17 09:58:34 -05:00
Jim Miller
9049625ec2 Bump Test Version 4.22.4 2023-04-08 21:34:29 -05:00
Jim Miller
d8c70ceae2 Don't try to set seriesUrl when no series (Anthologies) 2023-04-08 21:33:59 -05:00
Jim Miller
95bb8a0c7f Bump Test Version 4.22.3 2023-04-06 14:49:33 -05:00
Jim Miller
9b1a64616b Add force_img_referer optional feature. Closes #940 2023-04-06 14:48:52 -05:00
Jim Miller
8a6894fa28 Fix to allow update epub to get missing images. 2023-04-06 12:27:21 -05:00
Jim Miller
7c4e819c93 Add comment about why cover not read from epub on update 2023-04-06 12:03:21 -05:00
Jim Miller
9bedeb55a0 Add to AO3 authorUrl comments. 2023-04-05 10:20:48 -05:00
Jim Miller
6c92d45d97 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2023-04-05 10:14:16 -05:00
chocolatechipcats
c7c029c706 missed a line... 2023-04-05 10:14:11 -05:00
chocolatechipcats
6fec02f79e missed a line... 2023-04-05 10:14:11 -05:00
chocolatechipcats
fc3e8bb8ff Replacing orphan_account regex 2023-04-05 10:14:11 -05:00
chocolatechipcats
3f52734da2 Replacing orphan_account regex 2023-04-05 10:14:11 -05:00
chocolatechipcats
cde8a739fb orphan_account authorUrl
This replaces the orphan_account authorUrl (which 404's) with a link to the AO3 homepage
2023-04-05 10:14:11 -05:00
chocolatechipcats
dc5837badb orphan_account authorUrl
This replaces the orphan_account authorUrl (which 404's) with a link to the AO3 homepage
2023-04-05 10:14:11 -05:00
Jim Miller
43a2d5cd67 Series name can also have [ in it. 2023-04-04 17:14:18 -05:00
Jim Miller
2c0a1d1046 Don't use Raw series with calibre_series_meta. 2023-04-04 16:56:14 -05:00
Jim Miller
64aaaf6daa Bump Test Version 4.22.2 2023-04-03 16:49:03 -05:00
Jim Miller
dd2a076b6f Add static include_in_* when double quoted. 2023-04-02 19:06:54 -05:00
Jim Miller
cf7f84c886 Set Calibre Series URL link. 2023-04-01 20:46:31 -05:00
Jim Miller
98a5a120c1 Bump Test Version 4.22.1 2023-04-01 20:46:31 -05:00
Jim Miller
77d35d88c7 Anthologies don't need per-story config(custom_columns_settings) 2023-04-01 20:46:31 -05:00
Jim Miller
f25ed9efbb Bump Release Version 4.22.0 2023-04-01 10:14:18 -05:00
Jim Miller
de7e4803a3 Bump Test Version 4.21.8 2023-03-29 22:38:59 -05:00
Jim Miller
1516b100d2 Remove site: merlinfic.dtwins.co.uk 'This site has been removed due to PHP compatibility issues.' 2023-03-29 22:34:57 -05:00
Jim Miller
7ff2976dfe Remove site: tasteofpoison.inkubation.net broken ~3 years 2023-03-29 22:32:02 -05:00
Jim Miller
f4426d0532 Remove site: www.andromeda-web.com broken ~3 years 2023-03-29 22:31:13 -05:00
Jim Miller
f4fbbf0d34 Remove site: sword.borderline-angel.com broken ~3 years 2023-03-29 22:29:14 -05:00
Jim Miller
57cf738df5 Remove site: www.qaf-fic.com broken ~3 years 2023-03-29 22:27:50 -05:00
Jim Miller
edb09d1a7e Remove site: buffygiles.velocitygrass.com broken ~3 years 2023-03-29 22:26:41 -05:00
Jim Miller
84c5e245e6 Remove site: trekiverse.org broken ~3 years 2023-03-29 22:24:49 -05:00
Jim Miller
95cece7e9c Remove site: archive.shriftweb.org broken ~4 years 2023-03-29 19:54:17 -05:00
Jim Miller
ea345b059d Remove site: csi-forensics.com - SSL expired +1yr, broken ~3 years 2023-03-29 19:50:49 -05:00
Jim Miller
6ca6d47066 Remove site: www.wraithbait.com - SSL expired +1yr, broken ~3 years 2023-03-29 19:49:29 -05:00
Jim Miller
fea04ed16c Remove site: www.ik-eternal.net - No DNS, broken ~18 months 2023-03-29 19:46:26 -05:00
Jim Miller
84b3b6d61e Remove site: themaplebookshelf.com - No DNS, broken ~18 months 2023-03-29 19:45:09 -05:00
Jim Miller
4f0be16f0b Remove site: www.looselugs.com - No DNS, broken ~3 years 2023-03-29 19:43:26 -05:00
Jim Miller
f8fc1a2881 Remove site: fanfic.potterheadsanonymous.com - No DNS, broken ~3 years 2023-03-29 19:42:02 -05:00
Jim Miller
f9471377bb Remove site: sugarquill.net - Site retired 2023-03-29 17:56:12 -05:00
Jim Miller
152088de87 adapter_thehookupzonenet: Fix changed date format 2023-03-29 17:49:01 -05:00
Jim Miller
82702ea958 Update translations. 2023-03-25 09:54:36 -05:00
Jim Miller
3432a786d5 Bump Test Version 4.21.7 2023-03-21 21:17:14 -05:00
Jim Miller
4fd8972f6a adapter_ficbooknet: Fix for site change. 2023-03-21 21:17:09 -05:00
Jim Miller
e4847653c6 Bump Test Version 4.21.6 2023-03-21 13:30:16 -05:00
Jim Miller
6e73c7400a adapter_wattpadcom: Fix accidentally hardcoding a story in. 2023-03-21 13:16:20 -05:00
Jim Miller
5c40f4073a Bump Test Version 4.21.5 2023-03-20 11:18:25 -05:00
Jim Miller
da3777a0ca Changes to wattpad API mapping chapter URLs to story URLs. 2023-03-20 11:18:20 -05:00
Jim Miller
dd636bb55f Bump Test Version 4.21.4 2023-03-15 10:24:59 -05:00
Jim Miller
6fcfdaabf3 Remove inline ads, only seen with flaresolverr so far 2023-03-15 10:24:51 -05:00
Jim Miller
e26eb9d9cc Bump Test Version 4.21.3 2023-03-14 16:47:25 -05:00
Jim Miller
732d40f5c8 Fix for custom columns [storyUrl] sections. 2023-03-14 16:46:58 -05:00
Jim Miller
814cf2931c Bump Test Version 4.21.2 2023-03-09 13:42:49 -06:00
Jim Miller
5e4f041509 Remove doubled doreplacements/removeallentities from author(etc)HTML processing. 2023-03-09 13:40:02 -06:00
Jim Miller
8862ec985f Bump Test Version 4.21.1 2023-03-07 10:44:17 -06:00
Jim Miller
c887697d61 AO3: Better fix for always_reload_first_chapter vs use_view_full_work
Fixes #932
This reverts commit a2e9d29cf6.
2023-03-07 10:39:30 -06:00
Jim Miller
30115980af adapter_fictionmaniatv: Site change for status, fixes #931 2023-03-07 09:55:13 -06:00
Jim Miller
be057e296f Bump Release Version 4.21.0 2023-03-01 11:22:40 -06:00
Jim Miller
a5d42e07c9 Bump Test Version 4.20.7 2023-02-26 16:49:02 -06:00
Hazel Shanks
6484f588e4 fix #922 -- call utf8fromSoup exactly once 2023-02-26 16:38:08 -06:00
chocolatechipcats
83a5c28d71 Update defaults.ini 2023-02-24 17:59:37 -06:00
chocolatechipcats
96a129a70f Update plugin-defaults.ini 2023-02-24 17:59:37 -06:00
Jim Miller
51e6892a5e Bump Test Version 4.20.6 2023-02-24 16:21:50 -06:00
Jim Miller
47ad5c1e1f adapter_royalroadcom: Fixes for site changes. #923 2023-02-24 16:19:52 -06:00
Jim Miller
bdb90941d3 Bump Test Version 4.20.5 2023-02-19 20:43:47 -06:00
Jim Miller
a2e9d29cf6 AO3: Re-soup full_work on every chapter to avoid problems with soup changes. Found with always_reload_first_chapter:true 2023-02-19 20:43:28 -06:00
Jim Miller
b43bec4126 Import brotlidecpy directly for A-shell on iOS users. 2023-02-17 14:23:46 -06:00
Jim Miller
5992f835fb defaults.ini: Add comment about setting int/float custom columns to None 2023-02-12 17:07:41 -06:00
Jim Miller
263c840f30 Minor fix in exception processing 2023-02-11 16:40:02 -06:00
Jim Miller
7786b1b5a9 Bump Test Version 4.20.4 2023-02-11 13:39:16 -06:00
Jim Miller
b1ce5f8956 adapter_thesietchcom: Fix for site more closely following XenForo2 2023-02-11 13:38:37 -06:00
Jim Miller
5e6ab494b9 Bump Test Version 4.20.3 2023-02-10 10:00:22 -06:00
Jim Miller
b99560acca FlareSolverr: novelfull.com sometimes w/o expires of any kind 2023-02-10 09:20:22 -06:00
Jim Miller
b146552e39 Bump Test Version 4.20.2 2023-02-05 09:48:18 -06:00
Jim Miller
8468a502bb Add style attr by default to fiction.live 2023-02-05 09:48:18 -06:00
Jim Miller
1b96617c78 adapter_fictionlive: Soup chapter text to fix up HTML 2023-02-05 09:48:18 -06:00
Jim Miller
7ac179e068 Bump Test Version 4.20.1 2023-02-03 12:20:13 -06:00
Jim Miller
f29f3f973a Comment out some debug 2023-02-03 09:00:06 -06:00
Jim Miller
e775bd451d Bump Release Version 4.20.0 2023-02-02 10:29:04 -06:00
Jim Miller
bef71a49b6 Bump Test Version 4.19.16 2023-01-26 09:46:42 -06:00
Jim Miller
e5ab3e1d0c Fixes for adapter_fictionlive story URLs-normalize & skip unsub URL 2023-01-26 09:46:35 -06:00
Jim Miller
bb06ffdaea Bump Test Version 4.19.15 2023-01-23 10:58:41 -06:00
Jim Miller
5ce7aa5c48 Merge branch 'bugmaschine-main' 2023-01-23 10:58:03 -06:00
Jim Miller
85450360de Fixes for #910 adapter_deviantartcom date changes. 2023-01-23 10:55:18 -06:00
Bugmaschine
ec6873f95f C
This should be bette
2023-01-23 10:55:18 -06:00
Bugmaschine
e4d5b61ef6 Modified code to use existing parse_relative_date_string function 2023-01-23 10:55:18 -06:00
Bugmaschine
644bd369e4 Did some testing, Should work 2023-01-23 10:55:18 -06:00
Bugmaschine
dede2376c3 Looks cleaner 2023-01-23 10:55:18 -06:00
Bugmaschine
2bd727bec2 Should work, but it is not clean 2023-01-23 10:55:18 -06:00
Jim Miller
50c85d4835 Revamp retries for browser cache with open_pages_in_browser 2023-01-23 10:39:17 -06:00
Jim Miller
7103630e55 Bump Test Version 4.19.14 2023-01-20 09:58:22 -06:00
Jim Miller
a31d58bca3 Fix for &amp; in chapter title. 2023-01-20 09:58:16 -06:00
Jim Miller
6ae424d3ff Bump Test Version 4.19.13 2023-01-17 21:28:16 -06:00
Jim Miller
3b703da1f3 Add r_anthmax/n_anthmax options for custom_columns_settings 2023-01-17 21:28:10 -06:00
Bugmaschine
6695f23079 Fixed the Deviantart adapter not detecting that a Deviation is Marked as needing a login 2023-01-17 13:33:25 -06:00
Jim Miller
5d4d8e6239 Bump Test Version 4.19.12 2023-01-15 14:47:02 -06:00
Jim Miller
b14590c112 Skip day of week for localization in browsercache_firefox2 2023-01-15 14:46:51 -06:00
Jim Miller
e11e09f935 Bump Test Version 4.19.11 2023-01-15 13:17:55 -06:00
Jim Miller
4e0aa707b9 Move makeDate to dateutils to call from browsercache_firefox2 2023-01-15 13:17:50 -06:00
Jim Miller
0845deb095 Bump Test Version 4.19.10 2023-01-14 10:01:14 -06:00
Jim Miller
2719705a1a adapter_mediaminerorg: Updates for site changes 2023-01-14 10:00:57 -06:00
Jim Miller
346da2cdee Report site unavailable for AO3. Closes #908 2023-01-13 17:16:43 -06:00
Jim Miller
db39aaf4ff Bump Test Version 4.19.9 2023-01-13 10:37:09 -06:00
Jim Miller
22ea1d4a15 adapter_fastnovelsnet: Fixes for site changes -- tested with use_flaresolverr_proxy 2023-01-13 10:36:53 -06:00
Jim Miller
4365e852fe Remove a debug 2023-01-13 10:35:42 -06:00
Jim Miller
6a474eb0a0 Bump Test Version 4.19.8 2023-01-12 12:05:38 -06:00
Jim Miller
020d8d9e5b Update language->langcode mapping for updated AO3 list 2023-01-12 12:05:18 -06:00
Fedor Suchkov
220ca33cc9 Added fanficfare.fetchers to packages in setup.py. 2023-01-12 10:36:40 -06:00
Jim Miller
2cee4cca06
Merge pull request #905 from JimmXinu/browsercache
Browser Cache Refactor & open_pages_in_browser feature
2023-01-11 11:14:58 -06:00
Jim Miller
a31ace8032 Bump Test Version 4.19.7 2023-01-11 11:00:35 -06:00
Jim Miller
6d0495eab8 Add open_pages_in_browser setting to defaults.ini 2023-01-11 11:00:29 -06:00
Jim Miller
6d6457a32f Bump Test Version 4.19.6 2023-01-06 13:49:34 -06:00
Jim Miller
befe0e5254 Tweak (undoc) setting name open_pages_in_browser_tries_limit 2023-01-06 13:41:55 -06:00
Jim Miller
2c41230b74 Bump Test Version 4.19.5 2023-01-06 13:31:50 -06:00
Jim Miller
0e1e92750c Use scheme in cache keys for http vs https 2023-01-06 13:29:53 -06:00
Jim Miller
b27854b8a5 Newer default date due to timezones 2023-01-06 13:05:51 -06:00
Jim Miller
2c504ae67e Change header labels to all lowercase. 2023-01-06 13:04:56 -06:00
Jim Miller
24d02895ef Bump Test Version 4.19.4 2023-01-06 12:53:02 -06:00
Jim Miller
01887e37b4 Tweak browser cache timings 2023-01-06 12:33:59 -06:00
Jim Miller
628f76c20a Fix first time browser cache sleep 2023-01-06 12:23:17 -06:00
Jim Miller
f31e7b1860 Fix for fictionpress.com 2023-01-06 12:22:48 -06:00
Jim Miller
073d52a17c Bump Test Version 4.19.3 2023-01-06 12:03:18 -06:00
Jim Miller
eac3531f31 Tweak open_pages_in_browser timings. 2023-01-06 12:02:48 -06:00
Jim Miller
7873e25779 Make py2.7 compatible. 2023-01-06 11:29:43 -06:00
Jim Miller
f468611b01 Firefox2 cache dates, convert UTC to local 2023-01-06 11:18:08 -06:00
Jim Miller
d3aea54b6c Bump Test Version 4.19.2 2023-01-05 14:19:01 -06:00
Jim Miller
1d5afe8cd6 Wrap browser cache in thread lock just in case. 2023-01-05 14:07:34 -06:00
Jim Miller
91d6aacc74 open_browser_pages_tries_limit basic implementation 2023-01-05 13:12:39 -06:00
Jim Miller
0036ba94d9 Tweak debug output 2023-01-05 13:11:54 -06:00
Jim Miller
3711663a12 Take firefox cached time from response header. 2023-01-05 13:11:17 -06:00
Jim Miller
7e2eb531ba Comment some debugs, tweak browser cache to do normal sleep after open browser 2023-01-02 19:24:40 -06:00
Jim Miller
39cca07432 Bump Test Version 4.19.1 2023-01-01 13:06:22 -06:00
Jim Miller
001cdd34c7 Tweaks to browser cache 2023-01-01 13:06:03 -06:00
Jim Miller
4cb0201970 open_pages_in_browser setting 2023-01-01 13:06:03 -06:00
Jim Miller
56da4a2850 Fix EmailPassDialog 2023-01-01 13:06:03 -06:00
Jim Miller
f613fea791 Location: and location: headers both used... 2023-01-01 13:06:03 -06:00
Jim Miller
ccd25b0c93 Only apply open_page_in_browser when use_browser_cache_only:true 2023-01-01 13:06:03 -06:00
Jim Miller
60c14c2cef Add key list to browser cache to look for WebToEpub cache entries 2023-01-01 13:06:03 -06:00
Jim Miller
895274ad24 Scandir for cache troubleshooting 2023-01-01 13:06:03 -06:00
Jim Miller
bf13b81837 Move open_page_in_browser up into BrowserCacheDecorator 2023-01-01 13:06:03 -06:00
Jim Miller
adeb9f26c3 Move open_page_in_browser up into BrowserCacheDecorator 2023-01-01 13:06:03 -06:00
Jim Miller
c3631f6ac7 Change BrowserCache to on-demand, not scan 2023-01-01 13:06:03 -06:00
Jim Miller
1301fc3dc4 Rename browsercache files before changing contents to preserve history. 2023-01-01 13:06:03 -06:00
Jim Miller
d76fa989d1 Fix encoding auto/chardet 2023-01-01 13:06:03 -06:00
Jim Miller
53dd0073f1 Update defaults.ini to only quotev.com benefiting from use_cloudscraper 2023-01-01 13:06:03 -06:00
Jim Miller
b6b0b0a8c5 Make failed chapter URLs links with continue_on_chapter_error 2023-01-01 13:06:03 -06:00
Jim Miller
c0573d76fd Include (current) story-title in normalized ffnet chapter URLs. 2023-01-01 13:06:03 -06:00
Jim Miller
44b803a529 Tweak debug output 2023-01-01 13:06:03 -06:00
Jim Miller
c6705a82db Refactoring for browser cache v2/fetcher 2023-01-01 13:06:03 -06:00
Jim Miller
66813584f5 Bump Release Version 4.19.0 2023-01-01 13:00:04 -06:00
Jim Miller
e61829052e Bump Test Version 4.18.5 2022-12-30 08:38:16 -06:00
Jim Miller
701d358ea6 Fixes for config base_xenforo options, closes #902 2022-12-30 08:35:56 -06:00
Jim Miller
15d434fce2 Bump Test Version 4.18.4 2022-12-16 13:51:14 -06:00
Eleanor Davies
c801729215
scribblehub flaresolverr fix (#900)
* scribblehub flaresolverr fix
2022-12-16 13:46:34 -06:00
Jim Miller
2e192380f0 Bump Test Version 4.18.3 2022-12-16 12:29:37 -06:00
Jim Miller
4c4355a910 Equalize ok/cancel buttons on user/pass & email pass dialogs 2022-12-16 12:11:37 -06:00
Jim Miller
7c17a2dcd0 Fix for adapter_quotevcom status 2022-12-16 11:57:50 -06:00
Jim Miller
186a97042b Bump Test Version 4.18.2 2022-11-30 09:42:12 -06:00
Jim Miller
d2f6d2d6b8 adapter_ficbooknet: Site change for status + remove debug 2022-11-30 09:42:04 -06:00
Jim Miller
0c1bbd0c96 Bump Test Version 4.18.1 2022-11-27 09:14:20 -06:00
Jim Miller
f5f9a7d303 Tweak for adapter_storiesonlinenet description parsing 2022-11-27 09:14:04 -06:00
Jim Miller
224bd11821 Bump Release Version 4.18.0 2022-11-21 19:04:53 -06:00
Jim Miller
6d6cac850b Remove a somewhat misleading status message. #897 2022-11-21 19:03:51 -06:00
Jim Miller
d81cc0bd4a Bump Test Version 4.17.8 2022-11-15 08:43:10 -06:00
Jim Miller
73459f2b83 Still allow images with use_flaresolverr_proxy if use_browser_cache 2022-11-15 08:42:41 -06:00
Jim Miller
aa8c96de7b defaults.ini file name settings tweaks 2022-11-14 12:06:09 -06:00
Jim Miller
61a7701e78 Bump Test Version 4.17.7 2022-11-10 21:45:13 -06:00
Jim Miller
337086b90b Update metadata caching with dependency invalidating 2022-11-10 21:45:00 -06:00
Jim Miller
20003aa49d Bump Test Version 4.17.6 2022-11-06 14:16:12 -06:00
Jim Miller
e1d5a68a90 Adding replace_chapter_text feature. 2022-11-06 14:15:17 -06:00
Jim Miller
ac5f94a6ac Bump Test Version 4.17.5 2022-11-06 10:25:23 -06:00
mvlcek
d85e3b977e Support classic AND modern (and minimalist) theme for storiesonline, finestories and scifistories 2022-11-06 12:01:19 +01:00
Jim Miller
fead675aae Bump Test Version 4.17.4 2022-11-05 10:12:54 -05:00
Jim Miller
c33267750d remove_class_chapter missing from config lists 2022-11-05 10:10:38 -05:00
Jim Miller
9c5badc2bf Bump Test Version 4.17.3 2022-10-26 10:14:37 -05:00
Jim Miller
b65713f902 adapter_tenhawkpresents: Change site to t.evancurrie.ca 2022-10-25 18:16:54 -05:00
Jim Miller
8ad18383cc Bump Test Version 4.17.2 2022-10-22 12:04:18 -05:00
Jim Miller
6e1892dd4e adapter_adultfanfictionorg: Fixes for site changes, thanks cryosaur. 2022-10-22 11:48:48 -05:00
Jim Miller
f593295d06 Bump Test Version 4.17.1 2022-10-21 09:18:07 -05:00
Jim Miller
7eb142e598 adapter_adultfanfictionorg: Fixes for site changes, thanks cryosaur. 2022-10-21 09:17:51 -05:00
Jim Miller
4d322a8fae Remove Calibre Update Cover option entirely(was deprecated) #878 2022-10-20 09:45:27 -05:00
Jim Miller
ccea7827ce Bump Release Version 4.17.0 2022-10-18 11:47:27 -05:00
Jim Miller
ed2bb78657 Update translations. 2022-10-18 11:47:12 -05:00
Jim Miller
8871352b2c Bump Test Version 4.16.6 2022-10-14 10:48:33 -05:00
Jim Miller
04632728bc Flaresolverr v3 beta doesn't have 'headers'?? 2022-10-14 10:48:18 -05:00
Jim Miller
d92475b980 Bump Test Version 4.16.5 2022-10-14 09:29:20 -05:00
Jim Miller
89c4b68b9f Flaresolverr v3 beta using 'expiry' cookie key, was 'expires'. 2022-10-14 09:29:14 -05:00
Jim Miller
6e97d98118 Fix site name fanfiction.tenhawkpresents.ink 2022-10-08 10:19:14 -05:00
Jim Miller
e326b81b3f Bump Test Version 4.16.4 2022-09-27 10:01:42 -05:00
Jim Miller
a7ced3d78a adapter_adultfanfictionorg: Fixes for site changes. 2022-09-27 10:01:42 -05:00
Jim Miller
c78ff37f56 Bump Test Version 4.16.3 2022-09-25 08:56:47 -05:00
Jim Miller
560abad128 Disable Cancel during metadata update ProgBar. 2022-09-25 08:56:39 -05:00
Jim Miller
1adba9193a Bump Test Version 4.16.2 2022-09-22 12:27:17 -05:00
Jim Miller
a6d492d970 adapter_chosentwofanficcom: Site has several links to each story in a series page. 2022-09-22 12:27:09 -05:00
Jim Miller
56a7f271ff Bump Test Version 4.16.1 2022-09-20 10:49:10 -05:00
Jim Miller
3fffd22996 Fixes for add_category/genre_when_multi_category settings. #884 2022-09-20 10:46:40 -05:00
Jim Miller
d11d4c5263 Bump Release Version 4.16.0 2022-09-19 12:20:42 -05:00
Jim Miller
ed5260f035 Update translations. 2022-09-19 12:20:28 -05:00
Jim Miller
5df1608d74 Bump Test Version 4.15.18 2022-09-13 10:00:05 -05:00
Jim Miller
773b2600c5 Add use_ssl_default_seclevelone option for aneroticstory 2022-09-13 09:59:13 -05:00
Jim Miller
d0fddf2da6 Update embedded certifi to 2022.06.15.1 2022-09-13 09:54:51 -05:00
Jim Miller
8ccc3dc129 Bump Test Version 4.15.17 & Update translations 2022-09-12 17:25:17 -05:00
Jim Miller
2b001f003b adapter_storiesonlinenet: Fix for empty scores. #882 2022-09-12 08:50:37 -05:00
Jim Miller
dd88bef85a Bump Test Version 4.15.16 2022-09-11 20:58:04 -05:00
Jim Miller
2a6e92e586 Add flaresolverr_proxy_timeout (default 60000ms) #703 2022-09-11 20:57:44 -05:00
Jim Miller
102b23434b Bump Test Version 4.15.15 2022-09-11 20:40:31 -05:00
Jim Miller
7ea7c8497c adapter_storiesonlinenet: More tweaks to keep story-title in URL. #882 2022-09-11 20:40:24 -05:00
Jim Miller
2faafdd9f3 Bump Test Version 4.15.14 2022-09-11 16:53:30 -05:00
Jim Miller
a09c84258f adapter_storiesonlinenet: Also change index URL after login. #882 2022-09-11 16:53:23 -05:00
Jim Miller
8a3ce58d4e Bump Test Version 4.15.13 2022-09-11 14:59:25 -05:00
Jim Miller
599a89ee6a adapter_storiesonlinenet: Fix for premium accounts redirecting to directly chapter? #882 2022-09-11 14:59:15 -05:00
Jim Miller
5b0b91eb46 Bump Test Version 4.15.12 2022-09-10 09:30:40 -05:00
Jim Miller
cddfd8b835 SOL/etc: Change story URL scheme to keep story-title & use to detect story ID reuse. #882 2022-09-10 09:30:30 -05:00
Jim Miller
770c9fa167 AO3: Detect 'This work is part of an ongoing challenge and will be revealed soon!' 2022-09-09 10:24:06 -05:00
Jim Miller
ecf4b10238 Bump Test Version 4.15.11 2022-09-09 10:09:07 -05:00
Jim Miller
4c64b406df Fix for remove from update & rejects lists when lower selected first. 2022-09-09 10:08:28 -05:00
Jim Miller
031b9052d1 Update translations. 2022-09-08 12:25:05 -05:00
Jim Miller
f276b836c7 Use a chapter URL as referrer for default_cover_image/force_cover_image. 2022-09-07 13:22:41 -05:00
Jim Miller
e63b05ff16 Bump Test Version 4.15.10 2022-09-07 10:42:28 -05:00
Jim Miller
0113d07a63 adapter_wattpadcom: Add include_chapter_banner_images feature, defaults on 2022-09-07 10:42:22 -05:00
Jim Miller
c0b6e918ad Bump Test Version 4.15.9 2022-09-06 13:59:24 -05:00
Jim Miller
92d3c7c8f0 Update defaults.ini for use_old_cover and force_cover_image 2022-09-06 13:58:25 -05:00
Jim Miller
543c741502 Bump Test Version 4.15.8 2022-09-05 13:14:54 -05:00
Jim Miller
018f87767d Deprecate(rather than remove) updateepubcover feature. #878 2022-09-05 13:14:42 -05:00
Jim Miller
238884ad53 Restore updateepubcover feature. 2022-09-05 10:57:30 -05:00
Jim Miller
cd83136278 Rename always_use_existing_cover to use_old_cover 2022-09-04 16:39:42 -05:00
Jim Miller
6759803ccd Add force_cover_image setting. 2022-09-04 16:36:55 -05:00
Jim Miller
b5f6a447b9 Bump Test Version 4.15.7 2022-09-04 15:40:34 -05:00
Jim Miller
b26b124cfe Improve handling for default_cover_image failing to load. 2022-09-04 14:07:38 -05:00
Jim Miller
e58df9ac97 Fix for always_use_existing_cover when oldcover name collides with dl image. 2022-09-04 14:01:33 -05:00
Jim Miller
11f7c6f115 Link where Default Update EPUB Cover was. 2022-09-04 13:14:30 -05:00
Jim Miller
662b808ba9 Better coverage for do_updateepubcover_warning 2022-09-04 11:35:45 -05:00
Jim Miller
dbeba818f7 Bump Test Version 4.15.6 2022-09-03 19:35:59 -05:00
Jim Miller
666c3b4143 Correct a generate_cover_settings example line in plugin-defaults.ini 2022-09-03 18:11:32 -05:00
Jim Miller
e2dba246b2 Pare down and tweak initial personal.ini for new users. 2022-09-03 17:02:07 -05:00
Jim Miller
4e57d27a57 Tweak Calibre Cover options layouts, texts and defaults to be more rational for new users. 2022-09-03 16:32:34 -05:00
Jim Miller
4a58c43af9 Only apply covernewonly to setting cover from epub--not GC. 2022-09-03 16:24:29 -05:00
Jim Miller
1d2006761d Add always_use_existing_cover setting. 2022-09-03 14:42:05 -05:00
Jim Miller
23bc94451e Add test1.com sid=91 / 92 cover_image cases. 2022-09-03 14:25:00 -05:00
Jim Miller
a1f3349da0 Remove 'Update EPUB Cover?' download up, add Cover New Only instead. 2022-09-03 14:25:00 -05:00
Jim Miller
f99889d5e8 Remove a dead line of code 2022-09-03 14:25:00 -05:00
Jim Miller
137138a8ab adapter_fictionlive shouldn't set cover_image 2022-09-02 11:23:09 -05:00
Jim Miller
640b0eac0e Clear metadata cache on numWords set from plugin for derived values. 2022-09-02 11:22:42 -05:00
Jim Miller
73b78d6335 Bump Test Version 4.15.5 2022-08-25 08:18:14 -05:00
Jim Miller
7558c998df Fix for calibre_series_meta feature when series contains [ 2022-08-25 08:17:58 -05:00
Jim Miller
387aad83b6 Bump Test Version 4.15.4 2022-08-21 18:01:27 -05:00
Nicolas SAPA
43b07b6d6a nsapa_proxy: detect proxy protocol violation
Fix #865 by validating proxy response.

Signed-off-by: Nicolas SAPA <nico@ByMe.at>
2022-08-21 18:00:22 -05:00
Jim Miller
b6abcc41cf Bump Test Version 4.15.3 2022-08-21 12:20:45 -05:00
Jim Miller
a307c128fa Also include threadmarks_title in tagsfromtitle (XF) 2022-08-21 12:20:03 -05:00
Jim Miller
16b78523e5 Remove RSS link from threadmarks_title (XF2) 2022-08-21 12:20:03 -05:00
Jim Miller
8084761154 Bump Test Version 4.15.2 2022-08-18 21:42:12 -05:00
Jim Miller
d3dd5a86a8 Better layout stretching for Make series name/comment area scrollable 2022-08-18 21:41:57 -05:00
Jim Miller
69510094d3 Bump Test Version 4.15.1 2022-08-18 21:17:45 -05:00
Jim Miller
b0ca83f760 Make series name/comment area scrollable for when lengthy. 2022-08-18 21:17:37 -05:00
Khoyo
2c707a74dd README: update archlinux package information
The Archlinux fanficfare package has been dropped from the [community] repository on 2022-04-01, and is now an AUR package. See https://aur.archlinux.org/cgit/aur.git/log/PKGBUILD?h=fanficfare
2022-08-12 19:20:29 -05:00
Jim Miller
dfbbed0709 Bump Release Version 4.15.0 2022-08-11 16:12:23 -05:00
Jim Miller
842b2d2d55 Bump Test Version 4.14.9 2022-08-09 08:54:19 -05:00
Jim Miller
af22795cd5 Update translations. 2022-08-09 08:54:19 -05:00
Jim Miller
cd71351181 adapter_adultfanfictionorg: http->https Closes #870 2022-08-09 08:48:49 -05:00
Jim Miller
86b3f49e6b Fix for bug with cal6 icon theme change - doesn't immediately affect FFF. 2022-08-02 17:54:56 -05:00
Jim Miller
7e53863d15 Bump Test Version 4.14.8 2022-07-28 12:26:02 -05:00
Jim Miller
a5832e8d02 Fix for win10/qt6 progbar not displaying initially. 2022-07-28 12:24:35 -05:00
Jim Miller
fc68c4574a Bump Test Version 4.14.7 2022-07-26 10:40:08 -05:00
Jim Miller
f4a7a8657e adapter_storiesonlinenet: Single chapter stories slightly different. Also scifistories and finestories. Closes #867 2022-07-26 10:39:40 -05:00
Jim Miller
943bf1f36c Bump Test Version 4.14.6 2022-07-26 09:02:03 -05:00
Jim Miller
2482416ea5 Add get_section_url() for adapter_royalroadcom 2022-07-25 16:19:55 -05:00
Jim Miller
431369ed42 Bump Test Version 4.14.5 2022-07-20 13:54:43 -05:00
Jim Miller
314ff73280 Use Cal6 get_icons() so icon themes apply--print_tracebacks_for_missing_resources=False cal 6.2+. 2022-07-20 13:33:26 -05:00
Jim Miller
ce6df518a2 adapter_scifistoriescom: inherit from StoriesOnlineNetAdapter instead of FineStoriesComAdapter 2022-07-20 12:37:44 -05:00
Jim Miller
99049da5c6 Set use_basic_cache:true by default for finestories.com & scifistories.com 2022-07-20 12:36:16 -05:00
Jim Miller
1d73c51712 Bump Test Version 4.14.4 2022-07-19 09:01:04 -05:00
Jim Miller
dead6872d4 Use cal6 icon theme system to allow plugin icon customization. 2022-07-19 08:59:20 -05:00
Jim Miller
7a93a494ec Don't need old icon.png, still have xcf 2022-07-18 13:22:05 -05:00
Jim Miller
93cfc97d1d Bump Point Release Version 4.14.3 2022-07-15 10:26:07 -05:00
Jim Miller
bcd16b7840 Update translations. 2022-07-15 10:24:30 -05:00
Jim Miller
be9f626c85 Bump Test Version 4.14.2 2022-07-14 14:49:52 -05:00
Jim Miller
1133f5cc3a Remove site: webnovel.com See #843 2022-07-14 14:31:19 -05:00
Jim Miller
b37ae23af7 Bump Test Version 4.14.1 2022-07-12 18:09:06 -05:00
Jim Miller
e9574d66df Fix for qt6 vs qt5 in Cal6 and personal.ini search. 2022-07-12 18:08:58 -05:00
Jim Miller
55f6b882df Bump Release Version 4.14.0 2022-07-11 12:28:24 -05:00
Jim Miller
8692665724 Update some comments. 2022-07-11 12:28:06 -05:00
Jim Miller
93f483e42c Update translations. 2022-07-11 12:27:01 -05:00
Jim Miller
05e3415059 Bump Test Version 4.13.11 2022-07-09 11:20:55 -05:00
Jim Miller
e6b66636b9 adapter_fictionhuntcom: Fix for changes to chapter list. 2022-07-09 11:20:01 -05:00
Jim Miller
13c6a1fd77 Bump Test Version 4.13.10 2022-07-07 12:23:02 -05:00
Jim Miller
9b6c6da639 base XF needs chapter title as a string. Entities added back in base_adapter. 2022-07-07 12:23:02 -05:00
Jim Miller
7b596c1110 Bump Test Version 4.13.9 2022-07-07 12:01:15 -05:00
Jim Miller
23e0977218 Update translations. 2022-07-07 12:01:15 -05:00
Jim Miller
7fbcb054ad Restore & > < entities in chapter titles. Closes #863 2022-07-07 11:57:59 -05:00
Jim Miller
40a2af2b3d Bump Test Version 4.13.8 2022-07-06 11:21:14 -05:00
Jim Miller
0b8180a2cf adapter_fictionhuntcom: Update for site changes. 2022-07-06 11:20:32 -05:00
Jim Miller
33f3aa8dd2 Additional strings for translation for #860 2022-07-06 11:15:07 -05:00
Jim Miller
6682a3117b Better handling for fail of an existing anthology book on update. Closes #860 2022-07-06 11:13:57 -05:00
Jim Miller
38ea209a40 Bump Test Version 4.13.7 2022-06-28 16:20:09 -05:00
Jim Miller
295868b923 Fix for problem with remove_tags refactor. 2022-06-28 16:18:59 -05:00
Jim Miller
fc8e96cc9e Bump Test Version 4.13.6 2022-06-25 15:09:16 -05:00
Jim Miller
58387605e6 Collect rating for adapter_libraryofmoriacom, refactor rating from TOC in base_efiction. Closes #859 2022-06-25 15:09:16 -05:00
Jim Miller
1d5e5d3722 Refactor code to remove empty tags to also remove now-empty parents by making another pass. 2022-06-24 14:58:26 -05:00
Jim Miller
4aa9c1bf34 Output utf8FromSoup times to debug. Remove before release. 2022-06-24 14:58:04 -05:00
Jim Miller
d347523942 Bump Test Version 4.13.5 2022-06-17 09:45:43 -05:00
Jim Miller
a181c36ccb adapter_themasquenet: Switch to https, closes #854 2022-06-17 09:44:57 -05:00
Jim Miller
28a2b5e926 Bump Test Version 4.13.4 2022-06-10 09:02:43 -05:00
Jim Miller
65a7538452 PI: Ctrl-Return/Enter on personal.ini editbox equivalent to clicking OK button. 2022-06-10 08:59:49 -05:00
Jim Miller
bb3a86298e Bump Test Version 4.13.3 2022-06-09 13:11:05 -05:00
Jim Miller
d01ae7004a base_xenforoforum_adapter(QQ): Allow for guest/deleted author w/o a link. Closes #852 2022-06-09 13:10:58 -05:00
Jim Miller
31f3384c8e Bump Test Version 4.13.2 2022-06-08 09:10:59 -05:00
Faye
97823bc12b readonlymind: add option to include foreword/author's note
ROM chapters can include an author's note as a foreword on the chapter
page. It's an entirely separate section tag from the story content, so
when it is present, a div tag is used to wrap both.
2022-06-08 09:10:34 -05:00
Jim Miller
7f2514c177 Bump Test Version 4.13.1 2022-06-05 14:06:38 -05:00
Jim Miller
e3b487205d adapter_mcstoriescom Allow multiple authors. Closes #847 2022-06-05 14:06:29 -05:00
Jim Miller
b5dd8d4565 Bump Release Version 4.13.0 2022-06-01 11:20:53 -05:00
Jim Miller
7341598cc3 Update translations. 2022-06-01 11:20:37 -05:00
Jim Miller
04dd608930 Bump Test Version 4.12.16 2022-05-31 09:33:09 -05:00
Jim Miller
8b64b415c4 adapter_chosentwofanficcom: http->https 2022-05-31 09:33:03 -05:00
Jim Miller
0da8d430d9 Bump Test Version 4.12.15 2022-05-26 08:29:32 -05:00
Jim Miller
38570c26c7 Update translations. 2022-05-26 08:29:07 -05:00
Jim Miller
78c6b3e5cd adapter_mediaminerorg: More story URL forms. #845 2022-05-26 08:27:55 -05:00
Jim Miller
7550554c3e Bump Test Version 4.12.14 2022-05-22 14:24:52 -05:00
Jim Miller
68bb6f6fcf Don't set marked in Calibre when book_id is None #833 2022-05-22 14:24:40 -05:00
Jim Miller
bf01b1a7de Bump Test Version 4.12.13 2022-05-22 10:32:12 -05:00
Jim Miller
c53cbfe156 adapter_quotevcom: Update collection of searchtags 2022-05-22 10:25:56 -05:00
Jim Miller
a1f839d732 Bump Test Version 4.12.12 2022-05-18 17:57:35 -05:00
Jim Miller
71de6900ee Add config check parsing for custom_columns_settings on personal.ini save. 2022-05-18 17:57:27 -05:00
Jim Miller
11665834b5 Bump Test Version 4.12.11 2022-05-15 13:40:17 -05:00
Jim Miller
36eed1bc43 Refactor img code in story.py to fix a problem when cover image also in story. 2022-05-15 13:39:56 -05:00
Jim Miller
b39d6a33b7 Bump Test Version 4.12.10 2022-05-10 21:23:04 -05:00
Jim Miller
9c554375aa adapter_webnovelcom: Not all paragraphs starting with '<' are HTML. #841 2022-05-10 21:22:44 -05:00
Jim Miller
7c6c82e0ac Bump Test Version 4.12.9 2022-05-10 16:09:55 -05:00
Martin Vlcek
ceccc5baab
fix storiesonline login (again) - parameter name was changed back to "email" (#840) 2022-05-10 16:08:14 -05:00
Jim Miller
379d6ac634 Bump Test Version 4.12.8 2022-05-10 10:16:23 -05:00
Jim Miller
53c75ce01c Rename adapter_fastnovelsnet - Fixes for site changes 2022-05-10 10:16:12 -05:00
Jim Miller
08044e5c0d Bump Test Version 4.12.7 2022-05-05 22:27:28 -05:00
Jim Miller
63b1d7ac72 Lighten color highlighting for storyUrls sections in dark mode. 2022-05-05 22:27:22 -05:00
Jim Miller
63450c65e1 Bump Test Version 4.12.6 2022-05-05 09:13:07 -05:00
mvlcek
e9d206bf9b fix storiesonline login 2022-05-05 09:12:44 -05:00
Jim Miller
3913028800 Bump Test Version 4.12.5 2022-05-04 16:43:46 -05:00
Jim Miller
b8879d6b75 adapter_ficbooknet: Fix for site change. 2022-05-04 16:39:05 -05:00
Jim Miller
7df74c2bbb adapter_wwwutopiastoriescom: Fixes for site changes. 2022-05-04 16:32:42 -05:00
Jim Miller
1782a32674 Remove site tomparisdorm.com - Moved to AO3 2022-05-04 16:17:21 -05:00
Jim Miller
20574c7e94 Remove site: bloodties-fans.com - Moved to AO3. 2022-05-04 16:04:52 -05:00
Jim Miller
a78eb07c77 adapter_fanficsme: Fix for changed 'words' metadata. 2022-05-04 15:54:09 -05:00
Jim Miller
a8bdcde4bf Remove site wuxiaworld.com Closes #796 2022-05-03 17:22:36 -05:00
Jim Miller
523aa75588 Remove site wuxiaworld.site Closes #758 2022-05-03 17:12:48 -05:00
Jim Miller
2b36871281 Bump Test Version 4.12.4 2022-05-01 12:12:36 -05:00
Jim Miller
0cff71b9d6 adapter_storiesonlinenet py2 fixes See #832 #829 #830 2022-05-01 12:12:28 -05:00
Jim Miller
e3d358e4e0 Bump Test Version 4.12.3 2022-04-30 17:02:35 -05:00
Jim Miller
afacc475b4 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2022-04-30 14:24:05 -05:00
David Buckley
8051ef7c9f
add chapter date metadata to RoyalRoadAdapter (#831)
* add chapter date metadata to RoyalRoadAdapter

* string format chapter date metadata

* improve formatting
2022-04-30 14:23:56 -05:00
Jim Miller
d0a13b63ff Bump Test Version 4.12.2 2022-04-30 09:13:52 -05:00
Jim Miller
c5734f96b8 Add slow_down_sleep_time:1 for [storiesonline.net] 2022-04-30 09:13:42 -05:00
Martin Vlcek
adefbcfcf8
Address Storiesonline.net "Click to Load text..."
#756
2022-04-30 09:11:07 -05:00
Jim Miller
eb9e3ba9fe Bump Test Version 4.12.1 2022-04-29 10:31:26 -05:00
Jim Miller
6e3055e753 Fix for SB using an attr on noscript tags now. 2022-04-29 10:31:19 -05:00
Jim Miller
6c3a133ccd Bump Release Version 4.12.0 2022-04-27 11:01:38 -05:00
Jim Miller
75af89464d Update translations. 2022-04-27 10:59:44 -05:00
Jim Miller
b40676518c Bump Test Version 4.11.15 2022-04-26 16:54:43 -05:00
Jim Miller
86b86b50f9 Py2 vs py3 fix #828 2022-04-26 16:54:34 -05:00
Jim Miller
5fd455b981 Fix some indenting 2022-04-26 10:45:23 -05:00
Jim Miller
58a8ca411c Bump Test Version 4.11.14 2022-04-25 21:33:11 -05:00
Jim Miller
d2ff6ba5d2 adapter_phoenixsongnet: Remove login code and changes for static author links. 2022-04-25 21:33:00 -05:00
Jim Miller
cb3f7e1644 Remove some dead code from inherited common_utils.py 2022-04-25 12:53:26 -05:00
Jim Miller
e2c6d4be99 Bump Test Version 4.11.13 2022-04-25 11:55:36 -05:00
Jim Miller
20802c8a6b adapter_fictionhuntcom: Fixes for site changes. 2022-04-25 11:54:41 -05:00
Jim Miller
2243edb175 adapter_webnovelcom: Fixes for site changes. #828 2022-04-25 10:55:59 -05:00
Jim Miller
80c4f4cb56 Bump Test Version 4.11.12 2022-04-20 19:21:21 -05:00
Jim Miller
b43d0e4b79 Update translations. 2022-04-20 19:21:10 -05:00
Jim Miller
3c95a6a533 Xenforo2 Ongoing==In-Progress 2022-04-16 23:37:51 -05:00
Jim Miller
d3d0865a00 Bump Test Version 4.11.11 2022-04-16 17:52:31 -05:00
Jim Miller
41e2f5ed75 Apply connect_timeout setting to network requests. 2022-04-16 17:52:24 -05:00
Jim Miller
8653b1520f Bump Test Version 4.11.10 2022-04-13 13:53:40 -05:00
Jim Miller
a67dd3d7b0 adapter_webnovelcom: Unescape & too. See #825 2022-04-13 13:53:25 -05:00
Jim Miller
bdeb2a80f7 Bump Test Version 4.11.9 2022-04-13 10:03:36 -05:00
Jim Miller
0eb543a726 Update translations. 2022-04-13 09:59:13 -05:00
Jim Miller
9c9a2a22f5 Bump Test Version 4.11.8 2022-04-12 09:36:32 -05:00
Jim Miller
8aeb05a22d Detect and error on adjusted chapter list < 1. Closes #826 2022-04-12 09:36:24 -05:00
Jim Miller
11670b30ba Bump Test Version 4.11.7 2022-04-08 19:36:29 -05:00
Jim Miller
ff0a9a7335 adapter_webnovelcom: Read chapter text from json in <script> tag. 2022-04-08 19:36:23 -05:00
Jim Miller
33272aaa22 Update translations. 2022-04-08 10:03:05 -05:00
Jim Miller
75fc53f93a Bump Test Version 4.11.6 2022-04-08 09:46:50 -05:00
Sidney Markowitz
890f416eae
use large cover images on royalroad (#823) 2022-04-08 09:46:34 -05:00
Jim Miller
3a35e4d2d0 Bump Test Version 4.11.5 2022-04-06 10:52:17 -05:00
Jim Miller
81ef198d00 Add --mozilla-cookies CLI option. 2022-04-06 10:52:11 -05:00
Jim Miller
d7f149e990 Bump Test Version 4.11.4 2022-04-05 09:47:05 -05:00
Jim Miller
6e86f51164 PI: Update translation strings. 2022-04-05 09:47:05 -05:00
Jim Miller
a086de264c PI: Check for existing anthology ebook on new anthology from series URL. 2022-04-05 09:45:03 -05:00
Jim Miller
dc28197c7b Bump Test Version 4.11.3 2022-04-01 09:32:12 -05:00
Jim Miller
de8443298e ffnet: Add meta_from_last_chapter option. 2022-04-01 09:32:05 -05:00
Jim Miller
eee92b4ebb Bump Test Version 4.11.2 2022-03-27 09:55:48 -05:00
Jim Miller
10a07fe4bf adapter_literotica: add ...$ to story URL search for when /xyz-pt1 and /xyz are different stories. 2022-03-27 09:55:42 -05:00
Jim Miller
a8c10bb017 Bump Test Version 4.11.1 2022-03-25 12:08:28 -05:00
Jim Miller
ecfa75c235 Adding fandom/category parsing to adapter_fictionhuntcom--more than just HP now. 2022-03-25 12:08:20 -05:00
Jim Miller
21bd4b951d New ffnet fandom containing + 2022-03-25 10:52:36 -05:00
Jim Miller
ff6950b2e2 Alphabet order defaults.ini files. 2022-03-25 10:26:11 -05:00
Jim Miller
f9a39897a2 Merge branch 'Rikkitp-fix-wuxiaworldco' 2022-03-25 10:21:53 -05:00
Snegirev Dmitry
eeac5f2b9a fix wuxiaworldco: www => m 2022-03-25 16:24:25 +03:00
Jim Miller
98ea6ba721 Bump Release Version 4.11.0 2022-03-23 09:35:01 -05:00
Jim Miller
2ca954f048 Update translations. 2022-03-23 09:34:44 -05:00
Jim Miller
fa7cf95ee2 Bump Test Version 4.10.8 2022-03-11 10:27:33 -06:00
Jim Miller
5680027b72 adapter_quotevcom: Additional chapter image parsing. 2022-03-11 10:27:26 -06:00
Jim Miller
8c6c6991c2 Bump Test Version 4.10.7 2022-03-02 10:00:29 -06:00
Sidney Markowitz
addc024e49 Issue 813 recognize various royalroad chapter url formats 2022-03-02 20:31:42 +13:00
Jim Miller
335bfb02c2 Bump Test Version 4.10.6 2022-02-24 21:48:24 -06:00
Jim Miller
fb94a3f3f1 Change base_xenforoforum reveal_invisible_text feature to also add class=invisible_text. Closes #812 2022-02-24 21:48:06 -06:00
Jim Miller
9ea9cf4c68 Bump Test Version 4.10.5 2022-02-22 13:10:08 -06:00
Jim Miller
e977587fae adapter_fastnovelnet: Update to redirected Story URL. 2022-02-22 13:09:58 -06:00
Jim Miller
0c02cd98e0 Bump Test Version 4.10.4 2022-02-22 11:31:16 -06:00
Jim Miller
c67e19e0bf adapter_fastnovelnet: 'Normalize' chapter URLs to current storyId URL--site is changing it frequently 2022-02-22 11:31:10 -06:00
Jim Miller
4e4360ec62 Bump Test Version 4.10.3 2022-02-20 11:47:43 -06:00
Jim Miller
e786090aeb base_efiction - narrow chapter search even more. 2022-02-20 11:28:06 -06:00
Jim Miller
03f2657a6e base_efiction - narrow chapter search regexp a little more. 2022-02-20 11:18:28 -06:00
Jim Miller
16be4cbbe5 Abstract 'Back to index' in base_efiction for other languages. 2022-02-20 11:08:32 -06:00
Jim Miller
53c8b69f1e Stop looking for FFDL settings--it's been 7 years. 2022-02-18 10:16:14 -06:00
Jim Miller
28238b18ff Bump Test Version 4.10.2 2022-02-16 11:26:14 -06:00
Jim Miller
f4c06014dd Look for story URLs in pasted mime as well as dropped. See #809 2022-02-16 11:11:44 -06:00
Jim Miller
fb8ab400b7 Bump Test Version 4.10.1 2022-02-14 14:22:01 -06:00
Jim Miller
f2d74defca adapter_storiesonlinenet: fix for dateUpdated when 'Last Activity' #808 2022-02-14 14:21:54 -06:00
Jim Miller
c1c18a5a87 Bump Release Version 4.10.0 2022-02-14 09:39:42 -06:00
Jim Miller
54e952748f Bump Test Version 4.9.10 2022-02-02 11:14:13 -06:00
Jim Miller
30470c8f6a adapter_fanfiktionde: Update where description comes from. 2022-02-02 11:14:07 -06:00
Jim Miller
4da7db4305 Bump Test Version 4.9.9 2022-02-01 12:36:58 -06:00
Jim Miller
23a00fb15a Correct use_flaresolverr_proxy error checking. 2022-02-01 12:36:52 -06:00
Jim Miller
951cc73e46 Merge branch 'fswithimages' 2022-02-01 09:12:32 -06:00
Nothorse
53452ca410
ReadOnlyMindAdapter: Add series_tags feature to populate series metadata (#803) 2022-02-01 09:11:47 -06:00
Jim Miller
01ba441a63 Bump Test Version 4.9.8 2022-02-01 09:10:35 -06:00
Jim Miller
582c1a6e7f Add use_flaresolverr_proxy:withimages option for FlareSolverr v1 users. 2022-02-01 09:10:35 -06:00
Jim Miller
77d1037a90 adapter_fanfictionnet: don't do skip_author_cover check without include_images:true 2022-02-01 09:10:35 -06:00
Jim Miller
52587ef69b Bump Test Version 4.9.7 2022-01-30 19:32:56 -06:00
Jim Miller
ea66ae350b Use logger.warning() not .warn() consistently. 2022-01-30 19:32:42 -06:00
Jim Miller
ad3a16f423 Force include_images:false when use_flaresolverr_proxy:true -- FlareSolverr v2.2.0 crashes on image request. 2022-01-30 19:31:52 -06:00
Jim Miller
4cf37d449e Stop passing download:true to FlareSolverr, they aren't putting it back. 2022-01-30 19:30:57 -06:00
Jim Miller
2c00752e23 Bump Test Version 4.9.6 2022-01-30 12:06:06 -06:00
Jim Miller
05e15487e4 adapter_royalroadcom: Add status 'Dropped' 2022-01-30 12:05:51 -06:00
Jim Miller
99236e82ad Bump Test Version 4.9.5 2022-01-30 09:59:16 -06:00
Nothorse
b9f5686a3c
readonlymind adapter (#801)
New Site: readonlymind.com, thanks Nothorse
2022-01-30 09:57:48 -06:00
Jim Miller
b99a7fe494 Bump Test Version 4.9.4 2022-01-29 10:19:16 -06:00
Jim Miller
f028bc9b6c adapter_royalroadcom: Add status 'Hiatus' Closes #800 2022-01-29 10:19:03 -06:00
Jim Miller
bd1bfbfaf9 base_efiction: Add 'Igen' as equiv to 'Yes, Completed' in Hungarian 2022-01-29 10:17:40 -06:00
Jim Miller
f61696fb3f Remove defunct site: hpfanficarchive.com 2022-01-24 11:15:27 -06:00
Jim Miller
f47f859de0 Bump Test Version 4.9.3 2022-01-23 10:13:45 -06:00
Jim Miller
6a18f3509b Remove fanfic.hu, moved to merengo.hu, but don't know if old storyIds are valid. 2022-01-23 10:13:11 -06:00
Jim Miller
02734791cd Add merengo.hu as eFiction with added consent click through. 2022-01-23 10:09:46 -06:00
Jim Miller
6194f3d9e7 Bump Test Version 4.9.2 2022-01-20 11:24:00 -06:00
Jim Miller
197c6dde81 Extend base_xenforoforum tagsfromtitle for ')(' '][' 2022-01-18 10:54:03 -06:00
Jim Miller
ea87916f4b Fix for py2 for base_xenforoforum tagsfromtitle. 2022-01-18 10:53:12 -06:00
Jim Miller
b710bdaafd Add flaresolverr_proxy settings to defaults.ini 2022-01-14 11:01:27 -06:00
Jim Miller
7b2d6a91fb Bump Test Version 4.9.1 2022-01-11 16:32:04 -06:00
Jim Miller
c7a542fd17 qt6 QFont.Normal/Bold & QTextEdit.NoWrap 2022-01-11 16:30:46 -06:00
Jim Miller
fa2b3c9511 Remove setTabStopWidth from raw prefs viewer--changed in qt6 and not needed. 2022-01-11 16:20:59 -06:00
Jim Miller
d6258ab74d Remove unneeded QTableWidgetItem.UserType 2022-01-11 16:07:00 -06:00
Jim Miller
f633ef8137 Remove dead convert_qvariant() code. 2022-01-11 16:07:00 -06:00
Jim Miller
a4c6fd9ff7 Replace QTextEdit.setTabStopWidth with setTabStopDistance 2022-01-11 16:07:00 -06:00
Jim Miller
0812d13003 Fix for QTableWidgetItem.UserType 2022-01-11 16:07:00 -06:00
Jim Miller
c97407ae56 Remove all Qt4 imports. 2022-01-11 16:07:00 -06:00
Jim Miller
b2b56e6366 Bump Release Version 4.9.0 2022-01-11 15:58:22 -06:00
Jim Miller
78e3689062 Remove removed fictionalley site from plugin-example.ini 2022-01-10 11:59:13 -06:00
Jim Miller
9f77f3a60d Bump Test Version 4.8.8 2022-01-10 09:09:18 -06:00
Jim Miller
db85c2c4b3 Update translations. 2022-01-10 09:09:06 -06:00
Jim Miller
dc26cef572 Update defaults.inis for Chrome's new Cache_Data dir. 2022-01-10 09:07:39 -06:00
Jim Miller
bc149a2deb Bump Test Version 4.8.7 2022-01-04 13:06:16 -06:00
Jim Miller
1e46c97bbd Adding plugin feature to Mark anthologies when individual story skipped. See #786 2022-01-04 13:05:44 -06:00
Jim Miller
790744c9e1 fictionhunt.com isn't requiring login anymore. Closes #784 2022-01-04 09:58:23 -06:00
Jim Miller
033c38fc91 Bump Test Version 4.8.6 2021-12-21 12:58:28 -06:00
Jim Miller
825a2070c5 Re-sync defaults.ini & plugin-defaults.ini 2021-12-21 12:58:28 -06:00
Jim Miller
5128dc6743 Strip base_xenforoforum tagsfromtitle with commas. Addresses final issue in #782 2021-12-21 12:58:28 -06:00
gesh
8828e1fc28 Fix nested []/()
Problem was regexes treated `[,(` and `],)` equally, considering eg
`[..)` balanced.

Considering `()` is also used within titles without signifying tags, it
might be worth investigating only matching that kind of bracketed tag at
the end of the title.

Closes: #783
2021-12-21 12:58:28 -06:00
hseg
a43949d123
Refactor main() in cli.py (#781)
* main: Replace return by explicit calls to exit()

In anticipation of breaking out these sections in their own functions

* Make doc-getting flags shortcut

This saves on pointless validation/setup work when only getting help
information. Moreover, these were the only actions that were in the
middle of the parse/validate/setup/run core logic of main(), moving them
out clears the way to cleanly breaking it up.

Removes -v alias for --version. If this is undesirable, a trick similar
to that for --sites-list can be used to shortcut it as well.

* Move up flag implication logic, var renaming

These are "virtual flags", should be set up asap after actual flags are
setup. Ideally, these would be set up in arg parsing, but this is
sometimes impossible/impractical.

Future improvement: use callbacks to say eg --updatealways sets
options.update, options.updatealways

* Move up validation

Fail fast if the arguments are invalid

* Internalize list_only into options

Helps keep related state together

* Pack up configs, printers for easier passing

* Break up main() into phases

* Remove unnecessary semicolon

* Unbundle configs, printers

This reverts commit 5dd44bbfc3.
Revertion reasons:
1) Initial commit was broken -- it reordered parameters in invocations
   to `get_config()`. This happened because python complained about
   invocations of the form `f(x,**d,z)` -- positional parameters may not
   appear after kwarg expansion. I mistakenly believed kwarg expansion
   would consume the slots of those parameters, and so this code would
   be equivalent to `f(x,z,**d)`. Instead, this passes `z` to the second
   positional parameter, which luckily enough had a key contained in `d`
   so it only caused a TypeError over the multiple values for that
   parameter.
2) To maintain the vision of the original commit would require multiple
   pessimizations *over* the previous state. Specifically:
   1) Using our running example of invocations of the form `f(x,**d,z)`,
      we'd need to turn `z` into a keyword argument. Since Python has no
      way of writing "`z` is a keyword argument whose value is taken
      from the current scope", that forces writing `f(x,**d,z=z)`.
      (Even if a proposal like <https://lwn.net/Articles/818129/> is
      accepted, we wouldn't be able to use it since we need to support
      Python 2)
   2) `dispatch()` uses `fail` internally. So we have one of two
      options:
      * Bundle `warn, fail` in `dispatch`'s arguments, and add a line
        `fail=printers['fail']` to the top of `dispatch`
      * Don't bundle `warn, fail` in `dispatch`'s arguments, and have
        `dispatch` bundle them instead
      Neither of these is palatable, especially over
      * Don't bundle `warn, fail` anywhere

* Restore -v alias for version

As 0847fc9 suggested might be desired
2021-12-21 12:21:02 -06:00
Jim Miller
61bc732810 Bump Test Version 4.8.5 2021-12-19 12:10:54 -06:00
Jim Miller
555872bdef
Merge pull request #780 from hseg/cleanup-cli-main-wip
cli.py: move out parseArgs
2021-12-19 12:10:36 -06:00
gesh
c0d776f64c cli.py: move out parseArgs
Resolves: #779
2021-12-19 03:48:35 +02:00
Jim Miller
a2dd11326f Bump Test Version 4.8.4 2021-12-14 19:36:34 -06:00
Jim Miller
0904101b7d adapter_archiveofourownorg: AO3 notification emails now sending http: instead of https: 2021-12-14 19:35:17 -06:00
Jim Miller
6fc9aa6dfc Bump Test Version 4.8.3 2021-12-11 19:32:16 -06:00
Jim Miller
3b72126f5f Add remove_class_chapter feature, true by default. 2021-12-11 19:32:06 -06:00
Jim Miller
80fb72928e Bump Test Version 4.8.2 2021-12-10 10:12:27 -06:00
Jim Miller
8ee9fc36ab adapter_scribblehubcom: Corner case removing spoilers. Closes #778 2021-12-10 10:12:18 -06:00
Jim Miller
89e731031c Bump Test Version 4.8.1 2021-12-03 10:56:46 -06:00
Jim Miller
619bc8a6f9 adapter_wwwnovelallcom: fixes for story w/o chapters & html desc. 2021-12-03 10:56:37 -06:00
Jim Miller
a2523f1a1e Bump Release Version 4.8.0 2021-12-02 09:33:07 -06:00
Jim Miller
3499548a2f Update translations. 2021-11-29 09:08:39 -06:00
Jim Miller
4460ee00cf Bump Test Version 4.7.9 2021-11-28 12:32:17 -06:00
Jim Miller
89290bf7a4 Add fix_relative_text_links feature, defaults to true. 2021-11-28 12:32:08 -06:00
Jim Miller
a07b36b61f Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2021-11-28 12:27:20 -06:00
Jim Miller
6f305d6254
Merge pull request #773 from Epicpkmn11/sh-spoilers-footnotes
Make Scribble Hub spoilers & footnotes look nicer
2021-11-28 12:27:14 -06:00
Pk11
7e356b733e Fix crash when news or spoiler notes excluded 2021-11-28 04:58:25 -06:00
Pk11
f2c8ae6a0a Make Scribble Hub spoilers & footnotes look nicer 2021-11-28 04:39:40 -06:00
Jim Miller
b1ab540c11 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2021-11-20 15:50:25 -06:00
Jim Miller
9ca0bfc5d8
Merge pull request #771 from jcotton42/main
Remove Patreon support, discussion in #770
2021-11-20 15:50:10 -06:00
Josh Cotton
7011250353 Revert "Merge branch 'jcotton42-patreon'"
This removes support for Patreon, as discussed in #770.

This reverts commit df26e74145, reversing
changes made to 23e4f9468d.
2021-11-20 13:30:08 -08:00
Jim Miller
744400b161 Bump Test Version 4.7.8 2021-11-20 15:24:58 -06:00
Jim Miller
d0b81c1c7b Add averrating metadata to adapter_novelfull. 2021-11-20 15:24:36 -06:00
Jim Miller
adfaf141d3 Bump Test Version 4.7.7 2021-11-19 20:21:50 -06:00
Jim Miller
a8047ba0a9 Fix for cover_min_size causing failures with SVG images when no_image_processing:true. 2021-11-19 20:21:44 -06:00
Jim Miller
b142654dfc Bump Test Version 4.7.6 2021-11-19 10:38:35 -06:00
Jim Miller
56d4688f2c
Merge pull request #766 from rapjul/main
Get largest Webnovel cover image
2021-11-19 10:22:16 -06:00
Jim Miller
df26e74145 Merge branch 'jcotton42-patreon' 2021-11-19 10:18:35 -06:00
Jim Miller
8dd9154982 Merge branch 'patreon' of https://github.com/jcotton42/FanFicFare into jcotton42-patreon 2021-11-19 10:16:57 -06:00
Jim Miller
23e4f9468d Reorder defaults.ini sections. 2021-11-19 10:13:57 -06:00
Jim Miller
aa966de4bc Merge branch 'main' of https://github.com/JimmXinu/FanFicFare 2021-11-19 10:12:35 -06:00
Jim Miller
a711083e90 Set default slow_down_sleep_time:2 for [www.asianfanfics.com] 2021-11-19 10:12:30 -06:00
Jim Miller
99bafb052b
Merge pull request #769 from jcotton42/deviantart-dates-and-ids
Deviantart date fix (closes #768), also storyId and extratags changes
2021-11-19 10:11:06 -06:00
rapjul
61b5cd8e43
Update adapter_webnovelcom.py
Removed `map()` to better support Python2.
Used inline `for` loop instead.
2021-11-18 20:54:00 -06:00
Josh Cotton
1466ff2422 Patreon support using the browser cache. 2021-11-18 18:19:04 -08:00
Josh Cotton
af8a979984 FanFiction tag no longer added to deviantArt works by default. 2021-11-18 17:00:53 -08:00
Josh Cotton
1d562d1fe4 Use the unique deviation ID for the story ID. 2021-11-18 16:54:26 -08:00
Josh Cotton
d437654320 Fix date parsing to be Python 2-compatible. 2021-11-18 16:48:57 -08:00
rapjul
1eb5eb2d54
Update adapter_webnovelcom.py
Finds the largest image source – in case Webnovel changes their code.
2021-11-18 07:10:37 -06:00
rapjul
dbc90cfce5
Update adapter_webnovelcom.py
Gets the largest cover image.
2021-11-17 17:16:28 -06:00
Jim Miller
cf5c0fd68c Bump Test Version 4.7.5 2021-11-17 12:54:50 -06:00
Jim Miller
b02f40318c
Merge pull request #765 from jcotton42/deviantart
Support for deviantArt (closes #374)
2021-11-17 12:54:24 -06:00
Josh Cotton
bc6d65de26 Change site abbreviation to 'dac' 2021-11-17 10:37:22 -08:00
Josh Cotton
09f2fc4d4b Have deviantArt use the basic cache by default. 2021-11-16 23:33:09 -08:00
Josh Cotton
5c06b32a30 Fetch tags from dA. 2021-11-16 23:24:27 -08:00
Josh Cotton
125c55e1e3 Add erorr message if both story detections fail. 2021-11-16 22:42:56 -08:00
Josh Cotton
841fe6e396 Remove comments before scraping chapter to avoid false matches. 2021-11-16 22:40:54 -08:00
Josh Cotton
f245310927 Handle deviantArt mature content. 2021-11-16 21:42:27 -08:00
Josh Cotton
5e31182bc8 Deviantart login support. 2021-11-16 21:42:22 -08:00
Josh Cotton
0ca4d20720 Baisc support for deviantArt. 2021-11-14 23:38:25 -08:00
Jim Miller
2ddce1acd5 Bump Test Version 4.7.4 2021-11-14 11:57:29 -06:00
Jim Miller
dc88a00ea4 New Site: psychfic.com (re-added), thanks HappyFaceSpider Closes #764 2021-11-14 11:57:16 -06:00
Jim Miller
df61e88714 Bump Test Version 4.7.3 2021-11-12 19:21:31 -06:00
Brian
36efc7366e
Update adapter_storiesonlinenet.py
Added age/rating field parsing for finestories and scifistories
2021-11-12 17:00:41 -08:00
Jim Miller
a829d01e7c Bump Test Version 4.7.2 2021-11-09 16:47:03 -06:00
Jim Miller
1459ad8611 Add --json-meta-file CLI option. #761 2021-11-09 16:46:56 -06:00
Jim Miller
2e78b153d5 Bump Test Version 4.7.1 2021-11-09 08:54:49 -06:00
Jim Miller
467d79120e adapter_ficbooknet: Fix for site change. 2021-11-09 08:54:42 -06:00
Jim Miller
9080349615 Bump Release Version 4.7.0 2021-11-04 12:36:56 -05:00
Jim Miller
2085dda0a3 Bump Test Version 4.6.11 2021-11-03 15:00:56 -05:00
Jim Miller
52e69abb88 Update translations. 2021-11-03 15:00:08 -05:00
Jim Miller
06fa73666f Code for FlareSolverr v2.0.X, can't handle images. 2021-11-03 14:56:13 -05:00
Jim Miller
d7940213ab Bump Test Version 4.6.10 2021-11-01 20:10:02 -05:00
Jim Miller
da5ec5b357 adapter_royalroadcom: fix for ancient bug reading unixtime attr that's come back after years. 2021-11-01 20:09:09 -05:00
Jim Miller
605fc0dbcf Bump Test Version 4.6.9 2021-10-27 13:35:04 -05:00
Jim Miller
9da07fd160 fictionalley-archive.org: Convert adapter_fictionalleyorg to adapter_fictionalleyarchiveorg. 2021-10-27 13:34:30 -05:00
Jim Miller
913f8dc256 Bump Test Version 4.6.8 2021-10-25 13:07:59 -05:00
Jim Miller
f8cb9e9364 adapter_storiesonlinenet: Fix for site updates, login and dates. 2021-10-25 13:05:23 -05:00
Jim Miller
7ec234a052 Bump Test Version 4.6.7 2021-10-24 09:44:55 -05:00
Jim Miller
bb12670ef3 adapter_wwwutopiastoriescom: Add siterating_votes,siterating,siterank_of,siterank,views #750 2021-10-24 09:41:05 -05:00
Jim Miller
120a82c82b Bump Test Version 4.6.6 2021-10-20 13:20:45 -05:00
Jim Miller
bd9128044a Fix for more arbitrary py3 incompatibility(MutableSet). Closes #748 2021-10-20 13:14:56 -05:00
Jim Miller
9e54b8d82b Bump Test Version 4.6.5 2021-10-16 15:30:53 -05:00
Jim Miller
1f3f09d713 Switching royalroad.com specific 'get from imap' code to use fetcher instead of urllib. For #746 2021-10-16 15:30:40 -05:00
Jim Miller
17c9a26c8a Switch config _filelist feature from using urllig.request.build_opener to fetcher.RequestsFetcher. 2021-10-16 14:50:06 -05:00
Jim Miller
5755d462cc Update bundled certifi to 2021.10.08 2021-10-16 14:41:11 -05:00
Jim Miller
f4de32550c Bump Test Version 4.6.4 2021-10-10 20:35:14 -05:00
Jim Miller
2ae9c679e1 Update for adapter_fictionhuntcom, 'next' link in author pages changed. 2021-10-10 20:35:08 -05:00
Jim Miller
9dc4de0f07 Bump Test Version 4.6.3 2021-10-10 19:41:16 -05:00
Jim Miller
a64a415f59 adapter_wwwutopiastoriescom: Split 'eroticatags' metadata entry rather than single string. Closes #744 2021-10-10 19:40:34 -05:00
Jim Miller
5e02fdc2ae Bump Test Version 4.6.2 2021-10-07 11:18:57 -05:00
Jim Miller
6d75c4b464 Handle errors in royalroad email links better. 2021-10-07 10:51:19 -05:00
Jim Miller
ff05648b04 Fix typo in warning string. 2021-10-03 12:34:18 -05:00
Jim Miller
0a114cd313 Bump Test Version 4.6.1 2021-10-02 17:13:32 -05:00
TheCakeIsNaOH
db3b17ed5f Flaresolverr proxy add option to specify protocol
Add an option to specify the protocol to use for flaresolverr.
This allows usage of flaresolverr over https.
2021-10-02 17:10:58 -05:00
Jim Miller
14231fdd0a Fix for flaresolverr_proxy doing get urls from page 2021-10-02 13:57:02 -05:00
Jim Miller
76565e959a Bump Release Version 4.6.0 2021-09-30 11:53:56 -05:00
Jim Miller
70e67f7960 Bump Test Version 4.5.15 2021-09-28 12:57:31 -05:00
Jim Miller
9f244b9c01 Remove site www.squidge.org/peja now hosted on squidgeworld.org 2021-09-28 12:53:42 -05:00
Jim Miller
6a1dccd270 Remove site faerie-archive.com - redirects to some sketchy looking ad sites. 2021-09-28 12:33:49 -05:00
Jim Miller
b146954afd Remove site fanfiction-junkies.de - redirects to ad site. 2021-09-28 12:31:22 -05:00
Jim Miller
58cc24e9c4 Remove site it-could-happen.net - redirects to some sketchy looking ad sites. 2021-09-28 12:29:42 -05:00
Jim Miller
c97de461a8 Remove site sebklaine.net - redirects to some sketchy looking ad sites. 2021-09-28 12:27:47 -05:00
Jim Miller
e759240175 Remove site nha.magical-worlds.us - redirects to something unrelated. 2021-09-28 12:25:59 -05:00
Jim Miller
0eb3abd44a Bump Test Version 4.5.14 2021-09-27 11:25:46 -05:00
Jim Miller
e699910675 Fix for adapter_wuxiaworldsite site change - chapter list in page HTML now. 2021-09-27 11:25:37 -05:00
Jim Miller
c92acf2b3b Bump Test Version 4.5.13 2021-09-24 11:23:06 -05:00
Jim Miller
b439fa8bf0 Don't retry connection to flaresolverr proxy and report specifically on fail. Closes #737 2021-09-24 11:21:21 -05:00
Jim Miller
7ac8d1f1aa Bump Test Version 4.5.12 2021-09-24 10:49:35 -05:00
Jim Miller
991b928edb Accept wuxiaworldsite.com as alias for wuxiaworld.site. 2021-09-24 10:48:26 -05:00
Jim Miller
fb815c0453 Better error message for AO3 login-required series. Closes #736 2021-09-23 11:01:00 -05:00
Jim Miller
74ddae0fd9 Add section about test versions to home page. 2021-09-23 10:31:08 -05:00
Jim Miller
4dcc9ec510 Bump Test Version 4.5.11 2021-09-23 10:16:01 -05:00
Jim Miller
2245167580 Fix for adapter_asianfanficscom - site changed author URLs. 2021-09-23 10:15:54 -05:00
Jim Miller
97fe1bbcf6 Remove some debugs. 2021-09-22 21:57:02 -05:00
Jim Miller
93fc626332 Tweak XF tagsfromtitle in defaults.ini to not break &amp; 2021-09-22 21:56:31 -05:00
Jim Miller
16f19e6b4a Bump Test Version 4.5.10 2021-09-16 19:09:23 -05:00
Jim Miller
66ed3478cd Fix for FFF plugin not recognizing the same ffnet story URL when the title changes. 2021-09-16 19:09:10 -05:00
Jim Miller
982fd32a06 Bump Test Version 4.5.9 2021-09-16 12:45:30 -05:00
Jim Miller
3be15436a8 Add FS session code, but disabled currently. 2021-09-16 12:24:28 -05:00
Jim Miller
443a543bb5 Bump Test Version 4.5.8 2021-09-16 12:24:28 -05:00
Jim Miller
e28773850f flaresolverr_proxy watch for super future cookies, reduce proxy timeout. 2021-09-16 12:24:28 -05:00
Jim Miller
52e740cf58 Bump Test Version 4.5.7 2021-09-16 12:24:28 -05:00
Jim Miller
bdd8921328 Need a cookie version to read saved cookie file back. flaresolverr_proxy 2021-09-16 12:24:28 -05:00
Jim Miller
3f2596c247 Fixes for flaresolverr_proxy so POST works. 2021-09-16 12:24:28 -05:00
Jim Miller
73305fe0df Bump Test Version 4.5.6 2021-09-16 12:24:28 -05:00
Jim Miller
5ca13c71b3 Adding flaresolverr_proxy. 2021-09-16 12:24:28 -05:00
Jim Miller
06730f3f7b Add order_threadmarks_by_date_categories option, closes #733 2021-09-16 12:24:02 -05:00
Jim Miller
464a7a3ee3 Bump Test Version 4.5.5 2021-09-08 16:40:01 -05:00
oh45454545
bd52738e4c Update adapter_asianfanficscom.py 2021-09-08 16:39:39 -05:00
oh45454545
3b6a4b85a9 Update adapter_asianfanficscom.py 2021-09-08 16:39:39 -05:00
oh45454545
03d030feab Update adapter_asianfanficscom.py 2021-09-08 16:39:39 -05:00
Jim Miller
1082dc5417 Bump Test Version 4.5.4 2021-09-07 18:22:37 -05:00
Jim Miller
afb9f38ab4 Add 'min' for minutes to parse_relative_date_string() #731 2021-09-07 18:22:14 -05:00
Jim Miller
9754747785 Add use_browser_cache comments to fictionpress.com sections. 2021-09-07 10:25:42 -05:00
Jim Miller
8ea2aca735 Bump Test Version 4.5.3 2021-09-04 15:11:58 -05:00
Jim Miller
f7dcce698b Fix for dateutils change breaking royalroad 2021-09-04 15:11:49 -05:00
Jim Miller
b94779f7d4 Bump Test Version 4.5.2 2021-09-03 11:26:55 -05:00
Jim Miller
b24db52b3d Fixes for site changes in adapter_webnovelcom. #731 2021-09-03 11:26:40 -05:00
Jim Miller
19571e3b2b Don't mark wuxiaworld.com and webnovel.com stories FanFiction by default. #730 2021-08-30 11:57:54 -05:00
Jim Miller
3ae3d6c677 Bump Test Version 4.5.1 2021-08-29 15:36:35 -05:00
Jim Miller
6924828c8d Add POST request for wuxiaworld.site site changers. Closes #729. 2021-08-29 15:36:25 -05:00
Jim Miller
5aa8f2b25c Bump Release Version 4.5.0 2021-08-25 09:09:46 -05:00
Jim Miller
f0b14e680e Update translations. 2021-08-24 18:18:24 -05:00
Jim Miller
fc3f1c6588 Update translations. 2021-08-24 17:57:40 -05:00
Jim Miller
e62c771a3f Remove removed site from example.ini 2021-08-21 13:16:22 -05:00
Jim Miller
72ada92aa4 Bump Test Version 4.4.6 2021-08-20 19:42:22 -05:00
Jim Miller
18aa2776b0 Fix for lazyload images in base_xenforoforum_adapter 2021-08-20 19:41:49 -05:00
Jim Miller
3a30d2c5ea Bump Test Version 4.4.5 2021-08-09 13:29:33 -05:00
Jim Miller
e859aa23bf More 'correct' fix for missing URL on anthology update. Reverses fcf8dc2cde 2021-08-09 13:29:23 -05:00
Jim Miller
ca3a453447 Bump Test Version 4.4.4 2021-08-09 12:57:12 -05:00
Jim Miller
7132d16053 Update translations. 2021-08-09 12:55:00 -05:00
Jim Miller
fcf8dc2cde Keep existing series/anthology URL during update for config purposes. 2021-08-09 12:54:12 -05:00
Jim Miller
3aebb20ec2 Bump Test Version 4.4.3 2021-08-07 08:35:05 -05:00
Jim Miller
db1d6d9e0c Allow chapter URLs for adapter_webnovelcom. 2021-08-07 08:34:52 -05:00
Jim Miller
0501e98b13 literotica.com: extratags:Erotica 2021-07-27 16:44:29 -05:00
Jim Miller
0609d8bfae Bump Test Version 4.4.2 2021-07-26 17:06:24 -05:00
Jim Miller
89c6d45786 Shift adapter_fictionmaniatv to http, problems with https server? 2021-07-26 17:06:16 -05:00
Jim Miller
48065e5d83 Add reveal_invisible_text option to base_xenforoforum_adapter. 2021-07-21 11:28:23 -05:00
Jim Miller
5c3a8931ed Add <s>strikethrough</s> example to adapter_test1 2021-07-21 11:27:53 -05:00
Jim Miller
f994c67cc5 Remove site harrypotterfanfiction.com, site closed. Closes #719 2021-07-21 10:02:54 -05:00
Jim Miller
466e706f1c Bump Test Version 4.4.1 2021-07-14 13:09:19 -05:00
Jim Miller
de01752a8b Allow fictionpress.com with use_browser_cache--user still needs to configure in personal.ini Closes #716 2021-07-14 13:09:12 -05:00
Jim Miller
e2a3b48481 Change blockfilecache to save uint32 addrs instead of original cache key. Hashing cache key proved unreliable in some cases. 2021-07-14 12:58:10 -05:00
Jim Miller
162dcf5fbd Bump Release Version 4.4.0 2021-07-13 11:18:19 -05:00
Jim Miller
7ebc993891 Update translations. 2021-07-13 11:14:46 -05:00
Jim Miller
c750ebc4d5 Conceal some debug output for proxies. 2021-07-13 11:13:23 -05:00
Jim Miller
4c56c27b3b Bump Test Version 4.3.16 2021-07-12 09:21:21 -05:00
Jim Miller
ab6c023903 Add http_proxy and https_proxy setting, remove fix_broken_https_proxy setting. 2021-07-12 09:21:18 -05:00
Jim Miller
7a30473ce2 Bump Test Version 4.3.15 2021-07-11 09:12:58 -05:00
Jim Miller
0b117007dc Adding fix_broken_https_proxy setting for broken Windows https proxy settings. 2021-07-11 09:12:49 -05:00
Jim Miller
dbba0d5cb2 Bump Test Version 4.3.14 2021-07-10 11:59:46 -05:00
Jim Miller
37bb0b8e45 Add estimatedWords for XF2 sites that provide it (SB/SV mainly). Closes #712 2021-07-10 11:59:39 -05:00
Jim Miller
3063baeb20 Bump Test Version 4.3.13 2021-07-10 11:05:49 -05:00
Jim Miller
d2d2584dc9 adapter_royalroadcom: Collect numWords. #712 2021-07-10 10:57:51 -05:00
Jim Miller
edad05c2d7 XF: data-url or data-src for lazyload images. Closes #713 2021-07-10 10:36:30 -05:00
Jim Miller
344824294d Fix for XF2 stories using author img as cover when absolute URL. 2021-07-10 10:36:30 -05:00
Jim Miller
7d3c1c1e2b Bump Test Version 4.3.12 2021-07-09 13:00:39 -05:00
Jim Miller
b1f65c9c4f Update certifi to 2021.05.30 while keeping our changes. 2021-07-09 12:56:11 -05:00
Jim Miller
b3126d3996 Update packaged urllib3 to version v1.26.6 2021-07-09 12:48:12 -05:00
Jim Miller
f26bc481d4 Bump Test Version 4.3.11 2021-07-02 10:46:56 -05:00
Jim Miller
c65ce60f71 Pre-v5 Calibre fix for nsapa_proxy. 2021-07-02 10:46:48 -05:00
Jim Miller
7077c85ada Bump Test Version 4.3.10 2021-06-27 11:25:13 -05:00
Jim Miller
1dbfed1be2 Also save error column on new books because chapter errors now making it meaningful. 2021-06-27 11:25:05 -05:00
Jim Miller
f74f1a3561 Bump Test Version 4.3.9 2021-06-27 10:24:47 -05:00
Jim Miller
bd3807f168 New Site: worldofx.de (German language X-files eFiction) 2021-06-27 10:23:06 -05:00
Jim Miller
d161e21940 adapter_wwwnovelallcom - Updates for site changes. 'translator' no longer available? 2021-06-27 09:19:44 -05:00
Jim Miller
da800759ca Bump Test Version 4.3.8 2021-06-23 20:08:44 -05:00
Nicolas SAPA
f52947446a nsapa_proxy: detect&log proxy connection error
On socket.error, re-raise a ConnectionError and skip the whole
processing.
2021-06-23 20:08:03 -05:00
Nicolas SAPA
da79260189 nsapa_proxy: fix truncated reply from proxy
Sometime, we get part of the payload in the first recv() call.
Detect this case and reinject the additionnal data in the payload loop.
2021-06-23 20:08:03 -05:00
Jim Miller
0e4e3ab00a Bump Test Version 4.3.7 2021-06-21 12:18:35 -05:00
Jim Miller
5dd2d3297c Implement use_browser_cache for ficbook.net. 2021-06-21 12:18:24 -05:00
Jim Miller
2ee505706c ficbook.net doesn't use www. anymore. 2021-06-21 12:12:37 -05:00
Jim Miller
b938e15712 adapter_ficbooknet remove double / from author URL. 2021-06-21 12:04:06 -05:00
Jim Miller
93d11a4b8d Bump Test Version 4.3.6 2021-06-21 11:52:42 -05:00
Jim Miller
011e52dbb9 Treat img url contains '.svg?' same as endswith('.svg')--Calibre image processing chokes hard on SVG. #709 2021-06-21 11:44:13 -05:00
Jim Miller
ad63699c5b Add link to wiki/BrowserCacheFeature in Cloudflare error. #708 2021-06-21 10:36:11 -05:00
Jim Miller
e4d198b72b Bump Test Version 4.3.5 2021-06-16 08:56:14 -05:00
Jim Miller
63c7edcecc Forgot to import traceback for browser cache changes. 2021-06-16 08:56:06 -05:00
Jim Miller
125487003e Bump Test Version 4.3.4 2021-06-15 15:26:03 -05:00
Jim Miller
935a0b2413 Browser Cache Firefox Cache2 -- Skip with warning on bad file parse instead of error. #706 2021-06-15 15:25:34 -05:00
Jim Miller
cce2f18d0c Browser Cache Chome Simple Cache -- Skip with warning on bad file parse instead of error. 2021-06-15 15:16:55 -05:00
Jim Miller
bccb7eed85 Bump Test Version 4.3.3 2021-06-10 11:05:22 -05:00
Jim Miller
48917b4234 XF: Take datePublished from first post and fix date reading. 2021-06-10 11:05:14 -05:00
Jim Miller
18226e2fe1 Change defaults.ini to correctly show default values for remove_spoilers, legend_spoilers 2021-06-08 11:14:52 -05:00
Jim Miller
b57094dc5d Bump Test Version 4.3.2 2021-06-03 11:20:34 -05:00
Jim Miller
91a7ce01a3 Collect 'fandoms' for adapter_scribblehubcom. 2021-06-03 11:20:23 -05:00
Jim Miller
c26c9be76f Bump Test Version 4.3.1 2021-06-01 10:34:02 -05:00
Jim Miller
a020de9f99 PI: Only update author link (AKA authorUrl) in Calibre if changed. Only
really affects older versions of Calibre.  Calibre does the same
starting around Sept 2020.
2021-06-01 10:33:00 -05:00
Jim Miller
9f270e2b91 Bump Release Version 4.3.0 2021-05-30 13:09:12 -05:00
Jim Miller
45ecbf8ede Bump Test Version 4.2.4 2021-05-23 13:18:18 -05:00
Jim Miller
d5d2bae774 Update translations. 2021-05-23 13:18:16 -05:00
Jim Miller
10e198c7ba quotev.com: use_cloudscraper:true by default. 2021-05-23 13:17:11 -05:00
Jim Miller
d3034dc8df Remove Python 2.7 for CLI. Reported to not work, I don't have a 2.7 to test. Accidentally included in 872644cbe6. 2021-05-13 09:39:43 -05:00
Jim Miller
872644cbe6 Bump Test Version 4.2.3 2021-05-13 09:24:14 -05:00
Jim Miller
1bd9c9667d adapter_bdsmlibrarycom: Set author Anonymous when author not found instead of sys.exit(). Closes #696 2021-05-13 09:24:01 -05:00
Jim Miller
64759be173 Bump Test Version 4.2.2 2021-05-06 10:44:37 -05:00
Jim Miller
213f790f0e Fix XF authorUrl and author_avatar_cover feature. Closes #695 2021-05-06 10:44:16 -05:00
Jim Miller
73a4d83eda Bump Test Version 4.2.1 2021-05-04 12:48:13 -05:00
Jim Miller
6533f1a3c6 Report browser cache load fail as such. 2021-05-04 12:48:01 -05:00
Jim Miller
2a4a09f562 Add CLI --color option for warns and fails. Closes #692 2021-05-04 12:36:11 -05:00
Jim Miller
b77b5ccc1b Fix a typo in comment. 2021-05-04 12:26:55 -05:00
Jim Miller
66fb5d7bab Catch exceptions in word count. 2021-05-04 09:03:56 -05:00
Jim Miller
6d6f273787 Bump Release Version 4.2.0 2021-04-30 08:42:27 -05:00
Jim Miller
9eee629c38 Bump Release Version 4.2.0 2021-04-30 08:42:01 -05:00
Jim Miller
c8a695c735 Bump Test Version 4.1.10 2021-04-20 19:15:52 -05:00
Jim Miller
acd86e3902 Tweak some comments for image settings in defaults.ini 2021-04-20 19:15:52 -05:00
Jim Miller
0d1dd7ab5c Need to set logger.setLevel() again with import changes to see debugs in plugin. 2021-04-20 19:15:52 -05:00
Jim Miller
ef7ba42f9a
Merge pull request #689 from AlexRiina/adapter-novelfull-2
add more story meta-data and fix more extra chapter headings
2021-04-20 19:15:26 -05:00
Alex Riina
536a759a7f default to https 2021-04-20 18:47:23 -04:00
Alex Riina
6207a2fdf7 add more story meta-data and fix more extra chapter headings 2021-04-20 18:43:49 -04:00
Alex Riina
9bc70b79e6
Add New Site: novelfull.com (#688)
* Add basic support for novelfull.com

* remove extra log line

* set status to in-progress when not completed

* leave description as html to rely on existing conversion

* force removal of paragraphs with chapter headers

The previous version sometimes finds text elements which don't have a
decompose method, so forcing Beautiful Soup to find paragraph tags
ensures this will not crash

* add authors separately

* parse genre too
2021-04-20 09:13:35 -05:00
Jim Miller
2cd6f53f76 Bump Test Version 4.1.9 2021-04-16 11:17:09 -05:00
Jim Miller
4fb60c0a9c import changes for arch linux system plugins - As submitted by eli-schwartz 2021-04-16 11:16:29 -05:00
Jim Miller
1d3067dfec Bump Test Version 4.1.8 2021-04-15 13:28:13 -05:00
Jim Miller
92cc03cf6e Remove site: www.deepinmysoul.net, moved to deepinmysoul.nl and changed software--not eFiction anymore, <100 stories, all old 2021-04-15 13:16:02 -05:00
Jim Miller
3724695d23 Remove site: site there, efiction broken, last successful 2018-12-04 2021-04-15 13:14:27 -05:00
Jim Miller
6c0020fc4f Remove site: fanfic.castletv.net, DNS there, no server, last successful 2018-10-21 2021-04-15 13:12:10 -05:00
Jim Miller
583dd45610 Remove site: www.potterfics.com, 'Potterfics.com has closed its doors forever' 2021-04-15 13:10:21 -05:00
Jim Miller
87c30b3239 Remove site: www.thepetulantpoetess.com, no longer efiction, URLs like OTW but different 2021-04-15 13:07:51 -05:00
Jim Miller
0053a29c64 Remove site: www.thundercatsfans.org, no longer efiction, now static pages and PDFs 2021-04-15 13:06:01 -05:00
Jim Miller
c9a9e2e2d6 Remove site: fictionpad.com, completely different site, looks like generic parked? 2021-04-15 13:01:46 -05:00
Jim Miller
867f3fdb49 Bump Test Version 4.1.7 2021-04-14 19:25:34 -05:00
Jim Miller
72ed9fcb4a Missing import time 2021-04-14 14:15:16 -05:00
Jim Miller
fec9ec0a04 Fix some debug output. 2021-04-14 13:22:15 -05:00
Jim Miller
78ed49a45f Move some debug output. 2021-04-14 13:20:13 -05:00
Jim Miller
aa88aacfe9 Merge branch 'main' into proxy 2021-04-14 13:17:28 -05:00
Jim Miller
8e9a734299 PI: imports from calibre_plugins.fanficfare_plugin.fanficfare not .fanficfare. 2021-04-13 21:58:38 -05:00
Jim Miller
1bfa1bc62b Bump Test Version 4.1.6 2021-04-12 19:19:00 -05:00
Jim Miller
a757d97a40 Updates from cloudscraper 1.2.58 2021-04-09 20:54:39 -05:00
Jim Miller
711c9e3ad4 Merge branch 'main' into proxy 2021-04-04 10:13:14 -05:00
Jim Miller
f139e6ea94 Bump Test Version 4.1.5 2021-04-04 10:04:56 -05:00
Jim Miller
5ff77100c5 Add use_cloudscraper:true under [www.ficbook.net] to defaults.ini 2021-04-04 10:04:51 -05:00
Jim Miller
16478cdd5a Merge branch 'main' into proxy 2021-04-04 08:33:43 -05:00
Malloc Voidstar
192ca9b444
Add jpg_quality to example INIs
More easily discoverable for users who want to save space.
2021-04-03 21:55:30 -07:00
Malloc Voidstar
6a12c4d52f
Use newer Calibre image processing, add JPG quality setting 2021-04-03 15:54:29 -07:00
Jim Miller
3e73f6c708 Bump Test Version 4.1.4 2021-04-01 12:32:06 -05:00
Jim Miller
878aa0b1a6 adapter_fanfictionnet: Report removed category error. 2021-04-01 12:31:57 -05:00
Jim Miller
b9fc710a87 adapter_fanfictionnet: Report removed category error. 2021-04-01 12:31:22 -05:00
Jim Miller
617a76a3d1 Merge branch 'proxy' of https://github.com/JimmXinu/FanFicFare into proxy 2021-03-31 15:49:58 -05:00
Jim Miller
2c8e87e85b Add nsapa_proxy feature, from nsapa. 2021-03-31 15:49:36 -05:00
Jim Miller
e16694a798 Bump Test Version 4.1.3 2021-03-31 11:40:53 -05:00
Jim Miller
0971c3c76b Fix for adapter_literotica story URL oddities. 2021-03-31 11:40:46 -05:00
Jim Miller
f6ac55beb6 Add nsapa_proxy feature, from nsapa. 2021-03-30 12:41:21 -05:00
Jim Miller
d10c357036 Bump Test Version 4.1.2 2021-03-29 17:01:52 -05:00
Jim Miller
f2c061080f Fix for site change: adapter_fictionmaniatv 2021-03-29 16:59:43 -05:00
Jim Miller
dd75be8efe Bump Test Version 4.1.1 2021-03-28 17:22:56 -05:00
Jim Miller
09828fc9c8 Add [harrypotterfanfiction.com] slow_down_sleep_time:1 -- Site blocking fast downloads. 2021-03-28 17:22:44 -05:00
Jim Miller
b1a1d7c6bc Fix for adapter_literotica changing URLs on author page yet again. 2021-03-28 17:08:20 -05:00
Jim Miller
40835a969b Fix adapter_siyecouk url pattern for extra params. 2021-03-27 22:05:32 -05:00
Matěj Cepl
a1c1bbd2d5
Get storyId for sugarquillnet and siyecouk from the parsed query string.
Fixes #652
2021-03-27 13:28:50 +01:00
Jim Miller
41ba08d2f6 Bump Release Version 4.1.0 2021-03-26 12:13:57 -05:00
Jim Miller
03b93bb9d7 Bump Test Version 4.0.13 2021-03-21 10:13:53 -05:00
Jim Miller
14c4e30576 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare into main 2021-03-21 10:01:27 -05:00
David
52ae3d1ec0 Update for recent site change and fix first chapter
Recent site changes mean the keywords meta attribute doesn't have spaces after the comma.
Also, sometimes the first chapter does not have a number. Was defaulting to "Chapter 1" but this works it out based on later chapters.
2021-03-21 10:01:04 -05:00
Jim Miller
0ca5326261 Update translations. 2021-03-20 09:13:45 -05:00
Jim Miller
69b6fcc17b Bump Test Version 4.0.12 2021-03-20 09:07:05 -05:00
Hazel Shanks
fb7abb7bee Fix off-by-one error in most_recent_chunk / add_chapter_url interaction, closes #672 2021-03-20 09:06:45 -05:00
Jim Miller
824e33abcd Bump Test Version 4.0.11 2021-03-19 12:52:48 -05:00
Jim Miller
d6c7064254 Fixes for literotica sites changes. Issue #671 2021-03-19 12:52:34 -05:00
Jim Miller
f324c284ff Bump Test Version 4.0.10 2021-03-13 12:42:48 -06:00
Jim Miller
35dbb1967b Issue with fiction.live setting in defaults[fiction.live] overriding personal[www.fiction.live]. Could use a more general solution if I can think of one. 2021-03-13 12:40:24 -06:00
Jim Miller
25427e17aa Check for 'failedtoload' *before* trying to fetch. 2021-03-13 12:23:30 -06:00
Jim Miller
7e18176ffc Fix for include_dice_rolls when multiple fieldsets. 2021-03-13 12:22:36 -06:00
Jim Miller
7bc0be1788 Bump Test Version 4.0.9 2021-03-11 13:52:08 -06:00
Jim Miller
16f9081a80 Apply user_agent when falling back to urllib2 (no matching site adapter) 2021-03-11 13:52:00 -06:00
Jim Miller
35e10c0a8b Bump Test Version 4.0.8 2021-03-11 11:21:31 -06:00
Jim Miller
237c004e20 Merge branch 'main' of https://github.com/JimmXinu/FanFicFare into main 2021-03-11 11:21:04 -06:00
Jim Miller
0e1d97915c
Merge pull request #668 from HazelSh/fictionlive
minor changes to track fictionlive website updates
2021-03-11 11:20:56 -06:00
Jim Miller
ba50bff441 Bump Test Version 4.0.7 2021-03-09 12:26:50 -06:00
Jim Miller
d97c4607a1 Add include_dice_rolls option 2021-03-09 12:26:08 -06:00
Jim Miller
c7e716eaa0 Fix for caching and /posts/ vs #post URLs. 2021-03-08 11:22:24 -06:00
Jim Miller
c8f51ac64b Fix show_timestamps option in adapter_fictionlive 2021-03-07 17:48:14 -06:00
Hazel Shanks
9693cc4c35 support vote titles 2021-03-08 10:16:57 +13:00
Hazel Shanks
9c53660299 December 2020 fiction.live updates: removed 'key tags'/top tags 2021-03-08 10:16:57 +13:00
Jim Miller
97dabcfceb Bump Test Version 4.0.6 2021-03-06 17:17:44 -06:00
Jim Miller
64666069d4 Include error for continue_on_chapter_error in log 2021-03-06 09:51:29 -06:00
Jim Miller
e1ff7e9739 Bump Test Version 4.0.5 2021-03-02 09:19:08 -06:00
Jim Miller
715173f513 Put 'Change theme to Classic' back in adapter_storiesonlinenet 2021-03-02 09:19:01 -06:00
Jim Miller
158b4b7553 Bump Test Version 4.0.4 2021-02-25 13:04:46 -06:00
Jim Miller
e2d6614fe7 use_ssl_unverified_context:true ignored when use_clouadscraper:true 2021-02-23 15:08:49 -06:00
Jim Miller
8a7423d27f Remove some dup imports/code, thanks akshgpt7. Closes #663 2021-02-23 15:08:42 -06:00
Jim Miller
494e3fbaaa Bump Test Version 4.0.3 2021-02-22 12:03:37 -06:00
Jim Miller
b9f8d4e314 Fixes for ancient 'import *' getting broken by removing unused imports in base_writer 2021-02-22 12:03:27 -06:00
Jim Miller
5ea0a3d140 Bump Test Version 4.0.2 2021-02-20 16:36:06 -06:00
Jim Miller
a518de60b5 Fix for BG job race conditions. 2021-02-20 16:33:27 -06:00
Jim Miller
d9d61a04a8 Bump Test Version 4.0.1 2021-02-20 15:50:36 -06:00
Jim Miller
fc4ff3d2de Update plugin about.html 2021-02-20 15:49:49 -06:00
Jim Miller
4559314798 Fix writer_txt import removeAllEntities 2021-02-20 15:38:23 -06:00
Jim Miller
2192b4fccd Fix reduce_zalgo not imported. 2021-02-20 15:15:22 -06:00
Jim Miller
8b8dce8ba9 defaults.ini tweaks 2021-02-19 13:09:55 -06:00
Jim Miller
b1333ad5c2 Bump Release Version 4.0.0 2021-02-18 19:23:50 -06:00
Jim Miller
e6e51fc9fd Bump Test Version 3.99.33 2021-02-17 09:23:05 -06:00
Jim Miller
3222b0cedd Don't require index file in Firefox cache2. 2021-02-17 09:22:58 -06:00
Jim Miller
bc2cea6187 Bump Test Version 3.99.32 2021-02-16 15:25:05 -06:00
Jim Miller
eba91eaf65 Fix for browser_cache with bgmeta. 2021-02-16 15:24:18 -06:00
Jim Miller
d8daf768a9 Bump Test Version 3.99.31 2021-02-15 17:09:28 -06:00
Jim Miller
670995ba3b Add additional_images setting for html & epub formats. Close #648 2021-02-15 17:09:02 -06:00
Jim Miller
591bcc663b Handle /post-# urls better in base_xenforoforum_adapter 2021-02-14 11:01:51 -06:00
Jim Miller
aed2c5743f Bump Test Version 3.99.30 2021-02-13 19:16:29 -06:00
Jim Miller
99aef392fe Update translations. 2021-02-13 19:16:28 -06:00
Jim Miller
9345f6b875 Fix for 'Updating' (anonymous?) author in adapter_wuxiaworldsite. For #657 2021-02-13 19:14:34 -06:00
Jim Miller
e9bf516bb9 Accept params in any order for adapter_sugarquillnet 2021-02-13 10:51:19 -06:00
Jim Miller
5faa05abf6 Clarify comment about ONE browser_cache_path setting 2021-02-13 09:58:45 -06:00
Jim Miller
6252203b85 Bump Test Version 3.99.29 2021-02-12 14:04:03 -06:00
Jim Miller
95dad358af Update notification percent by chapter range. 2021-02-12 14:01:28 -06:00
Jim Miller
3e719d7671 Update strings for translation. 2021-02-12 13:42:05 -06:00
Jim Miller
a62e02a6ad sugarquill.net can benefit from basic cache. 2021-02-12 13:27:16 -06:00
Jim Miller
094afe8819 Fix for race condition in BG processes finishing, clean up BG output some 2021-02-12 13:03:34 -06:00
Jim Miller
115cb44948 Update included urllib3 to 1.26.3 2021-02-11 19:41:58 -06:00
Jim Miller
becca6e157 Bump Test Version 3.99.28 2021-02-11 17:26:36 -06:00
Jim Miller
40bf0dab66 Update translations. 2021-02-11 17:26:23 -06:00
Jim Miller
3611ccc16c Rename 'email' proc flag, conflict with 'email' import. 2021-02-11 17:24:39 -06:00
Jim Miller
477c0562a2 Apply 'email' proc flag when processing dragged .eml files--which are emails. 2021-02-11 17:24:13 -06:00
Jim Miller
904385e502 Drag/drop of 'emails'--look for story URL in Content-Base header for Thunderbird RSS 'emails'. 2021-02-11 14:57:21 -06:00
Jim Miller
72f8da76e5
Merge pull request #654 from mcepl/testing
Make testing working again
2021-02-11 12:30:53 -06:00
Matěj Cepl
97e789846c Make testing working again 2021-02-11 01:09:19 +01:00
Jim Miller
672ff9038b Allow tweak_fg_sleep etc with any site, add defaults settings for basexenforo, fictionalley and fictionpress. 2021-02-10 12:40:58 -06:00
Jim Miller
dddf955fae Bump Test Version 3.99.27 2021-02-10 12:39:52 -06:00
Jim Miller
9562794e24 Allow tweak_fg_sleep etc with any site, add defaults settings for basexenforo, fictionalley and fictionpress. 2021-02-10 12:38:22 -06:00
Jim Miller
1e5f10888b Bump default user_agent to upcoming FFF/4.X. 2021-02-10 12:01:48 -06:00
Jim Miller
2c94c90748 Remove some dead code from adapter_webnovelcom 2021-02-10 12:01:48 -06:00
Jim Miller
bb925dda04 Fix adapter_webnovelcom for some site changes. Closes #644 2021-02-10 11:54:58 -06:00
Jim Miller
c387e708e1 Comment out some debugs 2021-02-10 11:54:28 -06:00
Jim Miller
9b6657edb3 Bump Test Version 3.99.26 2021-02-09 20:19:31 -06:00
Jim Miller
09b05199d1 Py2/py3 for Empty. 2021-02-09 20:18:46 -06:00
Jim Miller
61e35f0b53 Bump Test Version 3.99.25 2021-02-09 19:59:17 -06:00
Jim Miller
78c6831226 More test1 cases. 2021-02-09 19:59:17 -06:00
Jim Miller
43cf842721 Really fix 'Don't trust 100% to count jobs finished.' 2021-02-09 19:54:00 -06:00
Jim Miller
760a5cbc9c Bump Test Version 3.99.24 2021-02-09 17:55:38 -06:00
Jim Miller
b60c83bfd5 Don't trust 100% to count jobs finished. 2021-02-09 17:55:38 -06:00
Jim Miller
53fe026cfe Basic version of BG % done reporting by stories & chapters. 2021-02-09 14:51:22 -06:00
Jim Miller
f8bfc49ea8 Make easy to add addition testX.com adapters for multi-site testing. 2021-02-09 14:51:12 -06:00
Jim Miller
3f43e5b929 Bump Test Version 3.99.23 2021-02-09 13:26:38 -06:00
Jim Miller
fc049f53e0 fanficauthors.net doesn't use login anymore. Closes #651 2021-02-09 13:26:28 -06:00
Jim Miller
34608575c7 Bump Test Version 3.99.22 2021-02-08 16:02:34 -06:00
Jim Miller
868742f9d9 Add scandir & stat checking for simplecache for performance. 2021-02-08 16:02:24 -06:00
Jim Miller
bb5e5166f6 Comment out a debug. 2021-02-08 16:02:00 -06:00
Jim Miller
212d076a50 Bump Test Version 3.99.21 2021-02-08 14:29:52 -06:00
Jim Miller
1bc524db2d Fix notifications for BG proc per site, reinstate pool_size=Calibre cpus setting. 2021-02-08 14:29:40 -06:00
Jim Miller
48df9f2023 Incomplete BG proc per site--no notifications pool size fixed. 2021-02-07 21:26:24 -06:00
Jim Miller
bb7a4f3ea4 Restore multi-process BG--network bound and threading had problems. 2021-02-07 20:23:24 -06:00
Jim Miller
c0a1996589 Also lock BasicCache on load/save 2021-02-07 20:09:19 -06:00
Jim Miller
555c675209 Add a base_efiction_adapter comment about why it's hitting a req so much. 2021-02-07 20:08:55 -06:00
Jim Miller
081bf75ba0 Update defaults.ini for Firefox cache reading 2021-02-07 15:49:25 -06:00
Jim Miller
797dc6e420 Bump Test Version 3.99.20 2021-02-07 15:27:32 -06:00
Jim Miller
adfc7494d1 First version of Firefox cache2 reader. 2021-02-07 15:02:36 -06:00
Jim Miller
d708e91725 Count stories downloading in BG better. 2021-02-06 19:45:13 -06:00
Jim Miller
c46d911cc4 Comment out some debugs in epubutil.py 2021-02-06 08:36:47 -06:00
Jim Miller
f33a5de8b3 More properly get msg payload fromemail drag and drop. Closes #645 2021-02-05 17:43:24 -06:00
Jim Miller
8428110d67 Don't count continue_on_chapter_error chapters when checking vs 'new chapters'. 2021-02-05 13:30:26 -06:00
Jim Miller
c294446082 Bump Test Version 3.99.19 2021-02-05 13:11:57 -06:00
Jim Miller
98feb81475 Remove commented out BG process code--shared caches/cookies wouldn't work with it. 2021-02-05 13:07:23 -06:00
Jim Miller
f917e3955c Change browser_cache to load on-demand and share cache/cookie instances in Calibre BG jobs. 2021-02-05 13:06:14 -06:00
Jim Miller
2e17e3bef4 Report next chapter in errors for check_next_chapter 2021-02-04 22:53:10 -06:00
Jim Miller
ecf22fea39 First cut single BG proc--same as before, just one proc. 2021-02-04 19:22:57 -06:00
Jim Miller
a78f8e94ee Add defaults.ini comment about background-color CSS on <body> not working in epubs all the time anymore. 2021-02-04 15:33:40 -06:00
Jim Miller
630945570d Don't need debugs around xenforo decodeEmail anymore. 2021-02-04 15:20:16 -06:00
Jim Miller
18b29a3f4e Include browser cache settings in defaults.ini. 2021-02-04 14:32:21 -06:00
Jim Miller
c1ae854548 Bump Test Version 3.99.18 2021-02-04 14:30:09 -06:00
Jim Miller
e9f899ab57 Include browser cache settings in defaults.ini. 2021-02-04 14:29:52 -06:00
Jim Miller
fc7ec6b89a Give adapter_archiveofourownorg a get_section_url() for rejects checks. 2021-02-04 14:18:54 -06:00
Jim Miller
2a93c9191a Implement browser_cache_age_limit setting (float hours to use cache) 2021-02-04 14:08:06 -06:00
Jim Miller
312179570b Bump Test Version 3.99.17 2021-02-03 19:18:52 -06:00
Jim Miller
5b9d7b422f Update brotlidecpy. 2021-02-03 19:14:07 -06:00
Jim Miller
d3f073a630 Change http status code for use_browser_cache_only misses to not invoke StoryDoesNotExist 2021-02-03 16:11:26 -06:00
Jim Miller
8506ed9b5b Bump Test Version 3.99.16 2021-02-03 13:43:17 -06:00
Jim Miller
49bbd95880 Remember, browser cache is hard coded to only work with ffnet so far. 2021-02-03 13:19:49 -06:00
Jim Miller
4d1326c1bb Add bit more diag output. 2021-02-03 13:17:04 -06:00
Jim Miller
d6684663bb Fail more gracefully on skip_author_cover when author page not found. 2021-02-03 13:10:38 -06:00
Jim Miller
4c93dc7097 Apply what ini restrictions there are to new parameters. 2021-02-03 13:03:45 -06:00
Jim Miller
14f3d71f70 Browser cache sharing in plugin too. 2021-02-03 12:54:40 -06:00
Jim Miller
5ee4a2e572 Browser cache sharing in CLI, but not plugin. 2021-02-03 12:18:03 -06:00
Jim Miller
5f4504ccf2 Bump Test Version 3.99.15 2021-02-03 10:51:49 -06:00
Jim Miller
5ab1779b4c Need newer requests/urllib3 for Retry(other=) 2021-02-03 10:51:26 -06:00
Jim Miller
6eb7597d8b Add pywin32 as a windows only CLI dependency. 2021-02-03 10:21:45 -06:00
Jim Miller
0cc7aa54c7 Add share_open() for windows locked file nonsense. 2021-02-03 09:54:54 -06:00
Jim Miller
badfc73fcb Bump Test Version 3.99.12 2021-02-03 09:13:04 -06:00
Jim Miller
cbbdc20601 Include fanficfare.browsercache.chromagnon in CLI pip distro. 2021-02-03 09:12:56 -06:00
Jim Miller
0fa4dfda3f Bump Test Version 3.99.11 2021-02-03 09:09:09 -06:00
Jim Miller
88fe2477eb Include fanficfare.browsercache in CLI pip distro. 2021-02-03 09:08:57 -06:00
Jim Miller
c2afc72d6c Bump Test Version 3.99.10 2021-02-03 08:52:07 -06:00
Jim Miller
fc5123ff1f Add use_browser_cache_only, may rename later. 2021-02-03 08:49:46 -06:00
Jim Miller
cb27cb64b6 Update to brotlidecpy v1.0.3 2021-02-03 08:45:53 -06:00
Jim Miller
a3ff446a3d adapter_fanfictionnet: Don't hit server for crossover category's. Keep a list of category's with '+' in. 2021-02-02 20:28:12 -06:00
Jim Miller
1a3a6ec1e0 Comment out a debug. 2021-02-02 20:17:11 -06:00
Jim Miller
886e37b168 Bump Test Version 3.99.9 2021-02-02 17:35:05 -06:00
Jim Miller
9f13145b2c Still only for ffnet, but browser cache now saves the newest entry and other improvements. 2021-02-02 17:34:58 -06:00
Jim Miller
af241ca42c Make CLI chapter errors report more noticable. 2021-02-02 16:39:06 -06:00
Jim Miller
0b9b066c18 Revert brotlidecpy back v1.0.0. 2021-02-02 16:38:13 -06:00
Jim Miller
fd1a1e357b Another chapter normalize pattern for base_xenforoforum_adapter. 2021-02-02 16:37:48 -06:00
Jim Miller
0795225cc7 Fix SimpleCache.is_cache_dir 2021-02-02 11:15:01 -06:00
Jim Miller
738d520938 Strip ffnet urls of title too 2021-02-02 10:01:30 -06:00
Jim Miller
dd0571d4bd Update to brotlidecpy v1.0.2 2021-02-02 09:17:42 -06:00
Jim Miller
3e4193a6d5 Bump Test Version 3.99.8 2021-02-01 20:05:34 -06:00
Jim Miller
07dee591ab First (terrible ffnet only) version of scanning, remembering and using cache keys. 2021-02-01 20:04:47 -06:00
Jim Miller
e7e183b296 Add use_browser_cache, rename browser_cache_path 2021-02-01 17:55:50 -06:00
Jim Miller
eb04b3b7e4 Rename pagecache basic_cache. 2021-02-01 17:50:10 -06:00
Jim Miller
2078e5923f Add use_pagecache as a recognized INI keyword. 2021-02-01 17:40:25 -06:00
Jim Miller
b6751fddf4 Standardize hit/miss debugs a bit more. 2021-02-01 17:35:08 -06:00
Jim Miller
4dd1488fec Require brotli for CLI. 2021-02-01 17:34:42 -06:00
Jim Miller
c1ecaf668e Refactor browser cache stuff some. 2021-02-01 17:02:57 -06:00
Jim Miller
26f9ef0290 Incorporate changes from #642 (manually) 2021-02-01 14:12:05 -06:00
Jim Miller
2d6a67ff18 Found cache should return redirecturl=url not None. 2021-02-01 10:39:36 -06:00
Jim Miller
32db6e2036 Update to brotlidecpy v1.0.1 2021-02-01 10:39:12 -06:00
Jim Miller
d3cb8e6be5 Import BrowserCache earlier so brotli-dict loads correctly. 2021-01-31 17:52:35 -06:00
Jim Miller
6cfc27cb87 Slightly better that hard coding ffnet domain in. 2021-01-31 17:36:43 -06:00
Jim Miller
3ee144475c Add BrowserCache and decorate fetcher when chrome_cache_path set. 2021-01-31 17:23:21 -06:00
Jim Miller
c5b538c724 Comment out a 'not found' print. 2021-01-31 17:22:00 -06:00
Jim Miller
5b587a8608 Comment out a 'not found' print. 2021-01-31 17:21:41 -06:00
Jim Miller
5ce7c00ac3 Bring in dependencies from ffnet-chrome-cache-fetch 2021-01-31 16:32:34 -06:00
Jim Miller
c0d283b9c2 Bump Test Version 3.99.7 2021-01-31 15:38:24 -06:00
Jim Miller
0c100d5917 Tweak debug output 2021-01-31 15:38:22 -06:00
Jim Miller
1c0d7f93f7 Older pyqt versions don't have is_dark_theme 2021-01-31 15:29:11 -06:00
Jim Miller
f99810a1ca Remove a stray space from ini. 2021-01-31 15:28:16 -06:00
Jim Miller
2ee5d71821 Py2/py3 conflict with 'what is a class'. 2021-01-31 15:13:06 -06:00
Jim Miller
6bcf0f5499 Bump Test Version 3.99.6 2021-01-31 14:34:06 -06:00
Jim Miller
5d00f16003 Tweak fetcher debug output. 2021-01-31 14:34:06 -06:00
Jim Miller
86dee0081d Convert adapter_spikeluvercom to base_efiction after site changes. 2021-01-31 13:08:52 -06:00
Jim Miller
623158bb01 Enable pagecache for base_efiction. 2021-01-31 13:07:50 -06:00
Jim Miller
e3217dfed6 Bump Test Version 3.99.5 2021-01-31 11:45:30 -06:00
Jim Miller
7aa451a3c1 Clean up debug output a little. 2021-01-31 11:42:33 -06:00
Jim Miller
f02b854343 Remove 'extrasleep' feature--it was ugly--increase slow_down_sleep_time for the couple sites that used it. We can trust the users, right? 2021-01-31 11:34:38 -06:00
Jim Miller
fad21498d2 Bump Test Version 3.99.4 2021-01-31 10:48:35 -06:00
Jim Miller
e1c27f8841 Merge branch 'fetch_refactor' into master 2021-01-31 10:47:06 -06:00
Jim Miller
e380560cb3 'Remove' unsupported --unverified_ssl CLI option. Use use_ssl_unverified_context:true 2021-01-31 10:38:29 -06:00
Jim Miller
03ebc65f6b Refactor autosave cookie/cache in CLI, etc. 2021-01-31 10:32:42 -06:00
Jim Miller
b1c2fe6885 Bump Test Version 3.99.3 2021-01-31 09:55:07 -06:00
Jim Miller
3c20a4c247 BasicCache save/load working. 2021-01-31 09:53:42 -06:00
Jim Miller
e4d81f0dff Busy cursor while building RejectListDialog 2021-01-31 09:50:56 -06:00
Jim Miller
2191498ef6 Bump Test Version 3.29.6 2021-01-30 17:24:35 -06:00
Jim Miller
8ef5dcc1b3 Handle case of story URLs from removed sites in rejects more gracefully. 2021-01-30 17:24:12 -06:00
Jim Miller
9fb72efa15 Fix adapter_fictionpresscom url pattern properly. 2021-01-30 17:22:30 -06:00
Jim Miller
773f83bb06 Bump Test Version 3.29.5 2021-01-30 16:20:39 -06:00
Jim Miller
e2b632c99a Merge branch 'get_section_url' 2021-01-30 16:18:34 -06:00
Jim Miller
586ddce59f Make get_section_url classmethod for performance. 2021-01-30 16:06:20 -06:00
Jim Miller
6695b9a846 get_section_url for rejects 2021-01-30 15:01:46 -06:00
Jim Miller
6ffdf768bb Comment out out debug output. 2021-01-30 15:00:31 -06:00
Jim Miller
1e38646026 Comment out out debug output. 2021-01-30 14:57:15 -06:00
Jim Miller
ef6ceaf8b0 Fix so existing [.../1/] INI sections will work for ffnet, refactor _section_url to get_section_url 2021-01-30 14:27:51 -06:00
Jim Miller
0c0534ea74 Refactor set_sleep(_override) a bit better. 2021-01-29 18:06:09 -06:00
Jim Miller
8ba5d2c423 Refactor use_pagecache into an INI setting and a sharable, thread safe cache impl. 2021-01-29 17:31:30 -06:00
Jim Miller
ddf82749af Bump Test Version 3.99.1 2021-01-29 13:41:11 -06:00
Jim Miller
3f6793b301 Refactor sleep and progressbar into FetcherDecorators 2021-01-29 13:30:49 -06:00
Jim Miller
67d4eb46ee Fix for intermediate fetcher/cache for plugin. 2021-01-29 12:07:47 -06:00
Jim Miller
ff9db222b3 Repackage brotlidecpy to not use _brotlidecpy 2021-01-29 11:06:40 -06:00
Jim Miller
166a7795d6 Switch bundled brotli to @sidney's py2/py3 brotlidecpy 2021-01-29 10:26:32 -06:00
Jim Miller
dd261dec96 Re-refactor cache code to not be a fetcher and instead dynamically 'decorate' fetcher do_request with caching code. 2021-01-28 22:10:54 -06:00
Jim Miller
018bd04305 Fetcher comments & tweaks 2021-01-28 21:48:49 -06:00
Jim Miller
7c6817bc4f Don't use _method names in new code. 2021-01-28 21:16:23 -06:00
Jim Miller
f8d976f42e Refactor fetcher cache into a dynamic subclass of Fetcher impls. 2021-01-28 21:14:15 -06:00
Jim Miller
d237ac849c Plugin: Only run one BG download at time. 2021-01-28 20:23:37 -06:00
Jim Miller
e1f9de264f Get cloudscraper 1.2.56 changes, but keep py2 compat. 2021-01-28 20:23:37 -06:00
Jim Miller
1d55f4778d Bump Test Version 3.29.4 2021-01-28 20:03:55 -06:00
Jim Miller
20b5b8fb95 Plugin: Only run one BG download at time. 2021-01-28 20:03:42 -06:00
Jim Miller
eb63b8bae5 Refactor cookiejar into Fetcher. 2021-01-28 17:41:33 -06:00
Jim Miller
5922d027b7 Bump Test Version 3.29.3 2021-01-28 09:40:28 -06:00
Jim Miller
a46edf092d Get cloudscraper 1.2.56 changes, but keep py2 compat. 2021-01-28 09:38:32 -06:00
Jim Miller
aa5706f372 Fix Referer header code. 2021-01-27 13:35:42 -06:00
Jim Miller
75999010f0 Refactor CloudScraperFetcher subclass 2021-01-27 12:29:47 -06:00
Jim Miller
a906d8f26b Partial refactoring of cache code. 2021-01-27 11:50:55 -06:00
Jim Miller
67d9eb92f4 More refactoring and consolidation of fetch code. 2021-01-26 18:44:36 -06:00
Jim Miller
59a19a7510 Refactor requests code into own fetcher subclass. 2021-01-26 16:28:40 -06:00
Jim Miller
8894b87212 Consolidate http header code 2021-01-26 12:33:39 -06:00
Jim Miller
682b3ba325 Tweak ffnet plugin sleep setting. 2021-01-26 11:17:02 -06:00
Jim Miller
3c67c4bf13 Remove httplib._MAXHEADERS workaround for royalroad.com 2021-01-26 10:42:02 -06:00
Jim Miller
c8aec09a0e Problem with lushstories.com cookies not happeneing anymore. 2021-01-26 10:29:11 -06:00
Jim Miller
ec13618224 Add requests-file to CLI dependencies. 2021-01-26 10:09:15 -06:00
Jim Miller
dfc68fd0ed HTTPError->HTTPErrorFFF all for trekfanfiction.net's server misconfiguration? 2021-01-26 10:08:28 -06:00
Jim Miller
12a7caa667 Bump Test Version 3.99.0 2021-01-25 21:29:28 -06:00
Jim Miller
80a131b555 Remove unused imports . 2021-01-25 21:27:21 -06:00
Jim Miller
10993a4fe2 Remove unused imports calibre-plugin 2021-01-25 21:09:24 -06:00
Jim Miller
00d9c42e57 Remove unused imports writers 2021-01-25 20:54:11 -06:00
Jim Miller
66520e236c Remove unused imports fanficfare 2021-01-25 20:53:31 -06:00
Jim Miller
edd089237e Remove unused imports adapters 2021-01-25 20:53:04 -06:00
Jim Miller
1d8c10b168 Re-impl use_ssl_unverified_context option. 2021-01-25 20:30:08 -06:00
Jim Miller
ca9ea7ef99 Need User-Agent for POST too. Still need to integrate get/post. 2021-01-25 19:28:19 -06:00
Jim Miller
ea29473239 Consolidate 404->StoryDoesNotExist checks in one place. 2021-01-25 18:13:17 -06:00
Jim Miller
8175361275 Update and keep kludge for 500 trekfanfiction.net server error. 2021-01-25 14:08:39 -06:00
Jim Miller
4959b6eb4f Refactor out some unneeded methods. 2021-01-25 13:20:57 -06:00
Jim Miller
bd3fb5dfe1 Fix file:// fetch mostly for default_cover_image 2021-01-25 12:54:15 -06:00
Jim Miller
88bf48ce44 Remove unused copy of six.py 2021-01-25 12:49:01 -06:00
Jim Miller
b9998abc48 Replace own retry system with urllib3.util.retry.Retry. 2021-01-25 12:34:49 -06:00
Jim Miller
b0cbb7da0d Fix INI _filelist by not using same fetcher anymore. 2021-01-24 17:12:07 -06:00
Jim Miller
8e58e90e84 Refactor Requestable class from Configurable and move decode and zalgo there -- INI _filelist broken? 2021-01-24 15:55:21 -06:00
Jim Miller
75b1cc23b5 Refactor _fetchUrl() to get_request() 2021-01-24 14:12:41 -06:00
Jim Miller
3ba65f922b Refactor _fetchUrlRaw() to get_request_raw() 2021-01-24 14:07:04 -06:00
Jim Miller
38a9c7db05 Refactor _fetchUrlOpened() to get_request_redirected() and remove FakeOpened. 2021-01-24 13:44:35 -06:00
Jim Miller
2e905841e2 Always use requests, no parameters on GETs, still FakeOpened and now fake HTTPError 2021-01-24 13:15:32 -06:00
Jim Miller
7c262e71fa Refactor _fetchUrl()s implicit POST to explicit post_request()s 2021-01-23 15:02:48 -06:00
Jim Miller
b948591389 Refactor: rename _postUrl to post_request 2021-01-23 14:04:50 -06:00
Jim Miller
0822212bcb Remove overly complicated _customized_fetch_url() that was complicating refactoring. 2021-01-23 14:03:07 -06:00
Jim Miller
22b1bca6cd Working on refactor network fetch code. 2021-01-23 13:09:11 -06:00
Jim Miller
f9a1fef55d Bump Test Version 3.29.2 2021-01-23 10:24:56 -06:00
Jim Miller
df016a5e36 Fix for chapter error report. Closes #641 2021-01-23 10:24:38 -06:00
Jim Miller
df4aabc517 Partial py2 compat--stopped because brotlipython is also py3 only. 2021-01-22 13:04:43 -06:00
Jim Miller
2b12dc7054 Merge branch 'sidney-ffnet-chrome-cache-fetch' into ffnet-chrome-cache-fetch 2021-01-21 18:41:42 -06:00
Jim Miller
9819e0b214 Clean up chromagnon code to only what we're using--rest probably wasn't updated completely anyway. 2021-01-21 17:57:28 -06:00
Jim Miller
17cd3f3d04 Change open()/close() to with open(): 2021-01-21 17:53:27 -06:00
Jim Miller
1041fc44ec Not caching all in memory now. 2021-01-21 17:48:19 -06:00
Jim Miller
c555942bf4 Some debugs(commented out), don't is_cache_dir() twice, check all comment? 2021-01-21 17:46:15 -06:00
Jim Miller
34a4ad26da Fetch with domain key if not found without. 2021-01-21 17:30:06 -06:00
Jim Miller
c2b6082345 Correct debug output. 2021-01-21 14:41:16 -06:00
Jim Miller
f74e4bd252 Fix for py3 change. 2021-01-21 14:39:37 -06:00
Jim Miller
10a6554c81 Some chromagnon cache keys aren't bytes or text? 2021-01-21 12:41:56 -06:00
sidney
f82e534cb5 Issue #635 - Implement reading from browser cache using two Chrome cache formats 2021-01-21 20:04:50 +13:00
Jim Miller
9572c25c0b Fix rebase errors. 2021-01-20 19:39:12 -06:00
Jim Miller
ea4cf245ac check_next_chapter still useful. 2021-01-20 12:55:53 -06:00
Jim Miller
39d23c8c98 Comment out check_next_chapter and skip_author_cover features entirely. 2021-01-20 12:55:36 -06:00
Jim Miller
2b9d4b4ebd Rebasing onto master
CLI Only working with ffnet *only* reading from Chrome browser cache.
2021-01-20 12:55:33 -06:00
Jim Miller
eb51c671f5 Bump Test Version 3.29.1 2021-01-20 12:53:03 -06:00
Jim Miller
6965a04403 adapter_fanfictionnet: Start keeping story title part of storyUrl. 2021-01-20 12:53:00 -06:00
Jim Miller
48b8730571 Bump Release Version 3.29.0 Closes #622 2021-01-20 12:52:26 -06:00
Jim Miller
005ccaded7 Update translations. 2021-01-20 12:52:26 -06:00
Jim Miller
28c4c56806 Bump Test Version 3.28.8 2021-01-20 12:52:26 -06:00
Jim Miller
030a0e7134 New strings for translation. 2021-01-20 12:52:26 -06:00
Jim Miller
cb116af143 Add better chapter error reporting, refactor proceed_question code. 2021-01-20 12:52:26 -06:00
Jim Miller
d9ae30cfe3 Add plugin options to mark success/failed/chapter error individually. 2021-01-20 12:52:26 -06:00
Jim Miller
4b8392bb22 Bump Test Version 3.28.7 2021-01-20 12:52:26 -06:00
Jim Miller
0ed828ec3b Clear metadata cache after adapter metadata fetch. Cached metadata values may not be replace_metadata processed if fetched before their conditional dependencies. Revealed by AO3 one-shots using title for chapter name. 2021-01-20 12:52:26 -06:00
Jim Miller
3048148b2a Bump Test Version 3.28.6 2021-01-20 12:52:26 -06:00
Jim Miller
1930df68d1 Report cloudscraper exceptions as such, plus hide the potentially misleading 'opensource' part.
Closes #634
2021-01-20 12:52:26 -06:00
Jim Miller
c680f3bb64 Tweek ffnet/fpcom sleep times again. 2021-01-20 12:52:26 -06:00
Jim Miller
96ff0dec5f slow_down_sleep_time: randomize between 0.5 time and 1.5 time. 2021-01-20 12:52:26 -06:00
Jim Miller
4fee9b3011 Increase times betwen retries on fetch error. 2021-01-20 12:52:26 -06:00
Jim Miller
865d1d9c69 Update to cloudscraper v1.2.52 2021-01-20 12:52:26 -06:00
Jim Miller
56172edf6e Bump Test Version 3.28.5 2021-01-20 12:52:26 -06:00
Jim Miller
e934417ba9 Report chapter_error in custom error column and marked (when configured). 2021-01-20 12:52:26 -06:00
Jim Miller
3dde6aff8f Bump Test Version 3.28.4 2021-01-20 12:52:25 -06:00
Jim Miller
ec75736717 First rough version of reporting continue_on_chapter_error chapters. 2021-01-20 12:52:25 -06:00
Jim Miller
f08b922a80 Tweak adapter_test1 chapter error case 2021-01-20 12:52:25 -06:00
Jim Miller
dff03364d7 ffnet fpcom, continue_on_chapter_error:true by default, increase sleep times. 2021-01-20 12:52:25 -06:00
Jim Miller
e3fb6d2a1c Re-enable ffnet and bump up sleep times for same. 2021-01-20 12:52:23 -06:00
Jim Miller
0e2885b6ca Bump Test Version 3.28.3 2021-01-20 12:52:05 -06:00
Jim Miller
5948fd1109 adapter_fanficsme: do an extra fetch before login for cookie(?) Closes #633 2021-01-20 12:52:05 -06:00
Jim Miller
ba8e7d7908 Bump Test Version 3.28.2 2021-01-20 12:52:05 -06:00
Jim Miller
f96e4af3d2 adapter_webnovelcom: Fixes for site changes. Closes #629 2021-01-20 12:52:05 -06:00
Jim Miller
96d0167538 Bump Test Version 3.29.1 2021-01-20 12:27:50 -06:00
Jim Miller
8b1da6f6ec adapter_fanfictionnet: Start keeping story title part of storyUrl. 2021-01-20 12:27:42 -06:00
Jim Miller
0479e418b2 Bump Release Version 3.29.0 Closes #622 2021-01-20 09:52:08 -06:00
Jim Miller
88f1b9c44d Update translations. 2021-01-20 09:46:40 -06:00
Jim Miller
ef98363abb Bump Test Version 3.28.8 2021-01-18 11:45:45 -06:00
Jim Miller
1342e87c14 New strings for translation. 2021-01-18 11:45:38 -06:00
Jim Miller
03c19c10a3 Add better chapter error reporting, refactor proceed_question code. 2021-01-18 11:42:33 -06:00
Jim Miller
eeedfdee87 Add plugin options to mark success/failed/chapter error individually. 2021-01-18 11:05:30 -06:00
Jim Miller
0e95125464 Bump Test Version 3.28.7 2021-01-17 13:47:38 -06:00
Jim Miller
fd46963301 Clear metadata cache after adapter metadata fetch. Cached metadata values may not be replace_metadata processed if fetched before their conditional dependencies. Revealed by AO3 one-shots using title for chapter name. 2021-01-17 13:46:56 -06:00
Jim Miller
210a6a5589 Bump Test Version 3.28.6 2021-01-16 15:25:46 -06:00
Jim Miller
b6b3c9425c Report cloudscraper exceptions as such, plus hide the potentially misleading 'opensource' part.
Closes #634
2021-01-16 15:25:10 -06:00
Jim Miller
7b35682ffd Tweek ffnet/fpcom sleep times again. 2021-01-16 13:10:41 -06:00
Jim Miller
48e042064d slow_down_sleep_time: randomize between 0.5 time and 1.5 time. 2021-01-16 13:08:26 -06:00
Jim Miller
e7a70a8301 Increase times betwen retries on fetch error. 2021-01-16 12:50:20 -06:00
Jim Miller
ffde5bfdb5 Update to cloudscraper v1.2.52 2021-01-16 12:45:08 -06:00
Jim Miller
967993cef2 Bump Test Version 3.28.5 2021-01-16 11:22:29 -06:00
Jim Miller
c84f9f2895 Report chapter_error in custom error column and marked (when configured). 2021-01-16 11:18:16 -06:00
Jim Miller
37cdec2f27 Bump Test Version 3.28.4 2021-01-16 10:23:17 -06:00
Jim Miller
6e68624f2a First rough version of reporting continue_on_chapter_error chapters. 2021-01-16 10:22:50 -06:00
Jim Miller
0cb2053be5 Tweak adapter_test1 chapter error case 2021-01-16 08:39:28 -06:00
Jim Miller
ea82a094f9 ffnet fpcom, continue_on_chapter_error:true by default, increase sleep times. 2021-01-16 08:33:26 -06:00
Jim Miller
69a436af98 Re-enable ffnet and bump up sleep times for same. 2021-01-15 20:06:53 -06:00
Jim Miller
6c0a6594ff Bump Test Version 3.28.3 2021-01-15 10:24:14 -06:00
Jim Miller
feeba370ed adapter_fanficsme: do an extra fetch before login for cookie(?) Closes #633 2021-01-15 10:24:04 -06:00
Jim Miller
e433339f6b Bump Test Version 3.28.2 2021-01-14 10:25:00 -06:00
Jim Miller
07f19f5f70 adapter_webnovelcom: Fixes for site changes. Closes #629 2021-01-14 10:24:50 -06:00
Jim Miller
3ea533f5e6 check_next_chapter still useful. 2021-01-13 10:34:53 -06:00
Jim Miller
56fe8dd657 chromagnon ONLY WORKS on WIN--different on Mac & Linux. 2021-01-12 17:09:24 -06:00
Jim Miller
04314d2b63 Use brotlipython with plugin--much slower, but pure python. Not available in pypi 2021-01-11 22:27:43 -06:00
Jim Miller
10fb77f00f Don't error out on ffnet. 2021-01-11 21:02:33 -06:00
Jim Miller
a33f39dfec Move logpage processing so words_added appears in calibre_fanficfare_metadata 2021-01-11 13:03:06 -06:00
Jim Miller
9d4f587e23 Tweak chromagnon/cacheParse.py 2021-01-11 13:03:06 -06:00
Jim Miller
0be96953af Initial version of calibre_fanficfare_metadata 2021-01-11 13:03:06 -06:00
Jim Miller
fb474c8c45 More py3 fix 2021-01-11 13:03:05 -06:00
Jim Miller
b99f8afbe9 Comment out check_next_chapter and skip_author_cover features entirely. 2021-01-11 13:03:05 -06:00
Jim Miller
95297b58e0 CLI Only working with ffnet *only* reading from Chrome browser cache. 2021-01-11 13:03:05 -06:00
Jim Miller
10a7cf8aa7 Reduce number of network hits for ffnet as much as possible. 2021-01-11 13:03:05 -06:00
Jim Miller
b2a7986b8f Update translations. 2021-01-11 12:53:10 -06:00
Jim Miller
2593044309 Bump Test Version 3.28.1 2021-01-11 11:49:12 -06:00
Jim Miller
c3ff444b30 quotev.com: site change in date parse, use utf8:ignore as first encoding choice. Closes #625 2021-01-11 11:48:41 -06:00
Jim Miller
a1ea9d0f11 Bump Release Version 3.28.0 2021-01-11 09:06:42 -06:00
Jim Miller
c74460bb56 Update translations 2021-01-11 09:05:08 -06:00
Jim Miller
50dff16eef Tweak disable adapter_fanfictionnet text. 2021-01-10 13:10:45 -06:00
Jim Miller
d85b4b73a6 Bump Test Version 3.27.6 2021-01-09 22:17:15 -06:00
Jim Miller
83dc85d801 Catch exception from emails not decoding, skip & logger.error(). 2021-01-09 22:17:05 -06:00
Jim Miller
18bf6445e0 Disable adapter_fanfictionnet with warning about site blocking. 2021-01-09 12:11:28 -06:00
Jim Miller
19af3ea7de Bump Test Version 3.27.5 2021-01-06 12:33:17 -06:00
Jim Miller
680bcc4280 Add a fake get_image_size() method for when no image processing available. Closes #621 2021-01-06 12:33:09 -06:00
Jim Miller
30e076def7 Bump Test Version 3.27.4 2021-01-04 17:40:55 -06:00
Jim Miller
1e0e2dde90 Change adapter_twilightednet to https 2021-01-04 17:40:40 -06:00
Jim Miller
d33533f536 Bump Test Version 3.27.3 2021-01-01 11:49:50 -06:00
Jim Miller
6d117363ed Change for adapter_fanfictionnetadapter_fanfictionnet to make skip_author_cover work again. 2021-01-01 11:49:41 -06:00
Jim Miller
fb6d4eee01 Bump Test Version 3.27.2 2020-12-30 13:53:02 -06:00
Jim Miller
9f668e2653 Make included certifi and requests use same tmp file code and store under calibre tmp dir for cleanup. 2020-12-30 13:50:14 -06:00
Jim Miller
13fbf31f2c Bump Test Version 3.27.1 2020-12-24 15:39:24 -06:00
Jim Miller
a42dccd9bf Add append_datepublished_to_storyurl option for storiesonline.net, finestories.com, scifistories.com only. 2020-12-24 15:38:29 -06:00
Jim Miller
0453ecbc44 Bump Release Version 3.27.0 2020-12-24 09:15:50 -06:00
Jim Miller
c5ce9c4cea Update translations. 2020-12-24 09:14:58 -06:00
Jim Miller
fd11526da8 Bump Test Version 3.26.8 2020-12-23 14:03:39 -06:00
Jim Miller
28901d293f Changes to allow email chapter update URLs to work in adapter_wattpadcom 2020-12-23 14:01:09 -06:00
Jim Miller
a52949c2e6 Remove temp output & unneeded parse change from adapter_fanfictionnet 2020-12-23 13:50:02 -06:00
Jim Miller
e217a0b653 Bump Test Version 3.26.7 2020-12-23 10:43:30 -06:00
Jim Miller
dda8acb21b Document use_cloudscraper better in defaults.ini. 2020-12-23 10:43:21 -06:00
Jim Miller
e9f933a7f7 Bump Test Version 3.26.6 2020-12-22 14:29:48 -06:00
Jim Miller
a2607ffa54 Don't use mobile User-Agents with cloudscraper--adapter_fanfictionnet doesn't handle mobile pages. 2020-12-22 14:29:40 -06:00
Jim Miller
c6cafa87f2 Bump Test Version 3.26.5 2020-12-22 14:04:25 -06:00
Jim Miller
f6d086e0dd Need to semi-manually raise HTTPError for error codes with cloudscraper/requests. 2020-12-22 14:03:46 -06:00
Jim Miller
9112346f41 Roll included soupsieve back--newest isn't py2 compat. 2020-12-22 14:03:03 -06:00
Jim Miller
34dc2e14b2 Temp: dump ffnet page to debug when metadata parsing fails. 2020-12-22 13:35:01 -06:00
Jim Miller
7b951d7f4d Update old included_dependencies to current versions. 2020-12-22 13:29:20 -06:00
Jim Miller
d33decd8f5 Clean up cloudscraper import 2020-12-22 13:17:02 -06:00
Jim Miller
d652b4a9fe Update new included_dependencies to current versions. 2020-12-22 13:16:24 -06:00
Jim Miller
a160d28f27 Remove unneeded file. 2020-12-22 12:34:09 -06:00
Jim Miller
7cb67982dd Use requests.util.extract_zipped_paths to get browsers.json instead of kludge. 2020-12-22 12:29:19 -06:00
Jim Miller
f772059654 Bump Test Version 3.26.4 2020-12-22 12:02:26 -06:00
Jim Miller
4d13e477a5 Don't do get_resources() workaround when not in plugin. 2020-12-22 12:02:19 -06:00
Jim Miller
5658967a8b Bump Test Version 3.26.3 2020-12-22 10:42:21 -06:00
Jim Miller
3dd46ceee3 Adding cloudscraper and dependencies 2020-12-22 10:41:57 -06:00
Jim Miller
414fafc1e5 Bump Test Version 3.26.1 2020-12-17 20:08:06 -06:00
Jim Miller
ae2b33ec7b Apply ensure_text() to INIs with plugin CLI. 2020-12-17 20:07:56 -06:00
Jim Miller
cf2ae9b126 Bump Release Version 3.26.0 2020-12-15 14:00:45 -06:00
Jim Miller
59ceff0af1 Bump Test Version 3.25.17 2020-12-15 09:42:22 -06:00
Jim Miller
425f372968 Clear user_agent for literotica.com 2020-12-15 09:41:36 -06:00
Jim Miller
fd8a7ce69b Use refresh_screen=False when calling Reading List add/remove, refresh
book_ids.
2020-12-14 15:55:54 -06:00
Jim Miller
78d68892f7 Bump Test Version 3.25.16 2020-12-13 11:08:32 -06:00
Jim Miller
2ee0ada0d1 Fix finding imgs in existing epubs. Closes #608 2020-12-13 11:00:02 -06:00
Jim Miller
9da3746b9c Bump Test Version 3.25.15 2020-12-11 09:27:16 -06:00
Hazel Shanks
0f895205f6 cast JSON api values to string even when expecting string -- fixes #604 2020-12-11 08:57:14 -06:00
Hazel Shanks
81ec048517 changes CSS pt values to em 2020-12-11 08:39:30 -06:00
Hazel Shanks
72024b2b8e added default CSS formatting for author's notes and for in-story tables/statblocks 2020-12-11 08:39:30 -06:00
Jim Miller
af38ed0878 Bump Test Version 3.25.14 2020-12-09 11:31:19 -06:00
Jim Miller
0af9c874b1 Fix for genre change in adapter_royalroadcom 2020-12-09 11:31:03 -06:00
Jim Miller
b36278a7c8 Bump Test Version 3.25.13 2020-12-07 19:31:21 -06:00
Jim Miller
ebb8608577 Fix for adapter_storiesonlinenet 'access' issue. 2020-12-07 19:31:11 -06:00
Jim Miller
5b4e0e041b Bump Test Version 3.25.12 2020-12-04 12:07:00 -06:00
Jim Miller
50bbb80633 adapter_scribblehubcom - use pagecache, remove some dead code 2020-12-04 12:06:33 -06:00
Jim Miller
0dcd5805fc Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2020-11-28 20:24:44 -06:00
Jim Miller
769a5b44b5
Merge pull request #601 from HazelSh/fictionlive
fix for #598
2020-11-28 20:24:35 -06:00
Hazel Shanks
3eb9755cce reworked / simplified chapter extraction loop -- fixes #598 2020-11-28 23:57:03 +13:00
Hazel Shanks
c53cbe8257 fix bug with chapter extration when all chapters are appendicies 2020-11-28 23:57:03 +13:00
Jim Miller
1af3b4ff92 Bump Test Version 3.25.11 2020-11-26 18:31:10 -06:00
Jim Miller
1d1fc33093 Revert(ish) 'fix' for ffnet covers--they fixed it. b9cf7e2a64 2020-11-26 18:30:56 -06:00
Jim Miller
6c9e84dc7f Bump Test Version 3.25.10 2020-11-26 14:21:58 -06:00
Jim Miller
e2e6f74d42 Don't error on <img> w/o class in replace_failed_smilies_with_alt_textreplace_failed_smilies_with_alt_text 2020-11-26 14:21:37 -06:00
Jim Miller
8cc21d19ec Allow tab to leave edit boxes. 2020-11-26 14:21:01 -06:00
Jim Miller
738da1af0e Bump Test Version 3.25.9 2020-11-23 14:31:48 -06:00
Jim Miller
285459758d Allow <img> tags without src attr in epub to update 2020-11-23 14:31:48 -06:00
Jim Miller
b72ce6ecf3 Fix http/https matching in identifiers:url search. 2020-11-23 14:31:48 -06:00
Jim Miller
9d8027ab7c Change convert_inline_images default to false. 2020-11-23 14:31:48 -06:00
Jim Miller
3301b96390 Update version updater to be more automatic. 2020-11-23 14:31:40 -06:00
Jim Miller
9c43667b44 Bump Test Version 3.25.8 2020-11-19 11:26:28 -06:00
Jim Miller
b9cf7e2a64 ffnet cover images changed? Or broken? 2020-11-19 11:26:13 -06:00
Jim Miller
9a25c9d6f7 Warn, not crash on when cover_min_size fails. 2020-11-19 11:01:31 -06:00
Jim Miller
16049cc09b Bump Test Version 3.25.7 2020-11-18 14:32:43 -06:00
Jim Miller
7c1a723a6d Fix adapter_thesietchcom for site change 2020-11-18 13:32:34 -06:00
Jim Miller
ee63036c6d Remove outdated comment 2020-11-18 10:22:37 -06:00
Jim Miller
a5a1322f28 Add 2nd dup-story URL check after fetching metadata for when story URL changes. 2020-11-18 10:19:14 -06:00
Jim Miller
07496ad0c3 Bump Test Version 3.25.6 2020-11-15 11:34:06 -06:00
Jim Miller
0080310062 Set default cover_min_size: 10,10 to avoid spacer images as covers. 2020-11-15 11:34:06 -06:00
Jim Miller
2beb2df77f Fixes for extra /s in normalized URLs. 2020-11-15 11:34:06 -06:00
Jim Miller
0fa697b418 Add (optional, default on) convert support for data:image in-line <img>s. 2020-11-15 11:20:45 -06:00
Jim Miller
12383b6342 Adding pass-through support for data:image in-line <img>s. 2020-11-15 10:02:45 -06:00
Jim Miller
f91111de90 Bump Test Version 3.25.5 2020-11-14 12:33:57 -06:00
Jim Miller
3a1447abea Fix anonymous authorUrl in adapter_archiveofourownorg to use getSiteDomain() 2020-11-14 12:33:57 -06:00
Jim Miller
4108d5c1d1 New Site: squidgeworld.org - shares code with AO3. 2020-11-14 12:29:09 -06:00
Jim Miller
b24cbbc954 Remove duplicate getLogger. 2020-11-14 12:27:17 -06:00
Jim Miller
8b0e0c8de5 Bump Test Version 3.25.4 2020-11-13 11:08:04 -06:00
Jim Miller
b85c265fdd adapter_wwwnovelallcom: Accept chapter URLs, must change to true storyId/URL after. 2020-11-13 10:55:36 -06:00
Jim Miller
078950b2e3 adapter_scribblehubcom: Accept chapter URLs. 2020-11-13 10:16:42 -06:00
Jim Miller
34a12c48c1 Don't use polyglot, not included until Calibre3. Remove some debug output. 2020-11-13 09:55:42 -06:00
Jim Miller
84a7414981
Merge pull request #592 from Rikkitp/wuxiaworldco_status
adapter_wuxiaworldco: added status
2020-11-12 09:05:58 -06:00
Dmitry Snegirev
64ddc71886 adapter_wuxiaworldco: added status 2020-11-12 14:14:35 +03:00
Jim Miller
08cd7b0822 Bump Test Version 3.25.3 2020-11-11 10:30:44 -06:00
Jim Miller
f1990600da adapter_literotica: Keep language domains & use for language metadata. #588 2020-11-11 10:30:10 -06:00
Jim Miller
a98770a18b Make adapter_novelupdatescc share code with adapter_wuxiaworldco 2020-11-11 10:11:20 -06:00
Jim Miller
5afd8ca3e2 Bump Test Version 3.25.2 2020-11-10 09:37:38 -06:00
Jim Miller
91073658cc adapter_alternatehistorycom now uses same thread group HTML as XenForo2. Closes #590 2020-11-10 09:37:27 -06:00
Jim Miller
c27ffc52b2 Bump Test Version 3.25.1 2020-11-09 19:25:27 -06:00
Jim Miller
5f9369176c adapter_wuxiaworldco/adapter_novelupdatescc: Re-add Don't include grayed out 'In preparation' 2020-11-09 19:25:27 -06:00
Jim Miller
31b2c75bed adapter_literotica: Accept /beta/ URLs. 2020-11-09 11:00:57 -06:00
Jim Miller
e8f75249da Bump Release Version 3.25.0 2020-11-09 10:50:12 -06:00
Jim Miller
0a01ab7438 Bump Test Version 3.24.16 2020-11-08 13:13:22 -06:00
Jim Miller
2278110d32 On drag and drop, accept text/html and remove %0D at end of text/uri-list URLs. Closes #587 2020-11-08 13:13:08 -06:00
Jim Miller
067a5fd244 Bump Test Version 3.24.15 2020-11-08 09:40:00 -06:00
Jim Miller
5c52a1f43b Update translations 2020-11-08 09:39:39 -06:00
Dmitry Snegirev
cca3f362e6 add fastnovel.net adapter 2020-11-08 00:31:42 +03:00
Jim Miller
175b4728d6 Bump Test Version 3.24.14 2020-11-06 11:04:04 -06:00
Jim Miller
475efc8f04 Update translations. 2020-11-06 11:03:49 -06:00
Jim Miller
338b288b38 Don't error out on non-editable custom columns ValueError. 2020-11-06 11:02:12 -06:00
Jim Miller
0a2585808f Update strings-to-translate. 2020-11-06 08:21:33 -06:00
Jim Miller
d5ec157654 Bump Test Version 3.24.13 2020-11-02 15:07:41 -06:00
Jim Miller
2fb09e6a2b Add Yes/No to All question dialoig boxes. 2020-11-02 15:06:25 -06:00
Jim Miller
793c04f262 Bump Test Version 3.24.12 2020-10-30 09:49:36 -05:00
Jim Miller
af89ede8b4 Update translations 2020-10-30 09:49:27 -05:00
Jim Miller
d729386685
Merge pull request #581 from HazelSh/fictionlive
fiction.live: fixed crash with stories with achievements (introduced in last set of commits)
2020-10-30 09:47:43 -05:00
Jim Miller
b7621c6555
Merge branch 'master' into fictionlive 2020-10-30 09:47:06 -05:00
Hazel Shanks
bb14697397 fixed bug with stories with achievements introduced in last set of commits 2020-10-30 18:08:09 +13:00
Jim Miller
cdf412660f Tweak defaults.ini 2020-10-29 10:19:55 -05:00
Jim Miller
50efa8f52d Update translations. 2020-10-28 22:28:46 -05:00
Jim Miller
eae93edc1f Bump Test Version 3.24.11 2020-10-27 14:16:38 -05:00
Jim Miller
0ee67a26ae Update translations 2020-10-27 14:16:15 -05:00
Jim Miller
e2901081f7 Update translation messages.pot 2020-10-27 14:14:32 -05:00
Jim Miller
6925559f5f Allow story URLs edit box to edit--remove *FromMimeData() methods. 2020-10-27 14:13:48 -05:00
Jim Miller
723489c230 Fix drag-n-drop of calibre rows onto FFF. 2020-10-27 13:22:48 -05:00
Jim Miller
0bf11d6ea3 Fix drag-n-drop of emls(Thunderbird). 2020-10-27 13:07:01 -05:00
Jim Miller
bda69750a3 Remove some debugs. 2020-10-26 14:33:52 -05:00
Jim Miller
6eb39b6d46 Bump Test Version 3.24.10 2020-10-25 17:43:27 -05:00
Jim Miller
d3b9dd0cde Change seriesHTML to not be filled without series. 2020-10-25 17:42:57 -05:00
Jim Miller
8a3b445241 Bump Test Version 3.24.9 2020-10-25 09:47:45 -05:00
Jim Miller
1ca7036594 Fix for Xenforo2 change (SB/SV) 2020-10-25 09:47:24 -05:00
Jim Miller
a819037b79 Bump Test Version 3.24.8 2020-10-24 09:15:22 -05:00
Jim Miller
78fa57a63c Update adapter_ficbooknet for site changes. 2020-10-24 09:15:07 -05:00
Jim Miller
7ad85b8beb Bump Test Version 3.24.7 2020-10-23 10:04:05 -05:00
Jim Miller
fc5b7cb3b7 Fix for latest storiesonline login change, thanks mrEd 2020-10-23 10:03:50 -05:00
Jim Miller
af63a3e770 Bump Test Version 3.24.6 2020-10-22 14:02:54 -05:00
Jim Miller
3699991d6d Add checks for CALIBREONLYSAVECOL when not config'ed and update w/o epub. 2020-10-22 12:46:36 -05:00
Jim Miller
915159a6d9 Don't disable Update Mode with By Action menus. 2020-10-22 12:12:00 -05:00
Jim Miller
7b2aaee4ea Remove some commented out dev code. 2020-10-22 12:10:05 -05:00
Jim Miller
2bd6435e72 Do 'not an anthology' dialog outside busy_cursor 2020-10-22 12:02:07 -05:00
Jim Miller
dae4acd884 Bump Test Version 3.24.5 2020-10-21 10:39:29 -05:00
Jim Miller
323fefa333 Bump Test Version 3.24.4 2020-10-21 10:38:25 -05:00
Jim Miller
9dbe543dc9 Fix regression in collision translation with email immediate 2020-10-21 10:38:25 -05:00
Jim Miller
e260b32014 Add icons to actions-by-mode menus. 2020-10-21 10:38:25 -05:00
Jim Miller
d0db30fa1d Bump Test Version 3.24.3 2020-10-21 10:38:25 -05:00
Jim Miller
c9297dd0c4 Working on many-mode-menu feature--also fixing collision translation. 2020-10-21 10:38:25 -05:00
Jim Miller
5264a15e68 Working on many-mode-menu feature--also fixing collision translation. 2020-10-21 10:38:25 -05:00
Jim Miller
7b4a3333e7 Working on many-mode-menu feature--also fixing 'checked' parameter from menu actions. 2020-10-21 10:38:25 -05:00
Jim Miller
22359b0f4d
Merge pull request #573 from HazelSh/fictionlive
Fiction.live fixes -- URL normalization, .ini changes, and a fix to story updates not being recognised
2020-10-21 10:37:02 -05:00
Hazel Shanks
6ab50b6eaa Merge branch 'master' into fictionlive 2020-10-21 13:28:57 +13:00
Hazel Shanks
c9ef6cbdfe fanficfare should now recognise when the most recent chapter has been updatet (chunk timestamp in URL) 2020-10-21 13:22:10 +13:00
Hazel Shanks
d40ff43a2f normalize fiction.live story URLs -- should close #559 2020-10-21 13:15:23 +13:00
Hazel Shanks
b35ca970f3 changed .ini files -- set up subject tags, comment suggesting dedup_img_files when including images 2020-10-21 12:56:08 +13:00
Pk11
2ea7280f66 Fix brackets
Missed changing one...
2020-10-20 10:08:40 -05:00
Pk11
c9d7092f3a Add mappings formats of a few strings
So translations can change the order of the variables
2020-10-20 10:08:40 -05:00
Jim Miller
a353a54374 Experimentally adding new translation strings to see how transifex handles them. 2020-10-19 21:42:48 -05:00
Jim Miller
31c5696cd1 Improve menus setVisible(False)--also disables. 2020-10-18 17:05:13 -05:00
Jim Miller
ebfe57d410 Bump Test Version 3.24.2 2020-10-18 16:29:45 -05:00
Jim Miller
7c2fac6b7a Change from menus disable to setVisible(False). 2020-10-18 16:29:32 -05:00
Jim Miller
de226a7e67 Bump Test Version 3.24.1 2020-10-17 16:38:10 -05:00
Jim Miller
08cadaea10 Changes to always have all FFF shortcut/menu items present plus submenus. 2020-10-17 16:37:56 -05:00
Jim Miller
35c662a616 Bump Release Version 3.24.0 2020-10-15 11:35:55 -05:00
Jim Miller
4f5df9609e Update translations 2020-10-15 11:35:19 -05:00
Jim Miller
6cb97595b8 Add always_login setting info to [base_xenforoforum] in defaults.ini. Closes #569 2020-10-11 17:50:55 -05:00
Jim Miller
cd470000b0 Update translations 2020-10-11 09:58:50 -05:00
Jim Miller
6c8840eabe Bump Test Version 3.23.17 2020-10-09 11:09:02 -05:00
Jim Miller
c70ec078ea Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2020-10-07 09:00:41 -05:00
muchtea
4c27747416 fiction.live - handle api returning non-int values for votes
comment
2020-10-07 09:00:37 -05:00
Jim Miller
6910c84225 Hide series title/desc labels when not present. 2020-10-05 16:13:19 -05:00
Jim Miller
7b21c9fbdd Bump Test Version 3.23.16 2020-10-05 16:04:54 -05:00
Jim Miller
d0a2487674 Update translations. 2020-10-05 15:59:39 -05:00
hseg
13570e3f99 Pass None to bs4.find() instead of empty string
BeatifulSoup apparently treats an empty string like any other string,
i.e. as a search filter for tag names. Passing None seems to disable
this filtering. Couldn't find the relevant docs for this, so any
representation of find()'s behaviour here is based on my testing only.
(on BeatifulSoup 4.9.3)

A quick code search on the repo shows this is the only place find gets
explicitly passed an empty string. Indeed, all other usages seem to list
the tag to expect (in this case, both are <div>s), but I chose to follow
this adapter's choice of not filtering.
2020-10-05 15:58:24 -05:00
Jim Miller
f069ae7897 Add new translation strings. 2020-10-05 10:59:29 -05:00
Jim Miller
9184ba9e0f Bump Test Version 3.23.15 2020-10-03 12:39:57 -05:00
Pk11
a02108c95d Switch to for loops 2020-10-03 12:12:26 -05:00
Pk11
3976266f70 Fix crash if news box is missing title and/or body 2020-10-03 12:12:26 -05:00
Pk11
df963ca78c Fix exclude_notes
Accidentally deleted this line
2020-10-03 12:12:26 -05:00
Pk11
3e3a13c096 Add excluding author's notes and news boxes 2020-10-03 12:12:26 -05:00
Pk11
c21c3a6041 Nicely format Scribble Hub's author's notes & news 2020-10-03 12:12:26 -05:00
Jim Miller
5cae380257 Bump Test Version 3.23.14 2020-10-02 15:58:23 -05:00
Jim Miller
710aaa32e6 More 'fix' for crazy eFiction series collection. 2020-10-02 15:58:11 -05:00
Jim Miller
8c46f82c26 archive.hpfanfictalk.com -> fanfictalk.com plus site changes for same. 2020-10-02 15:57:30 -05:00
Jim Miller
123308bebd Bump Test Version 3.23.13 2020-10-02 14:05:42 -05:00
Jim Miller
cbdb7649e1 Apply CLI --json-meta to --list to see series name/desc. 2020-10-02 14:04:13 -05:00
Jim Miller
29c718d7bd Making eFiction series name/desc collection work with more sites. 2020-10-02 13:34:13 -05:00
Jim Miller
50af1fa781 Change ponyfictionarchive.net to https by default 2020-10-02 12:54:06 -05:00
Jim Miller
b0bb9402da Making eFiction series name/desc collection work with more sites. 2020-10-02 12:30:19 -05:00
Jim Miller
f2ed71a0a3 Bump Test Version 3.23.12 2020-10-01 13:37:07 -05:00
Jim Miller
7177285f99 Another py3 can't compare to None fix. 2020-10-01 13:36:48 -05:00
Jim Miller
b2bb03921c Add status to AO3 series collection. 2020-10-01 13:28:20 -05:00
Jim Miller
2932c9b436 Bump Test Version 3.23.11 2020-10-01 11:50:07 -05:00
Jim Miller
8021836a04 Bump Test Version 2.23.11 2020-10-01 11:47:15 -05:00
Jim Miller
c2bc561688 Fix adapter_storiesonlinenet login. Closes #562 2020-10-01 11:46:12 -05:00
Jim Miller
d487b265f6 'Fixed' login for adapter_storiesonlinenet, but doesn't work on chapters. Tons of debug output. 2020-09-30 16:39:52 -05:00
Jim Miller
971db85948 Bump Test Version 3.23.10 2020-09-29 20:24:39 -05:00
Jim Miller
7e29d4163d Bump Test Version 2.23.10 2020-09-29 20:21:15 -05:00
Jim Miller
76e986117b Don't save cover image size for dedup_img_files - Calibre might replace it. Also fixes #561 2020-09-29 20:20:37 -05:00
Jim Miller
8cb0b88c92 Comment out a debug. 2020-09-28 11:32:29 -05:00
Jim Miller
ad59e2cf45 Adding eFiction series parsing attempt to base_adapter. 2020-09-27 19:50:32 -05:00
Jim Miller
d9101f315a test1.com series 2020-09-27 19:48:01 -05:00
Jim Miller
59e24831b8 Better GUI for series anthology 2020-09-27 19:47:39 -05:00
Jim Miller
599a72d2fc Bump Test Version 3.23.9 2020-09-26 09:04:35 -05:00
Jim Miller
882966cc0c ensure_text plugin-example.ini for py3 2020-09-26 09:04:19 -05:00
Jim Miller
f326301f38 Bump Test Version 3.23.8 2020-09-25 12:31:40 -05:00
Jim Miller
f06ca006d0 New site: www.the-sietch.com XenForo2 site with weird URL scheme. Closes #430 2020-09-25 12:31:10 -05:00
Jim Miller
1d077bda3f Refactor xenforo code, '/' in getPathPrefix() 2020-09-25 12:19:08 -05:00
Jim Miller
c68712a577 potionsandsnitches.org shouldn't have add_to_include_subject_tags in defaults.ini. 2020-09-24 15:28:13 -05:00
Jim Miller
4d06041688 Bump Test Version 3.23.7 2020-09-24 13:45:24 -05:00
Jim Miller
a9c87f4ecf Add get_urls_from_page() and get_series_from_page() to adapters, add
support.
2020-09-24 13:43:02 -05:00
Jim Miller
ad529eb9ef Fix for deprecated regex escape warnings. 2020-09-23 12:36:34 -05:00
Jim Miller
afdf3e5205 Bump Test Version 3.23.6 2020-09-19 08:29:13 -05:00
Jim Miller
34535be0a7 Change adapter_siyecouk to default to https, Closes #558. 2020-09-19 08:28:13 -05:00
Jim Miller
a0c551c46e Bump Test Version 3.23.5 2020-09-17 08:37:10 -05:00
Jim Miller
54bee0ad0f Add include_author_notes option (defaults on) to adapter_royalroadcom Closes #556 2020-09-17 08:36:45 -05:00
Hazel Shanks
158e0d5a5a Handle stories with missing contentRating, storyStatus & support beta.fictionlive.com domain 2020-09-17 08:06:45 -05:00
Hazel Shanks
58f3d1c268 Handle stories with missing contentRating, storyStatus & support beta.fictionlive.com domain 2020-09-17 15:56:22 +12:00
Jim Miller
599c9051f0 Bump Test Version 3.23.4 2020-09-16 21:28:41 -05:00
Jim Miller
d2a6faa225 Remove authorUrl sets from adapter_novelupdatescc / adapter_wuxiaworldco -- they don't reliable appear to have author links after all. Closes #555 2020-09-16 21:23:21 -05:00
Jim Miller
c759a9e769 Bump Test Version 3.23.3 2020-09-16 14:53:09 -05:00
Jim Miller
b396a08828 Add dedup_img_files option(default:false). #550 2020-09-16 14:52:55 -05:00
Jim Miller
15c21e8a6c Bump Test Version 3.23.2 2020-09-16 13:40:11 -05:00
Jim Miller
cdf7db07b2 Add setting remove_tags:script,style replacing script remove hardcode, adding style remove. Closes #553. 2020-09-16 13:38:46 -05:00
ElminsterAU
0055978a57 Fix for adapter_novelupdatescc not setting authorUrl. 2020-09-16 12:43:46 -05:00
ElminsterAU
1578a9f724 adds support for novelupdates.cc
this is currently a 1:1 copy of the adapter for wuxiaworld.co with just the relevant strings replaced
2020-09-16 12:43:46 -05:00
Jim Miller
c18822294f Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2020-09-16 12:07:01 -05:00
Jim Miller
5bb53b83ae Add comment that chireads.com and wuxiaworld.com don't seem to have authorUrl links. 2020-09-16 12:06:33 -05:00
Jim Miller
d1fb0d0d3c Fix for adapter_wuxiaworldco not setting authorUrl. 2020-09-16 12:05:49 -05:00
Jim Miller
0c822bc0a0 Fix for adapter_wuxiaworldsite not setting authorUrl. 2020-09-16 11:32:31 -05:00
muchtea
4a4a9e0327 fiction.live - fix typo 2020-09-16 09:12:17 -05:00
Jim Miller
ade0458f1e Bump Test Version 3.23.1 2020-09-15 09:25:37 -05:00
Jim Miller
c57470e955 Fix -s site list for adapter_hpfanficarchivecom. 2020-09-14 18:52:31 -05:00
Jim Miller
f09a76fa61 Bump Release Version 3.23.0 2020-09-14 17:56:04 -05:00
Jim Miller
64658387a5 Update translations. 2020-09-13 12:47:13 -05:00
Jim Miller
69a1186978 Bump Test Version 3.22.16 2020-09-13 12:41:33 -05:00
Jim Miller
af2dd1d063 Add 'undocumented' --unverified_ssl CLI option. Closes #546 2020-09-13 12:41:16 -05:00
Jim Miller
6cd3aed4d6 Bump Test Version 3.22.15 2020-09-11 14:17:42 -05:00
Jim Miller
2a7e1d2c19 adapter_valentchambercom - requires SSL and only allows w/o www. 2020-09-11 14:10:02 -05:00
Jim Miller
2590ec564f adapter_hpfanficarchivecom - SSL cert SSL_ERROR_RX_RECORD_TOO_LONG, back to http 2020-09-11 14:09:35 -05:00
Jim Miller
fbebe48fec Update translations. 2020-09-11 09:33:52 -05:00
Jim Miller
b86f637cee Add new string for transifex. 2020-09-10 16:31:36 -05:00
Jim Miller
aba534a95d Bump Test Version 3.22.14 2020-09-10 10:11:12 -05:00
Jim Miller
c1465af849 Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2020-09-10 10:10:24 -05:00
muchtea
e7496f0e3a fiction.live - handling possibly optional route titles 2020-09-10 10:10:18 -05:00
muchtea
445e26158f fiction.live - add support for internal links from choices to route chapters 2020-09-10 10:10:18 -05:00
muchtea
cac74157b9 fiction.live - add support for multi route stories
Route chapters simply are added at the end after the appendices
2020-09-10 10:10:18 -05:00
muchtea
88dc400edd Add .idea to .gitignore 2020-09-10 10:10:18 -05:00
Jim Miller
fd646e9924 Bump Test Version 3.22.13 2020-09-09 11:32:06 -05:00
Kolbo
c991f3cd3a add wuxiaworld.site adapter 2020-09-09 11:52:47 +02:00
Kolbo
12a5208ab2 refactor chiread tests with generic test class 2020-09-09 11:52:28 +02:00
Kolbo
008fdd6ea3 add generic class to easily add new test 2020-09-09 11:52:28 +02:00
Jim Miller
5175d40d6f Bump Test Version 3.22.12 2020-09-08 10:19:58 -05:00
Jim Miller
7cefc329bc adapter_fictionlive: getSiteExampleURLs(cls) needs to return a string. 2020-09-08 10:10:32 -05:00
Hazel Shanks
d55337e909 more permissive regex to handle story urls with genre in url, 20 char IDs & example urls -- fixes #541 2020-09-08 10:08:19 -05:00
Jim Miller
7585eafa77 Bump Test Version 3.22.11 2020-09-06 21:54:44 -05:00
Hazel Shanks
a42a349e3c changed time formats for livetime metadata, chunk timestamps to avoid localized names -- fixes #538 2020-09-06 21:54:10 -05:00
Jim Miller
b0faa2ce21 Cleanup some whitespace. 2020-09-06 19:28:01 -05:00
Hazel Shanks
28a23e1257 null handling in format_readerposts -- closes #539 2020-09-06 19:22:52 -05:00
Jim Miller
2861bb5f1e Bump Test Version 3.22.10 2020-09-06 17:30:43 -05:00
Jim Miller
3f73e28a48 adapter_occlumencysycophanthexcom needed another 'needs login' string. 2020-09-06 17:30:28 -05:00
Jim Miller
3459960c0f Bump Test Version 3.22.9 2020-09-04 20:55:47 -05:00
Jim Miller
12ac620d71 Fix anthology comments for changes in how Calibre handles them. 2020-09-04 20:55:37 -05:00
Jim Miller
402db7ff6e Bump Test Version 3.22.8 2020-09-02 11:46:25 -05:00
Jim Miller
33d79d503e Display number of URLs found for an anthology. 2020-09-02 11:46:14 -05:00
Jim Miller
30aedd3bd7 Bump Test Version 3.22.7 2020-08-31 12:43:01 -05:00
Jim Miller
b3ffa0767a Order INI files, clean up some whitespace. 2020-08-31 11:58:23 -05:00
Hazel Shanks
a486d62d20 added stripHTML to titles, summary 2020-08-31 11:55:46 -05:00
Hazel Shanks
2e2389390e re-added configuration validation for fiction.live adapter config options 2020-08-31 11:55:46 -05:00
Hazel Shanks
5e6b6a9b56 added support for very old fiction.live stories with UUID story IDs 2020-08-31 11:55:46 -05:00
Hazel Shanks
209a0a4a9e handling votes with no options, general 'everything-is-optional' sanity checking 2020-08-31 11:55:46 -05:00
Hazel Shanks
7cb05e38a3 fixed typo in default.ini / plugin-defaults.ini 2020-08-31 11:55:46 -05:00
Hazel Shanks
894d25a938 typos in config files 2020-08-31 11:55:46 -05:00
Hazel Shanks
79bd13f615 comment tidying & code-review changes: tags not sorted, metadata unset (not defaults) when not present 2020-08-31 11:55:46 -05:00
Hazel Shanks
9a66915b37 Add support for fiction.live (closes #201) 2020-08-31 11:55:46 -05:00
Jim Miller
d2fb987a1b Update translations. 2020-08-30 17:07:04 -05:00
Jim Miller
b369d6aa35 Added a string for translation. 2020-08-30 17:06:02 -05:00
Jim Miller
4a13a03dd4 Bump Test Version 3.22.6 2020-08-30 17:04:56 -05:00
Jim Miller
d4d3226803 Fix for Anthology bug when no story has a series. 2020-08-30 17:04:45 -05:00
Jim Miller
c27fd27f5b Bump Test Version 3.22.5 2020-08-27 21:14:41 -05:00
Jim Miller
a0b68ec7fe adapter_literotica: Fix for language domain name links on author page. 2020-08-27 21:14:23 -05:00
Jim Miller
f4b9431603 Add a 'shouldn't happen' error check for anthology merge. 2020-08-24 13:43:24 -05:00
Jim Miller
545d7c5e8d Add note about anthologies and epub3 to plugin-defaults.ini 2020-08-20 16:45:19 -05:00
Jim Miller
e5bd062c26 Show busy cursor during anthology merge & update. 2020-08-20 16:16:11 -05:00
Jim Miller
90cdcb3d9e Bump Test Version 3.22.4 2020-08-18 19:09:18 -05:00
Jim Miller
1df24b34af Fix chapter URLs in wuxiaworld.co. 2020-08-18 19:09:04 -05:00
Jim Miller
c0ee0cc702 Bump Test Version 3.22.3 2020-08-18 15:56:21 -05:00
Jim Miller
135c969a0d Make teffalump's chapter dedup & order code optional for wuxiaworld.co with dedup_order_chapter_list setting. 2020-08-18 15:53:50 -05:00
teffalump
b73093584f
Update adapter_wuxiaworldco.py
Strip non-numeric characters from chapter number before parsing as number. However, allows decimals and parses numbers as floats. Consequently, 'chapter 10.5 - xxx' still can be sorted into correct place.
2020-08-17 15:50:27 -07:00
teffalump
a7b71b94fd
Update adapter_wuxiaworldco.py (#532)
Fix skip chapter error
2020-08-17 09:52:43 -05:00
Jim Miller
e315a11506 Bump Test Version 3.22.2 2020-08-14 17:58:22 -05:00
teffalump
5caf276ec7
Wuxiaworldco update (#530)
adapter_wuxiaworldco Site updated layout.
2020-08-14 17:57:41 -05:00
Jim Miller
697f51d5b7 Bump Test Version 3.22.1 2020-08-13 16:25:29 -05:00
Jim Miller
d5c1f484e6 Put empty default_prefs['rejecturls'] back until I can fix the code better. #529 2020-08-13 16:24:41 -05:00
Jim Miller
a91345d626 Change outdated verbiage in comment. 2020-08-13 16:22:16 -05:00
Jim Miller
8b0a4336de Bump Release Version 3.22.0 2020-08-09 21:10:24 -05:00
Jim Miller
36e26b0592 Remove site gravitytales.com - redirects to webnovel.com now. Closes #521 2020-08-02 09:27:40 -05:00
Jim Miller
0303b7eeb0 Clean up a few debug outputs. 2020-08-01 22:23:23 -05:00
Jim Miller
9f23d13c9c Bump Test Version 3.21.12 2020-08-01 17:21:36 -05:00
Jim Miller
e27908e70a Accept storyUrl with title in it for adapter_webnovelcom. Addresses #520 2020-08-01 17:21:19 -05:00
Jim Miller
693c3d07ce Bump Test Version 3.21.11 2020-07-26 22:26:58 -05:00
Jim Miller
986531d238 r'' warning 2020-07-26 22:26:42 -05:00
Jim Miller
09dabc37ff Fix adapter_mcstoriescom getSiteExampleURLs() 2020-07-26 22:22:27 -05:00
Jim Miller
0234c16117 Bump Test Version 3.21.10 2020-07-25 15:03:59 -05:00
Jim Miller
7ec4dce57b adapter_scribblehubcom: Default to datetime.now() instead of date.today() because plugin expects datetime not date. 2020-07-25 15:03:26 -05:00
Jim Miller
c93b5d96fe Remove outdated rejecturls from plugin prefs--replaced by rejecturls_date last 18+ month back. 2020-07-22 14:01:15 -05:00
Jim Miller
a0948ff4f5 Bump Test Version 3.21.9 2020-07-22 09:11:25 -05:00
Jim Miller
4cb37739d7 Fixes for adapter_scribblehubcom dates. 2020-07-21 19:23:02 -05:00
Jim Miller
fbea3657ac Bump Test Version 3.21.8 2020-07-20 14:48:21 -05:00
Jim Miller
9252a9d31d Fix for adapter_quotevcom site changes. 2020-07-20 14:46:22 -05:00
Jim Miller
afd6b0cace Fix typo in adapter_ficbooknet 2020-07-20 14:39:38 -05:00
Jim Miller
23428e8f93 Fix adapter_harrypotterfanfictioncom desc parse for site change. 2020-07-20 14:20:53 -05:00
Eli Schwartz
d226f4791f fix deprecation warnings for logger.warn()
The correct function since 2003 has been .warning(), and .warn() is a
compat wrapper over it. It wasn't documented until
https://bugs.python.org/issue13235 added documentation stating it exists
but is obsolete. Nevertheless, the whole world is full of code that
mysteriously uses it anyway. Let's at least remove it here, though.
2020-07-20 13:55:12 -05:00
Jim Miller
dda3c591b6 adapter_masseffect2in needs __future__ unicode_literals because of real unicode r'' strings 2020-07-20 13:54:55 -05:00
Eli Schwartz
fc8efd457e fix deprecation warnings for invalid escapes
also fix ones that need to be switched from u'' to r'' +
unicode_literals
2020-07-20 13:51:13 -05:00
Eli Schwartz
49c9ea9837 fix deprecation warnings for invalid escapes
The two python strings

r'\,'
 '\,'

are identical, except the latter raises warnings, and future versions of
python will raise an error. There is no such C-style escape sequence, so
python falls back to treating it like a literal string. Mark this and
all other strings which aren't intended to actually interject C-style
escapes, as r'' strings to suppress this attempted interpretation and
avoid logging DeprecationWarnings.
2020-07-20 13:51:05 -05:00
Jim Miller
befdeb193c Bump Test Version 3.21.7 2020-07-20 11:36:32 -05:00
Jim Miller
f4a3b6e18d Normalize literotica.com URLs to www.literotica.com 2020-07-20 11:35:58 -05:00
Jim Miller
2fe661fbb7 Bump Test Version 3.21.6 2020-07-19 11:49:04 -05:00
Jim Miller
b600199f64 Update translations 2020-07-19 11:48:50 -05:00
Jim Miller
48bbf4f2da Mirror defaults.ini changes to plugin-defaults.ini. 2020-07-19 11:48:50 -05:00
Eleanor Davies
6328c147d9
ScribbleHub - Metadata Fixes (#513)
* better fix for date published
* add numwords, views, avg words
* remove isAdult
* remove login
2020-07-19 11:46:17 -05:00
Jim Miller
1dc801a6e0 Bump Test Version 3.21.5 2020-07-17 18:54:42 -05:00
Eleanor Davies
632a551a08
scribblehubcom - fix attribute error (#511) 2020-07-17 18:49:25 -05:00
Jim Miller
1da9653ebb Bump Test Version 3.21.4 2020-07-17 16:43:05 -05:00
Eleanor Davies
868d4317d9
Sribblehub fixes (#510)
* date published + cover art
* ratings
* fix today times
2020-07-17 16:41:40 -05:00
Jim Miller
0c2fe0b487 Bump Test Version 3.21.3 2020-07-16 15:25:12 -05:00
Jim Miller
3fb3775ce8 Default ini configs for [www.scribblehub.com] 2020-07-16 15:24:58 -05:00
Eleanor Davies
b29d8d6b22
Added support for scribblehub (#508)
* 1st chapter

* chapters working

* bugfixes for metadata

* python and lists and indexes -_-

* tidying comments

* inheritance fix for python2.7

* inheritance config fix for python2.7

* strings and ints
2020-07-16 15:21:14 -05:00
Jim Miller
609cf53048 Bump Test Version 3.21.2 2020-07-16 14:32:21 -05:00
Jim Miller
0ff0526d86 Fixes for site changes: adapter_ficbooknet 2020-07-16 14:11:39 -05:00
Jim Miller
d0bcf638d7 Change INI error link blue color in dark mode. 2020-07-15 19:22:06 -05:00
Jim Miller
20cca252b2 Bump Test Version 3.21.1 2020-07-15 13:47:22 -05:00
Jim Miller
e1d087733f For anthologies, look for common val in numbered series00 if not all share 'series'. 2020-07-15 13:46:51 -05:00
Jim Miller
fe6e0263b8 Add 'Series [0]' option for new anthologies. 2020-07-15 12:12:51 -05:00
Jim Miller
71fff8511b Comment out a debug. 2020-07-12 15:23:12 -05:00
Eli Schwartz
74c38d2431 fix the addition of unittests in commit 1669c06703
Revert "Exclude tests from plugin zip."

This reverts commit 8cddb23c69.

Instead, move unittests to where they belong: not inside the module, but
next to it.
2020-07-09 15:02:09 -05:00
Jim Miller
188c92332f Address some py3 deprecation warnings. 2020-07-09 14:20:55 -05:00
Jim Miller
5363fccbfe Update packaged version of six.py to v1.15.0 2020-07-09 14:13:24 -05:00
Jim Miller
7e4e2d7844 Bump Release Version 3.21.0 2020-07-07 10:46:48 -05:00
Jim Miller
1d3598ed8a Bump Test Version 3.20.13 2020-07-06 16:27:12 -05:00
Jim Miller
dfd943f797 Move add_category_when_multi_category after dedup'ing so consolidated category values don't count. 2020-07-06 16:26:03 -05:00
Jim Miller
24c349c2e0 Bump Test Version 3.20.12 2020-07-06 09:56:15 -05:00
Jim Miller
8cddb23c69 Exclude tests from plugin zip. 2020-07-06 09:55:38 -05:00
Kolbo
d6d61cd04f split fixtures in dedicaced files 2020-07-06 09:54:58 -05:00
Kolbo
df26d2752a remove unnecessary method 2020-07-06 09:54:58 -05:00
Kolbo
eb44773c87 Revert "refacto: fix deprecated warnings"
This reverts commit 1422dd9391.
2020-07-06 09:54:58 -05:00
Kolbo
1669c06703 feat: add chireads.com adapter and tests on this one 2020-07-06 09:54:58 -05:00
Kolbo
c92effa01b refacto: fix deprecated warnings 2020-07-06 09:54:58 -05:00
Kolbo
0118bea9e2 refacto: add venv to gitignore 2020-07-06 09:54:58 -05:00
Jim Miller
fac2536d2c Tweak test1.com 2020-07-05 19:16:24 -05:00
Jim Miller
ffbb28a8bd Bump Test Version 3.20.11 2020-07-05 19:14:12 -05:00
Jim Miller
1d73d2ffdb Add add_category_when_multi_category option. 2020-07-05 19:14:01 -05:00
Jim Miller
3ad076bd34 Bump Test Version 3.20.10 2020-07-05 16:21:23 -05:00
Jim Miller
df29f57374 Fix for recursion in add_genre_when_multi_category caching bad value. 2020-07-05 16:21:10 -05:00
Jim Miller
a0a831c2d4 Bump Test Version 3.20.9 2020-07-03 19:30:23 -05:00
Jim Miller
67d3bbdca4 Change INI edit highlight colors when dark theme. 2020-07-03 19:29:39 -05:00
Jim Miller
0bb83f62b7 Bump Test Version 3.20.8 2020-06-22 18:17:20 -05:00
Jim Miller
b7480a5d3f Fix add_genre_when_multi_category so it can include_in_ without breaking and move above doreplacements. 2020-06-22 18:17:04 -05:00
Jim Miller
9a0e244141 Change pretendercentre.com->www.pretendercentre.com for SSL cert. 2020-06-22 13:55:32 -05:00
Jim Miller
9f807bc9ca Bump Test Version 3.20.7 2020-06-22 13:24:41 -05:00
Jim Miller
1bff056a49 Add cover_min_size setting. 2020-06-22 13:13:23 -05:00
Jim Miller
e33f854402 Comment out some debug output. 2020-06-22 13:01:13 -05:00
Jim Miller
e25bb1e3a0 Bump Test Version 3.20.6 2020-06-18 12:13:03 -05:00
Jim Miller
9d226f9fe1 Add .SHOW_EMPTY feature for titlepage_entries. 2020-06-18 12:12:14 -05:00
Jim Miller
0aa3204dfb Update translations. 2020-06-17 17:21:06 -05:00
Jim Miller
85c4e638dd Bump Test Version 3.20.5 2020-06-17 17:15:35 -05:00
Jim Miller
3a23924721 Check for epub before polishing cover into it. 2020-06-17 17:15:33 -05:00
Jim Miller
7435d15fdb Bump Test Version 3.20.4 2020-06-10 10:49:30 -05:00
Jim Miller
93a9c4a6c6 Change site efiction.esteliel.de to faerie-archive.com, also changed siteabbrev eesd->fae 2020-06-10 10:49:30 -05:00
Jim Miller
d2e26165bc Bump Test Version 3.20.3 2020-06-07 09:29:56 -05:00
Jim Miller
fa784d58f9 Allow for no genre stories in adapter_fanficauthorsnet. 2020-06-07 09:24:54 -05:00
Jim Miller
3e61731df4 Bump Test Version 3.20.2 2020-06-06 15:58:58 -05:00
Jim Miller
ebd77e5c52 Fix adapter_webnovelcom sitetags for site changes. 2020-06-06 15:58:36 -05:00
Jim Miller
3c77a68e61 Bump Test Version 3.20.1 2020-06-04 17:27:08 -05:00
Jim Miller
54409dd083 Fix for collision issue with translations and email direct d/l. 2020-06-04 17:26:41 -05:00
Jim Miller
37bc54feb9 Update translations. 2020-06-03 10:12:59 -05:00
Jim Miller
c1929d60a1 Bump Release Version 3.20.0 2020-06-01 10:56:38 -05:00
Jim Miller
1c6eba12cf Bump Test Version 3.19.11 2020-06-01 10:55:58 -05:00
Jim Miller
c743d463c4 Add more URLs to cover_exclusion_regexp for base_xenforoforum. 2020-05-31 09:55:36 -05:00
Jim Miller
c73dfd9461 Bump Test Version 3.19.10 2020-05-30 09:41:20 -05:00
Jim Miller
a2f1817f30 Better auth page parsing due to own-favorite story in adapter_harrypotterfanfictioncom. 2020-05-30 09:39:29 -05:00
Jim Miller
7a5730c720 Fix for is_adult needing &showRestricted URL in adapter_harrypotterfanfictioncom. 2020-05-30 09:30:18 -05:00
Jim Miller
137685cef5 Bump Test Version 3.19.9 2020-05-28 17:58:08 -05:00
Jim Miller
c319d25fa3 Add characters and increased category collection for adapter_fanfiktionde. 2020-05-28 17:44:42 -05:00
Jim Miller
d2b860ceb6 Suppress output_css on CLI -z 2020-05-28 17:43:09 -05:00
Jim Miller
7d3c5445e2 Bump Test Version 3.19.8 2020-05-27 10:56:53 -05:00
Jim Miller
d79d5aec98 Update translations. 2020-05-27 10:56:53 -05:00
Jim Miller
681ddd0ad9 Fix numWords parsing in adapter_fanficsme. 2020-05-27 10:56:53 -05:00
Jim Miller
a14c97d335 Tweak series parsing to save fetches in adapter_silmarillionwritersguildorg 2020-05-27 10:37:55 -05:00
Jim Miller
23f93bde24 Allow for stories without series in adapter_silmarillionwritersguildorg, clean up whitespace. 2020-05-27 09:54:39 -05:00
Jim Miller
cbc7c4b64b Merge branch 'alistairporter-master' 2020-05-27 09:45:56 -05:00
Alistair Porter
cbf167d2a4 Rework series parsing to include index, remove adult check code & misc improvements 2020-05-27 10:28:05 +01:00
Alistair Porter
f12564b22f Remove some redundant code and clarify author parsing 2020-05-27 08:39:19 +01:00
Alistair Porter
1654915282 Stop removing hr tags from chapter 2020-05-26 02:13:10 +01:00
Alistair Porter
ac3dc698bb Fixed Series parsing for name and url 2020-05-26 02:13:10 +01:00
Alistair Porter
206e8c87da Fix the summary parsing to include all p tags in summary section 2020-05-26 02:13:10 +01:00
Alistair Porter
232f0b1b24 fix broken regexes for character, genre and warning parsing and reimplement try except for all metadata parsing 2020-05-26 02:13:10 +01:00
Jim Miller
ccdc926f22 Fix broken series parsing in adapter_adastrafanficcom & adapter_lotrgficcom. 2020-05-25 11:26:07 -05:00
alistairporter
406a9022ee
Fix copyright year 2020-05-25 06:21:30 +00:00
Alistair Porter
4b71c0b6e3 Add support for www.silmarillionwritersguild.org 2020-05-25 07:10:31 +01:00
Jim Miller
f169fb53c7 Add Russian langauge translation. 2020-05-20 10:42:10 -05:00
Jim Miller
46c67f87ca Update translations 2020-05-19 16:59:25 -05:00
Jim Miller
1328d3781b Bump Test Version 3.19.7 2020-05-19 16:57:50 -05:00
Jim Miller
eeeabff8f7 Allow /post/ story URLs with adapter_gravitytalescom. 2020-05-19 16:57:38 -05:00
Jim Miller
0035f3a0f3 Bump copyright date. 2020-05-19 16:56:54 -05:00
Jim Miller
3da4281093 Bump Test Version 3.19.6 2020-05-17 20:30:53 -05:00
Jim Miller
883938994b Change <td> to <div> in chapter text adapter_adultfanfictionorg 2020-05-17 20:30:37 -05:00
Jim Miller
a2e1a04409 Bump Test Version 3.19.5 2020-05-16 09:24:03 -05:00
Jim Miller
7feceb67ce Now actually *use* translated strings in prefs.py. 2020-05-16 09:22:48 -05:00
Jim Miller
164c63602b Bump Test Version 3.19.4 2020-05-15 11:19:54 -05:00
Jim Miller
bbf7e7ddbb Catch exception in exception handling for French user. 2020-05-15 11:19:36 -05:00
Jim Miller
bc9a71adf6 Update Translations. 2020-05-15 10:16:19 -05:00
Jim Miller
361bd6fd24 Need to include prefs.py in messages.pot for translation. 2020-05-14 19:04:23 -05:00
Jim Miller
aaaae4ddd8 Bump Test Version 3.19.3 2020-05-09 12:30:40 -05:00
Jim Miller
0ce1323df1 Remove '.' from numWords in adapter_fanfiktionde 2020-05-09 12:30:27 -05:00
Jim Miller
e6e1876f44 Bump Test Version 3.19.2 2020-05-08 14:35:30 -05:00
Jim Miller
1f17b9c9a0 Fix for adapter_bloodshedversecom site changes. 2020-05-06 11:45:13 -05:00
Jim Miller
ca07695481 Bump Test Version 3.19.1 2020-05-02 11:04:16 -05:00
Jim Miller
ec956c4115 Find both 'Translator' and 'TranslatorS', except in Russian. 2020-05-02 11:04:00 -05:00
Jim Miller
487b6067e9 Bump Release Version 3.19.0 2020-05-01 19:18:09 -05:00
Jim Miller
a4093f9621 Update translations(comments only) 2020-05-01 19:17:49 -05:00
Jim Miller
20197bacd9 Comment on __main__ vs main(). 2020-04-28 18:05:29 -05:00
Jim Miller
26d15c6f34 Bump Test Version 3.18.12 2020-04-27 12:29:03 -05:00
Jim Miller
2642a35009 Comment out a debug, user_agent for wuxiaworld.com 2020-04-27 12:26:21 -05:00
Jim Miller
328b36cb84 Bump Test Version 3.18.11 2020-04-25 20:25:05 -05:00
Smut Andrea
0107ce7d4d Fix adapter_fictionmaniatv image page parsing 2020-04-25 20:23:39 -05:00
Jim Miller
56e91c1f73 Bump Test Version 3.18.10 2020-04-22 10:35:24 -05:00
Jim Miller
1b6f46a5b7 Fix adapter_fanficsme for date tags change. 2020-04-22 10:31:41 -05:00
Jim Miller
d334a6fa93 Fix previous unicode fix. 2020-04-22 10:06:22 -05:00
Eli Schwartz
d8e7e345d0 fix newly introduced python3 compat error: 'unicode' doesn't exist
In commit 50921e0435, the unicode cast was
added to fix passed_defaultsini being native str() type on python2 only.
However it won't work at all on python3, since we need to import
six.text_type for that.

This could be more consistently spotted if we used a different name,
such as unicode_type or text_type itself, but instead we merely shadow
the python2 type, which means mistakes still work on python2. :(
2020-04-19 20:46:56 -05:00
Jim Miller
bc6e7cafd8 Bump Test Version 3.18.9 2020-04-15 12:02:06 -05:00
Jim Miller
dbbdca7497 Allow for author without link & id in XF2. 2020-04-15 12:01:52 -05:00
Jim Miller
e141ddfb7c Bump Test Version 3.18.8 2020-04-15 09:24:50 -05:00
Jim Miller
47fab653c9 Change inject_chapter_title back to h3. 2020-04-15 09:24:37 -05:00
Jim Miller
557c2ea601 Bump Test Version 3.18.7 2020-04-14 10:25:54 -05:00
Jim Miller
69fc4b67a0 Change inject_chapter_title code in adapter_storiesonlinenet due to clean up weirdness. 2020-04-14 10:25:39 -05:00
Jim Miller
e985e15761 Bump Test Version 3.18.6 2020-04-13 12:34:58 -05:00
Jim Miller
7c7ab004d2 Change defaults.ini recommendations for inject_chapter_title and change injected titles to h4--had problems with epub update. 2020-04-13 12:34:47 -05:00
Jim Miller
3c4b8ff401 Bump Test Version 3.18.5 2020-04-12 10:37:53 -05:00
Jim Miller
9689747063 Fix for py2/3 differences in final strip-non-ASCII decode fall back. 2020-04-12 10:37:19 -05:00
Jim Miller
7ba1f58788 Remove unused func from mobi.py, possible py2/3 issues. 2020-04-12 10:36:31 -05:00
Jim Miller
6b3992e238 Forgot to commit complete adapter_webnovelcom watermarking solution. 2020-04-12 10:35:55 -05:00
Jim Miller
79688dc14d Bump Test Version 3.18.4 2020-04-11 18:14:04 -05:00
Jim Miller
0c314fd644 Remove some 'watermarking' tags from adapter_webnovelcom 2020-04-11 18:13:33 -05:00
Jim Miller
a8a6519f01 Bump Test Version 3.18.3 2020-04-11 18:12:29 -05:00
Jim Miller
ec88777733 Python3 removes cgi.escape() in favor of html.escape() which doesn't exist in Python2. 2020-04-11 18:01:54 -05:00
Jim Miller
7fbf4130d3 Bump Test Version 3.18.2 2020-04-08 13:25:07 -05:00
Jim Miller
eacbd91d82 Add scifistories.com to valid site list for universe_as_series setting. 2020-04-06 22:33:27 -05:00
Jim Miller
57e1181c48 Change default setting for new users for 'Update Calibre Cover (from EPUB):' to 'Yes, if EPUB has a cover image' 2020-04-06 22:32:51 -05:00
Jim Miller
00d15bee59 De-obfuscate emails in XenForo posts. 2020-04-05 18:17:10 -05:00
Jim Miller
54f843ec06 Bump Test Version 3.18.1 2020-04-04 10:12:02 -05:00
Jim Miller
e9f82c3343 New Site: scifistories.com (extends finestories.com). 2020-04-04 10:11:17 -05:00
Jim Miller
020cef0b99 Bump Release Version 3.18.0 2020-04-01 12:37:05 -05:00
Jim Miller
f36e59e9c1 Update translations 2020-04-01 12:37:05 -05:00
Jim Miller
3d675e5a25 Bump Test Version 3.17.9 2020-04-01 10:08:51 -05:00
Jim Miller
594057738d Change fanfiction.tenhawkpresents.com to fanfic.tenhawkpresents.ink 2020-04-01 10:08:02 -05:00
Jim Miller
7c5ee9b44a Revert "Remove defunct site fanfiction.tenhawkpresents.com"
This reverts commit 397a181952.
2020-04-01 00:16:25 -05:00
Jim Miller
4d04fabe9c Bump Test Version 3.17.8 2020-03-31 19:27:41 -05:00
Jim Miller
a1a7ea4d40 Changes to adapter_archiveofourownorg for AO3 changes re: view_adult=true 2020-03-31 19:27:29 -05:00
Jim Miller
22d2ad4564 fictionalley.org needs a slow_down_sleep_time or it starts rejecting conns. 2020-03-31 14:49:51 -05:00
Jim Miller
39368ce2ac Bump Test Version 3.17.7 2020-03-31 14:49:51 -05:00
Jim Miller
3ee2597bc3 Fix Description parsing with series/universe/contest links in adapter_storiesonlinenet. 2020-03-31 14:49:51 -05:00
Jim Miller
ba5027ad4d
Merge pull request #479 from mcepl/adapter_ikeran
Revert PR#478
2020-03-31 14:49:15 -05:00
Matěj Cepl
b66dc2a928
Revert PR#478 2020-03-31 21:46:55 +02:00
Matěj Cepl
c1995507f1 Add support for fictionalley.ikeran.org
The site apparently uses the same software as www.fictionalley.org, so
this is just subclass of FictionAlleyOrgSiteAdapter.
2020-03-31 10:00:21 -05:00
Jim Miller
e094793c17 Remove extra spaces from replace_metadata examples 2020-03-30 12:30:40 -05:00
Jim Miller
8f21412374 Accept https URLs for adapter_wwwutopiastoriescom, but don't use https--doesn't work with python. 2020-03-29 12:08:38 -05:00
Jim Miller
869ed37137 Bump Test Version 3.17.6 2020-03-25 21:50:23 -05:00
Jim Miller
ab1a22cf22 Fix for adapter_ficbooknet when no ships/characters. 2020-03-25 21:50:08 -05:00
Jim Miller
c1409b85c6 Bump Test Version 3.17.5 2020-03-25 11:25:34 -05:00
Jim Miller
67c14a2ead Correct comments for universe_as_series setting in defaults.ini. 2020-03-25 11:22:39 -05:00
Jim Miller
c238cd5790 Fixes for ficbook.net site changes. 2020-03-25 11:11:04 -05:00
Jim Miller
20e1c4c98f Bump Test Version 3.17.4 2020-03-24 20:18:21 -05:00
Jim Miller
f59f1d0b37 Make sure all timestamps are using Calibre's local_tz. 2020-03-24 20:18:06 -05:00
Jim Miller
e0f933c357 Add inject_chapter_title feature to defaults.ini for finestories.com too. 2020-03-23 20:03:59 -05:00
Jim Miller
3a90dbaefb Bump Test Version 3.17.3 2020-03-23 13:07:04 -05:00
Jim Miller
a2ec61dc8d Fix for XF logins using manual user/pass instead of personal.ini. 2020-03-23 13:02:44 -05:00
Jim Miller
c74519c9ef Bump Test Version 3.17.2 2020-03-23 12:11:51 -05:00
Jim Miller
4878ce41c3 Add inject_chapter_title feature to adapter_storiesonlinenet for config checking too. 2020-03-23 12:09:43 -05:00
Jim Miller
59badd392c Bump Test Version 3.17.1 2020-03-23 10:33:34 -05:00
Jim Miller
f97b54328f Add inject_chapter_title feature to adapter_storiesonlinenet. 2020-03-23 10:32:50 -05:00
Jim Miller
50921e0435 Need to unicode() ini data passed in from Calibre. 2020-03-20 12:17:56 -05:00
Jim Miller
5f0a706f2c Bump Release Version 3.17.0 2020-03-14 11:19:43 -05:00
Jim Miller
aaf970d77c Update translations 2020-03-14 11:19:10 -05:00
Jim Miller
252d220caa Bump Test Version 3.16.3 2020-03-08 15:06:24 -05:00
Jim Miller
7e81930a56 Add base_xenforo2forum feature skip_sticky_first_posts(on by default). 2020-03-08 15:06:12 -05:00
Jim Miller
908da5744b Update adapter_test1 2020-03-06 14:22:28 -06:00
Jim Miller
f3fb857b89 Bump Test Version 3.16.2 2020-02-20 18:14:21 -06:00
Jim Miller
9494920eef Ignore AO3's chapter numbers for use_view_full_work and use chapter offset instead. Closes #470 2020-02-20 18:14:07 -06:00
Jim Miller
a37a14aa58 Bump Test Version 3.16.1 2020-02-19 10:57:50 -06:00
Jim Miller
9901796331 Add order_threadmarks_by_date to base_xenforoforum, improve defaults.ini. 2020-02-19 10:57:38 -06:00
Jim Miller
b1c55ced18 Bump Release Version 3.16.0 2020-02-13 11:50:38 -06:00
Jim Miller
f9395fd178 Bump Test Version 3.15.9 2020-02-07 12:22:32 -06:00
Jim Miller
fe1ab04627 Collect seriesXX for adapter_archivehpfanfictalkcom. 2020-02-07 12:22:16 -06:00
Jim Miller
0af184921d Reduce debug output in base_xenforoforum_adapter. 2020-02-06 21:53:14 -06:00
Jim Miller
977c07fa27 INI whitespace 2020-02-06 13:30:51 -06:00
Jim Miller
1e84465478 Bump Test Version 3.15.8 2020-02-06 13:28:51 -06:00
Jim Miller
2dd2f9ed81 Change adapter_archivehpfanfictalkcom to own code. 2020-02-06 13:28:31 -06:00
Jim Miller
bfccd4838e Not all listed 'lead' sections still exist. 2020-02-06 13:28:09 -06:00
Jim Miller
f519641e6e Bump Test Version 3.15.7 2020-02-04 17:59:57 -06:00
Jim Miller
b3d0918aab Adding eFiction base version of adapter_archivehpfanfictalkcom. 2020-02-04 10:32:06 -06:00
Jim Miller
84a9d753a4 Add replace_xbr_with_hr feature. 2020-02-02 14:42:04 -06:00
Jim Miller
555f8a7ae6 Bump Test Version 3.15.6 2020-02-01 09:33:57 -06:00
Jim Miller
6a6007441f Fix for adapter_storiesonlinenet requiring 'v' from login.php. 2020-02-01 09:33:57 -06:00
Jim Miller
8c982a6770 Update translations. 2020-02-01 09:27:13 -06:00
Jim Miller
672fc984a6 Bump Test Version 3.15.5 2020-01-24 22:25:42 -06:00
Jim Miller
031d9de356 Add more domains for AO3. 2020-01-24 22:25:24 -06:00
Jim Miller
f087d7dda9 Bump Test Version 3.15.4 2020-01-21 13:22:45 -06:00
Jim Miller
cac7410d7e Fix to #459 for Python2 compatibility. 2020-01-21 13:22:20 -06:00
Jim Miller
f779bad0de Add cover image page 'landmark' for epub3. 2020-01-20 14:00:45 -06:00
Jim Miller
0027730789 Bump Test Version 3.15.3 2020-01-20 13:00:45 -06:00
Jim Miller
cdb589966f Include <blockquote> as well as <p> in adapter_phoenixsongnet chapters. 2020-01-20 12:59:19 -06:00
Jim Miller
029e70aa0b Bump Test Version 3.15.2 2020-01-17 19:47:13 -06:00
Jim Miller
4b9bc818d7 Use storyUrl from metadata for checking library, for those sites that make canonical storyUrl difficult, like adapter_literotica. Closes #461 2020-01-17 19:46:49 -06:00
Jim Miller
54a35ca562 Bump Test Version 3.15.1 2020-01-15 13:26:48 -06:00
Jim Miller
933072df5c Add epub_version to defaults.ini 2020-01-15 13:26:48 -06:00
Jim Miller
11811226b4 Write epub 2.0 or epub 3.0 depending on epub_version setting. 2020-01-15 13:16:07 -06:00
Jim Miller
ca41ce4123 Make Unnew work on epub3 *and* epub2. 2020-01-15 13:16:07 -06:00
Jim Miller
311e4ad417 Fix version fo epub3 support -- 3 only, not 2. 2020-01-15 10:21:38 -06:00
Jim Miller
0c479fa579 Bump Release Version 3.15.0 2020-01-15 09:24:33 -06:00
Jim Miller
62fff21f59 Update github project home page. 2020-01-14 18:11:34 -06:00
Jim Miller
b53a3741bd Fix adapter_test1 for multi-author test. 2020-01-14 10:50:53 -06:00
Jim Miller
f579ef6e08 trekfanfiction.net uses utf8 now 2020-01-13 12:43:23 -06:00
Jim Miller
9f557cc10a Bump Test Version 3.14.8 2020-01-13 12:38:58 -06:00
Jim Miller
a83563961c Kludge fix for incorrect HTTP response 500 code from trekfanfiction.net. 2020-01-13 12:38:38 -06:00
Jim Miller
10b9050e57 Remove site fannation.shades-of-moonlight.com, parked domain, last worked Dec2018. 2020-01-13 12:11:40 -06:00
Jim Miller
da4a09b3d5 Remove site nfacommunity.com, parked domain, last worked Dec2018. 2020-01-13 12:09:51 -06:00
Jim Miller
de9a79ede8 Bump Test Version 3.14.7 2020-01-09 13:48:24 -06:00
Jim Miller
bfc9c8b45c Make sure storyUrl has &->&amp; and then change it back for Calibre. Matches how authorUrl is handled(in all_metadata). Closes #460 2020-01-09 13:46:44 -06:00
Jim Miller
24abb202a7 adapter_sugarquillnet change chapter text td to div. Issue #460 2020-01-09 13:07:35 -06:00
Jim Miller
4c3c624cc2 Bump Test Version 3.14.6 2020-01-09 12:21:55 -06:00
Jim Miller
eba9f75c91 Order ini files in help message to reflect priority order. 2020-01-09 12:10:02 -06:00
hseg
fbfcb9a5f6
Support XDG Base Directory spec
Basically boils down to giving `$XDG_CONFIG_HOME/fanficfare` top priority among
search locations, defaulting to `~/.config/fanficfare` if `$XDG_CONFIG_HOME` is
unset.

This shouldn't change anything for anyone who doesn't want XDG support.
2020-01-08 22:56:48 +02:00
Jim Miller
836c02de53 Update translations. 2020-01-08 11:28:59 -06:00
Jim Miller
7a2ff524d7 Bump Test Version 3.14.5 2020-01-07 12:51:31 -06:00
Jim Miller
49a5536c51 Update adapter_alternatehistorycom for XenForo2. Issue #457 2020-01-07 12:51:10 -06:00
Jim Miller
1b543780a8 Order custom columns in config page. 2020-01-06 17:42:10 -06:00
Jim Miller
3405e0bda1 Bump Test Version 3.14.4 2020-01-02 13:01:49 -06:00
Jim Miller
10029b41b1 Add -U/--update-epub-always option to CLI. 2020-01-02 12:58:47 -06:00
Jim Miller
ac16bfaeb3 Don't do URL quoting on file: URLs. Fix for spaces->+ breaking default cover. 2020-01-02 12:40:31 -06:00
Jim Miller
b192fd0ad1 Catch ',' in front of 'Thread' in forum titles. 2019-12-22 17:43:58 -06:00
Jim Miller
561fa5e319 Bump Test Version 3.14.3 2019-12-20 15:06:50 -06:00
Jim Miller
e9fbb19d67 Reduce minimum_calibre_version to v2.85.1, last of 2 series. 2019-12-18 18:21:47 -06:00
Jim Miller
45a25394f5 More fixes for python3. 2019-12-18 18:07:35 -06:00
Jim Miller
cdce62f2aa Bump Test Version 3.14.2 2019-12-18 12:05:41 -06:00
Jim Miller
45bc88d9bf Remove web service code. Refer back to tag v3.13.0 if ever needed again. 2019-12-18 12:05:26 -06:00
Jim Miller
a0e2db3925 With Calibre min version v3.85, included_dependencies are not needed. 2019-12-18 11:16:06 -06:00
Jim Miller
4718a2c2de Bump minimum_calibre_version to v3.48, last of 3 series. 2019-12-17 13:06:32 -06:00
Jim Miller
7c7946cc51 Bump Test Version 3.14.1 2019-12-17 12:38:52 -06:00
Eli Schwartz
18804a52cf py3: add various python-six fixes 2019-12-17 10:44:06 -06:00
Eli Schwartz
31ac92a06d py3: read config as unicode all the time 2019-12-17 10:44:06 -06:00
Eli Schwartz
c054449328 py3: get rid of basestring 2019-12-17 10:44:06 -06:00
Eli Schwartz
93e93dee92 py3: fix unicode type 2019-12-17 10:44:06 -06:00
Eli Schwartz
b4feb0153d python3: get configparser bits from six.moves 2019-12-17 10:44:06 -06:00
Eli Schwartz
4d1be812e8 python3: use the io module everywhere
Make the calibre plugin usage work on python3. Since FanFicFare does not
support python 2.5, make it work by using the modern idiom.

Essentially, six.StringIO and six.BytesIO makes no sense to use
anywhere, since any code that works on python3 at all will also work
with the io module in python >= 2.6. The only caveat is that the
unadorned str type in python2 works with neither, but it is always best
to be explicit and use either unicode or bytes.
2019-12-17 10:44:06 -06:00
Eli Schwartz
15a79ee0ca python3: decode the bytes received from get_resources()
We're combining it with a str type.
2019-12-17 10:44:06 -06:00
Eli Schwartz
a2c558d864 python3: fix incorrect use of merging two dictionaries
In python2, this was inefficient, because it allocated *three* lists of
tuples, before finally generating a dict based on them. In python3, it
fails because you cannot combine the dict_items() type.

Moreover, retval was always a function-local dictionary used purely for
returning the value, so dict1.update(dict2) will always yield the
correct result, and we don't even need to create a copy to avoid
mutating the original dictionary.
2019-12-17 10:44:06 -06:00
Eli Schwartz
dc77754c1a python3: use modern exception syntax 2019-12-17 10:44:06 -06:00
Eli Schwartz
4a3640cc33 python3: enforce use of absolute imports
Where relative imports are currently being relied upon, do this
explicitly.
2019-12-17 10:44:06 -06:00
Jim Miller
517082f4d1 Bump Release Version 3.14.0 2019-12-16 13:38:44 -06:00
Jim Miller
72e9054b1f Bump Test Version 3.13.10 2019-12-07 16:16:27 -06:00
Jim Miller
5012f64156 Different email URL for royalroad.com. Closes #452 2019-12-06 21:05:36 -06:00
Jim Miller
b945001851 Bump Test Version 3.13.9 2019-12-06 14:27:05 -06:00
Jim Miller
3b0b37920f Use image pages when available for adapter_fictionmaniatv 2019-12-06 14:26:52 -06:00
Jim Miller
6f3d4bc3af Bump Test Version 3.13.8 2019-12-02 13:58:48 -06:00
Jim Miller
ea013468e1 Update adapter_fictionmaniatv to use HTML versions vs text by default. 2019-12-02 13:58:47 -06:00
Jim Miller
816237116d Add to debug output to plugin. 2019-12-02 13:52:28 -06:00
Jim Miller
b8357c409c Update translations. 2019-12-01 10:39:53 -06:00
Jim Miller
b717b0b2a2 Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2019-11-30 17:26:08 -06:00
Jim Miller
2d4cda3ff9
Web Service Shutdown 2019-11-30 17:25:41 -06:00
Jim Miller
5f9ef422dd Closing Web Service Announcement. 2019-11-30 17:03:37 -06:00
Jim Miller
6ba2fdc776 Add 'button menu' option strings for translation. 2019-11-29 09:30:12 -06:00
Jim Miller
c732f52bb7 Bump Test Version 3.13.7 2019-11-26 12:47:38 -06:00
Jim Miller
a941dae620 New Site fanfics.me (Russian langauge site). 2019-11-26 12:47:21 -06:00
Jim Miller
879f4bc062 Bump Test Version 3.13.6 2019-11-23 19:18:33 -06:00
Jim Miller
e74af328cf Bump Test Version 3.12.6 2019-11-23 16:34:08 -06:00
Jim Miller
73a2708fa1 Base_eFiction - Get Rating/Rated from TOC page if not found on print page. Remove from 2 individual adapters. 2019-11-23 16:32:20 -06:00
Jim Miller
bb439de0e4 Bump Test Version 3.13.5 2019-11-23 10:38:04 -06:00
Jim Miller
f0e1ec7c41 Failsafe for max_zalgo setting. Closes #449. 2019-11-23 10:38:03 -06:00
Jim Miller
f0b139085e DON'T use pagecache in adapter_mediaminerorg. Causes SSL error in Calibre... 2019-11-23 10:38:03 -06:00
Jim Miller
5f8703059e Use pagecache in adapter_mediaminerorg. 2019-11-23 10:24:16 -06:00
Jim Miller
d5f67b1244 Bump Test Version 3.13.4 2019-11-22 21:41:51 -06:00
Jim Miller
45e336b61e Add plugin option for toolbar button to pop menu. 2019-11-22 21:41:29 -06:00
Jim Miller
666a9c958c Bump Test Version 3.13.3 2019-11-19 13:17:48 -06:00
Jim Miller
aaceecef99 Fix AO3 chapterstotal replace_metadata for multi digits. 2019-11-19 13:17:35 -06:00
Jim Miller
716679b012 Bump Test Version 3.13.2 2019-11-18 14:58:48 -06:00
Jim Miller
d389d2b12e Add <div class='fff_chapter_notes' around AO3 chapter head/foot notes. 2019-11-18 14:58:17 -06:00
Jim Miller
17b080c932 Bump Test Version 3.13.1 2019-11-17 11:36:27 -06:00
Jim Miller
31182e18fc Closes #466 - Remove feature to save cookiejar between downloads and BG jobs--causes some obscure problems with QQ at least. 2019-11-17 11:36:07 -06:00
Jim Miller
333444371c Bump Release Version 3.13.0 2019-11-15 11:51:04 -06:00
Jim Miller
6de5f07ffc Set Google Engine max_instances: 1 2019-11-15 11:50:23 -06:00
Jim Miller
7669cac324 Update translations. 2019-11-15 11:49:19 -06:00
Jim Miller
afadca7586 SB appears to have fixed OP Email notifications--and
fetch_last_page didn't always help anyway.

Revert "SB needs fetch_last_page again after all."

This reverts commit 9ce631c5c0.
2019-11-12 21:03:08 -06:00
Jim Miller
1897506613 Bump Test Version 3.12.14 2019-11-11 14:39:00 -06:00
Jim Miller
8d2f9198c2 Add some code to handle 24 hr clocks w/o changing adapters. 2019-11-11 14:38:30 -06:00
Jim Miller
f531a6293b Bump Test Version 3.12.13 2019-11-09 10:28:08 -06:00
Jim Miller
c8d25aa07c Remove site www.13hours.org. Moved to AO3. 2019-11-09 10:25:57 -06:00
Jim Miller
264c5473f0 Remove site lotrfanfiction.com - Closed as per web site. 2019-11-09 10:24:31 -06:00
Jim Miller
d7902a9c1e Remove site twilightarchives.com - Closed as per web site. 2019-11-09 10:20:57 -06:00
Jim Miller
96aeab1ddf Adding a little debug output. 2019-11-09 10:17:45 -06:00
Jim Miller
7db7ce7337 Bypass expired SSL cert by not using SSL. adapter_spikeluvercom 2019-11-09 10:14:27 -06:00
Jim Miller
8d95a8bab2 Bypass expired SSL cert by not using SSL. adapter_thehookupzonenet 2019-11-09 10:13:49 -06:00
Jim Miller
68b2a816fc Don't escape # in URLs, causes problems with ficbook.net. 2019-11-09 08:38:12 -06:00
Jim Miller
8998bb482b Bump Test Version 3.12.12 2019-11-08 22:19:03 -06:00
Jim Miller
aca86f43e3 Add site specific chapterslashtotal and chapterstotal for adapter_archiveofourownorg. 2019-11-08 22:18:41 -06:00
Jim Miller
7a03715ecc Bump Test Version 3.12.11 2019-11-07 14:04:59 -06:00
Jim Miller
94d09def21 Add parentforums site specific metadata for base_xenforo, include partial list in category. 2019-11-07 14:04:40 -06:00
Jim Miller
53143266c2 Bump Test Version 3.12.10 2019-11-07 14:04:39 -06:00
Jim Miller
f1676589fd Fixes for adapter_webnovelcom site changes to JSON. 2019-11-07 13:15:29 -06:00
Jim Miller
9944ffb7ac Bump Test Version 3.12.9 2019-11-04 21:07:28 -06:00
Jim Miller
720e6f088e Add user/pass to adapter_fictionhuntcom, required to see chapter text now. 2019-11-04 20:00:50 -06:00
Jim Miller
dff90a10df Discard cookie cache on library change--might use different users/settings. 2019-11-03 16:49:07 -06:00
Jim Miller
7b008f6b26 Bump Test Version 3.12.8 2019-11-02 10:17:48 -05:00
Jim Miller
ce68248694 At least one efiction site said Completed: Completed instead of Yes. 2019-11-02 10:17:26 -05:00
Jim Miller
a26a303325 Bump Test Version 3.12.7 2019-10-30 19:05:25 -05:00
Jim Miller
85713a3455 Adding some html class attrs to epub output for ease of CSS. 2019-10-30 19:05:05 -05:00
Jim Miller
d4e1012328 Bump Test Version 3.12.6 2019-10-26 11:17:02 -05:00
Jim Miller
e9d313f4ab Save cookiejar between downloads during same calibre session, including BG jobs. 2019-10-26 11:16:46 -05:00
Jim Miller
2c4524f2f8 Bump Test Version 3.12.5 2019-10-26 09:34:12 -05:00
Jim Miller
0ae8f31ae8 Comment out a debug. 2019-10-25 22:40:18 -05:00
Jim Miller
9ce631c5c0 SB needs fetch_last_page again after all.
Revert "Remove fetch_last_page base_xenforo feature--not needed after SB conversion to XF2."

This reverts commit f01e8d8354.
2019-10-25 22:37:51 -05:00
Jim Miller
3cbf9e5668 Add --no-output CLI option. Closes #443 2019-10-25 11:27:23 -05:00
Jim Miller
5c29744bb0 Bump Test Version 3.12.4 2019-10-23 16:03:41 -05:00
Jim Miller
851acb95e2 Avoid post URLs in XenForo notification emails for QQ & AH too. 2019-10-23 16:03:05 -05:00
Jim Miller
0e42d1bb1b Bump Test Version 3.12.3 2019-10-23 13:20:45 -05:00
Jim Miller
939a8b050a Fix for CLI update fail on one story causing all subsequent to also skip update. Issue #438 2019-10-23 12:42:34 -05:00
Jim Miller
a96c5aa800 Fix for 404 error on XF always_login. Closes #438 2019-10-23 12:18:45 -05:00
Jim Miller
c5e973cc0c Avoid post URLs in XenForo notification emails better w/o catching post in thread title. Probably. 2019-10-22 23:35:53 -05:00
Jim Miller
dbb3583045 Bump Test Version 3.12.2 2019-10-22 08:22:47 -05:00
Jim Miller
3c0276e63a Avoid post URLs in XenForo notification emails better. 2019-10-22 08:18:37 -05:00
Jim Miller
f01e8d8354 Remove fetch_last_page base_xenforo feature--not needed after SB conversion to XF2. 2019-10-18 10:04:39 -05:00
Jim Miller
4f11380296 Bump Test Version 3.12.1 2019-10-15 08:53:49 -05:00
Jim Miller
77eb5d41cb Treat spacebattles /post/ URLs in emails the same as sufficientvelocity to avoid extra URLs in thread notifications. 2019-10-15 08:53:29 -05:00
Jim Miller
400b81720f Bump Release Version 3.12.0 2019-10-14 10:23:36 -05:00
Jim Miller
d4cf0242e0 Change spacebattles to xenforo2 & Bump Test Version 3.11.9 2019-10-12 08:57:16 -05:00
Jim Miller
b5210fa2ba Adding xf2test.spacebattles.com -- seems to work fine. 2019-10-11 21:06:54 -05:00
Jim Miller
7ff6a8bbb7 Add defaults.ini CSS for ficbook.net to preserve line breaks. 2019-10-09 10:08:49 -05:00
Jim Miller
86895fa405 Update translations. 2019-10-08 13:17:01 -05:00
Jim Miller
20299506b8 Bump Test Version 3.11.8 2019-10-04 12:28:44 -05:00
Jim Miller
9647f98de5 Correct AO3 author parsing for high-byte characters. 2019-10-04 10:31:50 -05:00
Jim Miller
def9a64aa9 Added messages for translation--email tags feature. 2019-10-04 10:31:23 -05:00
Jim Miller
90edc483e2 Bump Test Version 3.11.7 2019-10-01 20:10:47 -05:00
Jim Miller
defa942a29 Additional warnings about imaptags added to manually added story URLs. Plus confirm(show_cancel_button=False) where appropriate. 2019-10-01 20:10:28 -05:00
Jim Miller
6f6513a1eb Bump Test Version 3.11.6 2019-09-28 21:49:00 -05:00
Jim Miller
c7b9d60500 Fix for adapter_trekfanfictionnet URL change. Closes #433. 2019-09-28 21:48:42 -05:00
Jim Miller
f88ec2b9aa Bump Test Version 3.11.5 2019-09-16 12:47:04 -05:00
Jim Miller
9cd8e20cd9 Add 'Restricted to Registered Users'(restricted) metadata to AO3. 2019-09-16 12:46:35 -05:00
Jim Miller
27c9bede73 Bump Test Version 3.11.4 2019-09-15 12:29:15 -05:00
Jim Miller
63d3d235f2 Fix for unicode chars in url params, fixes 427 2019-09-15 12:28:58 -05:00
Jim Miller
0066929558 Remove outdated xf2test testing code. 2019-09-15 11:41:58 -05:00
Jim Miller
c2fabce099 Bump Test Version 3.11.3 2019-09-12 10:37:33 -05:00
Jim Miller
fbf6a48020 Add feature to automatically add optional tags for stories downloaded from email URLs. 2019-09-12 10:37:12 -05:00
Jim Miller
38f4a4fd78 Change normalize_text_links and internalize_text_links to default on. Closes #426 2019-09-09 12:33:37 -05:00
Jim Miller
a925afb5d8 Bump Test Version 3.11.2 2019-09-09 09:02:48 -05:00
Jim Miller
9804aaf4d6 Merge branch 'gunmetal313-add_swi_org_ru_support' 2019-09-09 09:01:26 -05:00
Jim Miller
86960b2b10 Add use_pagecache() to adapter_swiorgru. 2019-09-09 09:01:06 -05:00
Ivan Kulikov
0d5099dbe4 adapter_swiorgru: issues was fixed. (metadata parsing was added, adult check was added) 2019-09-09 08:51:46 +03:00
Ivan Kulikov
48972cb5f2 New Site: http://www.swi.org.ru/ - Stories only. 2019-09-08 23:03:30 +03:00
Jim Miller
b5ed99cc1f Bump Test Version 3.11.1 2019-09-05 14:06:03 -05:00
Jim Miller
a2a3a01cde Fix for wordcount in adapter_fanfiktionde when max_zalgo:1 2019-09-05 14:05:44 -05:00
Jim Miller
2d9738f13d Bump Release Version 3.11.0 2019-09-04 10:44:56 -04:00
Jim Miller
03f930cf82 Update translations. 2019-09-04 10:43:58 -04:00
Jim Miller
282af12792 Bump Test Version 3.10.11 2019-08-23 22:40:02 -05:00
Jim Miller
0d71e3afe8 Add URL prefix for XF2 authorUrl if relative. 2019-08-23 22:38:52 -05:00
Jim Miller
abf3476eea Bump Test Version 3.10.10 2019-08-19 12:20:22 -05:00
Jim Miller
60763b8156 Add fetch_last_page for base_xenforo--SB doesn't send notice emails if user not up-to-date now. 2019-08-19 12:15:55 -05:00
Jim Miller
5c6ba3e62d Add dedup_chapter_list option for buggy chapter lists. Optional in case they're not buggy. 2019-08-15 14:38:14 -05:00
Jim Miller
95321588e4 Bump Test Version 3.10.9 2019-08-03 11:04:58 -05:00
Jim Miller
ef88db8c1a Add debug output for sucessful ebook write. As per #422. 2019-08-02 17:40:53 -05:00
Jim Miller
60e7ca6858 Bump Test Version 3.10.8 2019-08-01 15:23:47 -05:00
Jim Miller
109dd7f565 Fix for XF1 regression caused by XF2 threadmarks metadata code. 2019-08-01 14:49:49 -05:00
Jim Miller
726c836587 Fix for corner-case with deleting Rejects. 2019-08-01 09:21:01 -05:00
Jim Miller
4a75cc1526 Bump Test Version 3.10.7 2019-07-31 14:40:58 -05:00
Jim Miller
0e1ae124bf Fix for use_threadmarks_description ini typo. 2019-07-30 14:38:35 -05:00
Jim Miller
9532e4de5c Merge branch 'cli' 2019-07-30 14:35:55 -05:00
Jim Miller
5ac11e19fc Bump Test Version 3.10.6 2019-07-30 14:30:29 -05:00
Jim Miller
2397fdaaea Add XF2 threadmarks_cover/status/desc/title options. 2019-07-30 10:05:21 -05:00
Jim Miller
0e97cd7d99 Consolidate some XF html cleanup. 2019-07-30 09:13:42 -05:00
Jim Miller
b25769109a Bump Test Version 3.10.5 2019-07-30 09:06:56 -05:00
Jim Miller
000dc270ad Fix adapter_inkbunnynet incorrect example URL, caused problems with get URLs from page. 2019-07-30 08:59:19 -05:00
Jim Miller
47fdb9d83e Bump Test Version 3.10.4 2019-07-29 10:29:54 -05:00
Jim Miller
158d396734 Add a little more fail-safe for IMAP folder list. 2019-07-29 10:29:00 -05:00
Jim Miller
0f4b852a4b Bump Test Version 3.10.3 2019-07-28 15:06:05 -05:00
Jim Miller
7fdca5e7b6 Fix for IMAP folder parsing. 2019-07-28 15:05:52 -05:00
Jim Miller
337aadc1fe Bump Test Version 3.10.2 2019-07-28 09:59:45 -05:00
Jim Miller
5da0f20593 Add some debugging output to imap fetch. 2019-07-28 09:59:33 -05:00
Jim Miller
dcfcd112e1 Bump Test Version 3.10.1 2019-07-27 17:39:59 -05:00
Jim Miller
14f9b5b82b ensure_str on IMAP4 folder name for Python3. Fixes #419 2019-07-27 17:39:30 -05:00
Jim Miller
955eb1a2d3 Bump Release Version 3.10.0 2019-07-27 09:42:46 -05:00
Jim Miller
67f72497ea Update Translations comments 2019-07-27 09:41:51 -05:00
Jim Miller
667b784dca Bump Test Version 3.9.10 2019-07-25 12:17:29 -05:00
Jim Miller
e7e986f8d5 Additional error checking and output for IMAP4 fetch. 2019-07-25 12:17:05 -05:00
Jim Miller
105efcf395 Finish inclusion of always_include_first_post_chapters feature. 2019-07-24 18:36:56 -05:00
Jim Miller
4d6cc75b9b Bump Test Version 3.9.9 2019-07-22 16:50:38 -05:00
Jim Miller
cf6366dab4 Add always_include_first_post_chapters to base_xenforoforum_adapter 2019-07-22 16:49:33 -05:00
Jim Miller
69ecfb0837 XF posts can include a tags without href, look for href searching for links. Closes #417 2019-07-22 16:45:45 -05:00
Jim Miller
933c7b43a4 Add work around to fix XF2 issue with multiple '...' in threadmarks. 2019-07-22 16:39:04 -05:00
Jim Miller
b90bd847d2 Bump Test Version 3.9.8 2019-07-14 10:28:54 -05:00
Jim Miller
23400f9557 Tweak getting URLs from email for SV XF2 changes, don't get post URLs. 2019-07-14 10:28:04 -05:00
Jim Miller
1e247571ec Update Translations. 2019-07-14 00:56:51 -05:00
Jim Miller
777efc59fd Bump Test Version 3.9.7 2019-07-13 12:02:25 -05:00
Jim Miller
a7ebdb78ab Fix XF2 elided threadmarks fetch. 2019-07-13 10:20:17 -05:00
Jim Miller
ada3f5ed2a [base_xenforo2forum] should override [base_xenforoforum], not vice versa. 2019-07-13 09:31:23 -05:00
Jim Miller
4ec21b145f Add [base_xenforo2forum] to defaults.ini. 2019-07-13 09:24:49 -05:00
Jim Miller
9228a8c262 adapter_forumssufficientvelocitycom switched over to use base_xenforo2forum_adapter. 2019-07-13 09:24:49 -05:00
Jim Miller
23a1e6ef21 xf2test site 2019-07-13 09:24:48 -05:00
Jim Miller
cc6ccff2b4 Fix for reportorder results when not BG'ed. 2019-07-13 09:22:55 -05:00
Jim Miller
a4104d09b5 Bump Test Version 3.9.6 2019-07-12 21:54:48 -05:00
Jim Miller
ac72c19676 Plugin: Group reported results better. 2019-07-12 21:54:05 -05:00
Jim Miller
10ec0e7bb0 Bump Test Version 3.9.5 2019-07-11 20:35:22 -05:00
Jim Miller
c8c9520dd5 Adjust reader_posts_per_page for forum.questionablequesting.com. 2019-07-11 20:30:54 -05:00
Jim Miller
269e3d44b9 Bump Test Version 3.9.4 2019-07-11 19:41:28 -05:00
Jim Miller
420b5e3331 Add a de-dup check to XF threadmark collection due to at least one SB story having TM bug. 2019-07-11 19:41:08 -05:00
Jim Miller
2698a61231 Fix for XF post URLs with quotes in. 2019-07-07 19:28:57 -05:00
Jim Miller
59d6d121c4 Remove BS version debug so it isn't printed once per adapter. 2019-07-06 09:18:08 -05:00
Jim Miller
8025ba06e4 Bump Test Version 3.9.2 2019-07-05 08:21:44 -05:00
Jim Miller
b4e58a41c4 adapter_wuxiaworldco: Don't include grayed out 'In preparation' chapters 2019-07-05 08:21:08 -05:00
Eli Schwartz
a81ac5312c setup.py: declare an optional dependency on Pillow
Needed in order to advertise support for image processing, which will
gracefully fallback on doing nothing, but would ideally let you know via
pip that you should actually install the additional feature.
2019-07-03 13:56:02 -05:00
Eli Schwartz
ff1bd22193 README.md: make it a little more descriptive 2019-07-03 13:56:02 -05:00
Eli Schwartz
1a7883e56c Add Arch Linux installation instructions. 2019-07-03 13:56:02 -05:00
Jim Miller
35fa48da02 Bump Test Version 3.9.1 2019-06-26 15:22:18 -05:00
Jim Miller
6c3b243d12 Move BS version debug so it doesn't appear in CLI without -d. 2019-06-26 15:21:14 -05:00
Jim Miller
cd70d20b46 Bump Release Version 3.9.0 2019-06-25 13:21:19 -05:00
Jim Miller
e6f228c091 Bump Test Version 3.8.13 2019-06-22 15:13:21 -05:00
Jim Miller
60651cb15b Fix for a corner-case global_cache written by py2, then read by py3 bug. 2019-06-22 15:12:28 -05:00
Jim Miller
868e120ffc Fix for regression on XF not-first index post. 2019-06-22 15:08:33 -05:00
Jim Miller
33838ba887 Bump Test Version 3.8.12 2019-06-16 12:48:30 -05:00
Jim Miller
c8e7d4fbfa Include soupsieve & backports in PI zip. 2019-06-16 12:45:54 -05:00
Jim Miller
ed64da557e Include backports for soupsieve for web service/calibre 2.85.1. 2019-06-16 12:40:43 -05:00
Jim Miller
4992c952c9 Bump Test Version 3.8.11 2019-06-16 12:08:00 -05:00
Jim Miller
ed4e366922 Include soupsieve for bs4.7.1 on web service. 2019-06-16 12:08:00 -05:00
Jim Miller
44b4385847 Mark included bs4 version as fff. 2019-06-16 11:45:06 -05:00
Jim Miller
f234cd2e78 Update included bs4 to 4.7.1 2019-06-16 11:40:53 -05:00
Jim Miller
c38bfcf1da Update included six.py to 1.12. 2019-06-16 11:35:46 -05:00
Jim Miller
29ad115d6e Update translations. 2019-06-16 11:11:01 -05:00
Jim Miller
753f1f34ab Bump Test Version 3.8.10 2019-06-15 19:15:27 -05:00
oh45454545
54c0177b15 adapter_asianfanficscom foreword json fix 2019-06-15 19:14:55 -05:00
Jim Miller
e46942ccde Bump Test Version 3.8.9 2019-06-15 11:11:11 -05:00
Jim Miller
59ca64719e
Merge pull request #405 from oh45454545/master
adapter_asianfanficscom json fixes
2019-06-15 11:10:36 -05:00
oh45454545
67c1d1808a adapter_asianfanficscom json fixes 2019-06-15 17:32:28 +02:00
Jim Miller
a69100ba52 Bump Test Version 3.8.8 2019-06-13 13:57:05 -05:00
Jim Miller
46645c5b93 Change metadata separator to ' & ' when filling 'Contains names' custom columns. 2019-06-13 13:54:48 -05:00
Jim Miller
c7b7b393e3 Bump Test Version 3.8.7 2019-06-10 12:12:09 -05:00
Jim Miller
dcec57c09a Fix for adapter_asianfanficscom change for views metadata 2019-06-10 12:07:19 -05:00
Jim Miller
f8d5d9fb07 Bump Test Version 3.8.6 2019-06-08 08:30:39 -05:00
Jim Miller
cbcfd32464 Fix for yet more arbitrary py3 changes breaking code. Fixes #400 2019-06-08 08:26:25 -05:00
Jim Miller
f4d05d4f24 Bump Test Version 3.8.5 2019-06-06 14:52:03 -05:00
Jim Miller
dbe5f6a98d Fix for site change adapter_novelonlinefullcom 2019-06-06 14:51:44 -05:00
Jim Miller
cff4dca5e6 Bump Test Version 3.8.4 2019-06-05 12:18:46 -05:00
Jim Miller
adab8a6a59 Fix for some(older?) adapter_asianfanficscom stories not have json links. 2019-06-05 12:18:24 -05:00
Jim Miller
831c6778d8 More fixes for adapter_asianfanficscom site now giving different HTML to FFF. 2019-06-05 12:14:09 -05:00
Jim Miller
5f97e5b0e2 Bump Test Version 3.8.3 2019-06-04 20:10:32 -05:00
Jim Miller
5b3d8c4377 Fixes for site changes adapter_asianfanficscom 2019-06-04 20:10:18 -05:00
Jim Miller
8b99a042fc Bump Test Version 3.8.2 2019-06-03 14:30:35 -05:00
Jim Miller
739cc681d5 Comment out some debugs. 2019-06-03 14:30:10 -05:00
Jim Miller
b9a8a648d9 Fix for extended chars in text email imap fetchs on py3. 2019-06-03 14:27:45 -05:00
Jim Miller
05411ee451 Show debug output BeautifulSoup version. 2019-06-03 12:57:10 -05:00
Jim Miller
5e1ce3c45a adapter_mediaminerorg: Fix for not finding a block in chapter download. 2019-05-29 11:28:30 -05:00
Jim Miller
a4127aee18 Fix for newer BS4/soupsieve enforcing CSS selector rules about :/ chars. 2019-05-29 11:23:15 -05:00
Jim Miller
3b3e951100 Bump Test Version 3.8.1 2019-05-29 09:12:38 -05:00
Jim Miller
40e9d6014a Fix for site date change adapter_wuxiaworldcom 2019-05-29 08:58:58 -05:00
Jim Miller
e4a5adceb6 Bump Release Version 3.8.0 2019-05-27 14:22:10 -05:00
Jim Miller
a968f73db5 Update translations. 2019-05-27 14:21:24 -05:00
Jim Miller
a1a49effad Bump Test Version 3.7.18 2019-05-18 16:20:57 -05:00
Jim Miller
64f00bdb97 Update translations 2019-05-18 16:20:35 -05:00
Jim Miller
90a1375603 Fix for adapter_asianfanficscom site change. 2019-05-18 16:19:45 -05:00
Jim Miller
abda36cd92 Fix minor code copy/paste error. 2019-05-17 08:43:10 -05:00
Jim Miller
9d2025afed Bump Test Version 3.7.17 2019-05-16 15:13:35 -05:00
Jim Miller
741bc126d2 Fix title casing for base_xenforo feature capitalize_forumtags. 2019-05-16 15:12:41 -05:00
Jim Miller
dffb4d3168 Add 'publisher' metadata as a copy of 'site' and use to fill Publisher in Calibre. 2019-05-16 14:54:33 -05:00
Jim Miller
ab6684c1ad Bump Test Version 3.7.16 2019-05-14 15:39:23 -05:00
Jim Miller
0763d16d3e Fix for adapter_storiesonlinenet not detecting login failure. 2019-05-14 15:39:08 -05:00
Jim Miller
2978f9ece8 Bump Test Version 3.7.15 2019-05-14 13:15:57 -05:00
Jim Miller
7e80dacd92 Removing all filter()/map() calls--not consistent between Py2/Py3. 2019-05-14 13:12:54 -05:00
Jim Miller
881c55026f Fixes for adapter_quotevcom due to site changes. 2019-05-13 15:21:22 -05:00
Jim Miller
45d5728ae2 Some fixes for Russian language adapter_masseffect2in. 2019-05-13 14:15:40 -05:00
Eli Schwartz
c8362433e2 optimization: do not use filter just to get counts of a list
This breaks in python3, because filter returns an iterable filter
object, and it's anyways wrong on python2, because it does extra work to
create a list only to discard it. The best way to do this is to test
each member of the list and abort early as soon as we get the info we
need (that it is non-zero). Or alternatively use sum with a generator
expression.
2019-05-13 13:06:35 -05:00
Jim Miller
012d12fb52 Bump Test Version 3.7.14 2019-05-12 10:47:19 -05:00
Jim Miller
055c051807 adapter_finestoriescom shares code with adapter_storiesonlinenet, also do datetimes. 2019-05-12 10:46:13 -05:00
Jim Miller
26733f2651 Fixes for adapter_storiesonlinenet site changes--login and use datetime by default because one was missing date-only. 2019-05-12 10:39:27 -05:00
Jim Miller
c8b7c4412e Comment out some debugs in XF/XF2. 2019-05-09 22:17:29 -05:00
Jim Miller
469410cb94 Bump Test Version 3.7.13 2019-05-09 14:13:23 -05:00
Jim Miller
3375632f2c Yet another fix for tagsfromtitle on base_xenforo. Don't put whole title as a tag when no [( in title. 2019-05-09 14:13:05 -05:00
Jim Miller
66c1d8ffcf Bump Test Version 3.7.12 2019-05-08 14:20:39 -05:00
Jim Miller
b8e3f33646 Fix for base_xenforo's tagsfromtitle needing to be split. 2019-05-08 14:20:18 -05:00
Jim Miller
88c38d5d85 Bump Test Version 3.7.11 2019-05-07 11:15:23 -05:00
Jim Miller
8676649e0b Bump Test Version 3.7.10 2019-05-07 11:15:23 -05:00
Jim Miller
0b0703457c Disable xf2test before posting test version. 2019-05-07 11:15:12 -05:00
Jim Miller
a8bdc69cea Merge branch 'SV-xenforo2' 2019-05-07 11:07:41 -05:00
Jim Miller
38e1e33cb2 base_xenforoforum: include forumtags in genre and tagsfromtitle in category instead of including both in subject_tags. 2019-05-07 10:57:18 -05:00
Jim Miller
2099de432a Don't hardcode extratags into subject tags--it's in include_subject_tags in defaults.ini. 2019-05-07 09:40:21 -05:00
Jim Miller
0bbe27b9e9 Fix for anthology titles (and generate cover settings) not needing encode() anymore. 2019-05-06 14:15:58 -05:00
Jim Miller
08bed37eba Fix for series contain 'collection from' adapter_storiesonlinenet. 2019-05-06 14:08:57 -05:00
Jim Miller
448eeeee46 Add unicode series test cases to adapter_test1. 2019-05-06 14:06:52 -05:00
Jim Miller
eb9b8aebb7 Fix finding threadmarks for XF2. Developing to a moving target. 2019-05-02 14:42:59 -05:00
Jim Miller
4c20a843c7 Fix login for XF2 2019-05-02 14:40:12 -05:00
Jim Miller
e745962ff4 Fix ini sections for XF2. 2019-05-02 13:26:51 -05:00
Jim Miller
01e34ca0eb Fix forumtags for XF2 changes. 2019-05-01 09:06:24 -05:00
Jim Miller
2a95105837 Fixes for XF2 for Python3 being tetchy. 2019-04-30 20:45:03 -05:00
Jim Miller
7d35b642fd Fix replace_failed_smilies_with_alt_text for XF2. 2019-04-30 19:37:32 -05:00
Jim Miller
a6ec8fd1d8 Fix XF2 quotes and a new /post-999 form story URL. 2019-04-30 17:22:33 -05:00
Jim Miller
936a2409b4 Xenforo: Move lazyload img code to all be in one place. 2019-04-30 15:16:16 -05:00
Jim Miller
e288691d1c Fix for needing first post XF1/2. 2019-04-30 15:05:07 -05:00
Jim Miller
1a5c8b02d0 Fix spoiler tags for XF2, add base_xenforo2forum ini section. 2019-04-30 14:21:22 -05:00
Jim Miller
de66ca06ae XF2 fixes, put some debugs back in. 2019-04-24 14:39:57 -05:00
Jim Miller
a56f42982c Comment out some debugs. 2019-04-24 13:15:50 -05:00
Jim Miller
cf99d82e30 Refactor XF1 XF2 to consolidate logic. 2019-04-24 13:07:22 -05:00
Jim Miller
1c42040885 XF2 largely working, but still not perfect. 2019-04-24 11:13:57 -05:00
Jim Miller
c79cb4e450 Progressing on XF2, still incomplete. 2019-04-24 11:13:57 -05:00
Jim Miller
9a84b747e5 Fix XF2 author parse and feature-proof threadmark category ordering. 2019-04-24 11:13:57 -05:00
Jim Miller
e53e2bfbe5 Incomplete test version for xenforo2 on xf2test.sufficientvelocity.com. 2019-04-24 11:13:57 -05:00
Jim Miller
ee48decec5 Bump Test Version 3.7.6 2019-04-23 22:29:17 -05:00
Jim Miller
777a07a019 Fix for BS halping with string conversions on PI update from Saved Meta Column. 2019-04-23 22:29:00 -05:00
Jim Miller
5fe41d9b82 Bump Test Version 3.7.5 2019-04-23 19:48:46 -05:00
Jim Miller
9689a627d0 Merge branch 'AFF-fixes' 2019-04-23 19:47:58 -05:00
Jim Miller
87adf8f4e2 Require login for adapter_asianfanficscom, except when already logged in. Tweak not-sub'ed msg too. 2019-04-23 19:35:17 -05:00
oh45454545
2b0d053c54 login now required 2019-04-24 01:37:18 +02:00
oh45454545
4976611375 whitespace 2019-04-23 20:45:16 +02:00
oh45454545
66b683a6bc further fixes 2019-04-23 20:43:53 +02:00
Jim Miller
b249a05720 Remove hookFinalCleanup() from base_adapter--isn't going to do what I wanted. 2019-04-23 13:27:22 -05:00
Jim Miller
8038df7921 Add hookFinalCleanup() to base_adapter 2019-04-23 13:06:56 -05:00
oh45454545
25eee7d314 missed a login check 2019-04-23 02:58:00 +02:00
oh45454545
6403ca3bff adapter_asianfanficscom: fix missing HTML tags by automatically subscribing to story 2019-04-23 02:47:44 +02:00
Jim Miller
7a81dec11f Bump Test Version 3.7.4 2019-04-22 18:23:44 -05:00
Jim Miller
4c2bcb32da Merge branch 'oh45454545-AFF-fixes' 2019-04-22 18:22:40 -05:00
Jim Miller
ebf506a71e Remove a little extra whitespace. 2019-04-22 18:22:12 -05:00
oh45454545
8f260451bd that line should be there 2019-04-22 22:25:50 +02:00
oh45454545
203523a770 adapter_asianfanficscom fixes 2019-04-22 21:48:02 +02:00
Jim Miller
a6fb0fcc7f Bump Test Version 3.7.3 2019-04-21 21:58:05 -05:00
Jim Miller
5f5bde42d9 Fix use_archived_author in AO3. 2019-04-21 21:57:47 -05:00
Jim Miller
8a9ed58585 Bump Test Version 3.7.2 2019-04-21 16:30:26 -05:00
Jim Miller
b4c90dd02a Ad wall indicator has changed for adapter_webnovelcom. 2019-04-21 16:30:01 -05:00
Jim Miller
021ef30647 Bump Test Version 3.7.1 2019-04-21 15:09:20 -05:00
Jim Miller
7590dd8003 Update adapter_asianfanficscom to fetch chapter texts from JSON url. 2019-04-21 14:31:01 -05:00
Jim Miller
c7d85985a7 Bump Release Version 3.7.0 2019-04-19 14:08:56 -05:00
Jim Miller
42bd0b221d Update Translations. 2019-04-19 14:08:13 -05:00
Jim Miller
779d59885e Bump Test Version 3.6.9 2019-04-15 12:35:28 -05:00
Jim Miller
bbb417acc3 Add remove_authorfootnotes_on_update feature for AO3. 2019-04-15 12:34:23 -05:00
Jim Miller
65975828e3 Bump Test Version 3.6.8 2019-04-14 15:05:43 -05:00
Jim Miller
3e5de53984 More fixing for bool metadata values--convert to string when set. 2019-04-14 15:05:27 -05:00
Jim Miller
61db6d248d Bump Test Version 3.6.7 2019-04-13 21:24:09 -05:00
Jim Miller
4d823ec7e2 Fix for PI saved metadata not reading False & empty strings. 2019-04-13 21:23:47 -05:00
Jim Miller
52e3178300 Bump Test Version 3.6.6 2019-04-13 16:12:30 -05:00
Jim Miller
2dadedecec Add bookmarked site specific metadata for adapter_archiveofourownorg. 2019-04-13 16:02:22 -05:00
Jim Miller
9de8a5f7e2 Bump Test Version 3.6.5 2019-04-13 08:46:14 -05:00
Jim Miller
f165c8b0f9 Fix for saved custom column metadata and boolean values. 2019-04-12 21:30:27 -05:00
Jim Miller
0229fab8b4 Correct a comment in defaults.ini. 2019-04-12 13:39:33 -05:00
Jim Miller
4afdc269e7 Bump Test Version 3.6.4 2019-04-10 11:04:02 -05:00
Jim Miller
10fa02be11 Include status 'Hiatus' for adapter_royalroadcom. 2019-04-10 11:03:45 -05:00
Jim Miller
e0d83ce545 Update a comment link in setup.py. 2019-04-06 19:45:22 -05:00
Jim Miller
640b13e074 Update Translations. 2019-04-06 10:34:05 -05:00
Jim Miller
b8da3c9722 Bump Test Version 3.6.3 2019-04-06 10:32:08 -05:00
Jim Miller
befb6e0144 Closes #390 - RoyalRoad click link in emails. 2019-04-06 10:09:55 -05:00
Jim Miller
3975a37302 Add another StoryDoesNotExist string for adapter_fanfictionnet 2019-04-06 10:08:40 -05:00
Jim Miller
28bf3a35b8 Bump Test Version 3.6.2 2019-03-26 11:08:28 -05:00
Jim Miller
3c72dee8e8 Fix some comments. 2019-03-24 14:03:24 -05:00
Jim Miller
198c5d9ffc Add debug output for encoding used. 2019-03-24 13:27:09 -05:00
Jim Miller
7a3e99db9d Comment out some old debugs. 2019-03-24 13:22:23 -05:00
Jim Miller
e9fe09d545 Fix date format for adapter_gluttonyfictioncom 2019-03-22 18:39:57 -05:00
Jim Miller
2674aa2ee2 Remove ncisfic.com -- moved to AO3. 2019-03-22 18:33:35 -05:00
Jim Miller
4b938998a3 Recognize destinysgateway.com and www.destinysgateway.com 2019-03-18 12:22:30 -05:00
Jim Miller
202532cbd2 Bump Test Version 3.6.1 2019-03-18 12:11:39 -05:00
Jim Miller
720377d476 Revert "Remove defunct site www.destinysgateway.com"
This reverts commit 6e6055a77b.
2019-03-18 12:08:36 -05:00
Jim Miller
e167efa4d9 Bump Release Version 3.6.0 2019-03-12 10:28:29 -05:00
Jim Miller
f97025a15e Bump Test Version 3.5.7 2019-03-11 19:33:15 -05:00
Jim Miller
0f8b601db6
Merge pull request #384 from JimmXinu/jsonmeta
Make CLI -j/--json-meta option work on download/update.
2019-03-11 19:32:21 -05:00
Jim Miller
df25a2c768 Bump Test Version 3.5.6 2019-03-11 13:45:35 -05:00
Jim Miller
29ada606c5 Fix & improve universe_as_series for adapter_storiesonlinenet 2019-03-11 13:45:06 -05:00
Jim Miller
57c2e72b07 Bump copyright year. 2019-03-11 13:20:13 -05:00
Jim Miller
cd44d68a8d Update translations 2019-03-08 16:54:08 -06:00
Jim Miller
fe3f42a869 Bump Test Version 3.5.5 2019-03-08 16:47:42 -06:00
Jim Miller
3fb8cba23e Update adapter_fictionhuntcom for stories with no chapters. 2019-03-08 15:11:30 -06:00
Jim Miller
5a224ddb63 Bump Test Version 3.5.4 2019-03-08 15:06:26 -06:00
Jim Miller
ee172fa0af Update adapter_fictionhuntcom for significant site changes. 2019-03-08 15:01:03 -06:00
Jim Miller
d3a0626e02 Make CLI -j/--json-meta option work on download/update as well as -m/--meta-only. 2019-03-05 19:58:40 -06:00
Jim Miller
8188c0013b Bump Test Version 3.5.3 2019-03-03 13:07:03 -06:00
Jim Miller
4a294f35a6 Collect ships and description in adapter_harrypotterfanfictioncom. 2019-03-03 13:06:47 -06:00
Jim Miller
ea8a072fd2 Bump Test Version 3.5.2 2019-02-21 14:07:59 -06:00
Jim Miller
42d3ae121e Use https for adapter_wuxiaworldco. 2019-02-21 14:07:42 -06:00
Jim Miller
6105518324 Bump Test Version 3.5.1 2019-02-13 09:24:23 -06:00
Jim Miller
6093c2eb02 Some AO3 stories don't have fandom tags. 2019-02-13 09:24:02 -06:00
Jim Miller
25a380c218 Bump Release Version 3.5.0 2019-02-11 19:50:50 -06:00
Jim Miller
479bf1b50e Bump Test Version 3.4.7 2019-02-06 09:52:35 -06:00
Jim Miller
a2024be25f Remove firefly.populli.org -- now on AO3. 2019-02-06 09:45:01 -06:00
Jim Miller
b097545d91 Remove fhsarchive.com -- now on AO3. 2019-02-06 09:43:08 -06:00
Jim Miller
b7828d7c23 Update Translations. 2019-02-06 09:39:48 -06:00
Jim Miller
326ece76e4 Bump Test Version 3.4.6 2019-02-05 09:49:29 -06:00
Jim Miller
4df5027fda Add www.mugglenetfanfiction.com as replacement for removed fanfiction.mugglenet.com. 2019-02-05 09:46:50 -06:00
Jim Miller
cbb48d2406 Bump Test Version 3.4.5 2019-02-04 18:41:26 -06:00
Jim Miller
45abef86d7 Fix for int(pages) in adapter_adultfanfictionorg. Closes #375 2019-02-04 18:41:06 -06:00
Jim Miller
c4fc2afd14 Bump Test Version 3.4.4 2019-01-18 09:29:06 -06:00
Jim Miller
7ba0979f1e
Merge pull request #372 from Rikkitp/webnovel_author_fix
Fix author parsing at webnovelcom
2019-01-18 09:28:37 -06:00
Rikkitp
d01a3c1187 Fix author parsing at webnovelcom 2019-01-18 15:32:43 +03:00
Jim Miller
882bad50d7 Bump copyright years, comment out some debugs. 2019-01-17 13:02:43 -06:00
Jim Miller
3023dc035c Bump Test Version 3.4.3 2019-01-17 12:54:03 -06:00
Jim Miller
5e0a036814 Remember original href in data-orighref attr with epub internalize_text_links so inserted 'earlier' chapters don't break internal links. 2019-01-17 12:49:34 -06:00
Jim Miller
3488d35c1f Bump Test Version 3.4.2 2019-01-15 10:56:09 -06:00
Jim Miller
188d3420f6 Add always_login setting to base_xenforo for SV login-required story with 404 result. 2019-01-15 10:55:45 -06:00
Jim Miller
d423100e6b Update copyright. 2019-01-15 10:38:53 -06:00
Jim Miller
5ecf4a0ac3 Bump Test Version 3.4.1 2019-01-09 12:22:59 -06:00
Jim Miller
d1849d807c Ignore current Virtual Library when checking for existing story ID. 2019-01-09 12:21:15 -06:00
Jim Miller
5c324fc5b6 m.wuxiaworld.co == www.wuxiaworld.co 2019-01-08 19:02:21 -06:00
Jim Miller
37aab6a4f5 Bump Release Version 3.4.0 2019-01-07 16:07:06 -06:00
Jim Miller
fd43c98b96 Fix defaults.ini add_to_titlepage_entries for hentai-foundry.com 2019-01-04 11:20:54 -06:00
Jim Miller
cd697382c2 Bump Test Version 3.3.9 2019-01-02 11:52:17 -06:00
Jim Miller
1ad765875e Fix metadata parsing for adapter_siyecouk 2019-01-02 11:50:35 -06:00
Jim Miller
83255da41e Bump Test Version 3.3.8 2019-01-01 15:14:32 -06:00
Jim Miller
e211958904 Fix adapter_whoficcom for site changes. 2019-01-01 15:14:14 -06:00
Jim Miller
f2de6f6c02 Update translations. 2018-12-29 21:53:47 -06:00
Jim Miller
d2f6d11202 Bump Test Version 3.3.7 2018-12-29 20:54:32 -06:00
Jim Miller
32935e507d Fix AO3 logout url used to detect when already logged in. 2018-12-29 20:54:15 -06:00
Jim Miller
eea19a0f5d Bump Test Version 3.3.6 2018-12-29 13:32:39 -06:00
Jim Miller
0387bd9e7e Add conditionals_use_lists(default:true) setting for replace_metadata & Include/Exclude metadata conditionals. Might change existing behavior for some users. 2018-12-29 13:32:16 -06:00
Jim Miller
047eb9c37e Remove outdated comment. 2018-12-29 12:21:38 -06:00
Jim Miller
15eaaa7e11 Allow ==, !=, =~ and !~ in replace_metadata conditionals like in/exclude_metadata. 2018-12-29 12:18:09 -06:00
Jim Miller
0f113df946 Additional metadata edit recursion proofing. 2018-12-29 11:46:27 -06:00
Jim Miller
59e05c53de Bump Test Version 3.3.5 2018-12-29 10:51:59 -06:00
Jim Miller
84e8ca85f3 Allow animated gifs through Calibre Image proc -- need to remove gif.py later. 2018-12-29 10:51:39 -06:00
Jim Miller
7f689568ac Update some comments in adapter_archiveofourownorg 2018-12-28 21:51:19 -06:00
Jim Miller
b6512cef24 Bump Test Version 3.3.4 2018-12-28 20:53:04 -06:00
Jim Miller
e86e69c233 Additional fix for AO3 login site changes. 2018-12-28 20:40:25 -06:00
Jim Miller
f7bc9d01fb Bump Test Version 3.3.3 2018-12-27 12:23:03 -06:00
Jim Miller
f49f81a90d Fix for AO3 login site changes. 2018-12-27 12:22:48 -06:00
Jim Miller
426ce6ac8e Bump Test Version 3.3.2 2018-12-26 12:11:59 -06:00
Jim Miller
787ac24f78 Update translations 2018-12-26 12:10:36 -06:00
Jim Miller
949d1e73ea Force Translation Update 2018-12-25 23:36:36 -06:00
Jim Miller
2e6e807a3a Force Translation Update 2018-12-25 13:02:14 -06:00
Jim Miller
dec7d6cbdc Force Translation Update 2018-12-25 08:50:53 -06:00
Jim Miller
82935667da Force Translation Update 2018-12-23 15:04:48 -06:00
Jim Miller
df44b2c41d Bump Test Version 3.3.1 2018-12-22 19:57:20 -06:00
Jim Miller
75e09e20c6 Add bookmarkprivate and bookmarkrec site-specific metadata to adapter_archiveofourownorg 2018-12-22 19:57:02 -06:00
Jim Miller
ea794c39c2 Bump Release Version 3.3.0 2018-12-18 12:55:00 -06:00
Jim Miller
06997cd5d8 Bump Test Version 3.2.6 2018-12-13 17:21:04 -06:00
Jim Miller
3ed1730819 Accept archiveofourown.com for archiveofourown.org. archiveofourown.org remains the 'canonical' domain. 2018-12-13 17:20:46 -06:00
Jim Miller
ca742585e5 Update translations. 2018-12-10 14:37:04 -06:00
Jim Miller
e846a19ea7 Bump Test Version 3.2.5 2018-12-10 14:33:45 -06:00
Jim Miller
ded6f59c79 Workaround for adapter_hentaifoundrycom bad dateUpdated value. 2018-12-10 11:33:11 -06:00
Jim Miller
2ac2f8a1eb Bump Test Version 3.2.4 2018-12-05 18:13:01 -06:00
Jim Miller
737aedd12f New Site: www.hentai-foundry.com - Stories only. 2018-12-05 18:12:04 -06:00
Jim Miller
cc48cab5c6 Bump Test Version 3.2.3 2018-12-04 20:17:20 -06:00
Jim Miller
fd246c77c7 Improve recursion-proofing of replace_metadata for performance and duplicate lines. 2018-12-04 20:16:00 -06:00
Jim Miller
407ce79e2c Don't cache file: URLs. Option --save-cache can mask changes to _filelist files while debugging. 2018-12-04 19:13:16 -06:00
Jim Miller
723bdc39fa Don't sleep when fetching file: URLs. 2018-12-04 18:23:45 -06:00
Jim Miller
8df46811b5 Remove defunct site www.artemis-fowl.com 2018-12-04 15:11:01 -06:00
Jim Miller
f803564866 Remove defunct site asr3.slashzone.org 2018-12-04 15:09:42 -06:00
Jim Miller
4306286128 Remove defunct site tolkienfanfiction.com 2018-12-04 15:07:53 -06:00
Jim Miller
397a181952 Remove defunct site fanfiction.tenhawkpresents.com 2018-12-04 15:05:37 -06:00
Jim Miller
882948b07f Remove defunct site unknowableroom.org 2018-12-04 15:03:39 -06:00
Jim Miller
92d028e6b1 Remove defunct site mujaji.net 2018-12-04 15:01:48 -06:00
Jim Miller
ee32a4266e Update adapter_harrypotterfanfictioncom for site change. 2018-12-04 14:56:38 -06:00
Jim Miller
f346437bc1 Bump Test Version 3.2.2 2018-12-03 16:33:38 -06:00
Jim Miller
509609ef96 Change for adapter_storiesonlinenet for 'Full Access' stories. 2018-12-03 16:32:18 -06:00
Jim Miller
5eb1f1f3c0 Bump Test Version 3.2.1 2018-11-26 13:54:45 -06:00
Jim Miller
52570b5059 base_xenforoforum_adapter - Fix for buggy threadmarks on SV thread 46020. 2018-11-26 13:54:22 -06:00
Jim Miller
66b7de9668 Bump Release Version 3.2.0 2018-11-17 09:31:26 -06:00
Jim Miller
ae827a21bc Update example.ini. 2018-11-17 09:23:31 -06:00
Jim Miller
cdbee1e2b3 Bump Test Version 3.1.10 2018-11-16 11:18:27 -06:00
Jim Miller
139976c1d1 Add background_color ini verbiage and check before image conversion. 2018-11-16 11:15:09 -06:00
Jim Miller
be5202e225 Bump Test Version 3.1.9 2018-11-11 10:02:23 -06:00
Jim Miller
e1adbcf128 Refactor busy_cursor, busy cursor around saving rejects & settings, bump copyright years. 2018-11-11 09:58:07 -06:00
Jim Miller
b72c29db9a Move rejectlisturls_data out of settings in to own 'namedspaced' for efficiency. PI Only. 2018-11-11 09:27:34 -06:00
Jim Miller
1bb7c46e9c Use saved rejectlist_data. 2018-11-07 12:58:53 -06:00
Jim Miller
6083764346 Reject list changes: show numbers, edit title/author, incomplete save rejects as data. 2018-11-07 12:58:53 -06:00
Jim Miller
2f7ad102a9 Add feature for manually editing settings JSON. Only shown in debug mode. 2018-11-07 12:58:53 -06:00
Jim Miller
33d057606e Add verbiage to [www.asianfanfics.com] about site censoring words when not logged in. 2018-10-30 10:46:41 -05:00
Jim Miller
2b06e36276 Bump Test Version 3.1.8 2018-10-28 19:52:42 -05:00
Jim Miller
5be7b3e788 Detect 'fake' 404 page (with HTTP 200) for adapter_royalroadcom 2018-10-28 19:52:17 -05:00
Jim Miller
95f2dfc170 Fix defaults.ini order 2018-10-28 14:09:14 -05:00
Jim Miller
0216509724 Bump Test Version 3.1.7 2018-10-27 10:04:20 -05:00
Jim Miller
2c9b27f8e1 Fix for adapter_wuxiaworldco for Python3. 2018-10-27 09:51:10 -05:00
Jim Miller
06eaf89459
Merge pull request #354 from Rikkitp/wuxiaworldco_chapternames
Added volumes to chapter titles adapter_wuxiaworldco
2018-10-25 09:10:17 -05:00
Rikkitp
33171cfd25 Added volumes to chapter titles 2018-10-25 15:38:13 +03:00
Jim Miller
053b38603e Bump Test Version 3.1.6 2018-10-24 10:37:27 -05:00
Jim Miller
777ecee10f
Merge pull request #353 from Rikkitp/wuxiaworldco_description_fix
Fix adapter_wuxiaworldco description decomposition, remove outdated testing `raise` from `configurable.py`
2018-10-24 10:36:42 -05:00
Rikkitp
fd4f68d226 Fix adapter_wuxiaworldco description decomposition 2018-10-24 16:05:25 +03:00
Jim Miller
cf9ad72984 Bump Test Version 3.1.5 2018-10-23 10:03:14 -05:00
Jim Miller
934ad4a584 Fix for adapter_wuxiaworldcom site change. 2018-10-23 10:02:37 -05:00
Jim Miller
4e002d15e0 Minor improvement to adapter_asianfanficscom logging. 2018-10-23 10:02:16 -05:00
Jim Miller
c5019a1eea Bump Test Version 3.1.4 2018-10-22 13:26:19 -05:00
Jim Miller
9062806a33 adapter_fanfiktionde login doesn't use ssl.fanfiktion.de anymore. 2018-10-22 13:08:10 -05:00
Jim Miller
63792a0a97 Bump Test Version 3.1.3 2018-10-21 12:44:14 -05:00
Jim Miller
65623b2f52 Update for adapter_storiesonlinenet(& finestories) for login change. 2018-10-21 12:43:51 -05:00
Jim Miller
7a57de78c4 Use a default chapter name when none is given. Problem with adapter_novelonlinefullcom 2018-10-20 10:15:43 -05:00
Jim Miller
238a554aaa Bump Test Version 3.1.2 2018-10-19 14:03:22 -05:00
Jim Miller
1433576b9d Story URLs from email notice for royalroad contain clicktracker links that redirect to actual story URLs. Hit those to get story URL. Issue #349 2018-10-19 14:03:05 -05:00
Jim Miller
4007f82ce6 Bump Test Version 3.1.1 2018-10-18 13:27:56 -05:00
Jim Miller
f86d7d39b8 Fix for problem with new collision code and rejected URL. 2018-10-18 13:27:27 -05:00
Jim Miller
3c4e91c11d Bump Release Version 3.1.0 2018-10-17 10:25:51 -05:00
Jim Miller
46ce5e9b75 Bump Test Version 3.0.11 2018-10-15 15:08:53 -05:00
Jim Miller
8d39ad037e Fix for Calibre Metadata update called with URLs not in library. 2018-10-15 15:05:59 -05:00
Jim Miller
79c95cc346 Add a debug to see when adapter_test1 extractChapterUrlsAndMetadata called. 2018-10-15 15:04:37 -05:00
Jim Miller
bc24d615dc Update translations. 2018-10-15 13:29:07 -05:00
Jim Miller
2b549c5a4d Bump Test Version 3.0.10 2018-10-11 14:31:01 -05:00
Jim Miller
70f004182a Update strings to translate. 2018-10-11 14:30:43 -05:00
Jim Miller
93df1092fd Change adapter_lightnovelgatecom to adapter_novelonlinefullcom for site change. Closes #346 2018-10-11 14:27:24 -05:00
Jim Miller
fbbdd87d7f Bump Test Version 3.0.9 2018-10-11 08:50:43 -05:00
Jim Miller
025cdb1e37 Fix adapter_wuxiaworldco date formate for 24 hour clock. Closes #345 2018-10-11 08:49:41 -05:00
Jim Miller
4a2d876351 Bump Test Version 3.0.8 2018-10-10 10:47:29 -05:00
Jim Miller
86113bc53a Add max_zalgo feature. 2018-10-10 10:47:12 -05:00
Jim Miller
35c3838220 Bump Test Version 3.0.7 2018-10-09 08:02:26 -05:00
Jim Miller
30b5e56bcb Handle stories without cover correctly adapter_wwwlushstoriescom. Closes #344 2018-10-09 07:58:40 -05:00
Jim Miller
ef059f375d Bump Test Version 3.0.6 2018-10-03 16:08:04 -05:00
Jim Miller
9100fd3bfc Add cover images for adapter_wwwlushstoriescom 2018-10-03 16:05:40 -05:00
Jim Miller
bedff7f97a Bump Test Version 3.0.5 2018-09-30 21:33:06 -05:00
Jim Miller
96b96a885f Fix for issues with single-chapter stories not getting correct title after chapter meta refactor Jul 2018. 2018-09-30 21:32:04 -05:00
Jim Miller
e1fdc4738b Put SV(base_xenforo) prefix spans from title into 'genre', comment out some debugs. 2018-09-28 23:39:33 -05:00
Jim Miller
0d06766201 Fix for royalroad.com warnings. 2018-09-28 19:21:43 -05:00
Jim Miller
02fd536873 Rename adapter_royalroadcom appropriately. 2018-09-28 19:12:00 -05:00
Jim Miller
e2a9529189 Bump Test Version 3.0.4 2018-09-27 14:07:35 -05:00
Jim Miller
a660b8e023 Add fix calibre title/author cases options, move force title/author sort values to Standard Columns config Tab. 2018-09-27 14:06:39 -05:00
Jim Miller
baea2e9469 Bump Test Version 3.0.3 2018-09-24 15:06:50 -05:00
Jim Miller
db731e5296 Fix for py3 CLI existing epub update encoding issue. Closes #339 2018-09-24 15:06:29 -05:00
Jim Miller
890fd39d33 Bump Test Version 3.0.2 2018-09-19 13:03:55 -05:00
Jim Miller
9c47698824 Add 'date' chapter metadata to AO3. Closes #336 2018-09-19 13:03:29 -05:00
Jim Miller
b678d70724 Bump Test Version 3.0.1 2018-09-16 13:38:57 -05:00
Jim Miller
e199ea6ca6 Fix version_update.py for https url in index.html 2018-09-16 13:38:54 -05:00
Jim Miller
0af88089c5 Move mobi TOC back to after title page. Requested by jxxtan. 2018-09-16 13:31:39 -05:00
Jim Miller
be37084d40 Fix config ini check for legend_spoilers/remove_spoilers for royalroad.com. 2018-09-13 17:51:42 -05:00
Jim Miller
a62a1eb914 Bump Release Version 3.0.0 2018-09-10 13:44:39 -05:00
Jim Miller
a03625b2dc adapter_chosentwofanficcom: use pagecache and extracategories:Buffy the Vampire Slayer 2018-09-09 12:49:57 -05:00
Jim Miller
09fd06cd52 Set pip install dev status: Production/Stable 2018-09-07 12:54:26 -05:00
Jim Miller
3f5b82fb3d Bump Test Version 2.37.18 2018-09-07 12:48:56 -05:00
Jim Miller
dca5a90682 Fix FimF login. 2018-09-07 11:03:23 -05:00
Jim Miller
9c2dbde065 Update a copyright. 2018-09-05 18:41:00 -05:00
Jim Miller
9a8aed291e Bump Test Version 2.37.17 2018-09-05 10:46:57 -05:00
Jim Miller
22c0834cd7 Fix adapter_fanficauthorsnet metadata parsing and genre splitting. 2018-09-05 10:42:27 -05:00
Jim Miller
cb05247a59 Fix adapter_harrypotterfanfictioncom date and characters/genre splitting. 2018-09-05 10:41:44 -05:00
Jim Miller
fee343ccb3 Add --no-meta-chapters/-z CLI option. 2018-09-04 18:22:16 -05:00
Jim Miller
c246c33c9a Bump Test Version 2.37.16 2018-09-04 12:51:31 -05:00
Jim Miller
552cdcff1d Fix genre parsing for adapter_fanficauthorsnet. 2018-09-04 12:50:03 -05:00
Jim Miller
30006698ad Fix empty Genre parsing for adapter_ficwadcom 2018-09-04 12:41:52 -05:00
Jim Miller
124eb4da85 Bump Test Version 2.37.15 2018-09-04 11:51:53 -05:00
Jim Miller
b220f4db1c Fix for base_xenforoforum (SB/SV specifically) change to 'hide' sections of threadmark lists behind '...'. Issue #332 2018-09-04 11:50:51 -05:00
Jim Miller
295728eafc Update Translations 2018-09-01 09:02:26 -05:00
Jim Miller
6717fbfd89 Restore cursor in finally: clauses in case of issues. 2018-08-30 12:56:17 -05:00
Jim Miller
e5cafa496d Bump Test Version 2.37.14 2018-08-27 13:33:51 -05:00
Jim Miller
ca4abf9692 Remove defunct site national-library.net 2018-08-27 13:29:42 -05:00
Jim Miller
c945b0d4fe Remove defunct site nocturnal-light.net 2018-08-27 13:28:30 -05:00
Jim Miller
019772b278 Remove defunct site imrightbehindyou.com 2018-08-27 13:27:17 -05:00
Jim Miller
6e6055a77b Remove defunct site www.destinysgateway.com 2018-08-27 13:26:14 -05:00
Jim Miller
832ebdc218 Remove defunct site writing.whimsicalwanderings.net 2018-08-27 13:24:59 -05:00
Jim Miller
358454be51 Remove defunct site dramione.org 2018-08-27 13:23:47 -05:00
Jim Miller
fcad70a350 Remove defunct site www.fiction.thebrokenworld.org 2018-08-27 13:22:22 -05:00
Jim Miller
4403181e38 Remove defunct site area52hkh.net 2018-08-27 13:21:13 -05:00
Jim Miller
279b3105ee Small fix for <> appearing in text format. 2018-08-26 15:59:37 -05:00
Jim Miller
769d5a77c2 Improve --save-cache, save on each fetch/post, fix a py2/py3 cross bug. 2018-08-26 14:37:40 -05:00
Jim Miller
2f62b2aa3f Remove some debug output (conflist). 2018-08-26 14:09:36 -05:00
Jim Miller
b1cea64b84 Tweak mobi output--move TOC to end. 2018-08-23 11:53:50 -05:00
Jim Miller
42879bdc34 Tweaks to test1.com adapter. 2018-08-23 11:36:34 -05:00
Jim Miller
2d8ae6238c Bump Test Version 2.37.12 2018-08-22 15:46:34 -05:00
Jim Miller
7a8c16847c Add latestonly option to mark_new_chapters feature to remove pre-existing (new) chpater marks on update and only mark chapters that are new in this update. Closes #330 2018-08-22 15:45:46 -05:00
Jim Miller
4d3579dc66 Bump Test Version 2.37.11 2018-08-21 14:56:20 -05:00
Jim Miller
316b9c15db Attempting to get password protected stories in FimF working again. 2018-08-21 14:55:54 -05:00
Jim Miller
08c45f44e4 Bump Test Version 2.37.10 2018-08-20 12:58:11 -05:00
Jim Miller
905ae4c299 Fix for ffnet metadata parsing with newer BS. 2018-08-20 12:55:37 -05:00
Jim Miller
6effcdcddc Web service - Remove some JS debug output. 2018-08-15 15:33:19 -05:00
Jim Miller
445b74bcb8 Bump Test Version 2.37.7 2018-08-15 12:27:49 -05:00
Jim Miller
29bdb0cf35 Include LICENSE, etc differently to not end up installed in /usr/local 2018-08-15 12:27:15 -05:00
Jim Miller
a021ac65ed Bump Test Version 2.37.6 2018-08-15 12:14:47 -05:00
Jim Miller
becba63ce4 Cleanup Web Service HTML & CSS a bit. Email ebooks disabled. 2018-08-15 12:13:35 -05:00
Jim Miller
2c3372c3b2 Bump Test Version 2.37.4 2018-08-15 11:05:57 -05:00
Jim Miller
398f965726 Add DESCRIPTION.rst, LICENSE, README.md to pip package. Closes #329 2018-08-15 10:55:41 -05:00
Jim Miller
9694cfa883 Use chapters collected, not all chapter count for determining TOC inclusion. Closes #328 2018-08-13 14:57:06 -05:00
Jim Miller
08e4942276 Web: stripHTML() on allrecent descriptions. 2018-08-10 13:49:50 -05:00
Jim Miller
aa88e96b76 Web: Save output format setting in cookie. 2018-08-10 13:49:50 -05:00
Jim Miller
85f28d1054
Merge pull request #325 from JimmXinu/python3
Python2/3 Dual Code
2018-08-10 13:45:47 -05:00
Jim Miller
d50e6d084b Put Nook STR Cover 'fix' back in. 2018-08-10 13:29:14 -05:00
Jim Miller
f7bf2f7d0a Bump Test Version 2.37.3 2018-08-10 11:34:19 -05:00
Jim Miller
ff5e27a89c MOBI Debug output 2018-08-10 11:33:08 -05:00
Jim Miller
a93eeec5eb Fix for mobi output--link to TOC works again--was broken by html5lib enforcing html5 rules. 2018-08-09 19:54:01 -05:00
Jim Miller
6fbf3bc282 Fix for mobi broken page breaks at 'file' boundries and inline 'TOC' links. 2018-08-09 17:31:31 -05:00
Jim Miller
83d923300d Fix for mobi issue with 0 byte record markers being misplaced. 2018-08-09 14:36:08 -05:00
Jim Miller
9397c5e1f7 Fix a stray print to log in mobihtml.py 2018-08-08 22:17:34 -05:00
Jim Miller
c7cc2a3e0f Update ini copyrights. 2018-08-08 22:10:32 -05:00
Jim Miller
2cd4be0db0 Bump Test Version 2.37.2 2018-08-08 14:40:26 -05:00
Jim Miller
7b44ef106e Accept both [royalroad.com] and pre-existing [royalroadl.com] sections. 2018-08-08 14:23:46 -05:00
Jim Miller
39580268ac Change [royalroadl.com] to [www.royalroad.com] 2018-08-08 14:22:57 -05:00
Jim Miller
389eb8969c Make INI order tool py2/py3. 2018-08-08 14:16:41 -05:00
Jim Miller
32857a9dad Update version update code for py3. 2018-08-08 13:48:14 -05:00
Jim Miller
b5fa47838e Update code for Calibre Plugin create for py3. 2018-08-08 13:43:41 -05:00
Jim Miller
2974899ed5 Update included_dependencies to html5lib-1.0.1 2018-08-08 10:30:57 -05:00
Jim Miller
04737b3e85 Update included_dependencies to chardet-3.0.4 2018-08-08 10:30:57 -05:00
Jim Miller
67698baf11 Update included_dependencies to beautifulsoup4-4.6.1 2018-08-08 10:30:57 -05:00
Jim Miller
5be511916b Web service needs that UnicodeDecodeError exception handler 2018-08-08 10:29:39 -05:00
Jim Miller
a999544859 royalroadl.com now wants to be www.royalroad.com. 2018-08-07 17:02:29 -05:00
Jim Miller
2779e15961 Fix for &<> entities in chapter titles. 2018-08-07 14:46:36 -05:00
Jim Miller
c386df4e48 Correction for fanfiktion.de metadata parsing 2018-08-07 12:36:31 -05:00
Jim Miller
56fbe15dc9 Bump Test Version 2.37.1 2018-08-07 11:59:46 -05:00
Jim Miller
95124c0638 Require Python 2.7 or newer in CLI. Dependencies(html5lib,etc) don't work on 2.6 anymore. 2018-08-07 11:57:10 -05:00
Jim Miller
c27c24d1b2 Remove accidental file. 2018-08-06 11:57:41 -05:00
Jim Miller
b4844fe1fe Fixes for python2.6 compatibility. 2018-08-06 11:51:26 -05:00
Jim Miller
870fce58c4 CLI debug conf locations. 2018-08-05 21:19:37 -05:00
Jim Miller
1b4fc022ff Remove <img> tags when include_images:false, not just <img> attrs. 2018-08-05 20:22:56 -05:00
Jim Miller
d909ffb1f1 Remove some dead code from bs3 days. 2018-08-05 18:48:26 -05:00
Jim Miller
76c74006f0 Fix for nbsp removal in stripHTML for calibre. 2018-08-05 18:48:02 -05:00
Jim Miller
b5aa116349 adapter_gravitytalescom actually does need import time. 2018-08-05 18:25:26 -05:00
Jim Miller
c48c5dd35a Fix adapters that used getMetadata(title), which can be changed by various settings. 2018-08-05 18:21:09 -05:00
Jim Miller
5c49248700 Add default built-in values for image_max_size when defaults.ini is missing. 2018-08-04 16:40:23 -05:00
Jim Miller
913a0f6520 Remove a print 2018-08-04 16:40:13 -05:00
Jim Miller
dde61749d8 Fix import to relative. 2018-08-03 11:48:33 -05:00
Jim Miller
723298016d Merge fix? 2018-08-03 09:59:36 -05:00
Jim Miller
d0f8687520 Fixing mobi output for python2/3 dual version. 2018-08-03 09:55:02 -05:00
Jim Miller
556c1a677c More py3 deliberate incompatibilities. 2018-08-03 09:55:02 -05:00
Jim Miller
28ef1d2aa9 Encoding fixes for fanfic.hu, remove print from quotev.com 2018-08-03 09:55:02 -05:00
Jim Miller
ffb2d183e7 more py2/py3 fixes 2018-08-03 09:55:02 -05:00
Jim Miller
028c5e6ed2 more py2/py3 fixes 2018-08-03 09:55:02 -05:00
Jim Miller
101ef13956 more py2/py3 fixes 2018-08-03 09:55:02 -05:00
Jim Miller
3e98844d33 More stripping \xa0 in adapters. 2018-08-03 09:55:01 -05:00
Jim Miller
6fcd6199f3 Strip \xa0 for &nbsp; in stripHTML()--this may need better placement. 2018-08-03 09:55:01 -05:00
Jim Miller
c818eb30b5 Fixes for encoding/make unicode issues. 2018-08-03 09:55:01 -05:00
Jim Miller
3f06b86ef0 Fix for offsetting unicode in mediaminer.org title 2018-08-03 09:55:01 -05:00
Jim Miller
1435ce963c Document fromtimestamp(86400) 2018-08-03 09:55:01 -05:00
Jim Miller
78e100cb9a Fix Request for POST 2018-08-03 09:55:01 -05:00
Jim Miller
eb8a5b2c68 Fix translit. 2018-08-03 09:55:01 -05:00
Jim Miller
4335900aa5 More py2/py3 2018-08-03 09:55:01 -05:00
Jim Miller
bc79a9af38 Comment out pickle. 2018-08-03 09:55:01 -05:00
Jim Miller
b8aa3e9a48 py2/py3 transition code 2018-08-03 09:55:01 -05:00
Jim Miller
5aaf15c8b4 Ignore .bak files. 2018-08-03 09:55:00 -05:00
Jim Miller
8d3ff6d319 from .base_adapter 2018-08-03 09:55:00 -05:00
Jim Miller
b13fc666e4 Remove extra 'import time's 2018-08-03 09:54:59 -05:00
Jim Miller
00de815e65 Add absolute_import imports. 2018-08-03 09:54:59 -05:00
Jim Miller
5b1e8622d8 Bump ALPHA Version 2.37.0--Python 2/3 dual version. 2018-08-03 09:54:59 -05:00
Jim Miller
84f1f6f6d1 Added own copy of six.py as fanficfare.six for ensure_str etc. 2018-08-03 09:53:00 -05:00
Jim Miller
f099f9c1e2 Tweaks to imports for calibre plugin. 2018-08-03 09:53:00 -05:00
Jim Miller
5e66aabb97 Fixes for epub update. 2018-08-03 09:53:00 -05:00
Jim Miller
fa10cd36d1 p2/3 version of xenforoforum & SB 2018-08-03 09:53:00 -05:00
Jim Miller
a0d8f3dbc4 p2/3 version of htmlheuristics 2018-08-03 09:53:00 -05:00
Jim Miller
ae1d045674 Tweaking writers 2018-08-03 09:53:00 -05:00
Jim Miller
85b2c4b344 Fixes for including/manipulating images. 2018-08-03 09:53:00 -05:00
Jim Miller
968a5cca70 Little cleanup & name normalize. 2018-08-03 09:53:00 -05:00
Jim Miller
ce11390484 Fix p2/p3 unichr 2018-08-03 09:53:00 -05:00
Jim Miller
2565a719cc ffnet 2.7/3.7 with save-cache working. 2018-08-03 09:53:00 -05:00
Jim Miller
e05c7a0d90 Add internal python_version metadata. 2018-08-03 09:53:00 -05:00
Jim Miller
2dfa8b761b Because of course py3 uses an incompatible pickle format by default. 2018-08-03 09:52:59 -05:00
Jim Miller
5726d6a2d2 test1.com with epub/txt/html output working, mobi broken. 2018-08-03 09:52:59 -05:00
Jim Miller
a2a0ff0bfd Dual compatible cli.py. 2018-08-03 09:52:59 -05:00
Jim Miller
8627bee253 test1.com with epub/txt/html output working, mobi broken. 2018-08-03 09:52:59 -05:00
Jim Miller
33d2a77c07 Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
a7a08b44ce Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
64795c4921 Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
98b45e147d Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
3f196cd135 Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
ea6efdf8ff Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
7f038be6e3 Working towards python 2.7 & 3 cross compatibility. 2018-08-03 09:52:59 -05:00
Jim Miller
1194be8652 Bump Release Version 2.28.0 2018-08-02 14:43:51 -05:00
Jim Miller
eefad6628a Fixing mobi output for python2/3 dual version. 2018-08-02 14:03:34 -05:00
Jim Miller
f83193dd64 More py3 deliberate incompatibilities. 2018-08-02 13:27:15 -05:00
Jim Miller
2762c3353f Encoding fixes for fanfic.hu, remove print from quotev.com 2018-08-02 13:27:15 -05:00
Jim Miller
a893bdff92 more py2/py3 fixes 2018-08-02 13:27:15 -05:00
Jim Miller
a97b2d347e more py2/py3 fixes 2018-08-02 13:27:15 -05:00
Jim Miller
17aca1bb71 more py2/py3 fixes 2018-08-02 13:27:15 -05:00
Jim Miller
01c836f236 More stripping \xa0 in adapters. 2018-08-02 13:27:14 -05:00
Jim Miller
49f78457ee Strip \xa0 for &nbsp; in stripHTML()--this may need better placement. 2018-08-02 13:27:14 -05:00
Jim Miller
d43b90642f Fixes for encoding/make unicode issues. 2018-08-02 13:27:14 -05:00
Jim Miller
61c3af67e1 Fix for offsetting unicode in mediaminer.org title 2018-08-02 13:27:14 -05:00
Jim Miller
5179a2cd23 Document fromtimestamp(86400) 2018-08-02 13:27:14 -05:00
Jim Miller
f1d4f2f8bb Fix Request for POST 2018-08-02 13:27:14 -05:00
Jim Miller
be990a00a2 Fix translit. 2018-08-02 13:27:14 -05:00
Jim Miller
de9d49c0fc More py2/py3 2018-08-02 13:27:14 -05:00
Jim Miller
a38621d66f Comment out pickle. 2018-08-02 13:27:14 -05:00
Jim Miller
bfd1f8907e py2/py3 transition code 2018-08-02 13:27:14 -05:00
Jim Miller
b3ce28bc99 Ignore .bak files. 2018-08-02 13:27:13 -05:00
Jim Miller
eb29c0b78f from .base_adapter 2018-08-02 13:27:13 -05:00
Jim Miller
7d651a53d1 Remove extra 'import time's 2018-08-02 13:27:12 -05:00
Jim Miller
1ee5c36690 Add absolute_import imports. 2018-08-02 13:27:12 -05:00
Jim Miller
d283290cbf Bump ALPHA Version 2.37.0--Python 2/3 dual version. 2018-08-02 13:27:11 -05:00
Jim Miller
8f4cdfe24a Added own copy of six.py as fanficfare.six for ensure_str etc. 2018-08-02 13:24:11 -05:00
Jim Miller
0870e2056f Tweaks to imports for calibre plugin. 2018-08-02 13:22:42 -05:00
Jim Miller
308f9ffe6b Fixes for epub update. 2018-08-02 13:22:42 -05:00
Jim Miller
f2db0cbc01 p2/3 version of xenforoforum & SB 2018-08-02 13:22:42 -05:00
Jim Miller
5a88e7fcf4 p2/3 version of htmlheuristics 2018-08-02 13:22:42 -05:00
Jim Miller
c80fe74729 Tweaking writers 2018-08-02 13:22:42 -05:00
Jim Miller
7e9c337fb0 Fixes for including/manipulating images. 2018-08-02 13:22:42 -05:00
Jim Miller
a5f6770589 Little cleanup & name normalize. 2018-08-02 13:22:42 -05:00
Jim Miller
58402ea6e5 Fix p2/p3 unichr 2018-08-02 13:22:42 -05:00
Jim Miller
ad1ce3bbb0 ffnet 2.7/3.7 with save-cache working. 2018-08-02 13:22:42 -05:00
Jim Miller
615b2f54b4 Add internal python_version metadata. 2018-08-02 13:22:42 -05:00
Jim Miller
2d2805f1b8 Because of course py3 uses an incompatible pickle format by default. 2018-08-02 13:22:42 -05:00
Jim Miller
0783a74b59 test1.com with epub/txt/html output working, mobi broken. 2018-08-02 13:22:42 -05:00
Jim Miller
04d77dd214 Dual compatible cli.py. 2018-08-02 13:22:42 -05:00
Jim Miller
0b9ea4bebb test1.com with epub/txt/html output working, mobi broken. 2018-08-02 13:22:41 -05:00
Jim Miller
e3ab18589b Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
957ff3edf4 Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
cea3773e4f Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
1a2392a8c8 Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
ac3b288f3b Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
611e6cecf2 Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
7b97439fcd Working towards python 2.7 & 3 cross compatibility. 2018-08-02 13:22:41 -05:00
Jim Miller
409729e55a Bump Test Version 2.27.12 2018-08-01 20:46:21 -05:00
Jim Miller
8fb7f048b5 Update plugin & web service html2text included package to fix text <>& output. 2018-08-01 20:19:46 -05:00
Jim Miller
9186c2fae9 adapter_royalroadl site uses relative dates now, including months and years ago. 2018-08-01 19:26:36 -05:00
Jim Miller
b87a076c86 Remove in-story ad links from adapter_asexstoriescom. 2018-08-01 14:32:31 -05:00
Jim Miller
db0d8ae339 Fix html appearing in txt summary by default. 2018-08-01 14:23:13 -05:00
Jim Miller
d0f6b53fd5 Bump Test Version 2.27.11 2018-07-31 20:49:27 -05:00
Jim Miller
3a3d57add4 Fix for corner case screwing up chapter html(attr quotes). Closes #324 2018-07-31 20:48:57 -05:00
Jim Miller
e56cca5bd9 Add a style attr in adapter_test1. 2018-07-31 19:43:15 -05:00
Jim Miller
367f2a96ec Fix version metadata for CLI. 2018-07-30 12:17:11 -05:00
Jim Miller
34c580dbe8 Bump Test Version 2.27.10 2018-07-30 12:01:45 -05:00
Ea
231dd15abb adapter_webnovelcom: update title selection (#323)
The title style has been changed on 30th July 2018 between 03:00 UTC and
10:00 UTC. Before the tag was was: `<h2 class="pt8 pb8 oh f_serif">`,
now it is `<h2 class="pt4 pb4 oh mb4">`.

To fix it we just need to select for `h2` since there is only one `h2`
in the book details section and the `pt8` is just a styling class
(for `padding-top: 8px`) that may change when style is changed.
2018-07-30 12:00:01 -05:00
Jim Miller
05192b2c88 Bump Test Version 2.27.9 2018-07-26 16:47:38 -05:00
Jim Miller
cd1e94dbf7 Fix section links in html output(broken in def6b39) 2018-07-26 15:24:25 -05:00
Jim Miller
4043d7d301 Change AO3 description blockquote to a div tag. 2018-07-26 10:19:47 -05:00
Jim Miller
b231abc036 Remove accidental file. 2018-07-25 15:07:06 -05:00
Jim Miller
576016ccaf Remove Google Plus icon. 2018-07-25 12:49:08 -05:00
Jim Miller
254652c748 Bump Test Version 2.27.8 2018-07-25 12:43:31 -05:00
Jim Miller
6cfa71c3f0 Fix base_xenforoforum_adapter bug when no threadmarks. 2018-07-24 14:15:16 -05:00
Jim Miller
cc1359abf7 Bump Test Version 2.27.7 2018-07-24 11:07:56 -05:00
Jim Miller
f3ba815757 adapter_wuxiaworldco: Some older stories use a different date format. 2018-07-24 11:07:28 -05:00
Jim Miller
c3fd518817 Add note to use_threadmark_wordcounts setting--base_xenforo sites' wordcounts ignore words insite Spoiler tags. 2018-07-24 10:49:43 -05:00
Jim Miller
6e2bc4b8c4 Fix for site change: adapter_wuxiaworldcom 2018-07-24 10:46:04 -05:00
Jim Miller
0da6b59eff Get fanficauthors.net from existing epubs downloaded from the site. 2018-07-23 12:50:41 -05:00
Jim Miller
78f3b5e47a Bump Test Version 2.27.6 2018-07-19 16:44:22 -05:00
Jim Miller
b888b846e4 Fix origtitle/toctitle for mark_new_chapters. Broken in chapter metadata revamp. 2018-07-19 16:44:22 -05:00
Jim Miller
d3b9315e91 Add [epub] tocpage_entry example to INIs for base_xenforoforum. 2018-07-19 09:29:44 -05:00
Jim Miller
252b127c15 Bump Test Version 2.27.5 2018-07-17 10:58:42 -05:00
Jim Miller
def6b3991f Fix tocpage links and correct index04 vs index, issue #320. 2018-07-17 10:58:13 -05:00
Jim Miller
aac7b5912f Bump Test Version 2.27.4 2018-07-14 16:31:38 -05:00
Jim Miller
f1e66f247e base_xenforoforum: Adding date, words & kwords per chapter metadata. 2018-07-14 16:31:20 -05:00
Jim Miller
ffee4aa495 Refactor chapter internals for additional site-specific metadata per chapter. 2018-07-14 16:11:01 -05:00
Jim Miller
2943b0964f Fix base_xenforo_list for AH & QQ prefered domains. 2018-07-14 15:55:39 -05:00
Jim Miller
f03bb3fcf0 Refactor chapter internals. 2018-07-14 14:30:27 -05:00
Jim Miller
ffa62c8cfa Bump Test Version 2.27.3 2018-07-13 12:29:28 -05:00
Jim Miller
6c261914d8 Add ignore_chapter_url_list feature. 2018-07-13 12:19:02 -05:00
Jim Miller
f32f830004 adapter_trekfanfictionnet: don't set numWords to *character* count. 2018-07-13 12:11:28 -05:00
Jim Miller
8381dba465 Abstract adapter.chapterUrls into base_adapter. 2018-07-13 10:44:31 -05:00
Jim Miller
f29f86d4f4 base_xenforoforum: Sum threadmark word counts into numWords(when present). INI option use_threadmark_wordcounts defaults to true. 2018-07-12 12:15:31 -05:00
Jim Miller
f3a22d1d37 Add alternate domains for SB, SV & QQ xenforo adapters. 2018-07-11 11:10:17 -05:00
Matěj Cepl
e86874d124 Replace leading TABs with spaces (#316)
* Replace leading TABs with spaces

* Convert all files from CRLF to LF

* Fix incorrect indentation
2018-07-08 22:08:02 -05:00
Jim Miller
4bd3b09c4f Bump Test Version 2.27.2 2018-07-06 13:13:23 -05:00
Jim Miller
9e16cdf4ef Add adapter_harrypotterfanfictioncom for new version of returned harrypotterfanfiction.com. 2018-07-06 13:13:02 -05:00
Jim Miller
612442a674 Bump Test Version 2.27.1 2018-07-05 19:51:19 -05:00
Chris Braun
283e5c6f31 Add adapter for http://wuxiaworld.co/ (#315)
* Add adapter for http://wuxiaworld.co/ (this is not https://www.wuxiaworld.com/!)
* Fix date format, add cover image
* Update Copyright, and add empty sections to default configuration files
2018-07-05 19:49:46 -05:00
Jim Miller
0d041e1188 Bump Release Version 2.27.0 2018-07-03 10:11:35 -05:00
Jim Miller
01e6ce63b6 Bump Test Version 2.26.9 2018-07-01 09:32:42 -05:00
David
31187c9e13 Change StoriesOnline to HTTPS (#311)
SIte changed to HTTPS for all pages on 30/06/2018. Keeping HTTP URLs as
valid for backward compatibility.
2018-07-01 09:29:28 -05:00
Jim Miller
b71f018e19 Bump Test Version 2.26.8 2018-06-28 15:40:59 -05:00
Jim Miller
7002f5806b adapter_webnovelcom: ignore 'ad-walled' chapters--the ad-wall bypass code stopped working, but isn't yet removed. 2018-06-28 15:40:38 -05:00
Jim Miller
6d3cae9e6a Add status states Paused & Cancelled to adapter_fanfiktionde as well as site specific native_status. 2018-06-28 15:32:56 -05:00
Jim Miller
13d19f927e Yet more site change for adapter_webnovelcom 2018-06-27 17:38:41 -05:00
Jim Miller
5048dfe2a5 Bump Test Version 2.26.7 2018-06-25 12:45:56 -05:00
Jim Miller
a172ffb106 Remove replace_br_with_p sentinels in desc HTML before giving to Calibre. 2018-06-25 12:39:27 -05:00
Jim Miller
de8f21aa92 Bump Test Version 2.26.6 2018-06-23 13:04:45 -05:00
Jim Miller
63f0b16177 Add adapter_classes metadata for developer testing. 2018-06-23 13:04:44 -05:00
Jim Miller
3e779975c3 Revert archive.skyehawke.com back https -> http. 2018-06-23 12:39:35 -05:00
Jim Miller
a2c67e8594 Add adapter_classes metadata for developer testing. 2018-06-23 11:38:25 -05:00
Jim Miller
3661966a9b Calibre Plugin: Remove ebook formats before update, overwrite or unnew so that the previouis version ends up in trash instead of just copied over. 2018-06-22 18:02:35 -05:00
Jim Miller
109551d5bd Bump Test Version 2.26.5 2018-06-15 10:39:04 -05:00
Jim Miller
c0030a619b Merge branch 'http2https-works-only' 2018-06-15 10:38:50 -05:00
Jim Miller
7fd0ccba45 Fix for adapter_inkbunnynet author search. 2018-06-15 10:30:06 -05:00
Jim Miller
f1b3bc021e Change all sites that will work with https to use it all the time. 2018-06-15 10:29:16 -05:00
Jim Miller
99ea6d5064 Add force_https option for developer testing. 2018-06-14 14:56:03 -05:00
Jim Miller
fecdc3f114 base_efiction: use getProtocol for images too. 2018-06-14 14:15:32 -05:00
Jim Miller
586ddb6084 Fix for author in adapter_lcfanficcom 2018-06-14 13:15:23 -05:00
Jim Miller
66f7765c5c Bump Test Version 2.26.4 2018-06-13 18:16:40 -05:00
Jim Miller
f68e2a0752 Fix author for adapter_inkbunnynet. 2018-06-13 18:15:47 -05:00
Jim Miller
2be25aa2f2 Updates for site changes for adapter_gravitytalescom. 2018-06-13 16:32:40 -05:00
Jim Miller
173bbe8fa6 adapter_adultfanfictionorg used urllib2 exceptions without importing it. 2018-06-13 13:22:22 -05:00
Jim Miller
2bd644cd66 Remove fanfiction.mugglenet.com -- mugglenet.com is there, but no fanfic section anymore. 2018-06-13 13:08:41 -05:00
Jim Miller
5630357ec1 Bump Test Version 2.26.3 2018-06-11 23:25:38 -05:00
Jim Miller
9ef259c882 Site update fixes for adapter_inkbunnynet, thanks GComyn. 2018-06-11 23:25:15 -05:00
Jim Miller
ab7dd7523f Bump Test Version 2.26.2 2018-06-06 11:58:23 -05:00
Dmitry Snegirev
b58ca0f4f2 Add status to webnovelcom (#306) 2018-06-06 09:49:45 -05:00
Jim Miller
8ca3490c47 Bump Test Version 2.26.1 2018-06-05 12:00:12 -05:00
Jim Miller
4d54cf419d Fixes for adapter_webnovelcom for site changes. 2018-06-05 11:59:39 -05:00
Jim Miller
bccc32bd2a Bump Release Version 2.26.0 2018-05-30 12:58:22 -05:00
Jim Miller
e1529c21a6 CLI debug option --save-cache OR --save_cache 2018-05-26 22:47:32 -05:00
Jim Miller
85aec35ea9 Bump Test Version 2.25.7 2018-05-24 12:32:11 -05:00
Jim Miller
4828a63e22 Clarify some tooltip text and add 'FFF Frozen URL' feature. 2018-05-24 12:31:51 -05:00
Jim Miller
bd33668954 https for starslibrarynet, abstract protocol a bit for eFiction Base. 2018-05-24 12:30:25 -05:00
Jim Miller
54ab68c7ab Bump Test Version 2.25.6 2018-05-14 12:50:56 -05:00
Dmitry Snegirev
b34d01691f fixes dateUpdated ang datePublished for lightnovelgate (#303)
* fixes encoding and space for lightnovelgates

* fixes dateUpdated and datePublished for lightnovelgate
2018-05-14 12:49:46 -05:00
Jim Miller
8b17846c4d Bump Test Version 2.25.5 2018-05-08 15:28:13 -05:00
Jim Miller
d521cfdcf0 Consolidate URL chapter range code and apply to CLI for #302. 2018-05-08 15:27:42 -05:00
Jim Miller
5dfb2242b3 Bump Test Version 2.25.4 2018-05-06 16:42:18 -05:00
Jim Miller
bac2092d5a Merge branch 'Rikkitp-novelall_chapname_fix' 2018-05-06 16:40:56 -05:00
Dmitry Snegirev
bd197a10c2 remove unnecessary br around advert in www.novelall.com 2018-05-07 00:06:12 +03:00
Dmitry Snegirev
a2a3efce52 fix chapter names in www.novelall.com 2018-05-07 00:04:26 +03:00
Jim Miller
7d3e1ccc95 Adding website_encodings:ignore feature for adapter_wwwnovelallcom. 2018-05-06 15:25:45 -05:00
Jim Miller
3e49134cf6 Remove defunct sites thealphagate.com and harrypotterfanfiction.com 2018-05-04 12:08:51 -05:00
Jim Miller
9a6bd88d00 Fix date for adapter_gluttonyfictioncom 2018-05-04 11:26:48 -05:00
Jim Miller
71b107fc61 Bump Test Version 2.25.3 2018-05-03 12:11:14 -05:00
Jim Miller
157b0555e4 Add rating and sitetags to adapter_webnovelcom. 2018-05-03 12:02:19 -05:00
Jim Miller
04eaab1acf Change adapter_whoficcom to https. 2018-05-03 11:42:43 -05:00
Jim Miller
12430fdbdc Bump Test Version 2.25.2 2018-05-02 18:03:22 -05:00
Jim Miller
771246c9de Fix some metadata collection in adapter_webnovelcom. 2018-05-02 18:02:43 -05:00
Jim Miller
ae68c7997b adapter_webnovelcom - use_pagecache - mostly for debugging. 2018-05-02 17:25:18 -05:00
Jim Miller
0a97a55467 Bump Test Version 2.25.1 2018-04-29 15:42:02 -05:00
Jim Miller
a04954294b base_xenforoforum_adapter: exclude threadmarks URL as chapter URL. 2018-04-29 15:40:54 -05:00
Jim Miller
5aa49e6744 Bump Release Version 2.25.0 2018-04-29 13:13:19 -05:00
Jim Miller
4ec0daa137 Bump Test Version 2.24.8 2018-04-18 14:05:45 -05:00
Jim Miller
84f9969a6c Fix adapter_hpfanficarchivecom to not take author from banner by mistake. 2018-04-18 14:05:20 -05:00
Jim Miller
f2ff2c1206 Bump Test Version 2.24.7 2018-04-17 12:39:57 -05:00
Jim Miller
e1fa2d698e Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2018-04-17 12:38:53 -05:00
Dminti Snegirev
0f809f36a9 add logger import to dateutils.py (#297) 2018-04-17 12:38:47 -05:00
Jim Miller
27d06c9cba Update for web service token file. 2018-04-16 12:26:45 -05:00
Jim Miller
a6c551ee1b Add/Update some comments to rikkitp's changes. 2018-04-16 11:31:18 -05:00
Dmitry Snegirev
535451cfa0 Add adapter www.novelall.com 2018-04-15 23:58:28 +03:00
Jim Miller
f7d6c70eaf Bump Test Version 2.24.6 2018-04-15 10:09:00 -05:00
David
fb000ad24e Fix downloading illustrated stories from literotica. (#295)
Was doing an "utf8FromSoup" which destroyed the image links. Now I can't
remember why I thought this was needed.
2018-04-15 10:07:45 -05:00
Jim Miller
5e09e9d718 Bump Test Version 2.24.5 2018-04-14 09:31:56 -05:00
Jim Miller
2db8e070b4 Paranoia check for legend_spoilers feature in royalroadl 2018-04-14 09:29:59 -05:00
Jim Miller
04e9a27c06 Bump Test Version 2.24.4 2018-04-12 23:21:01 -05:00
Jim Miller
e77a6c6fdc Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2018-04-12 23:18:52 -05:00
Jim Miller
37156e8a6c Add remove_spoilers & legend_spoilers options to royalroadl.com for Issue #287 2018-04-12 23:17:36 -05:00
Jim Miller
45b07bdc8d Bump Test Version 2.24.3 2018-04-12 10:15:34 -05:00
Dminti Snegirev
620f512d02 Add reformating option fix_excess_space to lightnovelgate (#291) 2018-04-12 10:14:23 -05:00
Dminti Snegirev
8010d50ab8 Added status metadata to wuxiaworld and royalroadl (#289)
Also category (FanFiction vs Original) for adapter_royalroadl
2018-04-11 14:04:55 -05:00
Dminti Snegirev
83bceb8154 Fix www.webnovel.com adapter with volumes (#293) 2018-04-11 14:01:37 -05:00
Jim Miller
5d57fdbe6c Fix author URL/Id fetch for adapter_wwwlushstoriescom. 2018-04-07 14:06:47 -05:00
Jim Miller
2b03ad1b39 Bump Test Version 2.24.2 2018-04-05 09:49:21 -05:00
Dminti Snegirev
f743185771 Fix for wuxiaworld adapter when image is null (#288) 2018-04-05 09:48:49 -05:00
Jim Miller
f6b51d2b92 Allow domain sufficientvelocity.com for forums.sufficientvelocity.com. 2018-04-02 18:35:25 -05:00
Jim Miller
02bc402792 Bump Test Version 2.24.1 2018-03-30 14:36:36 -05:00
Jim Miller
7b34b7e5c2 Fixes for tables, add keep_empty_tags:p,td,th and add to keep_html_attrs colspan,rowspan. 2018-03-30 14:36:11 -05:00
Jim Miller
d8224d129e Chaneg ficwad.com to https. 2018-03-28 20:28:29 -05:00
Jim Miller
05dca7d974 Bump Release Version 2.24.0 2018-03-27 10:36:13 -05:00
Jim Miller
e1796b2538 Update translations 2018-03-27 10:34:20 -05:00
Chris Braun
cc110de643 Fix erroneous HTML tags(webnovel.com) (#276) 2018-03-21 10:25:57 -05:00
Jim Miller
0739aa47db Remove somemore defunct base_xenforoforum show_chapter_authors code. 2018-03-20 13:24:26 -05:00
Jim Miller
35fe2e983a Remove now defunct base_xenforoforum show_chapter_authors option. Sites changed threadmarks.rss both to be unusable by FFF *and* not include post authors. 2018-03-20 13:19:31 -05:00
Jim Miller
acc89951df Bump Test Version 2.23.6 2018-03-19 10:28:59 -05:00
Jim Miller
ef0815e916 Fix for SB/SV threadmarks.rss becoming incomplete list--move extract_threadmarks from QQ back to base. Incomplete--need to remove show_chapter_authors feature. 2018-03-19 10:28:39 -05:00
Jim Miller
f15d61e15a Bump Test Version 2.23.5 2018-03-17 15:23:47 -05:00
Chris Braun
9699e79f46 Strip arbitrary milliseconds precision from date (#277) 2018-03-17 15:23:04 -05:00
Jim Miller
892eb3a01f Bump Test Version 2.23.4 2018-03-13 14:02:16 -05:00
Jim Miller
c98ccd7fbe Merge branch 'cryzed-update-wuxiaworld.com-adapter' 2018-03-13 13:59:03 -05:00
cryzed
5cf59c2fcb Make stripHTML import relative 2018-03-13 19:56:22 +01:00
cryzed
bb38ecde61 Make quotes consistent 2018-03-13 19:26:26 +01:00
cryzed
ea588f60ca Update adapter for wuxiaworld.com 2018-03-13 19:23:08 +01:00
Jim Miller
7c5d100121 Remove extra CSS line from default ini files. 2018-03-12 23:34:27 -05:00
Jim Miller
12d7199cf6 Update translations. 2018-03-12 23:33:40 -05:00
Jim Miller
e82ae3e1f6 Bump Test Version 2.23.3 2018-03-12 23:26:49 -05:00
Jim Miller
33480c3ce4 adapter_storiesonline(FineStories.com) fix for missing author link in header tag. 2018-03-12 19:00:58 -05:00
Jim Miller
99880b7c97 Bump Test Version 2.23.2 2018-03-12 17:26:02 -05:00
Jim Miller
7b363dfa58 Fix fix_pseudo_html in configurable.py for plugin edit check. 2018-03-12 17:23:25 -05:00
Jim Miller
97b0468e8f Bump Test Version 2.23.1 2018-03-11 21:48:14 -05:00
Chris Braun
890bd99062 Fix pseudo HTML for webnovel.com corrections (#273)
* I might go to hell for this

* Add 2 TinyMCE-specific tags that somehow ended up in a story's content and make the feature optional

* Fix a typo for the annotations-tag

* Correct fix_pseudo_html option behavior and add to valid set options, adjust logging level

* Correct typo

* Add fix_pseudo_html as a valid keyword
2018-03-11 21:45:09 -05:00
Chris Braun
889bfb481f Fix pseudo HTML for webnovel.com (#269)
* I might go to hell for this  (Yes. Yes you might. --J)

* Add 2 TinyMCE-specific tags that somehow ended up in a story's content and make the feature optional

* Fix a typo for the annotations-tag
2018-03-10 13:48:58 -06:00
theit8514
9cf3da6383 Add output_filename to --meta-only (#271) 2018-03-10 13:45:14 -06:00
Jim Miller
2eb5e6401d Bump Release Version 2.23.0 2018-02-28 10:29:19 -06:00
Jim Miller
86d796e08a Bump Test Version 2.22.6 2018-02-26 19:17:28 -06:00
Jim Miller
bfb211e7cc Update translations. 2018-02-26 09:26:51 -06:00
Jim Miller
1dde324b51 Calculate number of chapters using start-end range when doing updates. 2018-02-26 09:25:38 -06:00
Jim Miller
51db9d2f48 Bump Test Version 2.22.5 2018-02-22 17:17:36 -06:00
Jim Miller
11e8126cfd Make Get URLs from Page work better with TtH is_adult. 2018-02-22 17:12:58 -06:00
Jim Miller
ab0d8279c7 Bump Test Version 2.22.4 2018-02-15 20:27:20 -06:00
Jim Miller
03845d08c0 Fix rating, warnings, add ships to adapter_harrypotterfanfictioncom 2018-02-15 20:25:47 -06:00
Jim Miller
f2a617ba4e Bump Test Version 2.22.3 2018-02-14 16:02:21 -06:00
Jim Miller
02bc486ff7 Update adapter_efpfanficnet to use https and remove www. by default. 2018-02-14 16:01:47 -06:00
Jim Miller
a6c79e4057 Bump Test Version 2.22.2 2018-02-04 16:18:50 -06:00
Jim Miller
e18f52086b *Don't* include fandoms in category for fimfiction.net by default. 2018-02-04 16:18:24 -06:00
Chris Braun
e4f35d883f Handle new VIP chapter types in adapter_webnovelcom (#263) 2018-02-04 14:58:12 -06:00
Jim Miller
c5105b1580 Tweak FimF settings so 'genre' also contains 'content' tags and 'category' contains 'fandom' tags. 2018-02-02 13:10:56 -06:00
Jim Miller
286da2be9d Bump Test Version 2.22.1 2018-01-26 13:09:31 -06:00
Jim Miller
f73e73e807 Tweak FimF settings so 'genre' also contains 'content' tags and 'category' contains 'fandom' tags. 2018-01-26 13:08:40 -06:00
Jim Miller
beb33d82df Bump Release Version 2.22.0 2018-01-23 10:22:15 -06:00
Jim Miller
65931d9785 Update translations. 2018-01-23 10:22:14 -06:00
Jim Miller
2e0412d2d2 Bump Test Version 2.21.3 2018-01-20 11:03:56 -06:00
Jim Miller
f6dcac829c Tweak 'Chapter not found...' check for ffnet for changed/new text. 2018-01-19 17:36:31 -06:00
Jim Miller
72f6a858f6 Bump Test Version 2.21.2 2018-01-04 20:23:52 -06:00
Jim Miller
271ed9a7f1 first_chapter/last_chapter experimental/unsupported feature 2018-01-03 11:41:45 -06:00
Jim Miller
4b89d917b4 Skip date from first chapter if not there (adapter_ficbooknet) 2018-01-03 11:37:46 -06:00
Jim Miller
efb6365b90 Set siteabbrev for adapter_wattpadcom 2017-12-28 11:26:03 -06:00
Jim Miller
abea5c17d7 Bump Test Version 2.21.1 2017-12-20 11:28:50 -06:00
FaceDeer
9a45ad3641 fimfiction adapter: add new tag types, fix groups (#259)
* missed the "st" ordinal case when parsing dates

* add option to collect Fimfiction author's notes

* fimfiction adapter: views/total_views being read incorrectly

* Add new tag types to fimfiction adapter and fix groups metadata
2017-12-20 11:27:52 -06:00
Jim Miller
fd2dddcd30 Additional CLI python version checking. 2017-12-18 14:15:50 -06:00
Jim Miller
56a60471d2 Handle QQ threads w/o threadmarks correctly. 2017-12-16 10:50:21 -06:00
Jim Miller
b90ddd29a9 Bump Release Version 2.20.0 2017-12-14 12:33:52 -06:00
Jim Miller
360acae48a Bump Test Version 2.19.5 2017-12-05 18:50:10 -06:00
Chris Braun
489ffa5449 Fix author parsing for webnovel.com (#254) 2017-12-05 18:38:39 -06:00
Jim Miller
d009816831 Bump Test Version 2.19.4 2017-12-05 10:02:23 -06:00
Jim Miller
ede19292d3 Update translations. 2017-12-04 12:30:08 -06:00
Jim Miller
62d41b429c Skip #post- URLs in xenforo emails even when on first page. 2017-11-30 16:23:24 -06:00
Jim Miller
912ed6ba90 Bump Test Version 2.19.3 2017-11-29 12:09:38 -06:00
Jim Miller
3545477f1f Don't 'fix' file:/// to //. 2017-11-29 12:09:19 -06:00
Jim Miller
8e9eeaddf9 Bump Test Version 2.19.2 2017-11-28 20:57:02 -06:00
Jim Miller
34f695bb60 Adding pre_process_cmd for CLI. 2017-11-28 20:56:40 -06:00
Jim Miller
c492b73e99 Apply is_adult&user/pass dialogs to CALIBREONLY update. 2017-11-27 14:47:29 -06:00
Jim Miller
e70af2aeca Fix to adapter_storiesonlinenet author from GComyn. 2017-11-26 21:12:15 -06:00
Jim Miller
6cf077afec Bump Test Version 2.19.1 2017-11-18 15:45:20 -06:00
Jim Miller
fb8c9c13be Strip commas from numChapters in CLI for urlchaptercount. 2017-11-18 15:44:35 -06:00
Jim Miller
f059007bbb Bump Release Version 2.19.0 2017-11-13 10:46:12 -06:00
Jim Miller
4d38a0e099 Remove site obidala.net, content moved to AO3. 2017-11-11 12:22:15 -06:00
Jim Miller
9641842678 Fix adapter_tolkienfanfiction for site changes, remove strip_chapter_numeral site specific option--use strip_chapter_numbers option. 2017-11-11 11:50:50 -06:00
Jim Miller
8c51c4ca04 Bump Test Version 2.18.5 2017-11-09 18:28:37 -06:00
Jim Miller
29617e878a Reorder default ini files. 2017-11-09 18:27:08 -06:00
Dminti Snegirev
c82b20143d added adapter_lightnovelgatecom (#247) 2017-11-09 18:18:52 -06:00
Jim Miller
fce47d650e Correct CLI message for is_adult with --non-interactive. 2017-11-08 13:52:10 -06:00
Jim Miller
64376cf174 Bump Test Version 2.18.4 2017-11-08 12:52:25 -06:00
theit8514
012e0ea34f Add cli option --non-interactive (#192) 2017-11-08 12:49:25 -06:00
Jim Miller
2d3d334d9e Bump Test Version 2.18.3 2017-10-27 11:57:42 -05:00
Jim Miller
5989d1d06b adapter_harrypotterfanfictioncom: https default now. 2017-10-27 11:57:10 -05:00
Jim Miller
b47f7599e3 Add (personal.ini) note to User Config link - webservice. 2017-10-27 11:55:34 -05:00
Jim Miller
82ed9fb43f adapter_webnovelcom: normalize story URL. 2017-10-22 12:04:35 -05:00
Jim Miller
5c281f6ade Bump Test Version 2.18.2 2017-10-19 14:33:13 -05:00
Jim Miller
a19331e311 CLI: Fix problem with --download-list from plugin CLI, move StringIO for passed INIs down to point of use. 2017-10-19 14:26:25 -05:00
Jim Miller
cef419b574 Make plugin called as CLI defaults.ini behavor match help text--only use passed defaults.ini. 2017-10-19 13:08:38 -05:00
Jim Miller
fc61de611f cli.py wasn't logging correctly. Remove rather than add new output. 2017-10-19 13:08:38 -05:00
Chris Braun
9c939c039b adapter_webnovelcom: Fix escaping of stories with richly formatted (HTML) chapter content, the second (#239) 2017-10-19 11:48:10 -05:00
Jim Miller
98b630e8f8 Bump Test Version 2.18.1 2017-10-18 09:54:49 -05:00
Chris Braun
0b4787ef3a Fix escaping of stories with richly formatted (HTML) chapter content (#238)
Issue #237
2017-10-18 09:05:58 -05:00
Jim Miller
2cbdea1f8b Not all adapter_royalroadl stories have genre. 2017-10-16 13:02:10 -05:00
Jim Miller
b28f2e8577 Bump Release Version 2.18.0 2017-10-16 09:22:03 -05:00
Jim Miller
b046d16405 Update Translations. 2017-10-16 09:20:25 -05:00
Jim Miller
d56f5bf0eb Update translations 2017-10-13 11:40:13 -05:00
Jim Miller
11032d751b Bump Test Version 2.17.8 2017-10-12 13:17:30 -05:00
Jim Miller
7239e4c43e adapter_ponyfictionarchivenet: Fix for site change. 2017-10-12 13:14:25 -05:00
Jim Miller
3c96870db9 adapter_storiesonlinenet: Fix for premium stories author link. 2017-10-12 13:14:02 -05:00
Jim Miller
2ed5e5e6bb adapter_literotica: Treat 410(removed) same as 404: StoryDoesNotExist 2017-10-12 09:33:30 -05:00
Jim Miller
90490606e5 Bump Test Version 2.17.7 2017-10-02 13:03:21 -05:00
Jim Miller
3acf2ec358 Remove empty chars from ships when doing sort_ships / ships_CHARS 2017-10-02 13:02:58 -05:00
Jim Miller
0cd624a9da Bump Test Version 2.17.6 2017-09-22 13:13:41 -05:00
Jim Miller
6f118f0a1d Add AO3 login back with special call for auth token. 2017-09-22 13:11:49 -05:00
Jim Miller
b3dc95c66c Tweak 'no AO3 login allowed' exception message. 2017-09-21 16:13:20 -05:00
Jim Miller
945d3e16fe Bump Test Version 2.17.5 2017-09-21 11:27:18 -05:00
Jim Miller
31c26ac49d Thread URLs only for xenforo emails, plus cleanup a little. 2017-09-21 11:26:53 -05:00
Chris Braun
e643315532 Fix parsing of author name for webnovel.com (hopefully) (#232) 2017-09-20 13:15:29 -05:00
Jim Miller
20e96aa056 Bump Test Version 2.17.4 2017-09-19 21:02:14 -05:00
Jim Miller
4ad6d02e4d Change some network warning and non-halting error logs to debug due to QQ login showing them in CLI. 2017-09-19 20:59:10 -05:00
Chris Braun
2aa0b5160e Fix webnovel.com escaping again (#231) 2017-09-19 17:37:15 -05:00
Jim Miller
1344c895f3 Bump Test Version 2.17.3 2017-09-16 14:53:24 -05:00
Jim Miller
bf5e95f4c0 Remove AO3 login rather that try to update with extra .json call for token. 2017-09-16 14:52:32 -05:00
Jim Miller
993bd416a2 Bump Test Version 2.17.2 2017-09-15 11:07:16 -05:00
Jim Miller
85f3aaf4f0 Add --json-meta CLI option for --meta-only output, tweak --meta-only output. 2017-09-15 11:04:52 -05:00
Jim Miller
30e7c4aeaf Bump Release Version 2.17.1 2017-09-14 11:58:20 -05:00
Jim Miller
40dc033848 Update Translations. 2017-09-14 11:56:06 -05:00
Jim Miller
0322060c28 Bump Release Version 2.17.0 2017-09-14 11:44:27 -05:00
Jim Miller
18b99f1d36 Bump Test Version 2.16.7 2017-09-13 13:27:39 -05:00
Jim Miller
312a15029f Plugin: Save 'Show Download Options' check in gprefs like dialog geometries. 2017-09-13 13:26:46 -05:00
Jim Miller
8c6e1e00c0 Remove AO3 login from get_urls_from_page--login is failing and it isn't required anymore for 'adult'. 'Restricted' (user-only) won't work. 2017-09-13 12:43:51 -05:00
Jim Miller
2d8b3b2b26 Bump Test Version 2.16.6 2017-09-11 15:14:34 -05:00
Jim Miller
40739b87d2 Save commented out save-cache-on-fetch used for debugging. 2017-09-11 15:13:58 -05:00
Jim Miller
6980f0b990 CLI: Show chapter numbers with -m option dump. 2017-09-11 15:11:58 -05:00
Jim Miller
d9d776e8d2 base_xenforo: Fix for extended chars in threadmark chapter names. 2017-09-11 15:08:33 -05:00
Jim Miller
6beb0bf2d3 Fixes for QQ and AH changes. 2017-09-11 12:28:31 -05:00
Jim Miller
53b75670b4 Bump Test Version 2.16.5 2017-09-07 16:04:54 -05:00
Jim Miller
ae06a9b706 base_xenforo: Couple small fixes for corner cases. 2017-09-07 15:39:31 -05:00
Jim Miller
91b20b571b Bump Test Version 2.16.4 2017-09-06 14:33:18 -05:00
David
e3f027df84 adapter_literotica: Build the chapter a little better especially if there is mutiple pages (#225)
Had some leftover br handling that didn't work.  And now combine
multiple pages into one before using the allow_replace_br_with_p if it
is enable. And wanted to use the FFF version of this as it handles other
tags better than I could write.
2017-09-06 14:25:07 -05:00
Jim Miller
aa9d3066e9 Bump Test Version 2017-09-05 09:36:32 -05:00
Jim Miller
5cd9ea69c4 base_xenforo: Use '/posts/' not 'post' to find thread vs post URLs. 2017-09-03 11:02:53 -05:00
Jim Miller
d55b35acb6 Bump Test Version 2017-08-30 14:34:19 -05:00
Jim Miller
d10c32466b Add sectionUrl metadata and mechanism for internally normalizing story URLs *and* [story URL] sections. For base_xenforo when including thread titles in storyUrl. Doesn't effect base_xenforo *post* URLs. 2017-08-30 14:31:28 -05:00
Jim Miller
8b981aa938 base_xenforo: Move skip_threadmarks_categories to save a fetch if skipping anyway. Will also effect minimum_threadmarks. 2017-08-30 14:27:30 -05:00
Jim Miller
dfe896f4cd base_xenforo: Don't include thread title in *chapter* url,
perfomance impact is smaller and keeps from marking a bunch of stuff (new).
2017-08-30 14:26:16 -05:00
Jim Miller
07a32dc934 Bump Test Version 2017-08-30 10:50:50 -05:00
Jim Miller
1159fd0038 Fix for base_xenforo URL w/o title. 2017-08-30 10:32:34 -05:00
Jim Miller
6c22129327 Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-08-30 10:22:49 -05:00
Chris Braun
715de7549a Remove webnovel.com manual escaping of HTML entities, the website now seems to take care of it (#224) 2017-08-30 10:22:45 -05:00
Jim Miller
16679fa064 Comment out some debug output. 2017-08-29 21:27:09 -05:00
Jim Miller
d5e09a7dd7 Use thread-title URL for storyUrl with base_xenforo to save redirect fetches. 2017-08-29 21:17:33 -05:00
Jim Miller
15b3a02edb Add delays for base_xenforoforum_adapter. 2017-08-29 11:02:02 -05:00
Jim Miller
4d36f24d37 Bump Release Version 2017-08-18 12:25:29 -05:00
Jim Miller
4e2b14b566 Bump Test Version 2017-08-17 10:45:43 -05:00
Jim Miller
6c53f49e5e Restore sycophanthex.com sites. 2017-08-16 10:39:08 -05:00
Jim Miller
ee69ddfd74 Bump Test Version 2017-08-15 15:10:54 -05:00
Jim Miller
666cf0aee8 Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-08-15 15:10:37 -05:00
Chris Braun
a5ec069da4 Fix for gravitytales.com story (#219)
https://www.mobileread.com/forums/showpost.php?p=3568239&postcount=2395
2017-08-15 15:10:31 -05:00
Jim Miller
f233ab66f6 Update translations--metadata only. 2017-08-14 12:23:37 -05:00
Jim Miller
22005b454e Bump Test Version 2017-08-14 12:12:23 -05:00
Jim Miller
a3ab3e0133 AO3 - drop out of use_view_full_work when missing chapters. 2017-08-14 12:12:02 -05:00
Jim Miller
b48e12719e Bump Test Version 2017-08-10 16:50:32 -05:00
Jim Miller
39782362d0 Add AO3 inspiredlinks after last chapter. 2017-08-10 16:50:32 -05:00
Chris Braun
1a9f91cd04 Escape webnovel.com chapter texts for HTML (#216) 2017-08-10 16:45:57 -05:00
Jim Miller
2d47a0aff6 Bump Test Version 2017-08-08 22:35:08 -05:00
Jim Miller
a55dee9a98 Fix AO3 always_login and bookmarktags/bookmarksummary site metadata. 2017-08-08 22:34:44 -05:00
Jim Miller
1f5c3bbddd Save AO3 'associations' link 'inspired by' links. 2017-08-08 19:37:47 -05:00
Jim Miller
e3a680d592 Limit adapter_wuxiaworldcom removed links to prev/next chapter. 2017-08-08 18:59:09 -05:00
Jim Miller
e2a718be1a Don't cache metadata list while building from include_in_ -- Calibre version causes problems with removeallentities=True vs False cache. 2017-08-08 18:58:32 -05:00
Jim Miller
da8bd87700 Fix typo. 2017-08-08 18:55:52 -05:00
Jim Miller
0030aebc05 Bump Test Version 2017-08-07 20:25:20 -05:00
Chris Braun
a2ea915aa5 Fixes for webnovel.com site changes (PR #206)
* Fix scraping of last updated date for webnovel.com

* Remove outdated comment

* Use API endpoints to get regular- and VIP-chapter text
2017-08-07 20:23:50 -05:00
Jim Miller
b6df554777 Make default for fanfic.hu https. 2017-08-07 20:15:53 -05:00
botmtl
83c4416c46 allow https in getSiteURLPattern for FanficHuAdapter (ref:https://www.mobileread.com/forums/showpost.php?p=3563217&postcount=2375) (#209) 2017-08-07 20:15:11 -05:00
Jim Miller
1a5e1893ed Calibre image processing chokes on SVG images. 2017-08-07 11:14:32 -05:00
Jim Miller
ec100bcf85 Bump Test Version 2017-08-05 19:54:01 -05:00
Jim Miller
f39c9d49f8 Fix xenforo threadmarks change for SB/SV. 2017-08-05 19:52:00 -05:00
Jim Miller
86fcfc033f Bump Test Version 2017-07-30 09:13:49 -05:00
Jim Miller
e5d64f770f Comment out bookmarks for AO3 temporarily. 2017-07-30 09:13:29 -05:00
Jim Miller
35ea240ab3 Explicitly (instead of implicitly) set is_adult:false in defaults.ini. 2017-07-29 08:40:25 -05:00
Jim Miller
f6b637739f Calibre removed sanitize_html function. 2017-07-29 08:39:44 -05:00
Jim Miller
8018402a80 Bump Test Version 2017-07-27 18:03:09 -05:00
Jim Miller
d5a596fdce Fix for adapter_asianfanficscom caching vs login issue. 2017-07-27 18:02:33 -05:00
Jim Miller
12c887766a Bump Test Version 2017-07-26 16:50:54 -05:00
Jim Miller
183cc57f31 Remove adapter_gravitytalescom no feedparser warning--level isn't set before imports. 2017-07-26 16:50:41 -05:00
Jim Miller
9b4a5c95bd Bump Test Version 2017-07-26 16:46:48 -05:00
Jim Miller
a692157363 Reduce adapter_gravitytalescom no feedparser warning to debug. 2017-07-26 16:41:53 -05:00
Jim Miller
d2ec906381 Bump Release Version 2017-07-26 16:31:06 -05:00
Jim Miller
d8328985f7 Fix to adapter_asexstoriescom for site change. 2017-07-20 11:13:46 -05:00
Jim Miller
24c76ea808 adapter_finestoriescom and adapter_storiesonlinenet share code, but they need different Themes set now. 2017-07-20 10:58:51 -05:00
Jim Miller
e0691d729f adapter_fanficauthorsnet replace 'In progress' with 'In-Progress' to match standard. 2017-07-20 10:17:36 -05:00
Jim Miller
f2e63f3057 Bump Test Version 2017-07-20 10:03:44 -05:00
Chris Braun
cd2029912e Fix published and update date for gravitytales.com adapter - import feedparser (#204) 2017-07-19 17:22:26 -05:00
Jim Miller
7adc1b4d54 Consolidate and aggregate times for perf prof. 2017-07-19 17:19:13 -05:00
Jim Miller
09c684d744 Comment out some debugs. 2017-07-19 12:36:54 -05:00
Jim Miller
56668c08e7 br2p - don't show time if skipping. 2017-07-19 12:25:48 -05:00
Jim Miller
06ab7f9e06 Adding replace_failed_smilies_with_alt_text option for base_xenforo. 2017-07-19 12:19:32 -05:00
Jim Miller
a2fdb3b775 Adding replace_failed_smilies_with_alt_text option for base_xenforo. 2017-07-19 12:18:59 -05:00
Jim Miller
78c9f801a6 Adding replace_broken_smilies--needs completion yet. 2017-07-19 11:40:26 -05:00
Jim Miller
6da54ba6c4 Adding replace_broken_smilies--needs completion yet. 2017-07-19 11:39:41 -05:00
Jim Miller
b3a32ae240 Add markers and check to prevent replace_br_with_p running more than once on the same text. 2017-07-19 11:20:14 -05:00
Jim Miller
c243b1db3e Make utf8FromSoup() copy soup to avoid side effects. Plus time reporting. 2017-07-19 11:20:14 -05:00
Jim Miller
732d6603d8 First fix for replace_br_with_p, prevent more <div> nesting, is good. DON'T try to remove previously nested div tags. 2017-07-19 11:20:01 -05:00
Jim Miller
fc71ae0848 Fix replace_br_with_p creating nested div tags, remove ones added previously. Plus time reporting. 2017-07-19 11:20:01 -05:00
Jim Miller
cfc3e91b61 Fixes for // problem in images. 2017-07-19 11:20:01 -05:00
Jim Miller
c672c6f5f1 Bump Test Version 2017-07-17 20:46:56 -05:00
Jim Miller
c6a8f531df Fix for AO3 login change. 2017-07-17 20:44:27 -05:00
Jim Miller
64b4ea1444 Bump Test Version 2017-07-17 10:29:31 -05:00
Jim Miller
177d176ab4 Restore adapter_dramioneorg after site came back. 2017-07-17 10:28:39 -05:00
Jim Miller
aa39d72a8d Bump Release Version 2017-07-08 22:57:05 -05:00
Jim Miller
40e4a5824c Bump Test Version 2017-07-06 15:40:28 -05:00
Jim Miller
8223686f65 Fixes for fanfiktion.de site chanegs. 2017-07-06 14:00:23 -05:00
Jim Miller
23221bc141 Remove 9 defunct sites. 2017-07-05 20:06:02 -05:00
Jim Miller
115abbafc2 XenForo alternatehistory.com follows QQ more than SV/SB. 2017-07-05 19:55:28 -05:00
Jim Miller
e5a5cafde7 Fix typo in error catching. 2017-07-05 19:52:42 -05:00
Jim Miller
9dc9dadf9d Update translations. 2017-07-05 18:21:32 -05:00
Jim Miller
a40534b7be Bump Test Version 2017-07-04 10:39:02 -05:00
FaceDeer
4bb3ed111d fimfiction views/total_views (PR #200)
* missed the "st" ordinal case when parsing dates

* add option to collect Fimfiction author's notes

* fimfiction adapter: views/total_views being read incorrectly
2017-07-04 10:37:23 -05:00
Jim Miller
4850f4fd7c Bump Test Version 2017-07-02 15:39:42 -05:00
Jim Miller
1da7f6ec15 Fix for XenForo(SB/SV) threadmarks with non-ascii chars. 2017-07-01 22:47:20 -05:00
Jim Miller
e2f2b5ed8c Bump Test Version 2017-06-28 12:57:54 -05:00
Jim Miller
3fb308b051 Save optional dev cli cache saving feature. 2017-06-28 12:52:27 -05:00
Jim Miller
6fc5c791bb Remove dev version of cli with cache saving. 2017-06-28 12:45:01 -05:00
Jim Miller
8c93009b51 Fix plugin config check for _filelist. 2017-06-28 10:16:40 -05:00
Jim Miller
afdb1a7fb2 Bump Test Version 2017-06-27 18:12:48 -05:00
Jim Miller
d8c1ef93e0 Changes to base_xenforoforum_adapter to use threadmarks.rss for SV/SB, separate threadmark collection for QQ.
Also add show_chapter_authors option.
2017-06-27 16:14:24 -05:00
Jim Miller
18fd2682bd Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-06-27 10:35:12 -05:00
FaceDeer
317a5ac9a2 Add option to collect Fimfiction author notes (#197)
* Add `include_author_notes` option to collect Fimfiction author's notes
2017-06-27 10:35:05 -05:00
Jim Miller
21645fa911 Bump Test Version 2017-06-25 11:35:07 -05:00
Jim Miller
a5a7ddf5d4 Fix for XenForo change on SB & SV--not on QQ. 2017-06-25 11:34:35 -05:00
Jim Miller
f9d4b443b5 Bump Release Version 2017-06-22 15:26:05 -05:00
Jim Miller
d9e8fba7b0 Update translations. 2017-06-22 14:16:22 -05:00
Jim Miller
2c8b22c767 Bump Test Version 2017-06-22 10:13:30 -05:00
David
95b9dde621 finestories.com using Modern theme rather than Classic. (#195)
finestories.com seems have stopped using the Classic them and are ony
using the Modern.

Also a lot of refactoring to continue sharing code between storiesonline
and finestories.
2017-06-22 09:53:31 -05:00
FaceDeer
01f3189aa7 adapter_fimfictionnet -- missed the "st" ordinal case when parsing dates (#194) 2017-06-22 09:51:35 -05:00
Jim Miller
24c0304aa5 Bump Test Version 2017-06-20 14:43:27 -05:00
Jim Miller
5972a4fdbb Merge pull request #193 from Etana/master
Fix for Webnovel.com broken get category
2017-06-20 14:38:37 -05:00
Ea
a4b151190f Webnovel.com broken get category
After site update, structure around category changed from:

```
<div>
  <h2>Title <small>ABBR</small></h2>
  <p class="_tags"><strong class="ttc">Category</strong>...</p>
  ...
</div>
```

to:

```
<div>
  <h2>Title <small>ABBR</small></h2>
  <p class="_boost">Tag1 · Tag2 · ...</p>
  <p class="_tags"><strong class="ttc">Category</strong>...</p>
  ...
</div>
```

so getting a story failed.
2017-06-20 21:19:45 +02:00
Jim Miller
39ec2405c6 Bump Test Version 2017-06-18 10:08:32 -05:00
Jim Miller
e453ade9cb Allow multiple range URLs (url[1-5]) for same story in one download. 2017-06-18 10:08:27 -05:00
Jim Miller
0bac261ecb Bump Test Version 2017-06-18 09:20:20 -05:00
Jim Miller
68fee12d33 Merge pull request #191 from davidfor/master
Small change in series page
2017-06-18 09:18:25 -05:00
David
ef109b070c Small change in series page 2017-06-18 14:07:11 +10:00
Jim Miller
4ca49070e3 Bump Test Version 2017-06-17 15:58:57 -05:00
Jim Miller
4382451fa6 Change adapter_fimfictionnet to use makeDate for non-USEnglish locales. 2017-06-17 15:50:14 -05:00
Jim Miller
3ae9b0f86b Bump Test Version 2017-06-15 15:30:32 -05:00
Jim Miller
99482f9bda Add keep_prequel_in_description option for fimfiction.net. 2017-06-15 15:30:11 -05:00
Jim Miller
4ce18ca831 http->https for adapter_midnightwhispers 2017-06-14 12:18:49 -05:00
Jim Miller
26ceb442f5 Bump Test Version 2017-06-14 12:11:02 -05:00
Jim Miller
9e378b7dfa Fix base_xenforoforum_adapter for QQ--it doesn't have threadmark categories or reader mode. 2017-06-14 00:52:45 -05:00
Jim Miller
b156fa4914 Fix fimfiction datePublished (Merge pull request #190 from FaceDeer/master)
fimfiction pubdate tag update
2017-06-13 23:59:07 -05:00
FaceDeer
2f65e2ccce pubdate tag's class changed 2017-06-13 22:13:33 -06:00
FaceDeer
362d3e7959 Merge remote-tracking branch 'refs/remotes/JimmXinu/master' 2017-06-13 22:12:54 -06:00
FaceDeer
3acb60251f Merge remote-tracking branch 'refs/remotes/JimmXinu/master' 2017-06-13 21:46:47 -06:00
Jim Miller
49fb2552e1 Bump Test Version 2017-06-13 10:19:52 -05:00
Jim Miller
18c0429868 Update adapter_webnovelcom for site changes. Thanks, Ser4nb2LUY6e 2017-06-13 10:10:05 -05:00
Jim Miller
74f8161e18 Bump Test Version 2017-06-12 19:28:52 -05:00
Jim Miller
239dc88d86 Correct an error log message. 2017-06-12 19:27:19 -05:00
Jim Miller
1b4a1241c8 Update translations. 2017-06-12 19:26:55 -05:00
Jim Miller
84dc04bb15 Extend base_xenforoforum_adapter Reader Mode to other Threadmark Categories. 2017-06-11 13:20:01 -05:00
Jim Miller
1078279c97 Change adapter_fanfiktionde to https. 2017-06-10 10:40:55 -05:00
Jim Miller
5066ac07e0 Special error msg for storiesonline.net about Listing Theme. 2017-06-09 17:54:08 -05:00
Jim Miller
9cda10e255 Tweak skip_threadmarks_categories comments in INI. 2017-06-09 17:53:15 -05:00
Jim Miller
7ed6a982b7 Bump Test Version 2017-06-08 10:19:11 -05:00
Jim Miller
ad5e46e024 Merge pull request #187 from FaceDeer/fimfiction-groups
Fimfiction groups and author login date
2017-06-08 09:36:05 -05:00
FaceDeer
15ac85a7fb fallback for broken lastlogin data
about 20% of the time the data-time attibute is simply absent. I haven't
figured out a pattern as to why (it seems to come and go for any given
story), but the date is still present in the title so it can be parsed
out in that case.
2017-06-07 21:10:17 -06:00
FaceDeer
8912f4ee18 button for full list of groups when >8 groups has moved 2017-06-07 19:25:43 -06:00
Jim Miller
a3424f816b Bump Test Version 2017-06-06 09:29:11 -05:00
Jim Miller
d95cde81d9 Fix for get-story-urls from page, affected adapter_adultfanfictionorg. 2017-06-05 22:26:20 -05:00
Jim Miller
9275858d68 Update adapter_fimfictionnet for site changes. 2017-06-05 22:08:46 -05:00
Jim Miller
dd8beff697 Add skip_threadmarks_categories option to base_xenforoforum_adapter. 2017-06-05 10:27:35 -05:00
Jim Miller
1630f0a6f7 Bump Test Version 2017-06-04 21:55:27 -05:00
Jim Miller
1a9fcb137c Update Translations 2017-06-04 21:55:16 -05:00
Jim Miller
60110721c8 Merge pull request #185 from cryzed/fix-184
Fix Issue #184 (RoyalRoadL Adapter)
2017-06-04 17:00:32 -05:00
cryzed
0170ed5d0e Fix author parsing 2017-06-04 22:33:31 +02:00
cryzed
927e709d78 Fix #184 2017-06-04 22:11:36 +02:00
Jim Miller
e7e71cfd0a Update translations. 2017-05-31 22:14:31 -05:00
Jim Miller
55e29802c6 Bump Test Version 2017-05-31 22:13:41 -05:00
Jim Miller
06ae034a0c Change AO3 to https, normalize chapter URLs, remove view_adult from chapter URLs. 2017-05-31 22:11:43 -05:00
Jim Miller
06fc6e7992 Apply minimum_threadmarks in base_xenforoforum based on all threadmarks, not marks per category. 2017-05-31 10:30:36 -05:00
Jim Miller
b6fd122b22 Reduce debug output from replace_br_with_p. 2017-05-28 11:35:57 -05:00
Jim Miller
65388be613 Bump Test Version 2017-05-28 10:21:20 -05:00
Jim Miller
b27152e12b Add apocrypha_to_omake option for base_xenforoforum_adapter threadmark category. 2017-05-27 17:33:06 -05:00
Jim Miller
9c0dca6d35 Add rating for starslibrarynet, fix [www.starslibrary.net] as section name. 2017-05-27 17:32:23 -05:00
Jim Miller
641c98cb7c Add comma_entries to valid config list. 2017-05-26 18:23:53 -05:00
Jim Miller
d62e4469c0 Bump Test Version 2017-05-26 12:10:43 -05:00
Jim Miller
9694b1c919 Update translations. 2017-05-26 12:10:20 -05:00
Jim Miller
87da41ab84 wattpad - add defaults slow_down_sleep_time:2 and comma reads. 2017-05-26 11:53:35 -05:00
Jim Miller
5202e8ef1e wattpad - need to pass utf8FromSoup a soup... 2017-05-26 11:52:48 -05:00
Jim Miller
44af744fc8 Add comma_entries option to add commas. 2017-05-26 11:51:04 -05:00
Jim Miller
2c5335764e Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-05-25 23:44:59 -05:00
Jim Miller
dd75218db8 Merge pull request #183 from botmtl/wattpad
wattpad - Separate category from tags, include rags in genre by default, use larger cover image.
2017-05-25 23:44:47 -05:00
botmtl
556b8acdcd isolated category from tags
bigger looking cover
made the requested changes in default.ini and plugin-defaults.ini
2017-05-25 23:01:25 -04:00
Jim Miller
2a889d576d Bump Test Version 2017-05-25 15:47:22 -05:00
Jim Miller
161f0f2b54 adapter_storiesofardacom - td->div in description, catch parse excpetion after bad html in description. 2017-05-25 15:45:40 -05:00
Jim Miller
f7875f3619 Adapters' getSiteExampleURLs() value has to be able to pass their own getSiteURLPattern() for geturls.get_urls_from_page() to work correctly. 2017-05-25 15:08:42 -05:00
Jim Miller
c04e1e9ff5 Bump Test Version 2017-05-24 20:25:25 -05:00
Jim Miller
6f149521c3 Add xenforoforum Categorized threadmarks after regular threadmarks. 2017-05-24 20:24:11 -05:00
Jim Miller
622f4d7dd1 Take image .ext from end of whole URL if not found at end of path. Taking from content-type on response would be difficult. 2017-05-24 20:21:55 -05:00
Jim Miller
e6ca5da2c9 Fix processing for <center> <u> etc, soup.recursiveChildGenerator() not working anymore. Should be okay using findall with soup&re-soup in make_soup. 2017-05-24 20:20:14 -05:00
Jim Miller
b44be902d7 Bump Test Version 2017-05-21 10:41:27 -05:00
Jim Miller
635216e93e Add a comment & some space clean up. 2017-05-21 10:40:01 -05:00
Jim Miller
433317dad2 Tweaks to wattpadcom - utf8FromSoup() for image support. 2017-05-21 10:39:33 -05:00
Jim Miller
36bfea9cdb Merge pull request #181 from botmtl/wattpad
(wattpad) Styling corrections for pull request #180
2017-05-21 10:25:41 -05:00
Jim Miller
1f003ea591 Fix for asianfanficscom site change. 2017-05-21 10:20:40 -05:00
botmtl
bbdf2fb003 Removed most of the logging (kept 4, one being the big one to API_STORY_INFO)
All str.replace calls now use the same method.
2017-05-20 20:28:40 -04:00
Jim Miller
973cabd18d Merge pull request #180 from botmtl/wattpad
Wattpad adapter
2017-05-20 12:41:52 -05:00
botmtl
dd770ff29c Wattpad adapter 2017-05-20 12:57:57 -04:00
Jim Miller
c404f9dca8 Bump Release Version 2017-05-20 10:50:28 -05:00
Jim Miller
ce0b25c1bc Fix img cachedfetch for referer. 2017-05-20 10:39:07 -05:00
Jim Miller
826f0f6a6b Bump Release Version 2017-05-17 12:14:23 -05:00
Jim Miller
afd08fcd8b Bump Test Version 2017-05-15 19:39:27 -05:00
Jim Miller
9ac132f445 Update Translations. 2017-05-15 19:37:08 -05:00
Jim Miller
724796832e adapter_wwwaneroticstorycom: 'Complete'->'Completed' 2017-05-15 19:36:52 -05:00
Jim Miller
dddb771d79 Catch other common complete/in-progress statuses for calibre yes/no columns. 2017-05-15 19:32:47 -05:00
Jim Miller
ad328e8a9d Change fanficauthors.net to https. 2017-05-15 19:24:04 -05:00
Jim Miller
a299d6d44a Update translations. 2017-05-11 11:15:19 -05:00
Jim Miller
1c0c7cbbe3 Bump Test Version 2017-05-09 18:19:32 -05:00
Jim Miller
a2062c687f One-off normalize Reject List URLs to save doing it all the time. Also, automatically save plugin version in prefs. 2017-05-09 18:19:04 -05:00
Jim Miller
e00a280528 Remove a debug output from adapter_fanficauthorsnet 2017-05-09 14:13:17 -05:00
Jim Miller
354976e85a Reorder sections in defaults.ini files alphabetically, save script that does it. 2017-05-09 13:15:41 -05:00
Jim Miller
a490aa7c7a Bump Test Version 2017-05-09 11:46:08 -05:00
Jim Miller
ca5036792f Let PI search inside zip(html) and txt formats for story URLs. 2017-05-09 11:45:38 -05:00
Jim Miller
ce6e135dfa Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-05-07 18:23:12 -05:00
Jim Miller
526bb36dcc Merge pull request #179 from PlushBeaver/masseffect2in-redesign
Adapt MassEffect2.in adapter to new site layout
2017-05-07 18:23:18 -05:00
Dmitry Kozlyuk
d5080b1e3d Count chapter parts as separate chapters for MassEffect2.in
Conditional update relies on chapter count, so stick to the number
of actual posts.
2017-05-08 01:46:00 +03:00
Dmitry Kozlyuk
a35f861a19 Adapt to masseffect2.in redesign 2017-05-08 01:45:49 +03:00
Jim Miller
1d2a00feab Remove debug output of dependency versions. 2017-05-07 13:26:27 -05:00
Jim Miller
f4da821ab8 Fix error with std_cols_newonly if user has never saved config. 2017-05-05 18:53:06 -05:00
Jim Miller
d6c2b40ad7 Clear extratags: for tgstorytime.com and fictionmania.tv. 2017-05-05 09:54:33 -05:00
Jim Miller
611076bf6d Bump Test Version 2017-05-01 10:44:05 -05:00
Jim Miller
006821724e Fix AO3 use_view_full_work feature for 1 chapter works. 2017-05-01 10:43:36 -05:00
Jim Miller
808c0b7b49 Bump Test Version 2017-04-29 20:04:17 -05:00
Jim Miller
118360102d Debug output of depend package versions. 2017-04-29 20:01:17 -05:00
Jim Miller
0eed84d3ff Don't include html5lib and six in PI zip--uses calibre's versions anyway. 2017-04-29 20:01:17 -05:00
Jim Miller
15c2ab636f Add webencodings to included_dependencies. Needed by in web service with newer html5lib. 2017-04-29 20:01:17 -05:00
Jim Miller
d2feac0c66 Add chardet_confidence_limit option for 'auto' encoding setting. 2017-04-29 20:01:17 -05:00
Jim Miller
7f4bc5c36e Update html2text to (2016, 9, 19). 2017-04-29 20:01:16 -05:00
Jim Miller
db7777b161 Update chardet to 3.0.2. 2017-04-29 20:01:16 -05:00
Jim Miller
c5da1af470 Update six.py to 1.10.0. 2017-04-29 20:01:16 -05:00
Jim Miller
ba4c0b99a5 Update to BeautifulSoup 4.5.3. 2017-04-29 20:01:16 -05:00
Jim Miller
e085077e82 Update html5lib from 0.9x7 to 0.9x9. 2017-04-29 20:01:16 -05:00
Jim Miller
5afac3ce5b Add AO3 feature use_view_full_work -- true by default. 2017-04-29 17:11:40 -05:00
Jim Miller
33c33b602f Fix CLI -f option help for text vs txt. 2017-04-29 16:43:58 -05:00
Jim Miller
273ca6bcd7 Fix for PI suppressauthorsort/suppresstitlesort interacting wrong with Author/Title New Only. 2017-04-28 17:57:40 -05:00
Jim Miller
20cea788b6 Add site specific extracategories to new sites. 2017-04-28 15:37:28 -05:00
Jim Miller
3e897e7758 Add alternatehistory.com as a base_xenforoforum_adapter. Plus base_xenforoforum_adapter fixes. 2017-04-28 14:28:51 -05:00
Jim Miller
3464134aa8 Bump Test Version 2017-04-26 10:27:24 -05:00
Jim Miller
8a44ac2a65 New site: inkbunny.net - thanks, GComyn 2017-04-26 10:16:27 -05:00
Jim Miller
178c207c19 Add calibre_series_meta optional feature to include series metadata like calibre in epubs. 2017-04-23 19:14:47 -05:00
Jim Miller
fec9241932 Bump Test Version 2017-04-23 13:02:50 -05:00
Jim Miller
569e3c2324 Add sites lcfanfic.com and noveltrove.com -- thanks GComyn. 2017-04-23 13:00:31 -05:00
Jim Miller
d93aff207c Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-04-23 12:58:04 -05:00
Jim Miller
2bf6335073 Merge pull request #176 from cryzed/fix-webnovel-adapter
Improve webnovel adapter
2017-04-23 12:57:27 -05:00
Jim Miller
a77cfe595b Add bookmarktags and bookmarksummary metadata for AO3, requires always_login:true. 2017-04-23 12:38:51 -05:00
cryzed
54f2e43363 Make code compatible with older versions of BeautifulSoup and make locating last updated string more robust 2017-04-23 19:10:07 +02:00
cryzed
e41bb51453 Use relative "ago" time to figure out correct last absolute updated date 2017-04-23 02:54:56 +02:00
cryzed
4b241e7d79 Improve formatting and code style slightly
Remove mixed tabs and spaces, superfluous comment characters and comparison to None using the equality operator (instead of "is")
2017-04-23 02:04:44 +02:00
Jim Miller
6983e6d49f Fixes for adapter_gravitytalescom from GComyn. 2017-04-22 10:44:47 -05:00
Jim Miller
0f2d44152a Update copyright dates on new files. 2017-04-22 10:30:01 -05:00
Jim Miller
977580c5fa Bump Test Version 2017-04-21 14:51:26 -05:00
Jim Miller
9397b4f064 Tweak to version updating script for test versions. 2017-04-21 14:51:07 -05:00
Jim Miller
6bdfb870cd New sites from GComyn. 2017-04-21 14:14:04 -05:00
Jim Miller
bfecebbc80 Don't default remove chapter for webnovel in chapter_strip. 2017-04-21 12:48:10 -05:00
Jim Miller
576bb310a1 Bump test version. 2017-04-21 12:44:50 -05:00
Jim Miller
3e6ae39419 Add Story Notes to base_efiction_adapter. 2017-04-21 12:40:53 -05:00
Jim Miller
3eaf5a60a8 Don't send Referer:None -- hpfanficarchive.com doesn't like it. 2017-04-21 12:24:17 -05:00
Jim Miller
0094e3be97 Fixes to webnovel.com, chapter->Chapter, add desc. Thanks GComyn. 2017-04-21 10:58:26 -05:00
Jim Miller
2c103e03cb Additional tags collected for wuxiaworld--thanks GComyn 2017-04-21 10:42:35 -05:00
Jim Miller
4d3054ad8e Add download.archiveofourown.org for AO3 due to old downloaded AO3 epubs. 2017-04-21 10:38:00 -05:00
Jim Miller
a452163152 Fix for author including <b>Updated</b> in desc in adapter_ashwindersycophanthexcom. 2017-04-20 12:12:44 -05:00
Jim Miller
97f7eb1564 Merge branch 'master' of https://github.com/JimmXinu/FanFicFare 2017-04-18 17:17:50 -05:00
Jim Miller
9c1a0d09a1 Merge pull request #174 from cryzed/fix-header-issue
Fix "got more than 100 headers"-issue adapter_royalroadl.py only.
2017-04-18 17:17:29 -05:00
cryzed
9a6ad62771 Define a dummy httplib_max_headers context manager if a httplib module without the _MAXHEADERS attribute is used 2017-04-18 22:50:11 +02:00
cryzed
387fbd2276 Merge remote-tracking branch 'origin/master' into fix-header-issue 2017-04-18 21:50:22 +02:00
Jim Miller
62ad41c4ed Default webnovel.com to remove leading 'chapter '. 2017-04-18 13:43:47 -05:00
cryzed
9c6395b759 Isolate change of httplib._MAXHEADERS to getChapterText() 2017-04-18 18:47:50 +02:00
Jim Miller
aebd8c89e5 New site www.webnovel.com - thanks GComyn 2017-04-17 14:15:28 -05:00
Jim Miller
b36f41335a Fix ffnet referer for cover images. 2017-04-17 14:13:43 -05:00
cryzed
f5a627e008 Fix "got more than 100 headers"-issue 2017-04-17 19:45:04 +02:00
Jim Miller
4976d3fd9f Bump test version. 2017-04-16 17:08:46 -05:00
Jim Miller
fe195f7808 Fix for a mistaken Norwegian Bokmål translation. 2017-04-16 17:07:35 -05:00
Jim Miller
dbd9e745e5 Bump release version. 2017-04-14 12:41:56 -05:00
Jim Miller
e8a5dc2be7 Add pipe separator to tagsfromtitle in base_xenforo. 2017-04-13 22:38:49 -05:00
Jim Miller
31502261f0 Bump test version. 2017-04-10 13:03:27 -05:00
Jim Miller
51efff5a39 Remove defunct sites: www.restrictedsection.org, lucifael.com, onedirectionfanfiction.com, samdean.archive.nu, hpfandom.net, ficsite.com, sinful-desire.org, thehexfiles.net. 2017-04-10 13:02:25 -05:00
Jim Miller
441bc58dfb Bump test version. 2017-04-06 13:35:13 -05:00
Jim Miller
22d9fc6eac Merge pull request #172 from oh444555/master
adapter_asianfanficscom fix
2017-04-06 13:33:49 -05:00
oh444555
3f93ec2b42 adapter_asianfanficscom fix 2017-04-06 20:04:38 +02:00
Jim Miller
44f457a740 Bump test version. 2017-04-06 10:49:06 -05:00
Jim Miller
cd0178030c Merge pull request #171 from davidfor/master
Storiesonline and Literotica updates
2017-04-06 10:46:30 -05:00
David
1d99fc11d7 Add option to use all chapter categories in Literotica stories
Currently only the first category for a multiple chapter story is used.
Option added to use all, but set to default for backwards compatability.
2017-04-07 00:06:14 +10:00
David
a58d6cc530 For storiesonline, read text from index page
Extend the text put into the notice to all the text on the index page.
This can be used for a preface and might include a cover..
2017-04-07 00:03:58 +10:00
Jim Miller
a7d5f35565 Allow fractional series num in calibre injected series. 2017-04-05 15:35:45 -05:00
Jim Miller
76b1ab3dac Change a few outliers to use status: Completed instead of Complete. 2017-04-05 15:15:34 -05:00
Jim Miller
e8043dd9a1 Bump test version. 2017-04-05 12:47:13 -05:00
Jim Miller
10aadfb907 Add CLI option --progressbar. 2017-04-05 12:44:51 -05:00
Jim Miller
06aebc1707 Add use_archived_author option for archiveofourown.org. 2017-04-05 12:27:21 -05:00
Jim Miller
7b98e41c9b Add <meta charset="UTF-8"> to html output by default. 2017-04-03 10:29:03 -05:00
Jim Miller
2327ada979 Add titleHTML to mirror authorHTML for Tanjamuse. 2017-04-02 10:35:10 -05:00
Jim Miller
ac7ab2264d Change ncisfiction.net to ncisfiction.com 2017-03-31 13:13:19 -05:00
Jim Miller
24dc3b5fed Bump test version. 2017-03-30 16:03:36 -05:00
Jim Miller
042904676b Update translations. 2017-03-30 16:03:04 -05:00
Jim Miller
52687894ab Tweak _filelist--color coding. 2017-03-30 16:01:16 -05:00
Jim Miller
3c9d92d13d Actually adding _filelist feature. 2017-03-30 15:28:05 -05:00
Jim Miller
d9279b8142 Remove self.decode to defaults.ini, fix use_pagecache(). 2017-03-30 14:10:57 -05:00
Jim Miller
228b94592e Refactor to move fetches to Configuration class plus test version bump. 2017-03-29 18:28:49 -05:00
Jim Miller
537cf41403 Can't skip numChapters - adapter_trekfanfictionnet 2017-03-29 17:24:40 -05:00
Jim Miller
ee4544fdfc Remove defunct sites portkey.org and psychfic.com from defaults.ini. 2017-03-25 17:31:09 -05:00
Jim Miller
1670cdd02d Bump test version. 2017-03-22 10:33:27 -05:00
Jim Miller
b070359f1c Bump test version. 2017-03-22 10:23:38 -05:00
Jim Miller
bbde21ff7d Remove defunct sites portkey.org and psychfic.com 2017-03-22 10:23:38 -05:00
Jim Miller
d457e5f843 Update translations. 2017-03-22 10:18:50 -05:00
Jim Miller
15111c6e13 Change http to https for FimF - thanks baggins41 2017-03-22 10:18:10 -05:00
Jim Miller
32fd9dd892 Bump test version. 2017-03-19 12:27:57 -05:00
Jim Miller
e66bb6def7 Update translations. 2017-03-19 12:27:38 -05:00
Jim Miller
3896d0406e Improve metadata caching for performance. 2017-03-19 12:20:36 -05:00
Jim Miller
20feaef131 Test version bumped 2017-03-15 13:23:25 -05:00
Jim Miller
7f22c4a74a Update translations, Bump test version. 2017-03-14 17:36:11 -05:00
Jim Miller
f6831b07b5 Renamed midnightwhispers.ca domain midnightwhispers.net. 2017-03-14 17:25:26 -05:00
Jim Miller
7a175ba9d5 Remove unnecessary line that can cause problems with abbreviated ffnet URL. 2017-03-14 10:38:54 -05:00
Jim Miller
7ca8a5eb3b Bump test version. 2017-03-11 11:29:16 -06:00
Jim Miller
cb585e84c1 Fix for base_xenforoforum cached post used more than once. 2017-03-11 11:26:10 -06:00
Jim Miller
b52d493e2a Fix for authors and New Only. 2017-03-11 11:25:45 -06:00
Jim Miller
97a65c90d6 Bump test version. 2017-02-26 11:05:58 -06:00
Jim Miller
d617f24eb6 Update translations. 2017-02-26 11:05:32 -06:00
Jim Miller
12b123d0d5 Allow https in adapter_hpfanficarchivecom 2017-02-25 09:40:49 -06:00
Jim Miller
5cd4f23011 Normalize anthology URLs both from page and from epub. 2017-02-14 13:46:57 -06:00
Jim Miller
8cfa65d01b Bump versions. 2017-02-13 13:26:25 -06:00
Jim Miller
ace3396de3 Update translations. 2017-02-13 13:24:43 -06:00
Jim Miller
82ca30f9eb Fix for storiesonlinenet Daily Limit msg change, from GComyn. 2017-02-05 15:47:43 -06:00
Jim Miller
ab1d3bb70c Bump test version 2017-02-03 21:35:29 -06:00
Jim Miller
de06b8831e Merge pull request #162 from oh444555/master
adapter_asianfanficscom chapter URL update fix
2017-02-03 21:33:31 -06:00
oh444555
55d2c77a0c adapter_asianfanficscom chapter URL update fix 2017-02-04 03:40:31 +01:00
Jim Miller
8ba295c526 Bump test version. 2017-02-01 09:39:42 -06:00
Jim Miller
908c15468e Update translations. 2017-02-01 09:39:18 -06:00
Jim Miller
3f8669f723 Fix for adapter_adultfanfictionorg for site change. 2017-02-01 09:28:11 -06:00
Jim Miller
036d2f3cdb Bump test version. 2017-01-30 19:20:08 -06:00
Jim Miller
46e4e2cbab Update translations. 2017-01-30 19:19:35 -06:00
Jim Miller
eb7612fb9f base_xenforoforum: Add more caching and page lookahead in reader mode. 2017-01-30 19:18:12 -06:00
Jim Miller
beb8fc6e09 Update for typo fixes for translations. 2017-01-27 09:13:59 -06:00
Jim Miller
05c744e676 Merge pull request #160 from yurchor/master
Fix minor typos
2017-01-27 09:12:08 -06:00
Yuri Chornoivan
8cf5f1b401 Fix minor typos 2017-01-27 09:27:27 +02:00
Jim Miller
21fa480594 Fix bad tertiary except clause in ffnet check_next_chapter. 2017-01-25 19:55:13 -06:00
Jim Miller
c4e10b6e89 Fix bad tertiary except clause in ffnet check_next_chapter. 2017-01-25 19:20:11 -06:00
Jim Miller
1b14537f45 Bump test versions--plugin only change. 2017-01-25 10:35:30 -06:00
Jim Miller
6868fd4e5f Fix for error column when not an error. 2017-01-25 10:29:47 -06:00
Jim Miller
0d1165259d Bump test version. 2017-01-21 17:10:06 -06:00
Jim Miller
3a1b40e503 Merge pull request #158 from oh444555/master
Further adapter_asianfanficscom fixes
2017-01-21 17:06:43 -06:00
oh444555
046a216df3 Further adult check fixes for adapter_asianfanficscom 2017-01-21 21:44:52 +01:00
oh444555
acce4b191e Further adapter_asianfanficscom fixes
Fix for single-chapter adult checks
2017-01-20 18:52:22 +01:00
Jim Miller
70d415b4d8 Bump test versions. 2017-01-19 12:13:46 -06:00
Jim Miller
48e480f175 Tweak to adapter_whoficcom from GComyn 2017-01-19 12:12:52 -06:00
Jim Miller
8b26e0e78b Change abbrev and add inject_chapter_title for adapter_asianfanficscom. 2017-01-19 11:55:34 -06:00
Jim Miller
9ccea3e19e Merge pull request #157 from oh444555/master
adapter_asianfanficscom fixes
2017-01-19 11:38:11 -06:00
oh444555
a455c31dc7 adapter_asianfanficscom fixes
Fix for adult check and chapter title workaround
2017-01-19 05:08:50 +01:00
Jim Miller
d4fdf736bf Bump test version. 2017-01-18 12:32:58 -06:00
Jim Miller
e8dba4e565 adapter_bloodshedversecom needs to be able to change storyId. 2017-01-18 12:32:07 -06:00
Jim Miller
28cef36ce8 Put asianfanfics.com chapter_start example under [www.asianfanfics.com:epub]. 2017-01-15 18:57:35 -06:00
Jim Miller
118371b819 Minor fixes to adapter_asianfanficscom. 2017-01-15 18:47:39 -06:00
Jim Miller
48a3955147 Merge pull request #156 from oh444555/master
Support for asianfanfics.com
2017-01-15 18:38:36 -06:00
oh444555
66aa4ed2c1 Support for asianfanfics.com 2017-01-16 00:09:44 +01:00
Jim Miller
6f4ab86abf Bump test version. 2017-01-14 15:46:50 -06:00
Jim Miller
69acd90c8b Add author_avatar_cover option for base_xenforoforum. 2017-01-14 15:45:04 -06:00
Jim Miller
fea4cef885 Edge case fixes for errorcol and lastcheckedcol. 2017-01-14 15:37:20 -06:00
Jim Miller
0ea0765557 Bump test version. 2017-01-12 20:51:25 -06:00
Jim Miller
d66c4c3cee Make base_efiction_adapter honor keep_summary_html:true option. 2017-01-12 15:12:51 -06:00
517 changed files with 161311 additions and 45516 deletions

9
.gitignore vendored
View file

@ -15,6 +15,13 @@
# usually perl -pi.back -e edits.
*.back
*.bak
# pycharm project specific settings files
.idea
# vscode project specific settings file
.vscode
cleanup.sh
FanFictionDownLoader.zip
@ -26,3 +33,5 @@ build
dist
FanFicFare.egg-info
personal.ini
appcfg_oauth2_tokens
venv/

View file

@ -1 +1,3 @@
include DESCRIPTION.rst
include README.md
include LICENSE

View file

@ -1,19 +1,71 @@
FanFicFare
[FanFicFare](https://github.com/JimmXinu/FanFicFare)
==========
This is the repository for the FanFicFare project.
FanFicFare makes reading stories from various websites much easier by helping
you download them to EBook files.
FanFicFare is the rename and move of the previous FanFictionDownLoader (AKA
FFDL, AKA fanficdownloader) project.
FanFicFare was previously known as FanFictionDownLoader (AKA
FFDL, AKA fanficdownloader).
This program is available as a [calibre
plugin](http://www.mobileread.com/forums/showthread.php?p=3084025), a
[command-line interface](https://pypi.python.org/pypi/FanFicFare) (via
pip), and a [web service](http://fanficfare.appspot.com/).
Main features:
There's additional info in the project
[wiki](https://github.com/JimmXinu/FanFicFare/wiki) pages.
- Download FanFiction stories from over [100 different sites](https://github.com/JimmXinu/FanFicFare/wiki/SupportedSites). into ebooks.
There's also a [FanFicFare
maillist](https://groups.google.com/group/fanfic-downloader) for
discussion and announcements.
- Update previously downloaded EPUB format ebooks, downloading only new chapters.
- Get Story URLs from Web Pages.
- Support for downloading images in the story text. (EPUB and HTML
only -- download EPUB and convert to AZW3 for Kindle) More details on
configuring images in stories and cover images can be found in the
[FAQs] or [this post in the old FFDL thread].
- Support for cover image. (EPUB only)
- Optionally keep an Update Log of past updates (EPUB only).
There's additional info in the project [wiki] pages.
There's also a [FanFicFare maillist] for discussion and announcements and a [discussion thread] for the Calibre plugin.
Getting FanFicFare
==================
### Official Releases
This program is available as:
- A Calibre plugin from within Calibre or directly from the plugin [discussion thread], or;
- A Command Line Interface (CLI) [Python
package](https://pypi.python.org/pypi/FanFicFare) that you can
install with:
```
pip install FanFicFare
```
- _As of late November 2019, the web service version is shutdown. See the [Wiki Home](https://github.com/JimmXinu/FanFicFare/wiki#web-service-version) page for details._
### Test Versions
FanFicFare is released roughly every month, but new test versions are posted more frequently as changes are made.
Test versions are available at:
- The [test plugin] is posted at MobileRead.
- The test version of CLI for pip install is uploaded to the testpypi repository and can be installed with:
```
pip install --extra-index-url https://test.pypi.org/simple/ --upgrade FanFicFare
```
### Other Releases
Other versions may be available depending on your OS. I(JimmXinu) don't directly support these:
- **Arch Linux**: The latest CLI release can be obtained from the [fanficfare](https://aur.archlinux.org/packages/fanficfare) AUR package. It will install the calibre plugin, if calibre is installed.
[this post in the old FFDL thread]: https://www.mobileread.com/forums/showthread.php?p=1982785#post1982785
[FAQs]: https://github.com/JimmXinu/FanFicFare/wiki/FAQs#can-fanficfare-download-a-story-containing-images
[FanFicFare maillist]: https://groups.google.com/group/fanfic-downloader
[wiki]: https://github.com/JimmXinu/FanFicFare/wiki
[discussion thread]: https://www.mobileread.com/forums/showthread.php?t=259221
[test plugin]: https://www.mobileread.com/forums/showthread.php?p=3084025&postcount=2

View file

@ -1,8 +1,9 @@
[main]
host = https://www.transifex.com
[calibre-plugins.fanfictiondownloader]
[o:calibre:p:calibre-plugins:r:fanfictiondownloader]
file_filter = translations/<lang>.po
source_file = translations/en.po
source_lang = en
type = PO
type = PO

View file

@ -4,7 +4,7 @@ from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2016, Jim Miller'
__copyright__ = '2019, Jim Miller'
__docformat__ = 'restructuredtext en'
import sys, os
@ -32,6 +32,9 @@ except NameError:
# The class that all Interface Action plugin wrappers must inherit from
from calibre.customize import InterfaceActionBase
# pulled out from FanFicFareBase for saving in prefs.py
__version__ = (4, 57, 7)
## Apparently the name for this class doesn't matter--it was still
## 'demo' for the first few versions.
class FanFicFareBase(InterfaceActionBase):
@ -48,8 +51,8 @@ class FanFicFareBase(InterfaceActionBase):
description = _('UI plugin to download FanFiction stories from various sites.')
supported_platforms = ['windows', 'osx', 'linux']
author = 'Jim Miller'
version = (2, 8, 0)
minimum_calibre_version = (1, 48, 0)
version = __version__
minimum_calibre_version = (2, 85, 1)
#: This field defines the GUI plugin class that contains all the code
#: that actually does something. Its format is module_path:class_name
@ -102,8 +105,19 @@ class FanFicFareBase(InterfaceActionBase):
ac.apply_settings()
def load_actual_plugin(self, gui):
with self: # so the sys.path was modified while loading the
# plug impl.
# so the sys.path was modified while loading the plug impl.
with self:
# Make sure the fanficfare module is available globally
# under its simple name, -- This is the only reason other
# plugin files can import fanficfare instead of
# calibre_plugins.fanficfare_plugin.fanficfare.
#
# Added specifically for the benefit of
# eli-schwartz/eschwartz's Arch Linux distro that wants to
# package FFF plugin outside Calibre.
import fanficfare
return InterfaceActionBase.load_actual_plugin(self,gui)
def cli_main(self,argv):
@ -111,11 +125,10 @@ class FanFicFareBase(InterfaceActionBase):
with self: # so the sys.path was modified appropriately
# I believe there's no performance hit loading these here when
# CLI--it would load everytime anyway.
from StringIO import StringIO
from calibre.library import db
from calibre_plugins.fanficfare_plugin.fanficfare.cli import main as fff_main
from fanficfare.cli import main as fff_main
from calibre_plugins.fanficfare_plugin.prefs import PrefsFacade
from calibre.utils.config import prefs as calibre_prefs
from fanficfare.six import ensure_text
from optparse import OptionParser
parser = OptionParser('%prog --run-plugin '+self.name+' -- [options] <storyurl>')
@ -127,12 +140,11 @@ class FanFicFareBase(InterfaceActionBase):
pargs = [x for x in argv if x.startswith('--with-library') or x.startswith('--library-path')
or not x.startswith('-')]
opts, args = parser.parse_args(pargs)
fff_prefs = PrefsFacade(db(path=opts.library_path,
read_only=True))
read_only=True))
fff_main(argv[1:],
parser=parser,
passed_defaultsini=StringIO(get_resources("fanficfare/defaults.ini")),
passed_personalini=StringIO(fff_prefs["personal.ini"]),
passed_defaultsini=ensure_text(get_resources("fanficfare/defaults.ini")),
passed_personalini=ensure_text(fff_prefs["personal.ini"]),
)

View file

@ -1,6 +1,6 @@
<hr />
<p>Plugin created by Jim Miller, borrowing heavily from Grant Drake's
<p>Plugin created by Jim Miller, originally borrowing heavily from Grant Drake's
'<a href="http://www.mobileread.com/forums/showthread.php?t=134856">Reading List</a>',
'<a href="http://www.mobileread.com/forums/showthread.php?t=126727">Extract ISBN</a>' and
'<a href="http://www.mobileread.com/forums/showthread.php?t=134000">Count Pages</a>'
@ -8,12 +8,12 @@
<p>
Calibre officially distributes plugins from the mobileread.com forum site.
The official distro channel for this plugin is there: <a href="http://www.mobileread.com/forums/showthread.php?t=259221">FanFicFare</a>
The official distro channel and discussion thread for this plugin is there: <a href="http://www.mobileread.com/forums/showthread.php?t=259221">FanFicFare</a>
</p>
<p> I also monitor the
<a href="http://groups.google.com/group/fanfic-downloader">general users
group</a> for the downloader. That covers the web application and CLI, too.
group</a> for the downloader CLI, too.
</p>
<p>

View file

@ -0,0 +1,20 @@
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2024, Jim Miller'
__docformat__ = 'restructuredtext en'
## References:
## https://www.mobileread.com/forums/showthread.php?p=4435205&postcount=65
## https://www.mobileread.com/forums/showthread.php?p=4102834&postcount=389
from calibre_plugins.action_chains.events import ChainEvent
class FanFicFareDownloadFinished(ChainEvent):
# replace with the name of your event
name = 'FanFicFare Download Finished'
def get_event_signal(self):
return self.gui.iactions['FanFicFare'].download_finished_signal

View file

@ -1,64 +1,62 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2015, Jim Miller'
__docformat__ = 'restructuredtext en'
import re
try:
from PyQt5.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush)
except ImportError as e:
from PyQt4.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush)
class BasicIniHighlighter(QSyntaxHighlighter):
'''
QSyntaxHighlighter class for use with QTextEdit for highlighting
ini config files.
I looked high and low to find a high lighter for basic ini config
format, so I'm leaving this in the project even though I'm not
using.
'''
def __init__( self, parent, theme ):
QSyntaxHighlighter.__init__( self, parent )
self.parent = parent
self.highlightingRules = []
# keyword
self.highlightingRules.append( HighlightingRule( r"^[^:=\s][^:=]*[:=]",
Qt.blue,
Qt.SolidPattern ) )
# section
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\]",
Qt.darkBlue,
Qt.SolidPattern ) )
# comment
self.highlightingRules.append( HighlightingRule( r"#[^\n]*" ,
Qt.darkYellow,
Qt.SolidPattern ) )
def highlightBlock( self, text ):
for rule in self.highlightingRules:
for match in rule.pattern.finditer(text):
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
self.setCurrentBlockState( 0 )
class HighlightingRule():
def __init__( self, pattern, color, style ):
if isinstance(pattern,basestring):
self.pattern = re.compile(pattern)
else:
self.pattern=pattern
charfmt = QTextCharFormat()
brush = QBrush(color, style)
charfmt.setForeground(brush)
self.highlight = charfmt
# -*- coding: utf-8 -*-
from __future__ import (absolute_import, unicode_literals, division,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2015, Jim Miller'
__docformat__ = 'restructuredtext en'
import re
from PyQt5.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush)
from fanficfare.six import string_types
class BasicIniHighlighter(QSyntaxHighlighter):
'''
QSyntaxHighlighter class for use with QTextEdit for highlighting
ini config files.
I looked high and low to find a high lighter for basic ini config
format, so I'm leaving this in the project even though I'm not
using.
'''
def __init__( self, parent, theme ):
QSyntaxHighlighter.__init__( self, parent )
self.parent = parent
self.highlightingRules = []
# keyword
self.highlightingRules.append( HighlightingRule( r"^[^:=\s][^:=]*[:=]",
Qt.blue,
Qt.SolidPattern ) )
# section
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\]",
Qt.darkBlue,
Qt.SolidPattern ) )
# comment
self.highlightingRules.append( HighlightingRule( r"#[^\n]*" ,
Qt.darkYellow,
Qt.SolidPattern ) )
def highlightBlock( self, text ):
for rule in self.highlightingRules:
for match in rule.pattern.finditer(text):
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
self.setCurrentBlockState( 0 )
class HighlightingRule():
def __init__( self, pattern, color, style ):
if isinstance(pattern, string_types):
self.pattern = re.compile(pattern)
else:
self.pattern=pattern
charfmt = QTextCharFormat()
brush = QBrush(color, style)
charfmt.setForeground(brush)
self.highlight = charfmt

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -1,49 +1,116 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2015, Jim Miller'
__docformat__ = 'restructuredtext en'
from StringIO import StringIO
from ConfigParser import ParsingError
import logging
logger = logging.getLogger(__name__)
from calibre_plugins.fanficfare_plugin.fanficfare import adapters, exceptions
from calibre_plugins.fanficfare_plugin.fanficfare.configurable import Configuration
from calibre_plugins.fanficfare_plugin.prefs import prefs
def get_fff_personalini():
return prefs['personal.ini']
def get_fff_config(url,fileform="epub",personalini=None):
if not personalini:
personalini = get_fff_personalini()
sections=['unknown']
try:
sections = adapters.getConfigSectionsFor(url)
except Exception as e:
logger.debug("Failed trying to get ini config for url(%s): %s, using section %s instead"%(url,e,sections))
configuration = Configuration(sections,fileform)
configuration.readfp(StringIO(get_resources("plugin-defaults.ini")))
configuration.readfp(StringIO(personalini))
return configuration
def get_fff_adapter(url,fileform="epub",personalini=None):
return adapters.getAdapter(get_fff_config(url,fileform,personalini),url)
def test_config(initext):
try:
configini = get_fff_config("test1.com?sid=555",
personalini=initext)
errors = configini.test_config()
except ParsingError as pe:
errors = pe.errors
return errors
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2020, Jim Miller'
__docformat__ = 'restructuredtext en'
from functools import reduce
from io import StringIO
import logging
logger = logging.getLogger(__name__)
from fanficfare import adapters
from fanficfare.configurable import Configuration
from calibre_plugins.fanficfare_plugin.prefs import prefs
from fanficfare.six import ensure_text
from fanficfare.six.moves import configparser
from fanficfare.six.moves import collections_abc
def get_fff_personalini():
return prefs['personal.ini']
def get_fff_config(url,fileform="epub",personalini=None):
if not personalini:
personalini = get_fff_personalini()
sections=['unknown']
try:
sections = adapters.getConfigSectionsFor(url)
except Exception as e:
logger.debug("Failed trying to get ini config for url(%s): %s, using section %s instead"%(url,e,sections))
configuration = Configuration(sections,fileform)
configuration.read_file(StringIO(ensure_text(get_resources("plugin-defaults.ini"))))
configuration.read_file(StringIO(ensure_text(personalini)))
return configuration
def get_fff_adapter(url,fileform="epub",personalini=None):
return adapters.getAdapter(get_fff_config(url,fileform,personalini),url)
def test_config(initext):
try:
configini = get_fff_config("test1.com?sid=555",
personalini=initext)
errors = configini.test_config()
except configparser.ParsingError as pe:
errors = pe.errors
return errors
class OrderedSet(collections_abc.MutableSet):
def __init__(self, iterable=None):
self.end = end = []
end += [None, end, end] # sentinel node for doubly linked list
self.map = {} # key --> [key, prev, next]
if iterable is not None:
self |= iterable
def __len__(self):
return len(self.map)
def __contains__(self, key):
return key in self.map
def add(self, key):
if key not in self.map:
end = self.end
curr = end[1]
curr[2] = end[1] = self.map[key] = [key, curr, end]
def discard(self, key):
if key in self.map:
key, prev, next = self.map.pop(key)
prev[2] = next
next[1] = prev
def __iter__(self):
end = self.end
curr = end[2]
while curr is not end:
yield curr[0]
curr = curr[2]
def __reversed__(self):
end = self.end
curr = end[1]
while curr is not end:
yield curr[0]
curr = curr[1]
def pop(self, last=True):
if not self:
raise KeyError('set is empty')
key = self.end[1][0] if last else self.end[2][0]
self.discard(key)
return key
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, list(self))
def __eq__(self, other):
if isinstance(other, OrderedSet):
return len(self) == len(other) and list(self) == list(other)
return set(self) == set(other)
def get_common_elements(ll):
## returns a list of elements common to all lists in ll
## https://www.tutorialspoint.com/find-common-elements-in-list-of-lists-in-python
return list(reduce(lambda i, j: i & j, (OrderedSet(n) for n in ll)))

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

View file

@ -1,124 +1,159 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2016, Jim Miller'
__docformat__ = 'restructuredtext en'
import re
try:
from PyQt5.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush, QFont)
except ImportError as e:
from PyQt4.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush, QFont)
# r'add_to_+key
class IniHighlighter(QSyntaxHighlighter):
'''
QSyntaxHighlighter class for use with QTextEdit for highlighting
ini config files.
'''
def __init__( self, parent, sections=[], keywords=[], entries=[], entry_keywords=[] ):
QSyntaxHighlighter.__init__( self, parent )
self.parent = parent
self.highlightingRules = []
if entries:
# *known* entries
reentries = r'('+(r'|'.join(entries))+r')'
self.highlightingRules.append( HighlightingRule( r"\b"+reentries+r"\b", Qt.darkGreen ) )
# true/false -- just to be nice.
self.highlightingRules.append( HighlightingRule( r"\b(true|false)\b", Qt.darkGreen ) )
# *all* keywords -- change known later.
self.errorRule = HighlightingRule( r"^[^:=\s][^:=]*[:=]", Qt.red )
self.highlightingRules.append( self.errorRule )
# *all* entry keywords -- change known later.
reentrykeywords = r'('+(r'|'.join([ e % r'[a-zA-Z0-9_]+' for e in entry_keywords ]))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"\s*[:=]", Qt.darkMagenta ) )
if entries: # separate from known entries so entry named keyword won't be masked.
# *known* entry keywords
reentrykeywords = r'('+(r'|'.join([ e % reentries for e in entry_keywords ]))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"\s*[:=]", Qt.blue ) )
# *known* keywords
rekeywords = r'('+(r'|'.join(keywords))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+rekeywords+r"\s*[:=]", Qt.blue ) )
# *all* sections -- change known later.
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\].*?$", Qt.red, QFont.Bold, blocknum=1 ) )
if sections:
# *known* sections
resections = r'('+(r'|'.join(sections))+r')'
resections = resections.replace('.','\.') #escape dots.
self.highlightingRules.append( HighlightingRule( r"^\["+resections+r"\]\s*$", Qt.darkBlue, QFont.Bold, blocknum=2 ) )
# test story sections
self.teststoryRule = HighlightingRule( r"^\[teststory:([0-9]+|defaults)\]", Qt.darkCyan, blocknum=3 )
self.highlightingRules.append( self.teststoryRule )
# storyUrl sections
self.storyUrlRule = HighlightingRule( r"^\[https?://.*\]", Qt.darkMagenta, blocknum=4 )
self.highlightingRules.append( self.storyUrlRule )
# NOT comments -- but can be custom columns, so don't flag.
#self.highlightingRules.append( HighlightingRule( r"(?<!^)#[^\n]*" , Qt.red ) )
# comments -- comments must start from column 0.
self.commentRule = HighlightingRule( r"^#[^\n]*" , Qt.darkYellow )
self.highlightingRules.append( self.commentRule )
def highlightBlock( self, text ):
is_comment = False
blocknum = self.previousBlockState()
for rule in self.highlightingRules:
for match in rule.pattern.finditer(text):
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
if rule == self.commentRule:
is_comment = True
if rule.blocknum > 0:
blocknum = rule.blocknum
if not is_comment:
# unknown section, error all:
if blocknum == 1 and blocknum == self.previousBlockState():
self.setFormat( 0, len(text), self.errorRule.highlight )
# teststory section rules:
if blocknum == 3:
self.setFormat( 0, len(text), self.teststoryRule.highlight )
# storyUrl section rules:
if blocknum == 4:
self.setFormat( 0, len(text), self.storyUrlRule.highlight )
self.setCurrentBlockState( blocknum )
class HighlightingRule():
def __init__( self, pattern, color,
weight=QFont.Normal,
style=Qt.SolidPattern,
blocknum=0):
if isinstance(pattern,basestring):
self.pattern = re.compile(pattern)
else:
self.pattern=pattern
charfmt = QTextCharFormat()
brush = QBrush(color, style)
charfmt.setForeground(brush)
charfmt.setFontWeight(weight)
self.highlight = charfmt
self.blocknum=blocknum
# -*- coding: utf-8 -*-
from __future__ import (absolute_import, unicode_literals, division,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2020, Jim Miller'
__docformat__ = 'restructuredtext en'
import re
import logging
logger = logging.getLogger(__name__)
from PyQt5.Qt import (QApplication, Qt, QColor, QSyntaxHighlighter,
QTextCharFormat, QBrush, QFont)
try:
# qt6 Calibre v6+
QFontNormal = QFont.Weight.Normal
QFontBold = QFont.Weight.Bold
except:
# qt5 Calibre v2-5
QFontNormal = QFont.Normal
QFontBold = QFont.Bold
from fanficfare.six import string_types
class IniHighlighter(QSyntaxHighlighter):
'''
QSyntaxHighlighter class for use with QTextEdit for highlighting
ini config files.
'''
def __init__( self, parent, sections=[], keywords=[], entries=[], entry_keywords=[] ):
QSyntaxHighlighter.__init__( self, parent )
self.parent = parent
self.highlightingRules = []
colors = {
'knownentries':Qt.darkGreen,
'errors':Qt.red,
'allkeywords':Qt.darkMagenta,
'knownkeywords':Qt.blue,
'knownsections':Qt.darkBlue,
'teststories':Qt.darkCyan,
'storyUrls':Qt.darkMagenta,
'comments':Qt.darkYellow
}
try:
if( hasattr(QApplication.instance(),'is_dark_theme')
and QApplication.instance().is_dark_theme ):
colors = {
'knownentries':Qt.green,
'errors':Qt.red,
'allkeywords':Qt.magenta,
'knownkeywords':QColor(Qt.blue).lighter(150),
'knownsections':Qt.darkCyan,
'teststories':Qt.cyan,
'storyUrls':QColor(Qt.magenta).lighter(150),
'comments':Qt.yellow
}
except Exception as e:
logger.error("Failed to set dark theme highlight colors: %s"%e)
if entries:
# *known* entries
reentries = r'('+(r'|'.join(entries))+r')'
self.highlightingRules.append( HighlightingRule( r"\b"+reentries+r"\b", colors['knownentries'] ) )
# true/false -- just to be nice.
self.highlightingRules.append( HighlightingRule( r"\b(true|false)\b", colors['knownentries'] ) )
# *all* keywords -- change known later.
self.errorRule = HighlightingRule( r"^[^:=\s][^:=]*[:=]", colors['errors'] )
self.highlightingRules.append( self.errorRule )
# *all* entry keywords -- change known later.
reentrykeywords = r'('+(r'|'.join([ e % r'[a-zA-Z0-9_]+' for e in entry_keywords ]))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"(_filelist)?\s*[:=]", colors['allkeywords'] ) )
if entries: # separate from known entries so entry named keyword won't be masked.
# *known* entry keywords
reentrykeywords = r'('+(r'|'.join([ e % reentries for e in entry_keywords ]))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"(_filelist)?\s*[:=]", colors['knownkeywords'] ) )
# *known* keywords
rekeywords = r'('+(r'|'.join(keywords))+r')'
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+rekeywords+r"(_filelist)?\s*[:=]", colors['knownkeywords'] ) )
# *all* sections -- change known later.
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\].*?$", colors['errors'], QFontBold, blocknum=1 ) )
if sections:
# *known* sections
resections = r'('+(r'|'.join(sections))+r')'
resections = resections.replace('.',r'\.') #escape dots.
self.highlightingRules.append( HighlightingRule( r"^\["+resections+r"\]\s*$", colors['knownsections'], QFontBold, blocknum=2 ) )
# test story sections
self.teststoryRule = HighlightingRule( r"^\[teststory:([0-9]+|defaults)\]", colors['teststories'], blocknum=3 )
self.highlightingRules.append( self.teststoryRule )
# storyUrl sections
# StoryUrls are *not* checked beyond looking for https?://
self.storyUrlRule = HighlightingRule( r"^\[https?://.*\]", colors['storyUrls'], QFontBold, blocknum=2 )
self.highlightingRules.append( self.storyUrlRule )
# NOT comments -- but can be custom columns, so don't flag.
#self.highlightingRules.append( HighlightingRule( r"(?<!^)#[^\n]*" , colors['errors'] ) )
# comments -- comments must start from column 0.
self.commentRule = HighlightingRule( r"^#[^\n]*" , colors['comments'] )
self.highlightingRules.append( self.commentRule )
def highlightBlock( self, text ):
is_comment = False
blocknum = self.previousBlockState()
for rule in self.highlightingRules:
for match in rule.pattern.finditer(text):
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
if rule == self.commentRule:
is_comment = True
if rule.blocknum > 0:
blocknum = rule.blocknum
if not is_comment:
# unknown section, error all:
if blocknum == 1 and blocknum == self.previousBlockState():
self.setFormat( 0, len(text), self.errorRule.highlight )
# teststory section rules:
if blocknum == 3:
self.setFormat( 0, len(text), self.teststoryRule.highlight )
## changed storyUrl section to also be blocknum=1 April 2023
## storyUrl section rules:
# if blocknum == 4:
# self.setFormat( 0, len(text), self.storyUrlRule.highlight )
self.setCurrentBlockState( blocknum )
class HighlightingRule():
def __init__( self, pattern, color,
weight=QFontNormal,
style=Qt.SolidPattern,
blocknum=0):
if isinstance(pattern, string_types):
self.pattern = re.compile(pattern)
else:
self.pattern=pattern
charfmt = QTextCharFormat()
brush = QBrush(color, style)
charfmt.setForeground(brush)
charfmt.setFontWeight(weight)
self.highlight = charfmt
self.blocknum=blocknum

View file

@ -1,340 +1,403 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2016, Jim Miller, 2011, Grant Drake <grant.drake@gmail.com>'
__docformat__ = 'restructuredtext en'
import logging
logger = logging.getLogger(__name__)
import traceback
from datetime import time
from StringIO import StringIO
from calibre.utils.ipc.server import Server
from calibre.utils.ipc.job import ParallelJob
from calibre.constants import numeric_version as calibre_version
from calibre.utils.date import local_tz
from calibre_plugins.fanficfare_plugin.wordcount import get_word_count
from calibre_plugins.fanficfare_plugin.prefs import (SAVE_YES, SAVE_YES_UNLESS_SITE)
# pulls in translation files for _() strings
try:
load_translations()
except NameError:
pass # load_translations() added in calibre 1.9
# ------------------------------------------------------------------------------
#
# Functions to perform downloads using worker jobs
#
# ------------------------------------------------------------------------------
def do_download_worker(book_list,
options,
cpus,
merge=False,
notification=lambda x,y:x):
'''
Master job, to launch child jobs to extract ISBN for a set of books
This is run as a worker job in the background to keep the UI more
responsive and get around the memory leak issues as it will launch
a child job for each book as a worker process
'''
server = Server(pool_size=cpus)
logger.info(options['version'])
total = 0
alreadybad = []
# Queue all the jobs
logger.info("Adding jobs for URLs:")
for book in book_list:
logger.info("%s"%book['url'])
if book['good']:
total += 1
args = ['calibre_plugins.fanficfare_plugin.jobs',
'do_download_for_worker',
(book,options,merge)]
job = ParallelJob('arbitrary_n',
"url:(%s) id:(%s)"%(book['url'],book['calibre_id']),
done=None,
args=args)
job._book = book
server.add_job(job)
else:
# was already bad before the subprocess ever started.
alreadybad.append(book)
# This server is an arbitrary_n job, so there is a notifier available.
# Set the % complete to a small number to avoid the 'unavailable' indicator
notification(0.01, _('Downloading FanFiction Stories'))
# dequeue the job results as they arrive, saving the results
count = 0
while True:
job = server.changed_jobs_queue.get()
# A job can 'change' when it is not finished, for example if it
# produces a notification. Ignore these.
job.update()
if not job.is_finished:
continue
# A job really finished. Get the information.
book_list.remove(job._book)
book_list.append(job.result)
book_id = job._book['calibre_id']
count = count + 1
notification(float(count)/total, _('%d of %d stories finished downloading')%(count,total))
# Add this job's output to the current log
logger.info('Logfile for book ID %s (%s)'%(book_id, job._book['title']))
logger.info(job.details)
if count >= total:
## ordering first by good vs bad, then by listorder.
good_list = filter(lambda x : x['good'], book_list)
bad_list = filter(lambda x : not x['good'], book_list)
good_list = sorted(good_list,key=lambda x : x['listorder'])
bad_list = sorted(bad_list,key=lambda x : x['listorder'])
logger.info("\n"+_("Download Results:")+"\n%s\n"%("\n".join([ "%(url)s %(comment)s" % book for book in good_list+bad_list])))
logger.info("\n"+_("Successful:")+"\n%s\n"%("\n".join([book['url'] for book in good_list])))
logger.info("\n"+_("Unsuccessful:")+"\n%s\n"%("\n".join([book['url'] for book in bad_list])))
break
server.close()
# return the book list as the job result
return book_list
def do_download_for_worker(book,options,merge,notification=lambda x,y:x):
'''
Child job, to download story when run as a worker job
'''
from calibre_plugins.fanficfare_plugin import FanFicFareBase
fffbase = FanFicFareBase(options['plugin_path'])
with fffbase:
from calibre_plugins.fanficfare_plugin.dialogs import (NotGoingToDownload,
OVERWRITE, OVERWRITEALWAYS, UPDATE, UPDATEALWAYS, ADDNEW, SKIP, CALIBREONLY, CALIBREONLYSAVECOL)
from calibre_plugins.fanficfare_plugin.fanficfare import adapters, writers, exceptions
from calibre_plugins.fanficfare_plugin.fanficfare.epubutils import get_update_data
from calibre_plugins.fanficfare_plugin.fff_util import (get_fff_adapter, get_fff_config)
try:
book['comment'] = _('Download started...')
configuration = get_fff_config(book['url'],
options['fileform'],
options['personal.ini'])
if configuration.getConfig('use_ssl_unverified_context'):
## monkey patch to avoid SSL bug. dupliated from
## fff_plugin.py because bg jobs run in own process
## space.
import ssl
if hasattr(ssl, '_create_unverified_context'):
ssl._create_default_https_context = ssl._create_unverified_context
if not options['updateepubcover'] and 'epub_for_update' in book and options['collision'] in (UPDATE, UPDATEALWAYS):
configuration.set("overrides","never_make_cover","true")
# images only for epub, html, even if the user mistakenly
# turned it on else where.
if options['fileform'] not in ("epub","html"):
configuration.set("overrides","include_images","false")
adapter = adapters.getAdapter(configuration,book['url'])
adapter.is_adult = book['is_adult']
adapter.username = book['username']
adapter.password = book['password']
adapter.setChaptersRange(book['begin'],book['end'])
adapter.load_cookiejar(options['cookiejarfile'])
#logger.debug("cookiejar:%s"%adapter.cookiejar)
adapter.set_pagecache(options['pagecache'])
story = adapter.getStoryMetadataOnly()
if not story.getMetadata("series") and 'calibre_series' in book:
adapter.setSeries(book['calibre_series'][0],book['calibre_series'][1])
# set PI version instead of default.
if 'version' in options:
story.setMetadata('version',options['version'])
book['title'] = story.getMetadata("title", removeallentities=True)
book['author_sort'] = book['author'] = story.getList("author", removeallentities=True)
book['publisher'] = story.getMetadata("site")
book['url'] = story.getMetadata("storyUrl")
book['tags'] = story.getSubjectTags(removeallentities=True)
book['comments'] = story.get_sanitized_description()
book['series'] = story.getMetadata("series", removeallentities=True)
if story.getMetadataRaw('datePublished'):
book['pubdate'] = story.getMetadataRaw('datePublished').replace(tzinfo=local_tz)
if story.getMetadataRaw('dateUpdated'):
book['updatedate'] = story.getMetadataRaw('dateUpdated').replace(tzinfo=local_tz)
if story.getMetadataRaw('dateCreated'):
book['timestamp'] = story.getMetadataRaw('dateCreated').replace(tzinfo=local_tz)
else:
book['timestamp'] = datetime.now() # need *something* there for calibre.
writer = writers.getWriter(options['fileform'],configuration,adapter)
outfile = book['outfile']
## No need to download at all. Shouldn't ever get down here.
if options['collision'] in (CALIBREONLY, CALIBREONLYSAVECOL):
logger.info("Skipping CALIBREONLY 'update' down inside worker--this shouldn't be happening...")
book['comment'] = _('Metadata collected.')
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
## checks were done earlier, it's new or not dup or newer--just write it.
elif options['collision'] in (ADDNEW, SKIP, OVERWRITE, OVERWRITEALWAYS) or \
('epub_for_update' not in book and options['collision'] in (UPDATE, UPDATEALWAYS)):
# preserve logfile even on overwrite.
if 'epub_for_update' in book:
adapter.logfile = get_update_data(book['epub_for_update'])[6]
# change the existing entries id to notid so
# write_epub writes a whole new set to indicate overwrite.
if adapter.logfile:
adapter.logfile = adapter.logfile.replace("span id","span notid")
if options['collision'] == OVERWRITE and 'fileupdated' in book:
lastupdated=story.getMetadataRaw('dateUpdated')
fileupdated=book['fileupdated']
# updated doesn't have time (or is midnight), use dates only.
# updated does have time, use full timestamps.
if (lastupdated.time() == time.min and fileupdated.date() > lastupdated.date()) or \
(lastupdated.time() != time.min and fileupdated > lastupdated):
raise NotGoingToDownload(_("Not Overwriting, web site is not newer."),'edit-undo.png',showerror=False)
logger.info("write to %s"%outfile)
inject_cal_cols(book,story,configuration)
writer.writeStory(outfilename=outfile, forceOverwrite=True)
book['comment'] = _('Download %s completed, %s chapters.')%(options['fileform'],story.getMetadata("numChapters"))
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
## checks were done earlier, just update it.
elif 'epub_for_update' in book and options['collision'] in (UPDATE, UPDATEALWAYS):
# update now handled by pre-populating the old images and
# chapters in the adapter rather than merging epubs.
urlchaptercount = int(story.getMetadata('numChapters').replace(',',''))
(url,
chaptercount,
adapter.oldchapters,
adapter.oldimgs,
adapter.oldcover,
adapter.calibrebookmark,
adapter.logfile,
adapter.oldchaptersmap,
adapter.oldchaptersdata) = get_update_data(book['epub_for_update'])[0:9]
# dup handling from fff_plugin needed for anthology updates.
if options['collision'] == UPDATE:
if chaptercount == urlchaptercount:
if merge:
book['comment']=_("Already contains %d chapters. Reuse as is.")%chaptercount
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
book['outfile'] = book['epub_for_update'] # for anthology merge ops.
return book
else: # not merge,
raise NotGoingToDownload(_("Already contains %d chapters.")%chaptercount,'edit-undo.png',showerror=False)
elif chaptercount > urlchaptercount:
raise NotGoingToDownload(_("Existing epub contains %d chapters, web site only has %d. Use Overwrite to force update.") % (chaptercount,urlchaptercount),'dialog_error.png')
elif chaptercount == 0:
raise NotGoingToDownload(_("FanFicFare doesn't recognize chapters in existing epub, epub is probably from a different source. Use Overwrite to force update."),'dialog_error.png')
if not (options['collision'] == UPDATEALWAYS and chaptercount == urlchaptercount) \
and adapter.getConfig("do_update_hook"):
chaptercount = adapter.hookForUpdates(chaptercount)
logger.info("Do update - epub(%d) vs url(%d)" % (chaptercount, urlchaptercount))
logger.info("write to %s"%outfile)
inject_cal_cols(book,story,configuration)
writer.writeStory(outfilename=outfile, forceOverwrite=True)
book['comment'] = _('Update %s completed, added %s chapters for %s total.')%\
(options['fileform'],(urlchaptercount-chaptercount),urlchaptercount)
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
if options['do_wordcount'] == SAVE_YES or (
options['do_wordcount'] == SAVE_YES_UNLESS_SITE and not story.getMetadataRaw('numWords') ):
wordcount = get_word_count(outfile)
logger.info("get_word_count:%s"%wordcount)
story.setMetadata('numWords',wordcount)
writer.writeStory(outfilename=outfile, forceOverwrite=True)
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
if options['smarten_punctuation'] and options['fileform'] == "epub" \
and calibre_version >= (0, 9, 39):
# for smarten punc
from calibre.ebooks.oeb.polish.main import polish, ALL_OPTS
from calibre.utils.logging import Log
from collections import namedtuple
# do smarten_punctuation from calibre's polish feature
data = {'smarten_punctuation':True}
opts = ALL_OPTS.copy()
opts.update(data)
O = namedtuple('Options', ' '.join(ALL_OPTS.iterkeys()))
opts = O(**opts)
log = Log(level=Log.DEBUG)
polish({outfile:outfile}, opts, log, logger.info)
except NotGoingToDownload as d:
book['good']=False
book['showerror']=d.showerror
book['comment']=unicode(d)
book['icon'] = d.icon
except Exception as e:
book['good']=False
book['comment']=unicode(e)
book['icon']='dialog_error.png'
book['status'] = _('Error')
logger.info("Exception: %s:%s"%(book,unicode(e)),exc_info=True)
#time.sleep(10)
return book
## calibre's columns for an existing book are pased in and injected
## into the story's metadata. For convenience, we also add labels and
## valid_entries for them in a special [injected] section that has
## even less precedence than [defaults]
def inject_cal_cols(book,story,configuration):
configuration.remove_section('injected')
if 'calibre_columns' in book:
injectini = ['[injected]']
extra_valid = []
for k, v in book['calibre_columns'].iteritems():
story.setMetadata(k,v['val'])
injectini.append('%s_label:%s'%(k,v['label']))
extra_valid.append(k)
if extra_valid: # if empty, there's nothing to add.
injectini.append("add_to_extra_valid_entries:,"+','.join(extra_valid))
configuration.readfp(StringIO('\n'.join(injectini)))
#print("added:\n%s\n"%('\n'.join(injectini)))
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2020, Jim Miller, 2011, Grant Drake <grant.drake@gmail.com>'
__docformat__ = 'restructuredtext en'
import logging
logger = logging.getLogger(__name__)
from time import sleep
from datetime import time
from io import StringIO
from collections import defaultdict
import sys
from calibre.utils.date import local_tz
# pulls in translation files for _() strings
try:
load_translations()
except NameError:
pass # load_translations() added in calibre 1.9
# ------------------------------------------------------------------------------
#
# Functions to perform downloads using worker jobs
#
# ------------------------------------------------------------------------------
def do_download_worker_single(site,
book_list,
options,
merge,
notification=lambda x,y:x):
logger.info(options['version'])
## same info debug calibre prints out at startup. For when users
## give me job output instead of debug log.
from calibre.debug import print_basic_debug_info
print_basic_debug_info(sys.stderr)
notification(0.01, _('Downloading FanFiction Stories'))
from calibre_plugins.fanficfare_plugin import FanFicFareBase
fffbase = FanFicFareBase(options['plugin_path'])
with fffbase: # so the sys.path was modified while loading the
# plug impl.
from fanficfare.fff_profile import do_cprofile
## extra function just so I can easily use the same
## @do_cprofile decorator
@do_cprofile
def profiled_func():
count = 0
totals = {}
# can't do direct assignment in list comprehension? I'm sure it
# makes sense to some pythonista.
# [ totals[x['url']]=0.0 for x in book_list if x['good'] ]
[ totals.update({x['url']:0.0}) for x in book_list if x['good'] ]
# logger.debug(sites_lists.keys())
def do_indiv_notif(percent,msg):
totals[msg] = percent/len(totals)
notification(max(0.01,sum(totals.values())), _('%(count)d of %(total)d stories finished downloading')%{'count':count,'total':len(totals)})
do_list = []
done_list = []
logger.info("\n\n"+_("Downloading FanFiction Stories")+"\n%s\n"%("\n".join([ "%(status)s %(url)s %(comment)s" % book for book in book_list])))
## pass failures from metadata through bg job so all results are
## together.
for book in book_list:
if book['good']:
do_list.append(book)
else:
done_list.append(book)
for book in do_list:
# logger.info("%s"%book['url'])
done_list.append(do_download_for_worker(book,options,merge,do_indiv_notif))
count += 1
return finish_download(done_list)
return profiled_func()
def finish_download(donelist):
book_list = sorted(donelist,key=lambda x : x['listorder'])
logger.info("\n"+_("Download Results:")+"\n%s\n"%("\n".join([ "%(status)s %(url)s %(comment)s" % book for book in book_list])))
good_lists = defaultdict(list)
bad_lists = defaultdict(list)
for book in book_list:
if book['good']:
good_lists[book['status']].append(book)
else:
bad_lists[book['status']].append(book)
order = [_('Add'),
_('Update'),
_('Meta'),
_('Different URL'),
_('Rejected'),
_('Skipped'),
_('Bad'),
_('Error'),
]
stnum = 0
for d in [ good_lists, bad_lists ]:
for status in order:
stnum += 1
if d[status]:
l = d[status]
logger.info("\n"+status+"\n%s\n"%("\n".join([book['url'] for book in l])))
for book in l:
# Add prior listorder to 10000 * status num for
# ordering of accumulated results with multiple bg
# jobs
book['reportorder'] = stnum*10000 + book['listorder']
del d[status]
# just in case a status is added but doesn't appear in order.
for status in d.keys():
logger.info("\n"+status+"\n%s\n"%("\n".join([book['url'] for book in d[status]])))
# return the book list as the job result
return book_list
def do_download_for_worker(book,options,merge,notification=lambda x,y:x):
'''
Child job, to download story when run as a worker job
'''
from calibre_plugins.fanficfare_plugin import FanFicFareBase
fffbase = FanFicFareBase(options['plugin_path'])
with fffbase: # so the sys.path was modified while loading the
# plug impl.
from calibre_plugins.fanficfare_plugin.prefs import (
SAVE_YES, SAVE_YES_UNLESS_SITE, OVERWRITE, OVERWRITEALWAYS, UPDATE,
UPDATEALWAYS, ADDNEW, SKIP, CALIBREONLY, CALIBREONLYSAVECOL)
from calibre_plugins.fanficfare_plugin.wordcount import get_word_count
from fanficfare import adapters, writers
from fanficfare.epubutils import get_update_data
from fanficfare.exceptions import NotGoingToDownload
from fanficfare.six import text_type as unicode
from calibre_plugins.fanficfare_plugin.fff_util import get_fff_config
try:
logger.info("\n\n" + ("-"*80) + " " + book['url'])
## No need to download at all. Can happen now due to
## collision moving into book for CALIBREONLY changing to
## ADDNEW when story URL not in library.
if book['collision'] in (CALIBREONLY, CALIBREONLYSAVECOL):
logger.info("Skipping CALIBREONLY 'update' down inside worker")
return book
book['comment'] = _('Download started...')
configuration = get_fff_config(book['url'],
options['fileform'],
options['personal.ini'])
# images only for epub, html, even if the user mistakenly
# turned it on else where.
if options['fileform'] not in ("epub","html"):
configuration.set("overrides","include_images","false")
adapter = adapters.getAdapter(configuration,book['url'])
adapter.is_adult = book['is_adult']
adapter.username = book['username']
adapter.password = book['password']
adapter.totp = book['totp']
adapter.setChaptersRange(book['begin'],book['end'])
## each site download job starts with a new copy of the
## cookiejar and basic_cache from the FG process. They
## are not shared between different sites' BG downloads
if 'basic_cache' in options:
configuration.set_basic_cache(options['basic_cache'])
else:
options['basic_cache'] = configuration.get_basic_cache()
options['basic_cache'].load_cache(options['basic_cachefile'])
if 'cookiejar' in options:
configuration.set_cookiejar(options['cookiejar'])
else:
options['cookiejar'] = configuration.get_cookiejar()
options['cookiejar'].load_cookiejar(options['cookiejarfile'])
story = adapter.getStoryMetadataOnly()
if not story.getMetadata("series") and 'calibre_series' in book:
adapter.setSeries(book['calibre_series'][0],book['calibre_series'][1])
# logger.debug(merge)
# logger.debug(book.get('epub_for_update','(NONE)'))
# logger.debug(options.get('mergebook','(NOMERGEBOOK)'))
# is a merge, is a pre-existing anthology, and is not a pre-existing book in anthology.
if merge and 'mergebook' in options and 'epub_for_update' not in book:
# internal for plugin anthologies to mark chapters
# (new) in new stories
story.setMetadata("newforanthology","true")
logger.debug("metadata newforanthology:%s"%story.getMetadata("newforanthology"))
# set PI version instead of default.
if 'version' in options:
story.setMetadata('version',options['version'])
book['title'] = story.getMetadata("title", removeallentities=True)
book['author_sort'] = book['author'] = story.getList("author", removeallentities=True)
book['publisher'] = story.getMetadata("publisher")
book['url'] = story.getMetadata("storyUrl", removeallentities=True)
book['comments'] = story.get_sanitized_description()
book['series'] = story.getMetadata("series", removeallentities=True)
if story.getMetadataRaw('datePublished'):
book['pubdate'] = story.getMetadataRaw('datePublished').replace(tzinfo=local_tz)
if story.getMetadataRaw('dateUpdated'):
book['updatedate'] = story.getMetadataRaw('dateUpdated').replace(tzinfo=local_tz)
if story.getMetadataRaw('dateCreated'):
book['timestamp'] = story.getMetadataRaw('dateCreated').replace(tzinfo=local_tz)
else:
book['timestamp'] = datetime.now().replace(tzinfo=local_tz) # need *something* there for calibre.
writer = writers.getWriter(options['fileform'],configuration,adapter)
outfile = book['outfile']
## checks were done earlier, it's new or not dup or newer--just write it.
if book['collision'] in (ADDNEW, SKIP, OVERWRITE, OVERWRITEALWAYS) or \
('epub_for_update' not in book and book['collision'] in (UPDATE, UPDATEALWAYS)):
# preserve logfile even on overwrite.
if 'epub_for_update' in book:
adapter.logfile = get_update_data(book['epub_for_update'])[6]
# change the existing entries id to notid so
# write_epub writes a whole new set to indicate overwrite.
if adapter.logfile:
adapter.logfile = adapter.logfile.replace("span id","span notid")
if book['collision'] == OVERWRITE and 'fileupdated' in book:
lastupdated=story.getMetadataRaw('dateUpdated')
fileupdated=book['fileupdated']
# updated doesn't have time (or is midnight), use dates only.
# updated does have time, use full timestamps.
if (lastupdated.time() == time.min and fileupdated.date() > lastupdated.date()) or \
(lastupdated.time() != time.min and fileupdated > lastupdated):
raise NotGoingToDownload(_("Not Overwriting, web site is not newer."),'edit-undo.png',showerror=False)
logger.info("write to %s"%outfile)
inject_cal_cols(book,story,configuration)
writer.writeStory(outfilename=outfile,
forceOverwrite=True,
notification=notification)
if adapter.story.chapter_error_count > 0:
book['comment'] = _('Download %(fileform)s completed, %(failed)s failed chapters, %(total)s total chapters.')%\
{'fileform':options['fileform'],
'failed':adapter.story.chapter_error_count,
'total':story.getMetadata("numChapters")}
book['chapter_error_count'] = adapter.story.chapter_error_count
else:
book['comment'] = _('Download %(fileform)s completed, %(total)s chapters.')%\
{'fileform':options['fileform'],
'total':story.getMetadata("numChapters")}
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
## checks were done earlier, just update it.
elif 'epub_for_update' in book and book['collision'] in (UPDATE, UPDATEALWAYS):
# update now handled by pre-populating the old images and
# chapters in the adapter rather than merging epubs.
#urlchaptercount = int(story.getMetadata('numChapters').replace(',',''))
# returns int adjusted for start-end range.
urlchaptercount = story.getChapterCount()
(url,
chaptercount,
adapter.oldchapters,
adapter.oldimgs,
adapter.oldcover,
adapter.calibrebookmark,
adapter.logfile,
adapter.oldchaptersmap,
adapter.oldchaptersdata) = get_update_data(book['epub_for_update'])[0:9]
# dup handling from fff_plugin needed for anthology updates & BG metadata.
if book['collision'] in (UPDATE,UPDATEALWAYS):
if chaptercount == urlchaptercount and book['collision'] == UPDATE:
if merge:
## Deliberately pass for UPDATEALWAYS merge.
book['comment']=_("Already contains %d chapters. Reuse as is.")%chaptercount
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
book['outfile'] = book['epub_for_update'] # for anthology merge ops.
return book
else:
raise NotGoingToDownload(_("Already contains %d chapters.")%chaptercount,'edit-undo.png',showerror=False)
elif chaptercount > urlchaptercount and not (book['collision'] == UPDATEALWAYS and adapter.getConfig('force_update_epub_always')):
raise NotGoingToDownload(_("Existing epub contains %d chapters, web site only has %d. Use Overwrite or force_update_epub_always to force update.") % (chaptercount,urlchaptercount),'dialog_error.png')
elif chaptercount == 0:
raise NotGoingToDownload(_("FanFicFare doesn't recognize chapters in existing epub, epub is probably from a different source. Use Overwrite to force update."),'dialog_error.png')
if not (book['collision'] == UPDATEALWAYS and chaptercount == urlchaptercount) \
and adapter.getConfig("do_update_hook"):
chaptercount = adapter.hookForUpdates(chaptercount)
logger.info("Do update - epub(%d) vs url(%d)" % (chaptercount, urlchaptercount))
logger.info("write to %s"%outfile)
inject_cal_cols(book,story,configuration)
writer.writeStory(outfilename=outfile,
forceOverwrite=True,
notification=notification)
if adapter.story.chapter_error_count > 0:
book['comment'] = _('Update %(fileform)s completed, added %(added)s chapters, %(failed)s failed chapters, for %(total)s total.')%\
{'fileform':options['fileform'],
'failed':adapter.story.chapter_error_count,
'added':(urlchaptercount-chaptercount),
'total':urlchaptercount}
book['chapter_error_count'] = adapter.story.chapter_error_count
else:
book['comment'] = _('Update %(fileform)s completed, added %(added)s chapters for %(total)s total.')%\
{'fileform':options['fileform'],'added':(urlchaptercount-chaptercount),'total':urlchaptercount}
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
else:
## Shouldn't ever get here, but hey, it happened once
## before with prefs['collision']
raise Exception("Impossible state reached -- Book: %s:\nOptions:%s:"%(book,options))
if options['do_wordcount'] == SAVE_YES or (
options['do_wordcount'] == SAVE_YES_UNLESS_SITE and not story.getMetadataRaw('numWords') ):
try:
wordcount = get_word_count(outfile)
# logger.info("get_word_count:%s"%wordcount)
# clear cache for the rather unusual case of
# numWords affecting other previously cached
# entries.
story.clear_processed_metadata_cache()
story.setMetadata('numWords',wordcount)
writer.writeStory(outfilename=outfile, forceOverwrite=True)
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
if options['savemetacol'] != '':
book['savemetacol'] = story.dump_html_metadata()
except:
logger.error("WordCount failed")
if options['smarten_punctuation'] and options['fileform'] == "epub":
# for smarten punc
from calibre.ebooks.oeb.polish.main import polish, ALL_OPTS
from calibre.utils.logging import Log
from collections import namedtuple
# do smarten_punctuation from calibre's polish feature
data = {'smarten_punctuation':True}
opts = ALL_OPTS.copy()
opts.update(data)
O = namedtuple('Options', ' '.join(ALL_OPTS.keys()))
opts = O(**opts)
log = Log(level=Log.DEBUG)
polish({outfile:outfile}, opts, log, logger.info)
## here to catch tags set in chapters in literotica for
## both overwrites and updates.
book['tags'] = story.getSubjectTags(removeallentities=True)
except NotGoingToDownload as d:
book['good']=False
book['status']=_('Bad')
book['showerror']=d.showerror
book['comment']=unicode(d)
book['icon'] = d.icon
except Exception as e:
book['good']=False
book['status']=_('Error')
book['comment']=unicode(e)
book['icon']='dialog_error.png'
book['status'] = _('Error')
logger.info("Exception: %s:%s"%(book,book['comment']),exc_info=True)
return book
## calibre's columns for an existing book are passed in and injected
## into the story's metadata. For convenience, we also add labels and
## valid_entries for them in a special [injected] section that has
## even less precedence than [defaults]
def inject_cal_cols(book,story,configuration):
configuration.remove_section('injected')
if 'calibre_columns' in book:
injectini = ['[injected]']
extra_valid = []
for k in book['calibre_columns'].keys():
v = book['calibre_columns'][k]
story.setMetadata(k,v['val'])
injectini.append('%s_label:%s'%(k,v['label']))
extra_valid.append(k)
if extra_valid: # if empty, there's nothing to add.
injectini.append("add_to_extra_valid_entries:,"+','.join(extra_valid))
configuration.read_file(StringIO('\n'.join(injectini)))
#print("added:\n%s\n"%('\n'.join(injectini)))

File diff suppressed because it is too large Load diff

View file

@ -3,22 +3,9 @@
[defaults]
## [defaults] section applies to all formats and sites but may be
## overridden at several levels. Example:
## [defaults]
## titlepage_entries: category,genre, status
## [www.whofic.com]
## # overrides defaults.
## titlepage_entries: category,genre, status,dateUpdated,rating
## [epub]
## # overrides defaults & site section
## titlepage_entries: category,genre, status,datePublished,dateUpdated,dateCreated
## [www.whofic.com:epub]
## # overrides defaults, site section & format section
## titlepage_entries: category,genre, status,datePublished
## [overrides]
## # overrides all other sections
## titlepage_entries: category
## overridden at several levels. See
## https://github.com/JimmXinu/FanFicFare/wiki/INI-File for more
## details.
## Some sites also require the user to confirm they are adult for
## adult content. Uncomment by removing '#' in front of is_adult.
@ -29,38 +16,32 @@
## want to make them all look the same? Strip them off, then add them
## back on with add_chapter_numbers. Don't like the way it strips
## numbers or adds them back? See chapter_title_strip_pattern and
## chapter_title_add_pattern.
## chapter_title_add_pattern in defaults.ini.
#strip_chapter_numbers:true
#add_chapter_numbers:true
## Add this to genre if there's more than one category.
#add_genre_when_multi_category: Crossover
[epub]
## include images from img tags in the body and summary of stories.
## Include images from img tags in the body and summary of stories.
## Images will be converted to jpg for size if possible. Images work
## in epub format only. To get mobi or other format with images,
## download as epub and use Calibre to convert.
## true by default, uncomment and set false to not include images.
#include_images:true
## If not set, the summary will have all html stripped for safety.
## If set false, the summary will have all html stripped for safety.
## Both this and include_images must be true to get images in the
## summary.
## true by default, uncomment and set false to not keep summary html.
#keep_summary_html:true
## If set, the first image found will be made the cover image. If
## keep_summary_html is true, any images in summary will be before any
## If set true, and there isn't a specific cover image, the first
## image found in the story will be made the cover image. If
## keep_summary_html is true, images in the summary will be before any
## in chapters.
## true by default, uncomment and set false to turn off
#make_firstimage_cover:true
## Resize images down to width, height, preserving aspect ratio.
## Nook size, with margin.
#image_max_size: 580, 725
## Change image to grayscale, if graphics library allows, to save
## space.
#grayscale_images: false
## Most common, I expect will be using this to save username/passwords
## for different sites. Here are a few examples. See defaults.ini
@ -72,28 +53,6 @@
## default is false
#collect_series: true
[ficwad.com]
#username:YourUsername
#password:YourPassword
[www.adastrafanfic.com]
## Some sites do not require a login, but do require the user to
## confirm they are adult for adult content.
#is_adult:true
[www.twcslibrary.net]
#username:YourName
#password:yourpassword
#is_adult:true
## default is false
#collect_series: true
[www.fictionalley.org]
#is_adult:true
[www.harrypotterfanfiction.com]
#is_adult:true
[www.fimfiction.net]
#is_adult:true
#fail_on_password: false
@ -102,8 +61,9 @@
#is_adult:true
## tth is a little unusual--it doesn't require user/pass, but the site
## keeps track of which chapters you've read and won't send another
## update until it thinks you're up to date. This way, on download,
## it thinks you're up to date.
## update until it thinks you're up to date. If you set
## username/password, FFF will login to download. Then the site
## thinks you're up to date.
#username:YourName
#password:yourpassword

View file

@ -1,260 +1,282 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2016, Jim Miller'
__docformat__ = 'restructuredtext en'
import logging
logger = logging.getLogger(__name__)
import copy
from calibre.utils.config import JSONConfig
from calibre.gui2.ui import get_gui
from calibre_plugins.fanficfare_plugin.common_utils import get_library_uuid
SKIP=_('Skip')
ADDNEW=_('Add New Book')
UPDATE=_('Update EPUB if New Chapters')
UPDATEALWAYS=_('Update EPUB Always')
OVERWRITE=_('Overwrite if Newer')
OVERWRITEALWAYS=_('Overwrite Always')
CALIBREONLY=_('Update Calibre Metadata from Web Site')
CALIBREONLYSAVECOL=_('Update Calibre Metadata from Saved Metadata Column')
collision_order=[SKIP,
ADDNEW,
UPDATE,
UPDATEALWAYS,
OVERWRITE,
OVERWRITEALWAYS,
CALIBREONLY,
CALIBREONLYSAVECOL,]
# best idea I've had for how to deal with config/pref saving the
# collision name in english.
SAVE_SKIP='Skip'
SAVE_ADDNEW='Add New Book'
SAVE_UPDATE='Update EPUB if New Chapters'
SAVE_UPDATEALWAYS='Update EPUB Always'
SAVE_OVERWRITE='Overwrite if Newer'
SAVE_OVERWRITEALWAYS='Overwrite Always'
SAVE_CALIBREONLY='Update Calibre Metadata Only'
SAVE_CALIBREONLYSAVECOL='Update Calibre Metadata Only(Saved Column)'
save_collisions={
SKIP:SAVE_SKIP,
ADDNEW:SAVE_ADDNEW,
UPDATE:SAVE_UPDATE,
UPDATEALWAYS:SAVE_UPDATEALWAYS,
OVERWRITE:SAVE_OVERWRITE,
OVERWRITEALWAYS:SAVE_OVERWRITEALWAYS,
CALIBREONLY:SAVE_CALIBREONLY,
CALIBREONLYSAVECOL:SAVE_CALIBREONLYSAVECOL,
SAVE_SKIP:SKIP,
SAVE_ADDNEW:ADDNEW,
SAVE_UPDATE:UPDATE,
SAVE_UPDATEALWAYS:UPDATEALWAYS,
SAVE_OVERWRITE:OVERWRITE,
SAVE_OVERWRITEALWAYS:OVERWRITEALWAYS,
SAVE_CALIBREONLY:CALIBREONLY,
SAVE_CALIBREONLYSAVECOL:CALIBREONLYSAVECOL,
}
anthology_collision_order=[UPDATE,
UPDATEALWAYS,
OVERWRITEALWAYS]
# Show translated strings, but save the same string in prefs so your
# prefs are the same in different languages.
YES=_('Yes, Always')
SAVE_YES='Yes'
YES_IF_IMG=_('Yes, if EPUB has a cover image')
SAVE_YES_IF_IMG='Yes, if img'
YES_UNLESS_IMG=_('Yes, unless FanFicFare found a cover image')
SAVE_YES_UNLESS_IMG='Yes, unless img'
YES_UNLESS_SITE=_('Yes, unless found on site')
SAVE_YES_UNLESS_SITE='Yes, unless site'
NO=_('No')
SAVE_NO='No'
prefs_save_options = {
YES:SAVE_YES,
SAVE_YES:YES,
YES_IF_IMG:SAVE_YES_IF_IMG,
SAVE_YES_IF_IMG:YES_IF_IMG,
YES_UNLESS_IMG:SAVE_YES_UNLESS_IMG,
SAVE_YES_UNLESS_IMG:YES_UNLESS_IMG,
NO:SAVE_NO,
SAVE_NO:NO,
YES_UNLESS_SITE:SAVE_YES_UNLESS_SITE,
SAVE_YES_UNLESS_SITE:YES_UNLESS_SITE,
}
updatecalcover_order=[YES,YES_IF_IMG,NO]
gencalcover_order=[YES,YES_UNLESS_IMG,NO]
do_wordcount_order=[YES,YES_UNLESS_SITE,NO]
# if don't have any settings for FanFicFarePlugin, copy from
# predecessor FanFictionDownLoaderPlugin.
FFDL_PREFS_NAMESPACE = 'FanFictionDownLoaderPlugin'
PREFS_NAMESPACE = 'FanFicFarePlugin'
PREFS_KEY_SETTINGS = 'settings'
# Set defaults used by all. Library specific settings continue to
# take from here.
default_prefs = {}
default_prefs['personal.ini'] = get_resources('plugin-example.ini')
default_prefs['cal_cols_pass_in'] = False
default_prefs['rejecturls'] = ''
default_prefs['rejectreasons'] = '''Sucked
Boring
Dup from another site'''
default_prefs['reject_always'] = False
default_prefs['reject_delete_default'] = True
default_prefs['updatemeta'] = True
default_prefs['bgmeta'] = False
default_prefs['updateepubcover'] = False
default_prefs['keeptags'] = False
default_prefs['suppressauthorsort'] = False
default_prefs['suppresstitlesort'] = False
default_prefs['mark'] = False
default_prefs['showmarked'] = False
default_prefs['autoconvert'] = False
default_prefs['urlsfromclip'] = True
default_prefs['updatedefault'] = True
default_prefs['fileform'] = 'epub'
default_prefs['collision'] = SAVE_UPDATE
default_prefs['deleteotherforms'] = False
default_prefs['adddialogstaysontop'] = False
default_prefs['lookforurlinhtml'] = False
default_prefs['checkforseriesurlid'] = True
default_prefs['auto_reject_seriesurlid'] = False
default_prefs['checkforurlchange'] = True
default_prefs['injectseries'] = False
default_prefs['matchtitleauth'] = True
default_prefs['do_wordcount'] = SAVE_YES_UNLESS_SITE
default_prefs['smarten_punctuation'] = False
default_prefs['show_est_time'] = False
default_prefs['send_lists'] = ''
default_prefs['read_lists'] = ''
default_prefs['addtolists'] = False
default_prefs['addtoreadlists'] = False
default_prefs['addtolistsonread'] = False
default_prefs['autounnew'] = False
default_prefs['updatecalcover'] = None
default_prefs['gencalcover'] = SAVE_YES
default_prefs['updatecover'] = False
default_prefs['calibre_gen_cover'] = False
default_prefs['plugin_gen_cover'] = True
default_prefs['gcnewonly'] = False
default_prefs['gc_site_settings'] = {}
default_prefs['allow_gc_from_ini'] = True
default_prefs['gc_polish_cover'] = False
default_prefs['countpagesstats'] = []
default_prefs['wordcountmissing'] = False
default_prefs['errorcol'] = ''
default_prefs['save_all_errors'] = True
default_prefs['savemetacol'] = ''
default_prefs['lastcheckedcol'] = ''
default_prefs['custom_cols'] = {}
default_prefs['custom_cols_newonly'] = {}
default_prefs['allow_custcol_from_ini'] = True
default_prefs['std_cols_newonly'] = {}
default_prefs['set_author_url'] = True
default_prefs['includecomments'] = False
default_prefs['anth_comments_newonly'] = True
default_prefs['imapserver'] = ''
default_prefs['imapuser'] = ''
default_prefs['imappass'] = ''
default_prefs['imapsessionpass'] = False
default_prefs['imapfolder'] = 'INBOX'
default_prefs['imapmarkread'] = True
default_prefs['auto_reject_from_email'] = False
default_prefs['update_existing_only_from_email'] = False
default_prefs['download_from_email_immediately'] = False
def set_library_config(library_config,db):
db.prefs.set_namespaced(PREFS_NAMESPACE,
PREFS_KEY_SETTINGS,
library_config)
def get_library_config(db):
library_id = get_library_uuid(db)
library_config = None
if library_config is None:
#print("get prefs from db")
library_config = db.prefs.get_namespaced(PREFS_NAMESPACE,
PREFS_KEY_SETTINGS)
# if don't have any settings for FanFicFarePlugin, copy from
# predecessor FanFictionDownLoaderPlugin.
if library_config is None:
logger.info("Attempting to read settings from predecessor--FFDL")
library_config = db.prefs.get_namespaced(FFDL_PREFS_NAMESPACE,
PREFS_KEY_SETTINGS)
if library_config is None:
# defaults.
logger.info("Using default settings")
library_config = copy.deepcopy(default_prefs)
return library_config
# fake out so I don't have to change the prefs calls anywhere. The
# Java programmer in me is offended by op-overloading, but it's very
# tidy.
class PrefsFacade():
def _get_db(self):
if self.passed_db:
return self.passed_db
else:
# In the GUI plugin we want current db so we detect when
# it's changed. CLI plugin calls need to pass db in.
return get_gui().current_db
def __init__(self,passed_db=None):
self.default_prefs = default_prefs
self.libraryid = None
self.current_prefs = None
self.passed_db=passed_db
def _get_prefs(self):
libraryid = get_library_uuid(self._get_db())
if self.current_prefs == None or self.libraryid != libraryid:
#print("self.current_prefs == None(%s) or self.libraryid != libraryid(%s)"%(self.current_prefs == None,self.libraryid != libraryid))
self.libraryid = libraryid
self.current_prefs = get_library_config(self._get_db())
return self.current_prefs
def __getitem__(self,k):
prefs = self._get_prefs()
if k not in prefs:
# pulls from default_prefs.defaults automatically if not set
# in default_prefs
return self.default_prefs[k]
return prefs[k]
def __setitem__(self,k,v):
prefs = self._get_prefs()
prefs[k]=v
# self._save_prefs(prefs)
def __delitem__(self,k):
prefs = self._get_prefs()
if k in prefs:
del prefs[k]
def save_to_db(self):
set_library_config(self._get_prefs(),self._get_db())
prefs = PrefsFacade()
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2021, Jim Miller'
__docformat__ = 'restructuredtext en'
import logging
logger = logging.getLogger(__name__)
import copy
from calibre.gui2.ui import get_gui
# pulls in translation files for _() strings
try:
load_translations()
except NameError:
pass # load_translations() added in calibre 1.9
from calibre_plugins.fanficfare_plugin import __version__ as plugin_version
from calibre_plugins.fanficfare_plugin.common_utils import get_library_uuid
SKIP=_('Skip')
ADDNEW=_('Add New Book')
UPDATE=_('Update EPUB if New Chapters')
UPDATEALWAYS=_('Update EPUB Always')
OVERWRITE=_('Overwrite if Newer')
OVERWRITEALWAYS=_('Overwrite Always')
CALIBREONLY=_('Update Calibre Metadata from Web Site')
CALIBREONLYSAVECOL=_('Update Calibre Metadata from Saved Metadata Column')
collision_order=[SKIP,
ADDNEW,
UPDATE,
UPDATEALWAYS,
OVERWRITE,
OVERWRITEALWAYS,
CALIBREONLY,
CALIBREONLYSAVECOL,]
# best idea I've had for how to deal with config/pref saving the
# collision name in english.
SAVE_SKIP='Skip'
SAVE_ADDNEW='Add New Book'
SAVE_UPDATE='Update EPUB if New Chapters'
SAVE_UPDATEALWAYS='Update EPUB Always'
SAVE_OVERWRITE='Overwrite if Newer'
SAVE_OVERWRITEALWAYS='Overwrite Always'
SAVE_CALIBREONLY='Update Calibre Metadata Only'
SAVE_CALIBREONLYSAVECOL='Update Calibre Metadata Only(Saved Column)'
save_collisions={
SKIP:SAVE_SKIP,
ADDNEW:SAVE_ADDNEW,
UPDATE:SAVE_UPDATE,
UPDATEALWAYS:SAVE_UPDATEALWAYS,
OVERWRITE:SAVE_OVERWRITE,
OVERWRITEALWAYS:SAVE_OVERWRITEALWAYS,
CALIBREONLY:SAVE_CALIBREONLY,
CALIBREONLYSAVECOL:SAVE_CALIBREONLYSAVECOL,
SAVE_SKIP:SKIP,
SAVE_ADDNEW:ADDNEW,
SAVE_UPDATE:UPDATE,
SAVE_UPDATEALWAYS:UPDATEALWAYS,
SAVE_OVERWRITE:OVERWRITE,
SAVE_OVERWRITEALWAYS:OVERWRITEALWAYS,
SAVE_CALIBREONLY:CALIBREONLY,
SAVE_CALIBREONLYSAVECOL:CALIBREONLYSAVECOL,
}
anthology_collision_order=[UPDATE,
UPDATEALWAYS,
OVERWRITEALWAYS]
# Show translated strings, but save the same string in prefs so your
# prefs are the same in different languages.
YES=_('Yes, Always')
SAVE_YES='Yes'
YES_IF_IMG=_('Yes, if EPUB has a cover image')
SAVE_YES_IF_IMG='Yes, if img'
YES_UNLESS_IMG=_('Yes, unless FanFicFare found a cover image')
SAVE_YES_UNLESS_IMG='Yes, unless img'
YES_UNLESS_SITE=_('Yes, unless found on site')
SAVE_YES_UNLESS_SITE='Yes, unless site'
NO=_('No')
SAVE_NO='No'
prefs_save_options = {
YES:SAVE_YES,
SAVE_YES:YES,
YES_IF_IMG:SAVE_YES_IF_IMG,
SAVE_YES_IF_IMG:YES_IF_IMG,
YES_UNLESS_IMG:SAVE_YES_UNLESS_IMG,
SAVE_YES_UNLESS_IMG:YES_UNLESS_IMG,
NO:SAVE_NO,
SAVE_NO:NO,
YES_UNLESS_SITE:SAVE_YES_UNLESS_SITE,
SAVE_YES_UNLESS_SITE:YES_UNLESS_SITE,
}
updatecalcover_order=[YES,YES_IF_IMG,NO]
gencalcover_order=[YES,YES_UNLESS_IMG,NO]
do_wordcount_order=[YES,YES_UNLESS_SITE,NO]
PREFS_NAMESPACE = 'FanFicFarePlugin'
PREFS_KEY_SETTINGS = 'settings'
# Set defaults used by all. Library specific settings continue to
# take from here.
default_prefs = {}
default_prefs['last_saved_version'] = (0,0,0)
default_prefs['personal.ini'] = get_resources('plugin-example.ini')
default_prefs['cal_cols_pass_in'] = False
default_prefs['rejecturls'] = '' # removed, but need empty default for fallback
default_prefs['rejectreasons'] = '''Sucked
Boring
Dup from another site'''
default_prefs['reject_always'] = False
default_prefs['reject_delete_default'] = True
default_prefs['updatemeta'] = True
default_prefs['bgmeta'] = False
#default_prefs['updateepubcover'] = True # removed in favor of always True Oct 2022
default_prefs['keeptags'] = False
default_prefs['suppressauthorsort'] = False
default_prefs['suppresstitlesort'] = False
default_prefs['authorcase'] = False
default_prefs['titlecase'] = False
default_prefs['seriescase'] = False
default_prefs['setanthologyseries'] = False
default_prefs['mark'] = False
default_prefs['mark_success'] = True
default_prefs['mark_failed'] = True
default_prefs['mark_chapter_error'] = True
default_prefs['showmarked'] = False
default_prefs['autoconvert'] = False
default_prefs['urlsfromclip'] = True
default_prefs['button_instantpopup'] = False
default_prefs['updatedefault'] = True
default_prefs['fileform'] = 'epub'
default_prefs['collision'] = SAVE_UPDATE
default_prefs['deleteotherforms'] = False
default_prefs['adddialogstaysontop'] = False
default_prefs['lookforurlinhtml'] = False
default_prefs['checkforseriesurlid'] = True
default_prefs['auto_reject_seriesurlid'] = False
default_prefs['mark_series_anthologies'] = False
default_prefs['checkforurlchange'] = True
default_prefs['injectseries'] = False
default_prefs['matchtitleauth'] = True
default_prefs['do_wordcount'] = SAVE_YES_UNLESS_SITE
default_prefs['smarten_punctuation'] = False
default_prefs['show_est_time'] = False
default_prefs['send_lists'] = ''
default_prefs['read_lists'] = ''
default_prefs['addtolists'] = False
default_prefs['addtoreadlists'] = False
default_prefs['addtolistsonread'] = False
default_prefs['autounnew'] = False
default_prefs['updatecalcover'] = SAVE_YES_IF_IMG
default_prefs['covernewonly'] = False
default_prefs['gencalcover'] = SAVE_YES_UNLESS_IMG
default_prefs['updatecover'] = False
default_prefs['calibre_gen_cover'] = True
default_prefs['plugin_gen_cover'] = False
default_prefs['gcnewonly'] = True
default_prefs['gc_site_settings'] = {}
default_prefs['allow_gc_from_ini'] = True
default_prefs['gc_polish_cover'] = False
default_prefs['countpagesstats'] = []
default_prefs['wordcountmissing'] = False
default_prefs['errorcol'] = ''
default_prefs['save_all_errors'] = True
default_prefs['savemetacol'] = ''
default_prefs['lastcheckedcol'] = ''
default_prefs['custom_cols'] = {}
default_prefs['custom_cols_newonly'] = {}
default_prefs['allow_custcol_from_ini'] = True
default_prefs['std_cols_newonly'] = {}
default_prefs['set_author_url'] = True
default_prefs['set_series_url'] = True
default_prefs['includecomments'] = False
default_prefs['anth_comments_newonly'] = True
default_prefs['imapserver'] = ''
default_prefs['imapuser'] = ''
default_prefs['imappass'] = ''
default_prefs['imapsessionpass'] = False
default_prefs['imapfolder'] = 'INBOX'
default_prefs['imaptags'] = ''
default_prefs['imapmarkread'] = True
default_prefs['auto_reject_from_email'] = False
default_prefs['update_existing_only_from_email'] = False
default_prefs['download_from_email_immediately'] = False
#default_prefs['single_proc_jobs'] = True # setting and code removed
default_prefs['site_split_jobs'] = True
default_prefs['reconsolidate_jobs'] = True
def set_library_config(library_config,db,setting=PREFS_KEY_SETTINGS):
db.prefs.set_namespaced(PREFS_NAMESPACE,
setting,
library_config)
def get_library_config(db,setting=PREFS_KEY_SETTINGS,def_prefs=default_prefs):
library_id = get_library_uuid(db)
library_config = None
if library_config is None:
#print("get prefs from db")
library_config = db.prefs.get_namespaced(PREFS_NAMESPACE,
setting)
if library_config is None:
# defaults.
logger.info("Using default settings")
library_config = copy.deepcopy(def_prefs)
return library_config
# fake out so I don't have to change the prefs calls anywhere. The
# Java programmer in me is offended by op-overloading, but it's very
# tidy.
class PrefsFacade():
def _get_db(self):
if self.passed_db:
return self.passed_db
else:
# In the GUI plugin we want current db so we detect when
# it's changed. CLI plugin calls need to pass db in.
return get_gui().current_db
def __init__(self,passed_db=None,setting=PREFS_KEY_SETTINGS,def_prefs=default_prefs):
self.default_prefs = def_prefs
self.setting=setting
self.libraryid = None
self.current_prefs = None
self.passed_db=passed_db
def _get_prefs(self):
libraryid = get_library_uuid(self._get_db())
if self.current_prefs == None or self.libraryid != libraryid:
#print("self.current_prefs == None(%s) or self.libraryid != libraryid(%s)"%(self.current_prefs == None,self.libraryid != libraryid))
self.libraryid = libraryid
self.current_prefs = get_library_config(self._get_db(),
setting=self.setting,
def_prefs=self.default_prefs)
return self.current_prefs
def __getitem__(self,k):
prefs = self._get_prefs()
if k not in prefs:
# pulls from default_prefs.defaults automatically if not set
# in default_prefs
return self.default_prefs[k]
return prefs[k]
def __setitem__(self,k,v):
prefs = self._get_prefs()
prefs[k]=v
# self._save_prefs(prefs)
def __delitem__(self,k):
prefs = self._get_prefs()
if k in prefs:
del prefs[k]
def save_to_db(self):
self['last_saved_version'] = plugin_version
set_library_config(self._get_prefs(),self._get_db(),setting=self.setting)
prefs = PrefsFacade(setting=PREFS_KEY_SETTINGS,
def_prefs=default_prefs)
rejects_data = PrefsFacade(setting="rejects_data",
def_prefs={'rejecturls_data':[]})

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -18,6 +18,7 @@ logger = logging.getLogger(__name__)
import re
from calibre.ebooks.oeb.iterator import EbookIterator
from fanficfare.six import text_type as unicode
RE_HTML_BODY = re.compile(u'<body[^>]*>(.*)</body>', re.UNICODE | re.DOTALL | re.IGNORECASE)
RE_STRIP_MARKUP = re.compile(u'<[^>]+>', re.UNICODE)
@ -28,7 +29,7 @@ def get_word_count(book_path):
Estimate a word count
'''
from calibre.utils.localization import get_lang
iterator = _open_epub_file(book_path)
lang = iterator.opf.language
@ -52,7 +53,7 @@ def _get_epub_standard_word_count(iterator, lang='en'):
'''
book_text = _read_epub_contents(iterator, strip_html=True)
try:
from calibre.spell.break_iterator import count_words
wordcount = count_words(book_text, lang)
@ -67,7 +68,7 @@ def _get_epub_standard_word_count(iterator, lang='en'):
wordcount = get_wordcount_obj(book_text)
wordcount = wordcount.words
logger.debug('\tWord count - old method:%s'%wordcount)
return wordcount
def _read_epub_contents(iterator, strip_html=False):
@ -92,4 +93,3 @@ def _extract_body_text(data):
if body:
return RE_STRIP_MARKUP.sub('', body[0]).replace('.','. ')
return ''

View file

@ -1,7 +1,23 @@
# coding: utf-8
# -*- coding: utf-8 -*-
# Copyright 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import re
import codecs
stack = []
@ -54,4 +70,4 @@ def flush():
del stack[:]
def get_stack():
return stack
return stack

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2015 Fanficdownloader team, 2016 FanFicFare team
# Copyright 2015 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the 'License');
# you may not use this file except in compliance with the License.
@ -14,20 +14,23 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
try:
# just a way to switch between web service and CLI/PI
import google.appengine.api
try: # just a way to switch between CLI and PI
from calibre.constants import DEBUG
if os.environ.get('CALIBRE_WORKER', None) is not None or DEBUG:
loghandler.setLevel(logging.DEBUG)
logger.setLevel(logging.DEBUG)
else:
loghandler.setLevel(logging.CRITICAL)
logger.setLevel(logging.CRITICAL)
except:
try: # just a way to switch between CLI and PI
import calibre.constants
except:
import sys
if sys.version_info >= (2, 7):
import logging
logger = logging.getLogger(__name__)
loghandler=logging.StreamHandler()
loghandler.setFormatter(logging.Formatter("FFF: %(levelname)s: %(asctime)s: %(filename)s(%(lineno)d): %(message)s"))
logger.addHandler(loghandler)
loghandler.setLevel(logging.DEBUG)
logger.setLevel(logging.DEBUG)
import sys
if sys.version_info >= (2, 7):
import logging
logger = logging.getLogger(__name__)
loghandler=logging.StreamHandler()
loghandler.setFormatter(logging.Formatter("FFF: %(levelname)s: %(asctime)s: %(filename)s(%(lineno)d): %(message)s"))
logger.addHandler(loghandler)
loghandler.setLevel(logging.DEBUG)
logger.setLevel(logging.DEBUG)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2016 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,159 +15,132 @@
# limitations under the License.
#
import os, re, sys, glob, types
from os.path import dirname, basename, normpath
from __future__ import absolute_import
import os, re, sys, types
from contextlib import contextmanager
import logging
import urlparse as up
# py2 vs py3 transition
from ..six.moves.urllib.parse import urlparse
logger = logging.getLogger(__name__)
from .. import exceptions as exceptions
from ..configurable import Configuration
from .. import configurable as configurable
## must import each adapter here.
import adapter_test1
import adapter_fanfictionnet
import adapter_fanficcastletvnet
import adapter_fictionalleyorg
import adapter_fictionpresscom
import adapter_ficwadcom
import adapter_fimfictionnet
import adapter_harrypotterfanfictioncom
import adapter_mediaminerorg
import adapter_potionsandsnitches
import adapter_tenhawkpresentscom
import adapter_adastrafanficcom
import adapter_tthfanficorg
import adapter_twilightednet
import adapter_whoficcom
import adapter_siyecouk
import adapter_archiveofourownorg
import adapter_ficbooknet
import adapter_portkeyorg
import adapter_mugglenetcom
import adapter_hpfandomnet
import adapter_nfacommunitycom
import adapter_midnightwhispersca
import adapter_ksarchivecom
import adapter_archiveskyehawkecom
import adapter_squidgeorgpeja
import adapter_libraryofmoriacom
import adapter_wraithbaitcom
import adapter_chaossycophanthexcom
import adapter_dramioneorg
import adapter_erosnsapphosycophanthexcom
import adapter_lumossycophanthexcom
import adapter_occlumencysycophanthexcom
import adapter_phoenixsongnet
import adapter_walkingtheplankorg
import adapter_ashwindersycophanthexcom
import adapter_thehexfilesnet
import adapter_dokugacom
import adapter_iketernalnet
import adapter_onedirectionfanfictioncom
import adapter_storiesofardacom
import adapter_samdeanarchivenu
import adapter_destinysgatewaycom
import adapter_ncisfictionnet
import adapter_thealphagatecom
import adapter_fanfiktionde
import adapter_ponyfictionarchivenet
import adapter_ncisficcom
import adapter_nationallibrarynet
import adapter_themasquenet
import adapter_pretendercentrecom
import adapter_darksolaceorg
import adapter_finestoriescom
import adapter_hpfanficarchivecom
import adapter_twilightarchivescom
import adapter_nhamagicalworldsus
import adapter_hlfictionnet
import adapter_dracoandginnycom
import adapter_scarvesandcoffeenet
import adapter_thepetulantpoetesscom
import adapter_wolverineandroguecom
import adapter_sinfuldesireorg
import adapter_merlinficdtwinscouk
import adapter_thehookupzonenet
import adapter_bloodtiesfancom
import adapter_indeathnet
import adapter_qafficcom
import adapter_efpfanficnet
import adapter_potterficscom
import adapter_efictionestelielde
import adapter_pommedesangcom
import adapter_restrictedsectionorg
import adapter_imagineeficcom
import adapter_psychficcom
import adapter_asr3slashzoneorg
import adapter_potterheadsanonymouscom
import adapter_fictionpadcom
import adapter_storiesonlinenet
import adapter_trekiverseorg
import adapter_literotica
import adapter_voracity2eficcom
import adapter_spikeluvercom
import adapter_bloodshedversecom
import adapter_nocturnallightnet
import adapter_fanfichu
import adapter_fictionmaniatv
import adapter_tolkienfanfiction
import adapter_themaplebookshelf
import adapter_fannation
import adapter_sheppardweircom
import adapter_samandjacknet
import adapter_csiforensicscom
import adapter_lotrfanfictioncom
import adapter_fhsarchivecom
import adapter_fanfictionjunkiesde
import adapter_tgstorytimecom
import adapter_itcouldhappennet
import adapter_forumsspacebattlescom
import adapter_forumssufficientvelocitycom
import adapter_forumquestionablequestingcom
import adapter_ninelivesarchivecom
import adapter_masseffect2in
import adapter_quotevcom
import adapter_mcstoriescom
import adapter_buffygilescom
import adapter_andromedawebcom
import adapter_artemisfowlcom
import adapter_naiceanilmenet
import adapter_deepinmysoulnet
import adapter_haremlucifaelcom
import adapter_kiarepositorymujajinet
import adapter_fanfictionlucifaelcom
import adapter_adultfanfictionorg
import adapter_fictionhuntcom
import adapter_royalroadl
import adapter_chosentwofanficcom
import adapter_bdsmlibrarycom
import adapter_ficsitecom
import adapter_asexstoriescom
import adapter_gluttonyfictioncom
import adapter_valentchambercom
import adapter_looselugscom
import adapter_wwwgiantessworldnet
import adapter_lotrgficcom
import adapter_tomparisdormcom
import adapter_writingwhimsicalwanderingsnet
import adapter_sugarquillnet
import adapter_wwwarea52hkhnet
import adapter_starslibrarynet
import adapter_fanficauthorsnet
import adapter_fireflyfansnet
import adapter_fireflypopulliorg
import adapter_sebklainenet
import adapter_shriftweborgbfa
import adapter_trekfanfictionnet
import adapter_wuxiaworldcom
import adapter_wwwlushstoriescom
import adapter_wwwutopiastoriescom
import adapter_sinfuldreamscomunicornfic
import adapter_sinfuldreamscomwhisperedmuse
import adapter_sinfuldreamscomwickedtemptation
from . import base_adapter
from . import base_efiction_adapter
from . import adapter_test1
from . import adapter_test2
from . import adapter_test3
from . import adapter_test4
from . import adapter_fanfictionnet
from . import adapter_fictionalleyarchiveorg
from . import adapter_fictionpresscom
from . import adapter_ficwadcom
from . import adapter_fimfictionnet
from . import adapter_mediaminerorg
from . import adapter_potionsandsnitches
from . import adapter_tenhawkpresents
from . import adapter_adastrafanficcom
from . import adapter_tthfanficorg
from . import adapter_twilightednet
from . import adapter_whoficcom
from . import adapter_siyecouk
from . import adapter_archiveofourownorg
from . import adapter_ficbooknet
from . import adapter_midnightwhispers
from . import adapter_ksarchivecom
from . import adapter_libraryofmoriacom
from . import adapter_ashwindersycophanthexcom
from . import adapter_chaossycophanthexcom
from . import adapter_erosnsapphosycophanthexcom
from . import adapter_lumossycophanthexcom
from . import adapter_occlumencysycophanthexcom
from . import adapter_phoenixsongnet
from . import adapter_walkingtheplankorg
from . import adapter_dokugacom
from . import adapter_storiesofardacom
from . import adapter_ncisfictioncom
from . import adapter_fanfiktionde
from . import adapter_themasquenet
from . import adapter_pretendercentrecom
from . import adapter_darksolaceorg
from . import adapter_storyroomcom
from . import adapter_dracoandginnycom
from . import adapter_wolverineandroguecom
from . import adapter_thehookupzonenet
from . import adapter_efpfanficnet
from . import adapter_imagineeficcom
from . import adapter_storiesonlinenet
from . import adapter_literotica
from . import adapter_voracity2eficcom
from . import adapter_spikeluvercom
from . import adapter_bloodshedversecom
from . import adapter_fictionmaniatv
from . import adapter_sheppardweircom
from . import adapter_samandjacknet
from . import adapter_tgstorytimecom
from . import adapter_forumsspacebattlescom
from . import adapter_forumssufficientvelocitycom
from . import adapter_forumquestionablequestingcom
from . import adapter_ninelivesarchivecom
from . import adapter_masseffect2in
from . import adapter_quotevcom
from . import adapter_mcstoriescom
from . import adapter_naiceanilmenet
from . import adapter_adultfanfictionorg
from . import adapter_fictionhuntcom
from . import adapter_royalroadcom
from . import adapter_chosentwofanficcom
from . import adapter_bdsmlibrarycom
from . import adapter_asexstoriescom
from . import adapter_gluttonyfictioncom
from . import adapter_valentchambercom
from . import adapter_wwwgiantessworldnet
from . import adapter_starslibrarynet
from . import adapter_fanficauthorsnet
from . import adapter_fireflyfansnet
from . import adapter_trekfanfictionnet
from . import adapter_wwwutopiastoriescom
from . import adapter_sinfuldreamscomunicornfic
from . import adapter_sinfuldreamscomwickedtemptation
from . import adapter_asianfanficscom
from . import adapter_mttjustoncenet
from . import adapter_narutoficorg
from . import adapter_thedelphicexpansecom
from . import adapter_wwwaneroticstorycom
from . import adapter_lcfanficcom
from . import adapter_inkbunnynet
from . import adapter_alternatehistorycom
from . import adapter_wattpadcom
from . import adapter_novelonlinefullcom
from . import adapter_wwwnovelallcom
from . import adapter_hentaifoundrycom
from . import adapter_mugglenetfanfictioncom
from . import adapter_fanficsme
from . import adapter_fanfictalkcom
from . import adapter_scifistoriescom
from . import adapter_chireadscom
from . import adapter_scribblehubcom
from . import adapter_fictionlive
from . import adapter_thesietchcom
from . import adapter_squidgeworldorg
from . import adapter_novelfull
from . import adapter_psychficcom
from . import adapter_deviantartcom
from . import adapter_readonlymindcom
from . import adapter_wwwsunnydaleafterdarkcom
from . import adapter_syosetucom
from . import adapter_kakuyomujp
from . import adapter_fanfictionsfr
from . import adapter_touchfluffytail
from . import adapter_spiritfanfictioncom
from . import adapter_superlove
from . import adapter_cfaa
from . import adapter_althistorycom
## This bit of complexity allows adapters to be added by just adding
## importing. It eliminates the long if/else clauses we used to need
@ -178,9 +151,11 @@ __class_list = []
__domain_map = {}
def imports():
out = []
for name, val in globals().items():
if isinstance(val, types.ModuleType):
yield val.__name__
out.append(val.__name__)
return out
for x in imports():
if "fanficfare.adapters.adapter_" in x:
@ -192,6 +167,32 @@ for x in imports():
l.append(cls)
__domain_map[site]=l
def get_url_chapter_range(url_in):
# Allow chapter range with URL.
# like test1.com?sid=5[4-6] or [4,6]
mc = re.match(r"^(?P<url>.*?)(?:\[(?P<begin>\d+)?(?P<comma>[,-])?(?P<end>\d+)?\])?$",url_in)
#print("url:(%s) begin:(%s) end:(%s)"%(mc.group('url'),mc.group('begin'),mc.group('end')))
url = mc.group('url')
ch_begin = mc.group('begin')
ch_end = mc.group('end')
if ch_begin and not mc.group('comma'):
ch_end = ch_begin
return url,ch_begin,ch_end
# Call as ' with busy_cursor:"
@contextmanager
def lightweight_adapter(url):
adapter = None
try:
if not getNormalStoryURL.__dummyconfig:
getNormalStoryURL.__dummyconfig = configurable.Configuration(["test1.com"],"EPUB",lightweight=True)
adapter = getAdapter(getNormalStoryURL.__dummyconfig,url)
yield adapter
except:
yield None
finally:
del adapter
def getNormalStoryURL(url):
r = getNormalStoryURLSite(url)
if r:
@ -199,24 +200,45 @@ def getNormalStoryURL(url):
else:
return None
def getNormalStoryURLSite(url):
# print("getNormalStoryURLSite:%s"%url)
if not getNormalStoryURL.__dummyconfig:
getNormalStoryURL.__dummyconfig = Configuration(["test1.com"],"EPUB",lightweight=True)
# pulling up an adapter is pretty low over-head. If
# it fails, it's a bad url.
try:
adapter = getAdapter(getNormalStoryURL.__dummyconfig,url)
url = adapter.url
site = adapter.getSiteDomain()
del adapter
return (url,site)
except:
return None
# kludgey function static/singleton
# Note it's *not* on lightweight_adapter because it can't reference
# itself in its definition.
getNormalStoryURL.__dummyconfig = None
def getNormalStoryURLSite(url):
with lightweight_adapter(url) as adapter:
if adapter:
return (adapter.url,adapter.getSiteDomain())
else:
return None
## Originally defined for INI [storyUrl] sections where story URL
## contains a title that can change, now also used for reject list.
## waaaay faster with classmethod.
def get_section_url(url):
cls = _get_class_for(url)[0]
if cls:
return cls.get_section_url(url)
else:
## might be a url from a removed adapter.
## return unchanged in that case.
return url
def get_url_search(url):
'''
For adapters that have story URLs that can change. This is
used for searching the Calibre library by identifiers:url for
sites (generally) that contain author or title that can
change, but also have a unique identifier that doesn't.
returns a string containing a regexp, not a compiled re object.
'''
cls = _get_class_for(url)[0]
if not cls:
## still apply common processing.
cls = base_adapter.BaseSiteAdapter
return cls.get_url_search(url)
def getAdapter(config,url,anyurl=False):
#logger.debug("trying url:"+url)
@ -244,8 +266,7 @@ def getConfigSections():
def get_bulk_load_sites():
# for now, all eFiction Base adapters are assumed to allow bulk_load.
sections = set()
for cls in filter( lambda x : issubclass(x,base_efiction_adapter.BaseEfictionAdapter),
__class_list):
for cls in [x for x in __class_list if issubclass(x,base_efiction_adapter.BaseEfictionAdapter) ]:
sections.update( [ x.replace('www.','') for x in cls.getConfigSections() ] )
return sections
@ -270,13 +291,13 @@ def _get_class_for(url):
fixedurl = "http:%s"%url
if not fixedurl.startswith("http"):
fixedurl = "http://%s"%url
## remove any trailing '#' locations, except for #post-12345 for
## XenForo
if not "#post-" in fixedurl:
fixedurl = re.sub(r"#.*$","",fixedurl)
parsedUrl = up.urlparse(fixedurl)
parsedUrl = urlparse(fixedurl)
domain = parsedUrl.netloc.lower()
if( domain != parsedUrl.netloc ):
fixedurl = fixedurl.replace(parsedUrl.netloc,domain)
@ -295,14 +316,15 @@ def _get_class_for(url):
fixedurl = re.sub(r"^http(s?)://",r"http\1://www.",fixedurl)
cls = None
if len(clslst) == 1:
cls = clslst[0]
elif len(clslst) > 1:
for c in clslst:
if c.getSiteURLFragment() in fixedurl:
cls = c
break
if clslst:
if len(clslst) == 1:
cls = clslst[0]
elif len(clslst) > 1:
for c in clslst:
if c.getSiteURLFragment() in fixedurl:
cls = c
break
if cls:
fixedurl = cls.stripURLParameters(fixedurl)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,222 +15,24 @@
# limitations under the License.
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
class AdAstraFanficComSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','aaff')
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
@staticmethod
def getSiteDomain():
return 'www.adastrafanfic.com'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
addurl = "&warning=5"
else:
addurl=""
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if "Content is only suitable for mature adults. May contain explicit language and adult themes. Equivalent of NC-17." in data:
raise exceptions.AdultCheckRequired(self.url)
# problems with some stories, but only in calibre. I suspect
# issues with different SGML parsers in python. This is a
# nasty hack, but it works.
data = data[data.index("<body"):]
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
## <meta name='description' content='&lt;p&gt;Description&lt;/p&gt; ...' >
## Summary, strangely, is in the content attr of a <meta name='description'> tag
## which is escaped HTML. Unfortunately, we can't use it because they don't
## escape (') chars in the desc, breakin the tag.
#meta_desc = soup.find('meta',{'name':'description'})
#metasoup = bs.BeautifulStoneSoup(meta_desc['content'])
#self.story.setMetadata('description',stripHTML(metasoup))
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ''
while value and 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
# sometimes poorly formated desc (<p> w/o </p>) leads
# to all labels being included.
svalue=svalue[:svalue.find('<span class="label">')]
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
catstext = [cat.string for cat in cats]
for cat in catstext:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
charstext = [char.string for char in chars]
for char in charstext:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
genrestext = [genre.string for genre in genres]
self.genre = ', '.join(genrestext)
for genre in genrestext:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
warningstext = [warning.string for warning in warnings]
self.warning = ', '.join(warningstext)
for warning in warningstext:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(value.strip(), "%d %b %Y"))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(value.strip(), "%d %b %Y"))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self_make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self._fetchUrl(url)
# problems with some stories, but only in calibre. I suspect
# issues with different SGML parsers in python. This is a
# nasty hack, but it works.
data = data[data.index("<body"):]
soup = self.make_soup(data)
span = soup.find('div', {'id' : 'story'})
if None == span:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,span)
from .base_otw_adapter import BaseOTWAdapter
def getClass():
return AdAstraFanficComSiteAdapter
return AdastrafanficComAdapter
class AdastrafanficComAdapter(BaseOTWAdapter):
def __init__(self, config, url):
BaseOTWAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','aaff')
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.adastrafanfic.com'

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# -- coding: utf-8 --
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2013 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -17,18 +17,19 @@
################################################################################
### Written by GComyn
################################################################################
from __future__ import absolute_import
from __future__ import unicode_literals
import time
import logging
logger = logging.getLogger(__name__)
import re
import sys
from bs4 import UnicodeDammit
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
################################################################################
@ -41,13 +42,7 @@ class AdultFanFictionOrgAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
logger.debug("AdultFanFictionOrgAdapter.__init__ - url='{0}'".format(url))
self.decode = ["utf8",
"Windows-1252", "iso-8859-1"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
# logger.debug("AdultFanFictionOrgAdapter.__init__ - url='{0}'".format(url))
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
@ -62,8 +57,8 @@ class AdultFanFictionOrgAdapter(BaseSiteAdapter):
# normalized story URL.(checking self.zone against list
# removed--it was redundant w/getAcceptDomains and
# getSiteURLPattern both)
self._setURL('http://{0}.{1}/story.php?no={2}'.format(self.zone, self.getBaseDomain(), self.story.getMetadata('storyId')))
#self._setURL('http://' + self.zone + '.' + self.getBaseDomain() + '/story.php?no='+self.story.getMetadata('storyId'))
self._setURL('https://{0}.{1}/story.php?no={2}'.format(self.zone, self.getBaseDomain(), self.story.getMetadata('storyId')))
#self._setURL('https://' + self.zone + '.' + self.getBaseDomain() + '/story.php?no='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
#self.story.setMetadata('siteabbrev',self.getSiteAbbrev())
@ -73,9 +68,7 @@ class AdultFanFictionOrgAdapter(BaseSiteAdapter):
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%Y-%m-%d"
self.dateformat = "%B %d, %Y"
## Added because adult-fanfiction.org does send you to
## www.adult-fanfiction.org when you go to it and it also moves
@ -118,79 +111,31 @@ class AdultFanFictionOrgAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(self):
return ("http://anime.adult-fanfiction.org/story.php?no=123456789 "
+ "http://anime2.adult-fanfiction.org/story.php?no=123456789 "
+ "http://bleach.adult-fanfiction.org/story.php?no=123456789 "
+ "http://books.adult-fanfiction.org/story.php?no=123456789 "
+ "http://buffy.adult-fanfiction.org/story.php?no=123456789 "
+ "http://cartoon.adult-fanfiction.org/story.php?no=123456789 "
+ "http://celeb.adult-fanfiction.org/story.php?no=123456789 "
+ "http://comics.adult-fanfiction.org/story.php?no=123456789 "
+ "http://ff.adult-fanfiction.org/story.php?no=123456789 "
+ "http://games.adult-fanfiction.org/story.php?no=123456789 "
+ "http://hp.adult-fanfiction.org/story.php?no=123456789 "
+ "http://inu.adult-fanfiction.org/story.php?no=123456789 "
+ "http://lotr.adult-fanfiction.org/story.php?no=123456789 "
+ "http://manga.adult-fanfiction.org/story.php?no=123456789 "
+ "http://movies.adult-fanfiction.org/story.php?no=123456789 "
+ "http://naruto.adult-fanfiction.org/story.php?no=123456789 "
+ "http://ne.adult-fanfiction.org/story.php?no=123456789 "
+ "http://original.adult-fanfiction.org/story.php?no=123456789 "
+ "http://tv.adult-fanfiction.org/story.php?no=123456789 "
+ "http://xmen.adult-fanfiction.org/story.php?no=123456789 "
+ "http://ygo.adult-fanfiction.org/story.php?no=123456789 "
+ "http://yuyu.adult-fanfiction.org/story.php?no=123456789")
return ("https://anime.adult-fanfiction.org/story.php?no=123456789 "
+ "https://anime2.adult-fanfiction.org/story.php?no=123456789 "
+ "https://bleach.adult-fanfiction.org/story.php?no=123456789 "
+ "https://books.adult-fanfiction.org/story.php?no=123456789 "
+ "https://buffy.adult-fanfiction.org/story.php?no=123456789 "
+ "https://cartoon.adult-fanfiction.org/story.php?no=123456789 "
+ "https://celeb.adult-fanfiction.org/story.php?no=123456789 "
+ "https://comics.adult-fanfiction.org/story.php?no=123456789 "
+ "https://ff.adult-fanfiction.org/story.php?no=123456789 "
+ "https://games.adult-fanfiction.org/story.php?no=123456789 "
+ "https://hp.adult-fanfiction.org/story.php?no=123456789 "
+ "https://inu.adult-fanfiction.org/story.php?no=123456789 "
+ "https://lotr.adult-fanfiction.org/story.php?no=123456789 "
+ "https://manga.adult-fanfiction.org/story.php?no=123456789 "
+ "https://movies.adult-fanfiction.org/story.php?no=123456789 "
+ "https://naruto.adult-fanfiction.org/story.php?no=123456789 "
+ "https://ne.adult-fanfiction.org/story.php?no=123456789 "
+ "https://original.adult-fanfiction.org/story.php?no=123456789 "
+ "https://tv.adult-fanfiction.org/story.php?no=123456789 "
+ "https://xmen.adult-fanfiction.org/story.php?no=123456789 "
+ "https://ygo.adult-fanfiction.org/story.php?no=123456789 "
+ "https://yuyu.adult-fanfiction.org/story.php?no=123456789")
def getSiteURLPattern(self):
return r'http?://(anime|anime2|bleach|books|buffy|cartoon|celeb|comics|ff|games|hp|inu|lotr|manga|movies|naruto|ne|original|tv|xmen|ygo|yuyu)\.adult-fanfiction\.org/story\.php\?no=\d+$'
##This is not working right now, so I'm commenting it out, but leaving it for future testing
## Login seems to be reasonably standard across eFiction sites.
#def needToLoginCheck(self, data):
##This adapter will always require a login
# return True
# <form name="login" method="post" action="">
# <div class="top">E-mail: <span id="sprytextfield1">
# <input name="email" type="text" id="email" size="20" maxlength="255" />
# <span class="textfieldRequiredMsg">Email is required.</span><span class="textfieldInvalidFormatMsg">Invalid E-mail.</span></span></div>
# <div class="top">Password: <span id="sprytextfield2">
# <input name="pass1" type="password" id="pass1" size="20" maxlength="32" />
# <span class="textfieldRequiredMsg">password is required.</span><span class="textfieldMinCharsMsg">Minimum 8 characters8.</span><span class="textfieldMaxCharsMsg">Exceeded 32 characters.</span></span></div>
# <div class="top"><br /> <input name="loginsubmittop" type="hidden" id="loginsubmit" value="TRUE" />
# <input type="submit" value="Login" />
# </div>
# </form>
##This is not working right now, so I'm commenting it out, but leaving it for future testing
#def performLogin(self, url, soup):
# params = {}
# if self.password:
# params['email'] = self.username
# params['pass1'] = self.password
# else:
# params['email'] = self.getConfig("username")
# params['pass1'] = self.getConfig("password")
# params['submit'] = 'Login'
# # copy all hidden input tags to pick up appropriate tokens.
# for tag in soup.findAll('input',{'type':'hidden'}):
# params[tag['name']] = tag['value']
# logger.debug("Will now login to URL {0} as {1} with password: {2}".format(url, params['email'],params['pass1']))
# d = self._postUrl(url, params, usecache=False)
# d = self._fetchUrl(url, params, usecache=False)
# soup = self.make_soup(d)
#if not (soup.find('form', {'name' : 'login'}) == None):
# logger.info("Failed to login to URL %s as %s" % (url, params['email']))
# raise exceptions.FailedToLogin(url,params['email'])
# return False
#else:
# return True
return r'https?://(anime|anime2|bleach|books|buffy|cartoon|celeb|comics|ff|games|hp|inu|lotr|manga|movies|naruto|ne|original|tv|xmen|ygo|yuyu)\.adult-fanfiction\.org/story\.php\?no=\d+$'
## Getting the chapter list and the meta data, plus 'is adult' checking.
def doExtractChapterUrlsAndMetadata(self, get_cover=True):
@ -198,212 +143,109 @@ class AdultFanFictionOrgAdapter(BaseSiteAdapter):
## You need to have your is_adult set to true to get this story
if not (self.is_adult or self.getConfig("is_adult")):
raise exceptions.AdultCheckRequired(self.url)
else:
d = self.post_request('https://www.adult-fanfiction.org/globals/ajax/age-verify.php', {"verify":"1"})
if "Age verified successfully" not in d:
raise exceptions.FailedToDownload("Failed to Verify Age: {0}".format(d))
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist("Code: 404. {0}".format(url))
elif e.code == 410:
raise exceptions.StoryDoesNotExist("Code: 410. {0}".format(url))
elif e.code == 401:
self.needToLogin = True
data = ''
else:
raise e
data = self.get_request(url)
# logger.debug(data)
if "The dragons running the back end of the site can not seem to find the story you are looking for." in data:
raise exceptions.StoryDoesNotExist("{0}.{1} says: The dragons running the back end of the site can not seem to find the story you are looking for.".format(self.zone, self.getBaseDomain()))
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
##This is not working right now, so I'm commenting it out, but leaving it for future testing
#self.performLogin(url, soup)
# Now go hunting for all the meta data and the chapter list.
soup = self.make_soup(data)
## Title
## Some of the titles have a backslash on the story page, but not on the Author's page
## So I am removing it from the title, so it can be found on the Author's page further in the code.
## Also, some titles may have extra spaces ' ', and the search on the Author's page removes them,
## so I have to here as well. I used multiple replaces to make sure, since I did the same below.
a = soup.find('a', href=re.compile(r'story.php\?no='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a).replace('\\','').replace(' ',' ').replace(' ',' ').replace(' ',' ').strip())
h1 = soup.find('h1')
# logger.debug("Title:%s"%h1)
self.story.setMetadata('title',stripHTML(h1).replace('\\','').replace(' ',' ').replace(' ',' ').replace(' ',' ').strip())
# Find the chapters from first list only
chapters = soup.select_one('select.chapter-select').select('option')
for chapter in chapters:
self.add_chapter(chapter,self.url+'&chapter='+chapter['value'])
# Find the chapters:
chapters = soup.find('div',{'id':'snav'})
for i, chapter in enumerate(chapters.findAll('a')):
self.chapterUrls.append((stripHTML(chapter),self.url+'&chapter='+str(i+1)))
self.story.setMetadata('numChapters', len(self.chapterUrls))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"profile.php\?no=\d+"))
a = soup.find('a', href=re.compile(r"profile.php\?id=\d+"))
if a == None:
# I know that the original author of fanficfare wants to always have metadata,
# I know that the original author of fanficfare wants to always have metadata,
# but I posit that if the story is there, even if we can't get the metadata from the
# author page, the story should still be able to be downloaded, which is what I've done here.
self.story.setMetadata('authorId','000000000')
self.story.setMetadata('authorUrl','http://www.adult-fanfiction.org')
self.story.setMetadata('authorUrl','https://www.adult-fanfiction.org')
self.story.setMetadata('author','Unknown')
logger.warning('There was no author found for the story... Metadata will not be retreived.')
self.setDescription(url,'>>>>>>>>>> No Summary Given <<<<<<<<<<')
self.setDescription(url,'>>>>>>>>>> No Summary Given, Unknown Author <<<<<<<<<<')
else:
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl',a['href'])
self.story.setMetadata('author',stripHTML(a))
##The story page does not give much Metadata, so we go to the Author's page
##Get the first Author page to see if there are multiple pages.
##AFF doesn't care if the page number is larger than the actual pages,
##it will continue to show the last page even if the variable is larger than the actual page
author_Url = '{0}&view=story&zone={1}&page=1'.format(self.story.getMetadata('authorUrl'), self.zone)
#author_Url = self.story.getMetadata('authorUrl')+'&view=story&zone='+self.zone+'&page=1'
##I'm resetting the author page to the zone for this story
self.story.setMetadata('authorUrl',author_Url)
logger.debug('Getting the author page: {0}'.format(author_Url))
try:
adata = self._fetchUrl(author_Url)
except urllib2.HTTPError, e:
if e.code in 404:
raise exceptions.StoryDoesNotExist("Author Page: Code: 404. {0}".format(author_Url))
elif e.code == 410:
raise exceptions.StoryDoesNotExist("Author Page: Code: 410. {0}".format(author_Url))
else:
raise e
if "The member you are looking for does not exist." in adata:
raise exceptions.StoryDoesNotExist("{0}.{1} says: The member you are looking for does not exist.".format(self.zone, self.getBaseDomain()))
#raise exceptions.StoryDoesNotExist(self.zone+'.'+self.getBaseDomain() +" says: The member you are looking for does not exist.")
## The story page does not give much Metadata, so we go to
## the Author's page. Except it's actually a sub-req for
## list of author's stories for that subdomain
author_Url = 'https://members.{0}/load-user-stories.php?subdomain={1}&uid={2}'.format(
self.getBaseDomain(),
self.zone,
self.story.getMetadata('authorId'))
logger.debug('Getting the load-user-stories page: {0}'.format(author_Url))
adata = self.get_request(author_Url)
none_found = "No stories found in this category."
if none_found in adata:
raise exceptions.StoryDoesNotExist("{0}.{1} says: {2}".format(self.zone, self.getBaseDomain(), none_found))
asoup = self.make_soup(adata)
##Getting the number of pages
pages=asoup.find('div',{'class' : 'pagination'}).findAll('li')[-1].find('a')
if not pages == None:
pages = pages['href'].split('=')[-1]
else:
pages = 0
##If there is only 1 page of stories, check it to get the Metadata,
if pages == 0:
a = asoup.findAll('li')
for lc2 in a:
if lc2.find('a', href=re.compile(r'story.php\?no='+self.story.getMetadata('storyId')+"$")):
break
## otherwise go through the pages
else:
page=1
i=0
while i == 0:
##We already have the first page, so if this is the first time through, skip getting the page
if page != 1:
author_Url = '{0}&view=story&zone={1}&page={2}'.format(self.story.getMetadata('authorUrl'), self.zone, str(page))
logger.debug('Getting the author page: {0}'.format(author_Url))
try:
adata = self._fetchUrl(author_Url)
except urllib2.HTTPError, e:
if e.code in 404:
raise exceptions.StoryDoesNotExist("Author Page: Code: 404. {0}".format(author_Url))
elif e.code == 410:
raise exceptions.StoryDoesNotExist("Author Page: Code: 410. {0}".format(author_Url))
else:
raise e
##This will probably never be needed, since AFF doesn't seem to care what number you put as
## the page number, it will default to the last page, even if you use 1000, for an author
## that only hase 5 pages of stories, but I'm keeping it in to appease Saint Justin Case (just in case).
if "The member you are looking for does not exist." in adata:
raise exceptions.StoryDoesNotExist("{0}.{1} says: The member you are looking for does not exist.".format(self.zone, self.getBaseDomain()))
# we look for the li element that has the story here
asoup = self.make_soup(adata)
a = asoup.findAll('li')
for lc2 in a:
if lc2.find('a', href=re.compile(r'story.php\?no='+self.story.getMetadata('storyId')+"$")):
i=1
break
page = page + 1
if page > pages:
break
##Split the Metadata up into a list
##We have to change the soup type to a string, then remove the newlines, and double spaces,
##then changes the <br/> to '-:-', which seperates the different elemeents.
##Then we strip the HTML elements from the string.
##There is also a double <br/>, so we have to fix that, then remove the leading and trailing '-:-'.
##They are always in the same order.
## EDIT 09/26/2016: Had some trouble with unicode errors... so I had to put in the decode/encode parts to fix it
liMetadata = str(lc2).decode('utf-8').replace('\n','').replace('\r','').replace('\t',' ').replace(' ',' ').replace(' ',' ').replace(' ',' ')
liMetadata = stripHTML(liMetadata.replace(r'<br/>','-:-').replace('<!-- <br /-->','-:-'))
liMetadata = liMetadata.strip('-:-').strip('-:-').encode('utf-8')
for i, value in enumerate(liMetadata.decode('utf-8').split('-:-')):
if i == 0:
# The value for the title has been manipulated, so may not be the same as gotten at the start.
# I'm going to use the href from the lc2 retrieved from the author's page to determine if it is correct.
if lc2.find('a', href=re.compile(r'story.php\?no='+self.story.getMetadata('storyId')+"$"))['href'] != url:
raise exceptions.StoryDoesNotExist('Did not find story in author story list: {0}'.format(author_Url))
elif i == 1:
##Get the description
self.setDescription(url,stripHTML(value.strip()))
else:
# the rest of the values can be missing, so instead of hardcoding the numbers, we search for them.
if 'Located :' in value:
self.story.setMetadata('category',value.replace(r'&gt;',r'>').replace(r'Located :',r'').strip())
elif 'Category :' in value:
# Get the Category
self.story.setMetadata('category',value.replace(r'&gt;',r'>').replace(r'Located :',r'').strip())
elif 'Content Tags :' in value:
# Get the Erotic Tags
value = stripHTML(value.replace(r'Content Tags :',r'')).strip()
for code in re.split(r'\s',value):
self.story.addToList('eroticatags',code)
elif 'Posted :' in value:
# Get the Posted Date
value = value.replace(r'Posted :',r'').strip()
if value.startswith('008'):
# It is unknown how the 200 became 008, but I'm going to change it back here
value = value.replace('008','200')
elif value.startswith('0000'):
# Since the date is showing as 0000,
# I'm going to put the memberdate here
value = asoup.find('div',{'id':'contentdata'}).find('p').get_text(strip=True).replace('Member Since','').strip()
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
elif 'Edited :' in value:
# Get the 'Updated' Edited date
# AFF has the time for the Updated date, and we only want the date,
# so we take the first 10 characters only
value = value.replace(r'Edited :',r'').strip()[0:10]
if value.startswith('008'):
# It is unknown how the 200 became 008, but I'm going to change it back here
value = value.replace('008','200')
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
elif value.startswith('0000') or '-00-' in value:
# Since the date is showing as 0000,
# or there is -00- in the date,
# I'm going to put the Published date here
self.story.setMetadata('dateUpdated', self.story.getMetadata('datPublished'))
else:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
else:
# This catches the blank elements, and the Review and Dragon Prints.
# I am not interested in these, so do nothing
zzzzzzz=0
# logger.debug(asoup)
story_card = asoup.select_one('div.story-card:has(a[href="{0}"])'.format(url))
# logger.debug(story_card)
## Category
## I've only seen one category per story so far, but just in case:
for cat in story_card.select('div.story-card-category'):
# remove Category:, old code suggests Located: is also
# possible, so removing by <strong>
cat.find("strong").decompose()
self.story.addToList('category',stripHTML(cat))
self.setDescription(url,story_card.select_one('div.story-card-description'))
for tag in story_card.select('span.story-tag'):
self.story.addToList('eroticatags',stripHTML(tag))
## created/updates share formatting
for meta in story_card.select('div.story-card-meta-item span:last-child'):
meta = stripHTML(meta)
if 'Created: ' in meta:
meta = meta.replace('Created: ','')
self.story.setMetadata('datePublished', makeDate(meta, self.dateformat))
if 'Updated: ' in meta:
meta = meta.replace('Updated: ','')
self.story.setMetadata('dateUpdated', makeDate(meta, self.dateformat))
# grab the text for an individual chapter.
def getChapterText(self, url):
#Since each chapter is on 1 page, we don't need to do anything special, just get the content of the page.
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
chaptertag = soup.find('div',{'class' : 'pagination'}).parent.findNext('td')
soup = self.make_soup(self.get_request(url))
chaptertag = soup.select_one('div.chapter-body')
if None == chaptertag:
raise exceptions.FailedToDownload("Error downloading Chapter: {0}! Missing required element!".format(url))
## chapter text includes a copy of story title, author,
## chapter title, & eroticatags specific to the chapter. Did
## before, too.
return self.utf8FromSoup(url,chaptertag)

View file

@ -0,0 +1,46 @@
# -*- coding: utf-8 -*-
# Copyright 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
from .base_xenforo2forum_adapter import BaseXenForo2ForumAdapter
import logging
logger = logging.getLogger(__name__)
def getClass():
return WWWAlternatehistoryComAdapter
class WWWAlternatehistoryComAdapter(BaseXenForo2ForumAdapter):
def __init__(self, config, url):
BaseXenForo2ForumAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ah')
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.alternatehistory.com'
@classmethod
def getPathPrefix(cls):
# in case it needs more than just site/
return '/forum/'
def get_post_created_date(self,souptag):
return self.make_date(souptag.find('div', {'class':'message-inner'}))

View file

@ -0,0 +1,40 @@
# -*- coding: utf-8 -*-
# Copyright 2026 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import re
from .base_xenforo2forum_adapter import BaseXenForo2ForumAdapter
def getClass():
return AltHistoryComAdapter
## NOTE: This is a different site than www.alternatehistory.com.
class AltHistoryComAdapter(BaseXenForo2ForumAdapter):
def __init__(self, config, url):
BaseXenForo2ForumAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ahc')
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'althistory.com'

View file

@ -1,302 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2016 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# ####### Not all lables are captured. they are not formtted correctly on the
# ####### webpage.
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return AndromedaWebComAdapter # XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class AndromedaWebComAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fiction part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/fiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','awc') # XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %b %Y" # XXX
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.andromeda-web.com' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/fiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/fiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=2"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# Since the warning text can change by warning level, let's
# look for the warning pass url. ksarchive uses
# &amp;warning= -- actually, so do other sites. Must be an
# eFiction book.
# fiction/viewstory.php?sid=1882&amp;warning=4
# fiction/viewstory.php?sid=1654&amp;ageconsent=ok&amp;warning=2
#print data
m = re.search(r"'fiction/viewstory.php\?sid=10(&amp;warning=2)'",data)
m = re.search(r"'fiction/viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'content'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=3'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"fiction/viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^fiction/viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('fiction/viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'class' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2014 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2014 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,383 +15,55 @@
# limitations under the License.
#
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
from .base_otw_adapter import BaseOTWAdapter
def getClass():
return ArchiveOfOurOwnOrgAdapter
logger = logging.getLogger(__name__)
class ArchiveOfOurOwnOrgAdapter(BaseSiteAdapter):
class ArchiveOfOurOwnOrgAdapter(BaseOTWAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/works/'+self.story.getMetadata('storyId'))
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
BaseOTWAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ao3')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%Y-%b-%d"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'archiveofourown.org'
# The certificate is only valid for the following names:
# ao3.org,
# archiveofourown.com,
# archiveofourown.net,
# archiveofourown.org,
# www.ao3.org,
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/works/123456 http://"+cls.getSiteDomain()+"/collections/Some_Archive/works/123456 http://"+cls.getSiteDomain()+"/works/123456/chapters/78901"
def getAcceptDomains(cls):
return ['archiveofourown.org',
'archiveofourown.com',
'archiveofourown.net',
'archiveofourown.gay',
'download.archiveofourown.org',
'download.archiveofourown.com',
'download.archiveofourown.net',
'ao3.org',
]
def getSiteURLPattern(self):
# http://archiveofourown.org/collections/Smallville_Slash_Archive/works/159770
# Discard leading zeros from story ID numbers--AO3 doesn't use them in it's own chapter URLs.
return r"https?://"+re.escape(self.getSiteDomain())+r"(/collections/[^/]+)?/works/0*(?P<id>\d+)"
def mod_url_request(self, url):
return url
## Login
def needToLoginCheck(self, data):
if 'This work is only available to registered users of the Archive.' in data \
or "The password or user name you entered doesn't match our records" in data:
return True
def mod_url_request(self, url):
## add / to *not* replace media.archiveofourown.org
if self.getConfig("use_archive_transformativeworks_org",False):
return url.replace("/archiveofourown.org","/archive.transformativeworks.org")
elif self.getConfig("use_archiveofourown_gay",False):
return url.replace("/archiveofourown.org","/archiveofourown.gay")
else:
return False
def performLogin(self, url, data):
params = {}
if self.password:
params['user_session[login]'] = self.username
params['user_session[password]'] = self.password
else:
params['user_session[login]'] = self.getConfig("username")
params['user_session[password]'] = self.getConfig("password")
params['user_session[remember_me]'] = '1'
params['commit'] = 'Log in'
#params['utf8'] = u'✓'#u'\x2713' # gets along with out it, and it confuses the encoder.
params['authenticity_token'] = data.split('input name="authenticity_token" type="hidden" value="')[1].split('"')[0]
loginUrl = 'http://' + self.getSiteDomain() + '/user_sessions'
logger.info("Will now login to URL (%s) as (%s)" % (loginUrl,
params['user_session[login]']))
d = self._postUrl(loginUrl, params)
#logger.info(d)
if "Successfully logged in" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['user_session[login]']))
raise exceptions.FailedToLogin(url,params['user_session[login]'])
return False
else:
return True
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
addurl = "?view_adult=true"
else:
addurl=""
metaurl = self.url+addurl
url = self.url+'/navigate'+addurl
logger.info("url: "+url)
logger.info("metaurl: "+metaurl)
try:
data = self._fetchUrl(url)
meta = self._fetchUrl(metaurl)
if "This work could have adult content. If you proceed you have agreed that you are willing to see such content." in meta:
raise exceptions.AdultCheckRequired(self.url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if "Sorry, we couldn&#x27;t find the work you were looking for." in data:
raise exceptions.StoryDoesNotExist(self.url)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url,data)
data = self._fetchUrl(url,usecache=False)
meta = self._fetchUrl(metaurl,usecache=False)
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
for tag in soup.findAll('div',id='admin-banner'):
tag.extract()
metasoup = self.make_soup(meta)
for tag in metasoup.findAll('div',id='admin-banner'):
tag.extract()
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r"/works/\d+$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
alist = soup.findAll('a', href=re.compile(r"/users/\w+/pseuds/\w+"))
if len(alist) < 1: # ao3 allows for author 'Anonymous' with no author link.
self.story.setMetadata('author','Anonymous')
self.story.setMetadata('authorUrl','http://archiveofourown.org/')
self.story.setMetadata('authorId','0')
else:
for a in alist:
self.story.addToList('authorId',a['href'].split('/')[-1])
self.story.addToList('authorUrl','http://'+self.host+a['href'])
self.story.addToList('author',a.text)
byline = metasoup.find('h3',{'class':'byline'})
if byline:
self.story.setMetadata('byline',stripHTML(byline))
newestChapter = None
self.newestChapterNum = None # save for comparing during update.
# Scan all chapters to find the oldest and newest, on AO3 it's
# possible for authors to insert new chapters out-of-order or
# change the dates of earlier ones by editing them--That WILL
# break epub update.
# Find the chapters:
chapters=soup.findAll('a', href=re.compile(r'/works/'+self.story.getMetadata('storyId')+"/chapters/\d+$"))
self.story.setMetadata('numChapters',len(chapters))
logger.debug("numChapters: (%s)"%self.story.getMetadata('numChapters'))
if len(chapters)==1:
self.chapterUrls.append((self.story.getMetadata('title'),'http://'+self.host+chapters[0]['href']+addurl))
else:
for index, chapter in enumerate(chapters):
# strip just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+chapter['href']+addurl))
# (2013-09-21)
date = stripHTML(chapter.findNext('span'))[1:-1]
chapterDate = makeDate(date,self.dateformat)
if newestChapter == None or chapterDate > newestChapter:
newestChapter = chapterDate
self.newestChapterNum = index
a = metasoup.find('blockquote',{'class':'userstuff'})
if a != None:
self.setDescription(url,a)
#self.story.setMetadata('description',a.text)
a = metasoup.find('dd',{'class':"rating tags"})
if a != None:
self.story.setMetadata('rating',stripHTML(a.text))
d = metasoup.find('dd',{'class':"language"})
if d != None:
self.story.setMetadata('language',stripHTML(d.text))
a = metasoup.find('dd',{'class':"fandom tags"})
fandoms = a.findAll('a',{'class':"tag"})
for fandom in fandoms:
self.story.addToList('fandoms',fandom.string)
a = metasoup.find('dd',{'class':"warning tags"})
if a != None:
warnings = a.findAll('a',{'class':"tag"})
for warning in warnings:
self.story.addToList('warnings',warning.string)
a = metasoup.find('dd',{'class':"freeform tags"})
if a != None:
genres = a.findAll('a',{'class':"tag"})
for genre in genres:
self.story.addToList('freeformtags',genre.string)
a = metasoup.find('dd',{'class':"category tags"})
if a != None:
genres = a.findAll('a',{'class':"tag"})
for genre in genres:
if genre != "Gen":
self.story.addToList('ao3categories',genre.string)
a = metasoup.find('dd',{'class':"character tags"})
if a != None:
chars = a.findAll('a',{'class':"tag"})
for char in chars:
self.story.addToList('characters',char.string)
a = metasoup.find('dd',{'class':"relationship tags"})
if a != None:
ships = a.findAll('a',{'class':"tag"})
for ship in ships:
self.story.addToList('ships',ship.string)
a = metasoup.find('dd',{'class':"collections"})
if a != None:
collections = a.findAll('a')
for collection in collections:
self.story.addToList('collections',collection.string)
stats = metasoup.find('dl',{'class':'stats'})
dt = stats.findAll('dt')
dd = stats.findAll('dd')
for x in range(0,len(dt)):
label = dt[x].text
value = dd[x].text
if 'Words:' in label:
self.story.setMetadata('numWords', value)
if 'Comments:' in label:
self.story.setMetadata('comments', value)
if 'Kudos:' in label:
self.story.setMetadata('kudos', value)
if 'Hits:' in label:
self.story.setMetadata('hits', value)
if 'Bookmarks:' in label:
self.story.setMetadata('bookmarks', value)
if 'Chapters:' in label:
if value.split('/')[0] == value.split('/')[1]:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
if 'Completed' in label:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
# Find Series name from series URL.
ddseries = metasoup.find('dd',{'class':"series"})
if ddseries:
for i, a in enumerate(ddseries.findAll('a', href=re.compile(r"/series/\d+"))):
series_name = stripHTML(a)
series_url = 'http://'+self.host+a['href']
series_index = int(stripHTML(a.previousSibling).replace(', ','').split(' ')[1]) # "Part # of" or ", Part #"
self.story.setMetadata('series%02d'%i,"%s [%s]"%(series_name,series_index))
self.story.setMetadata('series%02dUrl'%i,series_url)
if i == 0:
self.setSeries(series_name, series_index)
self.story.setMetadata('seriesUrl',series_url)
def hookForUpdates(self,chaptercount):
if self.oldchapters and len(self.oldchapters) > self.newestChapterNum:
logger.info("Existing epub has %s chapters\nNewest chapter is %s. Discarding old chapters from there on."%(len(self.oldchapters), self.newestChapterNum+1))
self.oldchapters = self.oldchapters[:self.newestChapterNum]
return len(self.oldchapters)
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
chapter=self.make_soup('<div class="story"></div>').find('div')
data = self._fetchUrl(url)
soup = self.make_soup(data)
exclude_notes=self.getConfigList('exclude_notes')
def append_tag(elem,tag,string):
'''bs4 requires tags be added separately.'''
new_tag = soup.new_tag(tag)
new_tag.string=string
elem.append(new_tag)
if 'authorheadnotes' not in exclude_notes:
headnotes = soup.find('div', {'class' : "preface group"}).find('div', {'class' : "notes module"})
if headnotes != None:
headnotes = headnotes.find('blockquote', {'class' : "userstuff"})
if headnotes != None:
append_tag(chapter,'b',"Author's Note:")
chapter.append(headnotes)
if 'chaptersummary' not in exclude_notes:
chapsumm = soup.find('div', {'id' : "summary"})
if chapsumm != None:
chapsumm = chapsumm.find('blockquote')
append_tag(chapter,'b',"Summary for the Chapter:")
chapter.append(chapsumm)
if 'chapterheadnotes' not in exclude_notes:
chapnotes = soup.find('div', {'id' : "notes"})
if chapnotes != None:
chapnotes = chapnotes.find('blockquote')
if chapnotes != None:
append_tag(chapter,'b',"Notes for the Chapter:")
chapter.append(chapnotes)
text = soup.find('div', {'class' : "userstuff module"})
chtext = text.find('h3', {'class' : "landmark heading"})
if chtext:
chtext.extract()
chapter.append(text)
if 'chapterfootnotes' not in exclude_notes:
chapfoot = soup.find('div', {'class' : "end notes module", 'role' : "complementary"})
if chapfoot != None:
chapfoot = chapfoot.find('blockquote')
append_tag(chapter,'b',"Notes for the Chapter:")
chapter.append(chapfoot)
if 'authorfootnotes' not in exclude_notes:
footnotes = soup.find('div', {'id' : "work_endnotes"})
if footnotes != None:
footnotes = footnotes.find('blockquote')
append_tag(chapter,'b',"Author's Note:")
chapter.append(footnotes)
if None == soup:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,chapter)
return url

View file

@ -1,190 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ArchiveSkyeHawkeComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class ArchiveSkyeHawkeComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/story.php?no='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ash')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%Y-%m-%d"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'archive.skyehawke.com'
@classmethod
def getAcceptDomains(cls):
return ['archive.skyehawke.com','www.skyehawke.com']
@classmethod
def getSiteExampleURLs(cls):
return "http://archive.skyehawke.com/story.php?no=1234 http://www.skyehawke.com/archive/story.php?no=1234 http://skyehawke.com/archive/story.php?no=1234"
def getSiteURLPattern(self):
return re.escape("http://")+r"(archive|www)\.skyehawke\.com/(archive/)?story\.php\?no=\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('div', {'class':"story border"}).find('span',{'class':'left'})
title=stripHTML(a).split('"')[1]
self.story.setMetadata('title',title)
# Find authorid and URL from... author url.
author = a.find('a')
self.story.setMetadata('authorId',author['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+author['href'])
self.story.setMetadata('author',author.string)
authorSoup = self.make_soup(self._fetchUrl(self.story.getMetadata('authorUrl')))
chapter=soup.find('select',{'name':'chapter'}).findAll('option')
for i in range(1,len(chapter)):
ch=chapter[i]
self.chapterUrls.append((stripHTML(ch),ch['value']))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
box=soup.find('div', {'class': "container borderridge"})
sum=box.find('span').text
self.setDescription(url,sum)
boxes=soup.findAll('div', {'class': "container bordersolid"})
for box in boxes:
if box.find('b') != None and box.find('b').text == "History and Story Information":
for b in box.findAll('b'):
if "words" in b.nextSibling:
self.story.setMetadata('numWords', b.text)
if "archived" in b.previousSibling:
self.story.setMetadata('datePublished', makeDate(stripHTML(b.text), self.dateformat))
if "updated" in b.previousSibling:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(b.text), self.dateformat))
if "fandom" in b.nextSibling:
self.story.addToList('category', b.text)
for br in box.findAll('br'):
br.replaceWith('split')
genre=box.text.split("Genre:")[1].split("split")[0]
if not "Unspecified" in genre:
self.story.addToList('genre',genre)
if box.find('span') != None and box.find('span').text == "WARNING":
rating=box.findAll('span')[1]
rating.find('br').replaceWith('split')
rating=rating.text.replace("This story is rated",'').split('split')[0]
self.story.setMetadata('rating',rating)
logger.debug(self.story.getMetadata('rating'))
warnings=box.find('ol')
if warnings != None:
warnings=warnings.text.replace(']', '').replace('[', '').split(' ')
for warning in warnings:
self.story.addToList('warnings',warning)
for asoup in authorSoup.findAll('div', {'class':"story bordersolid"}):
if asoup.find('a')['href'] == 'story.php?no='+self.story.getMetadata('storyId'):
if '[ Completed ]' in asoup.text:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
chars=asoup.findNext('div').text.split('Characters')[1].split(']')[0]
for char in chars.split(','):
if not "None" in char:
self.story.addToList('characters',char)
break
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div',{'class':"chapter bordersolid"}).findNext('div').findNext('div')
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,302 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2016 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# ####### Not all lables are captured. they are not formtted correctly on the
# ####### webpage.
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ArtemisFowlComAdapter # XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class ArtemisFowlComAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fiction part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/fanfiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','afcff') # XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d/%m/%y" # XXX
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.artemis-fowl.com' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/fanfiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/fanfiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=5"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# Since the warning text can change by warning level, let's
# look for the warning pass url. ksarchive uses
# &amp;warning= -- actually, so do other sites. Must be an
# eFiction book.
# fanfiction/viewstory.php?sid=1882&amp;warning=4
# fanfiction/viewstory.php?sid=1654&amp;ageconsent=ok&amp;warning=2
#print data
m = re.search(r"'fanfiction/viewstory.php\?sid=10(&amp;warning=5)'",data)
m = re.search(r"'fanfiction/viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'pagetitle'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fanfiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=3'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"fanfiction/viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^fanfiction/viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('fanfiction/viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2016 FanFicFare team
# Copyright 2013 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,21 +15,18 @@
# limitations under the License.
#
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import urlparse
import time
import os
from bs4.element import Comment
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
import sys
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six.moves.urllib import parse as urlparse
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ASexStoriesComAdapter
@ -39,14 +36,6 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252",
"iso-8859-1"]
# 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.story.setMetadata('siteabbrev','asscom')
# Extract story ID from base URL, http://www.asexstories.com/Halloween-party-with-the-phantom/
@ -87,16 +76,10 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
if not (self.is_adult or self.getConfig("is_adult")):
raise exceptions.AdultCheckRequired(self.url)
try:
data1 = self._fetchUrl(self.url)
soup1 = self.make_soup(data1)
#strip comments from soup
[comment.extract() for comment in soup1.find_all(text=lambda text:isinstance(text, Comment))]
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data1 = self.get_request(self.url)
soup1 = self.make_soup(data1)
#strip comments from soup
[comment.extract() for comment in soup1.find_all(string=lambda text:isinstance(text, Comment))]
if 'Page Not Found.' in data1:
raise exceptions.StoryDoesNotExist(self.url)
@ -109,7 +92,7 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
self.story.setMetadata('title', title.string)
# Author
author = soup1.find('div',{'class':'story-info'}).findAll('div',{'class':'story-info-bl'})[1].find('a')
author = soup1.find('div',{'class':'story-info'}).find_all('div',{'class':'story-info-bl'})[1].find('a')
authorurl = author['href']
self.story.setMetadata('author', author.string)
self.story.setMetadata('authorUrl', authorurl)
@ -125,14 +108,11 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
description = description.encode('utf-8','ignore').strip()[0:150].decode('utf-8','ignore')
self.setDescription(url,'Excerpt from beginning of story: '+description+'...')
# Get chapter URLs
self.chapterUrls = []
### The first 'chapter' is not listed in the links, so we have to
### add it before the rest of the pages, if any
self.chapterUrls.append(('1', self.url))
self.add_chapter('1', self.url)
chapterTable = soup1.find('div',{'class':'pages'}).findAll('a')
chapterTable = soup1.find('div',{'class':'pages'}).find_all('a')
if chapterTable is not None:
# Multi-chapter story
@ -140,11 +120,11 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
for page in chapterTable:
chapterTitle = page.string
chapterUrl = urlparse.urljoin(self.url, page['href'])
self.chapterUrls.append((chapterTitle, chapterUrl))
if chapterUrl.startswith(self.url): # there are other URLs in the pages block now.
self.add_chapter(chapterTitle, chapterUrl)
self.story.setMetadata('numChapters', len(self.chapterUrls))
rated = soup1.find('div',{'class':'story-info'}).findAll('div',{'story-info-bl5'})[0].find('img')['title'].replace('- Rate','').strip()
rated = soup1.find('div',{'class':'story-info'}).find_all('div',{'class':'story-info-bl5'})[0].find('img')['title'].replace('- Rate','').strip()
self.story.setMetadata('rating',rated)
self.story.setMetadata('dateUpdated', makeDate('01/01/2001', '%m/%d/%Y'))
@ -157,7 +137,7 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from <%s>' % url)
#logger.info('Getting chapter text from <%s>' % url)
data1 = self._fetchUrl(url)
data1 = self.get_request(url)
soup1 = self.make_soup(data1)
# get story text
@ -170,5 +150,11 @@ class ASexStoriesComAdapter(BaseSiteAdapter):
if self.getConfig('strip_text_links'):
for anchor in story1('a', {'target': '_blank'}):
anchor.replaceWith(anchor.string)
## remove ad links in the story text and their following <br>
for anchor in story1('a', {'rel': 'nofollow'}):
br = anchor.find_next_sibling('br')
if br:
br.extract()
anchor.extract()
return self.utf8FromSoup(url, story1)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,17 +16,17 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return AshwinderSycophantHexComAdapter
@ -38,11 +38,6 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -50,10 +45,10 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','asph')
@ -69,10 +64,10 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
return "https://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
return r"https?://"+re.escape(self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
@ -97,11 +92,11 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
params['intent'] = ''
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php'
loginUrl = 'https://' + self.getSiteDomain() + '/user.php'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
d = self.post_request(loginUrl, params)
if "Logout" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
@ -118,61 +113,52 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
data = self.get_request(url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
asoup = self.make_soup(self._fetchUrl(self.story.getMetadata('authorUrl')))
asoup = self.make_soup(self.get_request(self.story.getMetadata('authorUrl')))
try:
# in case link points somewhere other than the first chapter
a = soup.findAll('option')[1]['value']
a = soup.find_all('option')[1]['value']
self.story.setMetadata('storyId',a.split('=',)[1])
url = 'http://'+self.host+'/'+a
soup = self.make_soup(self._fetchUrl(url))
url = 'https://'+self.host+'/'+a
soup = self.make_soup(self.get_request(url))
except:
pass
for info in asoup.findAll('table', {'width' : '100%', 'bordercolor' : re.compile(r'#')}):
for info in asoup.find_all('table', {'width' : '100%', 'bordercolor' : re.compile(r'#')}):
a = info.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
if a != None:
self.story.setMetadata('title',stripHTML(a))
break
# Find the chapters:
chapters=soup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+&i=1$'))
chapters=soup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+&i=1$'))
if len(chapters) == 0:
self.chapterUrls.append((self.story.getMetadata('title'),url))
self.add_chapter(self.story.getMetadata('title'),url)
else:
for chapter in chapters:
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
self.add_chapter(chapter,'https://'+self.host+'/'+chapter['href'])
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
@ -183,11 +169,11 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
return d.name
except:
return ""
cats = info.findAll('a',href=re.compile('categories.php'))
cats = info.find_all('a',href=re.compile('categories.php'))
for cat in cats:
self.story.addToList('category',cat.string)
a = info.find('a', href=re.compile(r'reviews.php\?sid='+self.story.getMetadata('storyId')))
val = a.nextSibling
svalue = ""
@ -199,8 +185,10 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
val = val.nextSibling
self.setDescription(url,svalue)
# <span class="label">Rated:</span> NC-17<br /> etc
labels = info.findAll('b')
## <td><span class="sb"><b>Published:</b> 04/08/2007</td>
## one story had <b>Updated...</b> in the description. Restrict to sub-table
labels = info.find('table').find_all('b')
for labelspan in labels:
value = labelspan.nextSibling
label = stripHTML(labelspan)
@ -242,8 +230,8 @@ class AshwinderSycophantHexComAdapter(BaseSiteAdapter):
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self._fetchUrl(url)
data = self.get_request(url)
soup = self.make_soup(data) # some chapters seem to be hanging up on those tags, so it is safer to close them

View file

@ -0,0 +1,290 @@
# -*- coding: utf-8 -*-
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import json
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return AsianFanFicsComAdapter
logger = logging.getLogger(__name__)
class AsianFanFicsComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.username = ""
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[3])
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL('https://' + self.getSiteDomain() + '/story/view/'+self.story.getMetadata('storyId'))
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','asnff')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%Y-%b-%d"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.asianfanfics.com'
@classmethod
def getSiteExampleURLs(cls):
return "https://"+cls.getSiteDomain()+"/story/view/123456 https://"+cls.getSiteDomain()+"/story/view/123456/story-title-here https://"+cls.getSiteDomain()+"/story/view/123456/1"
def getSiteURLPattern(self):
return r"https?://"+re.escape(self.getSiteDomain())+r"/story/view/0*(?P<id>\d+)"
def performLogin(self, url, data):
params = {}
if self.password:
params['username'] = self.username
params['password'] = self.password
else:
params['username'] = self.getConfig("username")
params['password'] = self.getConfig("password")
if not params['username']:
raise exceptions.FailedToLogin(url,params['username'])
params['from_url'] = url
# capture token from JS script, not appearing in form now.
csrf_token_search = 'csrfToken = "'
params['csrf_aff_token'] = data[data.index(csrf_token_search)+len(csrf_token_search):]
params['csrf_aff_token'] = params['csrf_aff_token'][:params['csrf_aff_token'].index('"')]
loginUrl = 'https://' + self.getSiteDomain() + '/login/index'
logger.info("Will now login to URL (%s) as (%s)" % (loginUrl, params['username']))
data = self.post_request(loginUrl, params)
soup = self.make_soup(data)
if self.loginNeededCheck(data):
logger.info('Failed to login to URL %s as %s' % (loginUrl, params['username']))
raise exceptions.FailedToLogin(url,params['username'])
def loginNeededCheck(self,data):
return "isLoggedIn = false" in data
def doStorySubscribe(self, url, soup):
subHref = soup.find('a',{'id':'subscribe'})
if subHref:
#does not work when using https - 403
subUrl = 'http://' + self.getSiteDomain() + subHref['href']
self.get_request(subUrl)
data = self.get_request(url,usecache=False)
soup = self.make_soup(data)
check = soup.find('div',{'class':'click-to-read-full'})
if check:
return False
else:
return soup
else:
return False
## Getting the chapter list and the meta data, plus 'is adult' checking.
def doExtractChapterUrlsAndMetadata(self,get_cover=True):
url = self.url
logger.info("url: "+url)
soup = None
try:
data = self.get_request(url)
soup = self.make_soup(data)
except exceptions.HTTPErrorFFF as e:
if e.status_code != 404:
raise
data = self.decode_data(e.data)
# logger.debug(data)
if not soup or self.loginNeededCheck(data):
# always login if not already to avoid lots of headaches
self.performLogin(url,data)
# refresh website after logging in
data = self.get_request(url,usecache=False)
soup = self.make_soup(data)
# subscription check
# logger.debug(soup)
subCheck = soup.find('div',{'class':'click-to-read-full'})
if subCheck and self.getConfig("auto_sub"):
subSoup = self.doStorySubscribe(url,soup)
if subSoup:
soup = subSoup
else:
raise exceptions.FailedToDownload("Error when subscribing to story. This usually means a change in the website code.")
elif subCheck and not self.getConfig("auto_sub"):
raise exceptions.FailedToDownload("This story is only available to subscribers. You can subscribe manually on the web site, or set auto_sub:true in personal.ini.")
## Title
a = soup.find('h1', {'id': 'story-title'})
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
mainmeta = soup.find('footer', {'class': 'main-meta'})
alist = mainmeta.find('span', string='Author(s)')
alist = alist.parent.find_all('a', href=re.compile(r"/profile/u/[^/]+"))
for a in alist:
self.story.addToList('authorId',a['href'].split('/')[-1])
self.story.addToList('authorUrl','https://'+self.host+a['href'])
self.story.addToList('author',a.text)
newestChapter = None
self.newestChapterNum = None
# Find the chapters:
chapters=soup.find('select',{'name':'chapter-nav'})
hrefattr=None
if chapters:
chapters=chapters.find_all('option')
hrefattr='value'
else: # didn't find <select name='chapter-nav', look for alternative
chapters=soup.find('div',{'class':'widget--chapters'}).find_all('a')
hrefattr='href'
for index, chapter in enumerate(chapters):
if chapter.text != 'Foreword' and 'Collapse chapters' not in chapter.text:
self.add_chapter(chapter.text,'https://' + self.getSiteDomain() + chapter[hrefattr])
# note: AFF cuts off chapter names in list. this gets kind of fixed later on
# find timestamp
a = soup.find('span', string='Updated')
if a == None:
a = soup.find('span', string='Published') # use published date if work was never updated
a = a.parent.find('time')
chapterDate = makeDate(a['datetime'],self.dateformat)
if newestChapter == None or chapterDate > newestChapter:
newestChapter = chapterDate
self.newestChapterNum = index
# story status
a = mainmeta.find('span', string='Completed')
if a:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
# story description
try:
jsonlink = soup.find('script',string=re.compile(r'/api/forewords/[0-9]+/foreword_[0-9a-z]+.json')).get_text().split('"')[1] # grabs url from quotation marks
fore_json = json.loads(self.get_request(jsonlink))
content = self.make_soup(fore_json['post']).find('body') # BS4 adds <html><body> if not present.
a = content.find('div', {'id':'story-description'})
except:
# not all stories have foreward link.
a = soup.find('div', {'id':'story-description'})
if a:
self.setDescription(url,a)
# story tags
a = mainmeta.find('span',string='Tags')
if a:
tags = a.parent.find_all('a')
for tag in tags:
self.story.addToList('tags', tag.text)
# story tags
a = mainmeta.find('span',string='Characters')
if a:
self.story.addToList('characters', a.nextSibling)
# published on
a = soup.find('span', string='Published')
a = a.parent.find('time')
self.story.setMetadata('datePublished', makeDate(a['datetime'], self.dateformat))
# updated on
a = soup.find('span', string='Updated')
if a:
a = a.parent.find('time')
self.story.setMetadata('dateUpdated', makeDate(a['datetime'], self.dateformat))
# word count
a = soup.find('span', string='Total Word Count')
if a:
a = a.find_next('span')
self.story.setMetadata('numWords', int(a.text.split()[0]))
# upvote, subs, and views
a = soup.find('div',{'class':'title-meta'})
spans = a.find_all('span', recursive=False)
self.story.setMetadata('upvotes', re.search(r'\(([^)]+)', spans[0].find('span').text).group(1))
self.story.setMetadata('subscribers', re.search(r'\(([^)]+)', spans[1].find('span').text).group(1))
if len(spans) > 2: # views can be private
self.story.setMetadata('views', spans[2].text.split()[0])
# cover art in the form of a div before chapter content
if get_cover:
cover_url = ""
a = soup.find('div',{'id':'bodyText'})
if a:
a = a.find('div',{'class':'text-center'})
if a:
cover_url = a.find('img')['src']
self.setCoverImage(url,cover_url)
# grab the text for an individual chapter
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self.get_request(url)
soup = self.make_soup(data)
# logger.debug(data)
ageform = soup.select_one('form[action="/account/toggle_age"]')
# logger.debug(ageform)
if ageform and (self.is_adult or self.getConfig("is_adult")):
params = {}
params['is_of_age']=ageform.select_one('input#is_of_age')['value']
params['current_url']=ageform.select_one('input#current_url')['value']
params['csrf_aff_token']=ageform.select_one('input[name="csrf_aff_token"]')['value']
loginUrl = 'https://' + self.getSiteDomain() + '/account/mark_over_18'
logger.info("Will now toggle age to URL (%s)" % (loginUrl))
# logger.debug(params)
data = self.post_request(loginUrl, params)
soup = self.make_soup(data)
# logger.debug(data)
content = soup.find('div', {'id': 'user-submitted-body'})
if self.getConfig('inject_chapter_image'):
logger.debug("Injecting chapter image")
imgdiv = soup.select_one('div#bodyText div.bot-spacer')
if imgdiv:
content.insert(0, "\n")
content.insert(0, imgdiv)
content.insert(0, "\n")
if self.getConfig('inject_chapter_title'):
logger.debug("Injecting full-length chapter title")
title = soup.find('h1', {'id' : 'chapter-title'}).text
newTitle = soup.new_tag('h3')
newTitle.string = title
content.insert(0, "\n")
content.insert(0, newTitle)
content.insert(0, "\n")
return self.utf8FromSoup(url,content)

View file

@ -1,227 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return Asr3SlashzoneOrgAdapter
class Asr3SlashzoneOrgAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/archive/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','asr3')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d/%m/%y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'asr3.slashzone.org'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/archive/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/archive/viewstory.php?sid=")+r"\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=3"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
#print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/archive/'+a['href'])
self.story.setMetadata('author',a.string)
# Rating
rate = stripHTML(soup.find('div',{'id':'pagetitle'}))
rate = rate[rate.rindex('[')+1:rate.rindex(']')]
self.story.setMetadata('rating', rate)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/archive/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
metadiv = soup.find('div',{'class':'content'})
smalldiv = metadiv.find('div',{'class':'small'})
categorys = smalldiv.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for category in categorys:
self.story.addToList('category',category.string)
chars = smalldiv.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
ships = smalldiv.parent.findAll('a',href=re.compile(r'browse\.php\?type=class&type_id=2&classid=1'))
for ship in ships:
self.story.addToList('ships',ship.string)
metatext = stripHTML(smalldiv)
if 'Completed: Yes' in metatext:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
wordstart=metatext.rindex('Word count:')+12
words = metatext[wordstart:metatext.index(' ',wordstart)]
self.story.setMetadata('numWords', words)
datesdiv = soup.find('div',{'class':'bottom'})
dates = stripHTML(datesdiv).split()
# Published: 04/26/2011 Updated: 03/06/2013
self.story.setMetadata('datePublished', makeDate(dates[1], self.dateformat))
self.story.setMetadata('dateUpdated', makeDate(dates[3], self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/archive/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
# can't use ^viewstory...$ in case of higher rated stories with javascript href.
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+'))
i=1
for a in storyas:
# skip 'report this' and 'TOC' links
if 'contact.php' not in a['href'] and 'index' not in a['href']:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# remove 'small' leaving only summary.
smalldiv.extract()
self.setDescription(url,metadiv)
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -23,6 +23,7 @@
### Fixed the removal of the extra tags from some of the stories and
### removed the attributes from the paragraph and span tags
###########################################################################
from __future__ import absolute_import
'''
This works, but some of the stories have abysmal formatting, so it would
probably need to be edited for reading.
@ -49,16 +50,16 @@ import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib
import urllib2
import sys
import urlparse
from bs4 import Comment
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from ..six.moves.urllib import parse as urlparse
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return BDSMLibraryComSiteAdapter
@ -68,13 +69,6 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252",
"iso-8859-1"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -82,7 +76,7 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
# get storyId from url--url validation guarantees query is only storyid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
self._setURL('http://{0}/stories/story.php?storyid={1}'.format(self.getSiteDomain(), self.story.getMetadata('storyId')))
self._setURL('https://{0}/stories/story.php?storyid={1}'.format(self.getSiteDomain(), self.story.getMetadata('storyId')))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','bdsmlib')
@ -98,33 +92,19 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/stories/story.php?storyid=1234"
return "https://"+cls.getSiteDomain()+"/stories/story.php?storyid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/stories/story.php?storyid=")+r"\d+$"
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
return r"https?://"+re.escape(self.getSiteDomain()+"/stories/story.php?storyid=")+r"\d+$"
def extractChapterUrlsAndMetadata(self):
if not (self.is_adult or self.getConfig("is_adult")):
raise exceptions.AdultCheckRequired(self.url)
try:
data = self._fetchUrl(self.url)
soup = self.make_soup(data)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(self.url)
if 'The story does not exist' in data:
raise exceptions.StoryDoesNotExist(self.url)
soup = self.make_soup(data)
# Extract metadata
title=soup.title.text.replace('BDSM Library - Story: ','').replace('\\','')
@ -132,47 +112,33 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
# Author
author = soup.find('a', href=re.compile(r"/stories/author.php\?authorid=\d+"))
i = 0
while author == None:
time.sleep(1)
logger.warning('A problem retrieving the author information. Trying Again')
try:
data = self._fetchUrl(self.url)
soup = self.make_soup(data)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
author = soup.find('a', href=re.compile(r"/stories/author.php\?authorid=\d+"))
i += 1
if i == 20:
logger.info('Too Many cycles... exiting')
sys.exit()
authorurl = urlparse.urljoin(self.url, author['href'])
self.story.setMetadata('author', author.text)
self.story.setMetadata('authorUrl', authorurl)
authorid = author['href'].split('=')[1]
self.story.setMetadata('authorId', authorid)
if author:
authorurl = urlparse.urljoin(self.url, author['href'])
self.story.setMetadata('author', author.text)
self.story.setMetadata('authorUrl', authorurl)
authorid = author['href'].split('=')[1]
self.story.setMetadata('authorId', authorid)
else:
logger.info("Failed to find Author, setting to Anonymous")
self.story.setMetadata('author','Anonymous')
self.story.setMetadata('authorUrl','https://' + self.getSiteDomain() + '/')
self.story.setMetadata('authorId','0')
# Find the chapters:
# The update date is with the chapter links... so we will update it here as well
for chapter in soup.findAll('a', href=re.compile(r'/stories/chapter.php\?storyid='+self.story.getMetadata('storyId')+"&chapterid=\d+$")):
for chapter in soup.find_all('a', href=re.compile(r'/stories/chapter.php\?storyid='+self.story.getMetadata('storyId')+r"&chapterid=\d+$")):
value = chapter.findNext('td').findNext('td').string.replace('(added on','').replace(')','').strip()
self.story.setMetadata('dateUpdated', makeDate(value, self.dateformat))
self.chapterUrls.append((stripHTML(chapter),'http://'+self.getSiteDomain()+chapter['href']))
self.add_chapter(chapter,'https://'+self.getSiteDomain()+chapter['href'])
self.story.setMetadata('numChapters',len(self.chapterUrls))
# Get the MetaData
# Erotia Tags
tags = soup.findAll('a',href=re.compile(r'/stories/search.php\?selectedcode'))
tags = soup.find_all('a',href=re.compile(r'/stories/search.php\?selectedcode'))
for tag in tags:
self.story.addToList('eroticatags',tag.text)
for td in soup.findAll('td'):
for td in soup.find_all('td'):
if len(td.text)>0:
if 'Added on:' in td.text and '<table' not in unicode(td):
value = td.text.replace('Added on:','').strip()
@ -192,7 +158,7 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
#Since each chapter is on 1 page, we don't need to do anything special, just get the content of the page.
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
chaptertag = soup.find('div',{'class' : 'storyblock'})
# Some of the stories have the chapters in <pre> sections, so have to check for that
@ -203,20 +169,20 @@ class BDSMLibraryComSiteAdapter(BaseSiteAdapter):
raise exceptions.FailedToDownload("Error downloading Chapter: {0}! Missing required element!".format(url))
#strip comments from soup
[comment.extract() for comment in chaptertag.findAll(text=lambda text:isinstance(text, Comment))]
[comment.extract() for comment in chaptertag.find_all(string=lambda text:isinstance(text, Comment))]
# BDSM Library basically wraps it's own html around the document,
# so we will be removing the script, title and meta content from the
# storyblock
for tag in chaptertag.findAll('head') + chaptertag.findAll('style') + chaptertag.findAll('title') + chaptertag.findAll('meta') + chaptertag.findAll('o:p') + chaptertag.findAll('link'):
for tag in chaptertag.find_all('head') + chaptertag.find_all('style') + chaptertag.find_all('title') + chaptertag.find_all('meta') + chaptertag.find_all('o:p') + chaptertag.find_all('link'):
tag.extract()
for tag in chaptertag.findAll('o:smarttagtype'):
for tag in chaptertag.find_all('o:smarttagtype'):
tag.name = 'span'
## I'm going to take the attributes off all of the tags
## because they usually refer to the style that we removed above.
for tag in chaptertag.findAll(True):
for tag in chaptertag.find_all(True):
tag.attrs = None
return self.utf8FromSoup(url,chaptertag)

View file

@ -1,12 +1,15 @@
from datetime import timedelta
from __future__ import absolute_import
import re
import urllib2
import urlparse
from bs4 import BeautifulSoup
import logging
logger = logging.getLogger(__name__)
from ..htmlcleanup import stripHTML
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six.moves.urllib import parse as urlparse
from .base_adapter import BaseSiteAdapter, makeDate
from .. import exceptions
@ -24,7 +27,7 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
SITE_ABBREVIATION = 'bvc'
SITE_DOMAIN = 'bloodshedverse.com'
BASE_URL = 'http://' + SITE_DOMAIN + '/'
BASE_URL = 'https://' + SITE_DOMAIN + '/'
READ_URL_TEMPLATE = BASE_URL + 'stories.php?go=read&no=%s'
STARTED_DATETIME_FORMAT = '%m/%d/%Y'
@ -40,19 +43,6 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
self._setURL(self.READ_URL_TEMPLATE % story_no)
self.story.setMetadata('siteabbrev', self.SITE_ABBREVIATION)
def _customized_fetch_url(self, url, exception=None, parameters=None):
if exception:
try:
data = self._fetchUrl(url, parameters)
except urllib2.HTTPError:
raise exception(self.url)
# Just let self._fetchUrl throw the exception, don't catch and
# customize it.
else:
data = self._fetchUrl(url, parameters)
return self.make_soup(data)
@staticmethod
def getSiteDomain():
return BloodshedverseComAdapter.SITE_DOMAIN
@ -62,7 +52,7 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
return cls.READ_URL_TEMPLATE % 1234
def getSiteURLPattern(self):
return re.escape(self.BASE_URL + 'stories.php?go=') + r'(read|chapters)\&(amp;)?no=\d+$'
return r'https?://' + re.escape(self.SITE_DOMAIN + '/stories.php?go=') + r'(read|chapters)\&(amp;)?no=\d+$'
# Override stripURLParameters so the "no" parameter won't get stripped
@classmethod
@ -70,7 +60,9 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
return url
def extractChapterUrlsAndMetadata(self):
soup = self._customized_fetch_url(self.url)
logger.debug("URL: "+self.url)
soup = self.make_soup(self.get_request(self.url))
# Since no 404 error code we have to raise the exception ourselves.
# A title that is just 'by' indicates that there is no author name
@ -81,14 +73,24 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
for option in soup.find('select', {'name': 'chapter'}):
title = stripHTML(option)
url = self.READ_URL_TEMPLATE % option['value']
self.chapterUrls.append((title, url))
self.add_chapter(title, url)
# Reset the storyId to be the first chapter no. Needed
# because emails contain link to later chapters instead.
query_data = urlparse.parse_qs(self.get_chapter(0,'url'))
story_no = query_data['no'][0]
self.story.setMetadata('storyId', story_no)
self._setURL(self.READ_URL_TEMPLATE % story_no)
logger.info("updated storyId:%s"%story_no)
logger.info("updated storyUrl:%s"%self.url)
story_no = self.story.getMetadata('storyId')
# Get the URL to the author's page and find the correct story entry to
# scrape the metadata
author_url = urlparse.urljoin(self.url, soup.find('a', {'class': 'headline'})['href'])
soup = self._customized_fetch_url(author_url)
soup = self.make_soup(self.get_request(author_url))
story_no = self.story.getMetadata('storyId')
# Ignore first list_box div, it only contains the author information
for list_box in soup('div', {'class': 'list_box'})[1:]:
url = list_box.find('a', {'class': 'fictitle'})['href']
@ -115,7 +117,7 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
summary_div = list_box.find('div', {'class': 'list_summary'})
if not self.getConfig('keep_summary_html'):
summary = ''.join(summary_div(text=True))
summary = ''.join(summary_div(string=True))
else:
summary = self.utf8FromSoup(author_url, summary_div)
@ -155,9 +157,6 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
self.story.addToList('warnings', warning)
elif key == 'Chapters':
self.story.setMetadata('numChapters', int(value))
elif key == 'Words':
# Apparently only numChapters need to be an integer for
# some strange reason. Remove possible ',' characters as to
@ -172,12 +171,13 @@ class BloodshedverseComAdapter(BaseSiteAdapter):
# ugly %p(am/pm) hack moved into makeDate so other sites can use it.
self.story.setMetadata('dateUpdated', date)
if self.story.getMetadata('rating') == 'NC-17' and not (self.is_adult or self.getConfig('is_adult')):
if self.story.getMetadataRaw('rating') == 'NC-17' and not (self.is_adult or self.getConfig('is_adult')):
raise exceptions.AdultCheckRequired(self.url)
def getChapterText(self, url):
soup = self._customized_fetch_url(url)
storytext_div = soup.find('div', {'class': 'storytext'})
soup = self.make_soup(self.get_request(url))
storytext_div = soup.find('div', {'class': 'tl'})
storytext_div = storytext_div.find('div', {'class': ''})
if self.getConfig('strip_text_links'):
for anchor in storytext_div('a', {'class': 'FAtxtL'}):

View file

@ -1,330 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from bs4.element import Tag
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# By virtue of being recent and requiring both is_adult and user/pass,
# adapter_fanficcastletvnet.py is the best choice for learning to
# write adapters--especially for sites that use the eFiction system.
# Most sites that have ".../viewstory.php?sid=123" in the story URL
# are eFiction.
# For non-eFiction sites, it can be considerably more complex, but
# this is still a good starting point.
# In general an 'adapter' needs to do these five things:
# - 'Register' correctly with the downloader
# - Site Login (if needed)
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
# - Grab the chapter list
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
# - Grab the chapter texts
# Search for XXX comments--that's where things are most likely to need changing.
# This function is called by the downloader in all adapter_*.py files
# in this dir to register the adapter class. So it needs to be
# updated to reflect the class below it. That, plus getSiteDomain()
# take care of 'Registering'.
def getClass():
return BloodTiesFansComAdapter # XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class BloodTiesFansComAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/fiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','btf') # XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %b %Y" # XXX
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'bloodties-fans.com' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/fiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/fiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/fiction/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
# Furthermore, there's a couple sites now with more than
# one warning level for different ratings. And they're
# fussy about it. midnightwhispers has three: 4, 2 & 1.
# we'll try 1 first.
addurl = "&ageconsent=ok&warning=4" # XXX
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# The actual text that is used to announce you need to be an
# adult varies from site to site. Again, print data before
# the title search to troubleshoot.
# Since the warning text can change by warning level, let's
# look for the warning pass url. nfacommunity uses
# &amp;warning= -- actually, so do other sites. Must be an
# eFiction book.
# viewstory.php?sid=561&amp;warning=4
# viewstory.php?sid=561&amp;warning=1
# viewstory.php?sid=561&amp;warning=2
#print data
#m = re.search(r"'viewstory.php\?sid=1882(&amp;warning=4)'",data)
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/fiction/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
listbox = soup.find('div',{'class':'listbox'})
# <strong>Rating:</strong> M<br /> etc
labels = listbox.findAll('strong')
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next strong tag.
svalue = ""
while not isinstance(value,Tag) or value.name != 'strong':
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rating' in label:
self.story.setMetadata('rating', value)
if 'Words' in label:
value=re.sub(r"\|",r"",value)
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
catstext = [cat.string for cat in cats]
for cat in catstext:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
charstext = [char.string for char in chars]
for char in charstext:
self.story.addToList('characters',char.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
value=re.sub(r"\|",r"",value)
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
value=re.sub(r"\|",r"",value)
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
# moved outside because they changed *most*, but not *all* labels to <strong>
ships = listbox.findAll('a',href=re.compile(r'browse.php.type=class&(amp;)?type_id=2')) # crappy html: & vs &amp; in url.
shipstext = [ship.string for ship in ships]
for ship in shipstext:
self.story.addToList('ships',ship.string)
genres = listbox.findAll('a',href=re.compile(r'browse.php\?type=class&(amp;)?type_id=1')) # crappy html: & vs &amp; in url.
genrestext = [genre.string for genre in genres]
for genre in genrestext:
self.story.addToList('genre',genre.string)
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/fiction/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,300 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2016 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return BuffyGilesComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class BuffyGilesComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /efiction part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/efiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','bufg')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d/%m/%y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'buffygiles.velocitygrass.com'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/efiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/efiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=5"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# Since the warning text can change by warning level, let's
# look for the warning pass url. ksarchive uses
# &amp;warning= -- actually, so do other sites. Must be an
# eFiction book.
# efiction/viewstory.php?sid=1882&amp;warning=4
# efiction/viewstory.php?sid=1654&amp;ageconsent=ok&amp;warning=5
#print data
m = re.search(r"'efiction/viewstory.php\?sid=542(&amp;warning=5)'",data)
m = re.search(r"'efiction/viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'pagetitle'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/efiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=3'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"efiction/viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^efiction/viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('efiction/viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -0,0 +1,38 @@
# -*- coding: utf-8 -*-
# Copyright 2024 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
from .base_otw_adapter import BaseOTWAdapter
def getClass():
return CFAAAdapter
class CFAAAdapter(BaseOTWAdapter):
def __init__(self, config, url):
BaseOTWAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','cfaa')
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.cfaarchive.org'

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,17 +16,17 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ChaosSycophantHexComAdapter
@ -38,11 +38,6 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -50,7 +45,7 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
@ -91,13 +86,7 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
# The actual text that is used to announce you need to be an
# adult varies from site to site. Again, print data before
@ -108,11 +97,9 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
pt = soup.find('div', {'id' : 'pagetitle'})
@ -129,11 +116,10 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
self.story.setMetadata('rating', rating)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.add_chapter(chapter,'http://'+self.host+'/'+chapter['href']+addurl)
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
@ -144,12 +130,12 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
labels = soup.find_all('span',{'class':'label'})
value = labels[0].previousSibling
svalue = ""
while value != None:
@ -159,7 +145,7 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
svalue += unicode(val)
val = val.nextSibling
self.setDescription(url,svalue)
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
@ -168,22 +154,22 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
self.story.setMetadata('numWords', value.split(' -')[0])
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
@ -207,9 +193,8 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
seriessoup = self.make_soup(self.get_request(series_url))
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
@ -227,7 +212,7 @@ class ChaosSycophantHexComAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'id' : 'story'})

View file

@ -0,0 +1,107 @@
# -*- coding: utf-8 -*-
# Copyright 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import logging
import re
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
from fanficfare.htmlcleanup import stripHTML
from .. import exceptions as exceptions
logger = logging.getLogger(__name__)
def getClass():
return ChireadsComSiteAdapter
class ChireadsComSiteAdapter(BaseSiteAdapter):
NEW_DATE_FORMAT = '%Y/%m/%d %H:%M:%S'
OLD_DATE_FORMAT = '%m/%d/%Y %I:%M:%S %p'
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev', 'chireads')
# get storyId from url--url validation guarantees query correct
match = re.match(self.getSiteURLPattern(), url)
if not match:
raise exceptions.InvalidStoryURL(url, self.getSiteDomain(), self.getSiteExampleURLs())
story_id = match.group('id')
self.story.setMetadata('storyId', story_id)
self._setURL('https://%s/category/translatedtales/%s/' % (self.getSiteDomain(), story_id))
@staticmethod
def getSiteDomain():
return 'chireads.com'
@classmethod
def getSiteExampleURLs(cls):
return 'https://%s/category/translatedtales/story-name' % cls.getSiteDomain()
def getSiteURLPattern(self):
return r'https?://chireads\.com/category/translatedtales/(?P<id>[^/]+)(/)?'
def extractChapterUrlsAndMetadata(self):
logger.debug('URL: %s', self.url)
data = self.get_request(self.url)
soup = self.make_soup(data)
info = soup.select_one('.inform-inform-data')
self.story.setMetadata('title', stripHTML(info.h3).split(' | ')[0])
self.setCoverImage(self.url, soup.select_one('.inform-product > img')['src'])
# Unicode strings because '' isn't ':', but \xef\xbc\x9a
# author = stripHTML(info.h6).split(u' ')[0].replace(u'Auteur : ', '', 1)
author = stripHTML(info.h6).split('Babelcheck')[0].replace('Auteur : ', '').replace('\xc2\xa0', '')
# author = stripHTML(info.h6).split('\xa0')[0].replace(u'Auteur : ', '', 1)
self.story.setMetadata('author', author)
self.story.setMetadata('authorId', author)
## site doesn't have authorUrl links.
datestr = stripHTML(soup.select_one('.newestchapitre > div > a')['href'])[-11:-1]
date = makeDate(datestr, '%Y/%m/%d')
if date:
self.story.setMetadata('dateUpdated', date)
intro = stripHTML(info.select_one('.inform-inform-txt').span)
self.setDescription(self.url, intro)
for content in soup.find_all('div', {'id': 'content'}):
for a in content.find_all('a'):
self.add_chapter(a.get_text(), a['href'])
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self.get_request(url)
soup = self.make_soup(data)
content = soup.select_one('#content')
if None == content:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,content)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2012 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,18 +16,19 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import sys
from bs4.element import Comment
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ChosenTwoFanFicArchiveAdapter
@ -39,12 +40,6 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8",
"iso-8859-1"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -54,7 +49,7 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','chosen2')
@ -70,10 +65,10 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
return "https://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
return r"https?"+re.escape("://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
@ -83,19 +78,13 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
addURL = "&ageconsent=ok&warning=3"
else:
addURL = ""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = '{0}&index=1{1}'.format(self.url,addURL)
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if "Content is only suitable for mature adults. May contain explicit language and adult themes. Equivalent of NC-17." in data:
raise exceptions.AdultCheckRequired(self.url)
@ -103,15 +92,13 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied("{0} says: Access denied. This story has not been validated by the adminstrators of this site.".format(self.getSiteDomain()))
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# Now go hunting for all the meta data and the chapter list.
## Title
## Some stories have a banner that has it's own a tag before the actual text title...
## so I'm checking the pagetitle div for all a tags that match the criteria, then taking the last.
a = soup.find('div',{'id':'pagetitle'}).findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))[-1]
a = soup.find('div',{'id':'pagetitle'}).find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))[-1]
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
@ -119,16 +106,15 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
# so I'm checking the pagetitle div for this as well
a = soup.find('div',{'id':'pagetitle'}).find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
#self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
self.chapterUrls.append((stripHTML(chapter),'http://{0}/{1}{2}'.format(self.host, chapter['href'],addURL)))
#self.add_chapter(chapter,'http://'+self.host+'/'+chapter['href'])
self.add_chapter(chapter,'https://{0}/{1}{2}'.format(self.host, chapter['href'],addURL))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
@ -141,7 +127,7 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
labels = soup.find_all('span',{'class':'label'})
for labelspan in labels:
val = labelspan.nextSibling
value = unicode('')
@ -163,27 +149,27 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
self.story.setMetadata('numWords', stripHTML(value))
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Pairing' in label:
ships = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=4'))
ships = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=4'))
for ship in ships:
self.story.addToList('ships',ship.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
for warning in warnings:
self.story.addToList('warnings',warning.string)
@ -206,17 +192,16 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
series_url = 'https://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
seriessoup = self.make_soup(self.get_request(series_url))
# can't use ^viewstory...$ in case of higher rated stories with javascript href.
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+'))
storyas = seriessoup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+'))
i=1
for a in storyas:
# skip 'report this' and 'TOC' links
if 'contact.php' not in a['href'] and 'index' not in a['href']:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
# this site has several links to each story.
if a.text == 'Latest Chapter':
if ('viewstory.php?sid='+self.story.getMetadata('storyId')) in a['href']:
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
@ -231,7 +216,7 @@ class ChosenTwoFanFicArchiveAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'id' : 'story'})

View file

@ -1,237 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return CSIForensicsComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class CSIForensicsComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','csiforensics')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %b %Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'csi-forensics.com'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=5&skin=elegantcsi"
else:
addurl="&skin=elegantcsi"
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# The actual text that is used to announce you need to be an
# adult varies from site to site. Again, print data before
# the title search to troubleshoot.
if "This story is rated NC-17, and therefore is not suitable for minors. If you are below the age required to view such material in your locality, please return from whence you came." in data: # XXX
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
pt = soup.find('div', {'id' : 'pagetitle'})
a = pt.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',a.string)
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Rating
rate = stripHTML(soup.find('div',{'id':'pagetitle'}))
rate = rate[rate.rindex('[')+1:rate.rindex(']')]
self.story.setMetadata('rating', rate)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
smalldiv = soup.find('div', {'class' : 'small'})
chars = smalldiv.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
metatext = stripHTML(smalldiv)
if 'Completed: Yes' in metatext:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
word=soup.find(text=re.compile("Word count:")).split(':')
self.story.setMetadata('numWords', word[1])
cats = smalldiv.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
warnings = smalldiv.findAll('a',href=re.compile(r'browse.php\?type=class(&amp;)type_id=2(&amp;)classid=\d+'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
date=soup.find('div',{'class' : 'bottom'})
pd=date.find(text=re.compile("Published:")).string.split(': ')
self.story.setMetadata('datePublished', makeDate(stripHTML(pd[1].split(' U')[0]), self.dateformat))
self.story.setMetadata('dateUpdated', makeDate(stripHTML(pd[2]), self.dateformat))
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
pub=0
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Genres' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
smalldiv.extract()
# Summary
summary = soup.find('div', {'class' : 'content'})
self.setDescription(url,summary)
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,13 +15,21 @@
# limitations under the License.
#
from __future__ import absolute_import
from ..htmlcleanup import stripHTML
# Software: eFiction
from base_efiction_adapter import BaseEfictionAdapter
from .base_efiction_adapter import BaseEfictionAdapter
class DarkSolaceOrgAdapter(BaseEfictionAdapter):
@classmethod
def getProtocol(self):
"""
Some, but not all site now require https.
"""
return "https"
@staticmethod
def getSiteDomain():
return 'dark-solace.org'

View file

@ -1,300 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return DeepInMySoulNetAdapter ## XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class DeepInMySoulNetAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fiction part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/fiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','dimsn') ## XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%B %d, %Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.deepinmysoul.net' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/fiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/fiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/fiction/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=4"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# Since the warning text can change by warning level, let's
# look for the warning pass url. ksarchive uses
# &amp;warning= -- actually, so do other sites. Must be an
# eFiction book.
# fiction/viewstory.php?sid=1882&amp;warning=4
# fiction/viewstory.php?sid=1654&amp;ageconsent=ok&amp;warning=5
#print data
m = re.search(r"'fiction/viewstory.php\?sid=29(&amp;warning=4)'",data)
m = re.search(r"'fiction/viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'pagecontent'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=3'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"fiction/viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^fiction/viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('fiction/viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,243 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return DestinysGatewayComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class DestinysGatewayComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','dgrfa')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%b %d %Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.destinysgateway.com'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=4"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while value and 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -0,0 +1,256 @@
# -*- coding: utf-8 -*-
# Copyright 2021 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import logging
import re
# py2 vs py3 transition
from ..six.moves.urllib.parse import urlparse
from .base_adapter import BaseSiteAdapter, makeDate
from fanficfare.htmlcleanup import stripHTML
from .. import exceptions as exceptions
from fanficfare.dateutils import parse_relative_date_string
logger = logging.getLogger(__name__)
def getClass():
return DeviantArtComSiteAdapter
class DeviantArtComSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev', 'dac')
self.username = 'NoneGiven'
self.password = ''
self.is_adult = False
match = re.match(self.getSiteURLPattern(), url)
if not match:
raise exceptions.InvalidStoryURL(url, self.getSiteDomain(), self.getSiteExampleURLs())
story_id = match.group('id')
author = match.group('author')
self.story.setMetadata('author', author)
self.story.setMetadata('authorId', author)
self.story.setMetadata('authorUrl', 'https://www.deviantart.com/' + author)
self._setURL(url)
@staticmethod
def getSiteDomain():
return 'www.deviantart.com'
@classmethod
def getAcceptDomains(cls):
return ['www.deviantart.com']
@classmethod
def getProtocol(self):
return 'https'
@classmethod
def getSiteExampleURLs(cls):
return 'https://%s/<author>/art/<work-name>' % cls.getSiteDomain()
def getSiteURLPattern(self):
return r'https?://www\.deviantart\.com/(?P<author>[^/]+)/art/(?P<id>[^/]+)/?'
def performLogin(self, url):
if self.username and self.username != 'NoneGiven':
username = self.username
else:
username = self.getConfig('username')
# logger.debug("\n\nusername:(%s)\n\n"%username)
if not username:
logger.info("Login Required for URL %s" % url)
raise exceptions.FailedToLogin(url,username)
data = self.get_request_raw('https://www.deviantart.com/users/login', referer=url, usecache=False)
data = self.decode_data(data)
soup = self.make_soup(data)
params = {
'referer': 'https://www.deviantart.com/_sisu/do/signin', # soup.find('input', {'name': 'referer'})['value'],
'referer_type': soup.find('input', {'name': 'referer_type'})['value'],
'csrf_token': soup.find('input', {'name': 'csrf_token'})['value'],
'challenge': soup.find('input', {'name': 'challenge'})['value'],
'lu_token': soup.find('input', {'name': 'lu_token'})['value'],
'remember': 'on',
'username': username
}
loginUrl = 'https://' + self.getSiteDomain() + '/_sisu/do/step2'
logger.debug('Will now login to deviantARt as (%s)' % username)
result = self.post_request(loginUrl, params, usecache=False)
soup = self.make_soup(result)
if not soup.find('input', {'name': 'lu_token2'}):
logger.info("Login Failed for URL %s (no lu_token2 found)" % url)
raise exceptions.FailedToLogin(url,username)
params = {
'referer': 'https://www.deviantart.com/_sisu/do/signin', # soup.find('input', {'name': 'referer'})['value'],
'referer_type': soup.find('input', {'name': 'referer_type'})['value'],
'csrf_token': soup.find('input', {'name': 'csrf_token'})['value'],
'challenge': soup.find('input', {'name': 'challenge'})['value'],
'lu_token': soup.find('input', {'name': 'lu_token'})['value'],
'lu_token2': soup.find('input', {'name': 'lu_token2'})['value'],
'remember': 'on',
'username': ''
}
if self.password:
params['password'] = self.password
else:
params['password'] = self.getConfig('password')
# logger.debug("\n\nparams['password']:(%s)\n\n"%params['password'])
loginUrl = 'https://' + self.getSiteDomain() + '/_sisu/do/signin'
logger.debug('Will now send password to deviantARt')
result = self.post_request(loginUrl, params, usecache=False)
if 'Log In | DeviantArt' in result:
logger.error('Failed to login to deviantArt as %s' % username)
raise exceptions.FailedToLogin('https://www.deviantart.com', username)
else:
return True
def requiresLogin(self, data):
return '</a> has limited the viewing of this artwork to members of the DeviantArt community only' in data
def isLoggedIn(self, data):
return '<form id="logout-form" action="https://www.deviantart.com/users/logout" method="POST">' in data
def isWatchersOnly(self, data):
return '>Watchers-Only Deviation<' in data
def requiresMatureContentEnabled(self, data):
return (
'>This content is intended for mature audiences<' in data
or '>This deviation is intended for mature audiences<' in data
or '>This filter hides content that may be inappropriate for some viewers<' in data
or '>May contain sensitive content<' in data
or '>Log in to view<' in data
or '>This deviation has been labeled as containing themes not suitable for all deviants.<' in data
)
def extractChapterUrlsAndMetadata(self):
logger.debug('URL: %s', self.url)
data = self.get_request(self.url)
soup = self.make_soup(data)
## story can require login outright, or it can show up as
## watchers-only or mature-enabled without the same 'requires
## login' strings.
if self.requiresLogin(data) or ( not self.isLoggedIn(data) and
(self.isWatchersOnly(data) or
self.requiresMatureContentEnabled(data)) ):
if self.performLogin(self.url):
data = self.get_request(self.url, usecache=False)
soup = self.make_soup(data)
## Check watchers only and mature enabled again, separately,
## after login because they can still apply after login.
if self.isWatchersOnly(data):
raise exceptions.FailedToDownload(
'Deviation is only available for watchers.' +
'You must watch this author before you can download it.'
)
if self.requiresMatureContentEnabled(data):
raise exceptions.FailedToDownload(
'Deviation is set as mature, you must go into your account ' +
'and enable showing of mature content.'
)
appurl = soup.select_one('meta[property="og:url"]')['content']
if appurl:
story_id = urlparse(appurl).path.lstrip('/')
else:
logger.debug("Looking for JS story id")
## after login, this is only found in a JS block. Dunno why.
## F875A309-B0DB-860E-5079-790D0FBE5668
match = re.match(r'\\"deviationUuid\\":\\"(?P<id>[A-Z0-9-]+)\\",',data)
if match:
story_id = match.group('id')
else:
raise exceptions.FailedToDownload('Failed to find Story ID.')
self.story.setMetadata('storyId', story_id)
title = soup.select_one('h1').get_text()
self.story.setMetadata('title', stripHTML(title))
## dA has no concept of status
# self.story.setMetadata('status', 'Completed')
pubdate = soup.select_one('time').get_text()
# Maybe do this better, but this works
try:
self.story.setMetadata('datePublished', makeDate(pubdate, '%b %d, %Y'))
except:
self.story.setMetadata('datePublished', parse_relative_date_string(pubdate))
# do description here if appropriate
story_tags = soup.select('a[href^="https://www.deviantart.com/tag"] span')
if story_tags is not None:
for tag in story_tags:
self.story.addToList('genre', tag.get_text())
self.add_chapter(title, self.url)
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s', url)
data = self.get_request(url)
# logger.debug(data)
soup = self.make_soup(data)
# remove comments section to avoid false matches
comments = soup.select_one('[data-hook=comments_thread]')
if comments:
comments.decompose()
# previous search not always found in some stories.
# <div id="comments"></div> inside the real containing
# div seems more common
commentsdiv = soup.select_one('div#comments')
if commentsdiv:
commentsdiv.parent.decompose()
# three different 'content' tags to look for.
# This is the current in Oct 2024
content = soup.select_one('[data-editor-viewer="1"]')
if content is None:
# older story? I can't find any of this style in Oct2024
content = soup.select_one('[data-id="rich-content-viewer"]')
if content is None:
# olderer story, but used by some older (2018) posts
content = soup.select_one('.legacy-journal')
if content is None:
raise exceptions.FailedToDownload(
'Could not find story text. Please open a bug with the URL %s' % self.url
)
return self.utf8FromSoup(url, content)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,17 +15,16 @@
# limitations under the License.
#
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return DokugaComAdapter
@ -37,11 +36,6 @@ class DokugaComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -80,7 +74,7 @@ class DokugaComAdapter(BaseSiteAdapter):
return "http://"+cls.getSiteDomain()+"/fanfiction/story/1234/1 http://"+cls.getSiteDomain()+"/spark/story/1234/1"
def getSiteURLPattern(self):
return r"http://"+self.getSiteDomain()+"/(fanfiction|spark)?/story/\d+/?\d+?$"
return r"http://"+self.getSiteDomain()+r"/(fanfiction|spark)?/story/\d+/?\d+?$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
@ -101,17 +95,17 @@ class DokugaComAdapter(BaseSiteAdapter):
params['Submit'] = 'Submit'
# copy all hidden input tags to pick up appropriate tokens.
for tag in soup.findAll('input',{'type':'hidden'}):
for tag in soup.find_all('input',{'type':'hidden'}):
params[tag['name']] = tag['value']
loginUrl = 'http://' + self.getSiteDomain() + '/fanfiction'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['username']))
d = self._postUrl(loginUrl, params)
d = self.post_request(loginUrl, params)
if "Your session has expired. Please log in again." in d:
d = self._postUrl(loginUrl, params)
d = self.post_request(loginUrl, params)
if "Logout" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
@ -129,28 +123,20 @@ class DokugaComAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url,soup)
data = self._fetchUrl(url)
data = self.get_request(url)
soup = self.make_soup(data)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# print data
# Now go hunting for all the meta data and the chapter list.
## Title and author
a = soup.find('div', {'align' : 'center'}).find('h3')
@ -167,23 +153,22 @@ class DokugaComAdapter(BaseSiteAdapter):
self.story.setMetadata('title',stripHTML(a))
# Find the chapters:
chapters = soup.find('select').findAll('option')
chapters = soup.find('select').find_all('option')
if len(chapters)==1:
self.chapterUrls.append((self.story.getMetadata('title'),'http://'+self.host+'/'+self.section+'/story/'+self.story.getMetadata('storyId')+'/1'))
self.add_chapter(self.story.getMetadata('title'),'http://'+self.host+'/'+self.section+'/story/'+self.story.getMetadata('storyId')+'/1')
else:
for chapter in chapters:
# just in case there's tags, like <i> in chapter titles. /fanfiction/story/7406/1
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+self.section+'/story/'+self.story.getMetadata('storyId')+'/'+chapter['value']))
self.add_chapter(chapter,'http://'+self.host+'/'+self.section+'/story/'+self.story.getMetadata('storyId')+'/'+chapter['value'])
self.story.setMetadata('numChapters',len(self.chapterUrls))
asoup = self.make_soup(self._fetchUrl(alink))
asoup = self.make_soup(self.get_request(alink))
if 'fanfiction' in self.section:
asoup=asoup.find('div', {'id' : 'cb_tabid_52'}).find('div')
#grab the rest of the metadata from the author's page
for div in asoup.findAll('div'):
for div in asoup.find_all('div'):
nav=div.find('a', href=re.compile(r'/fanfiction/story/'+self.story.getMetadata('storyId')+"/1$"))
if nav != None:
break
@ -223,7 +208,7 @@ class DokugaComAdapter(BaseSiteAdapter):
else:
asoup=asoup.find('div', {'id' : 'maincol'}).find('div', {'class' : 'padding'})
for div in asoup.findAll('div'):
for div in asoup.find_all('div'):
nav=div.find('a', href=re.compile(r'/spark/story/'+self.story.getMetadata('storyId')+"/1$"))
if nav != None:
break
@ -267,7 +252,7 @@ class DokugaComAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'id' : 'chtext'})

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2012 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,17 +16,17 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return DracoAndGinnyComAdapter
@ -38,11 +38,6 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -98,7 +93,7 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
d = self.post_request(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
@ -125,18 +120,12 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
data = self.get_request(url)
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
@ -150,24 +139,16 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
@ -180,11 +161,10 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.add_chapter(chapter,'http://'+self.host+'/'+chapter['href']+addurl)
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
@ -201,13 +181,13 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
self.setDescription(url,content.find('blockquote'))
for genre in content.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1')):
for genre in content.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1')):
self.story.addToList('genre',genre.string)
for warning in content.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')):
for warning in content.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')):
self.story.addToList('warnings',warning.string)
labels = content.findAll('b')
labels = content.find_all('b')
for labelspan in labels:
value = labelspan.nextSibling
@ -228,22 +208,22 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
self.story.setMetadata('rating', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
@ -265,10 +245,9 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
seriessoup = self.make_soup(self.get_request(series_url))
# can't use ^viewstory...$ in case of higher rated stories with javascript href.
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+'))
storyas = seriessoup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+'))
i=1
for a in storyas:
# skip 'report this' and 'TOC' links
@ -288,7 +267,7 @@ class DracoAndGinnyComAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'class' : 'listbox'})

View file

@ -1,311 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from bs4.element import Tag
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return DramioneOrgAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class DramioneOrgAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252",]
# 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','drmn')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %B %Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'dramione.org'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&warning=5"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# The actual text that is used to announce you need to be an
# adult varies from site to site. Again, print data before
# the title search to troubleshoot.
if "Stories that are suitable for ages 16 and older" in data:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Use banner as cover if found
coverurl = ''
img = soup.find('img',{'class':'banner'})
if img:
coverurl = img['src']
#print "Cover: "+coverurl
a = soup.find(text="This story has a banner; click to view.")
if a:
#print "A: "+ ', '.join("(%s, %s)" %tup for tup in a.parent.attrs)
coverurl = a.parent['href']
#print "Cover: "+coverurl
if coverurl:
self.setCoverImage(url,coverurl)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
genres=soup.findAll('a', {'class' : "tag-1"})
for genre in genres:
self.story.addToList('genre',genre.string)
warnings=soup.findAll('a', {'class' : "tag-2"})
for warning in warnings:
self.story.addToList('warnings',warning.string)
themes=soup.findAll('a', {'class' : "tag-3"})
for theme in themes:
self.story.addToList('themes',theme.string)
hermiones=soup.findAll('a', {'class' : "tag-4"})
for hermione in hermiones:
self.story.addToList('hermiones',hermione.string)
dracos=soup.findAll('a', {'class' : "tag-5"})
for draco in dracos:
self.story.addToList('dracos',draco.string)
timelines=soup.findAll('a', {'class' : "tag-6"})
for timeline in timelines:
self.story.addToList('timeline',timeline.string)
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
listbox = soup.find('div',{'class':'listbox'})
# <strong>Rated:</strong> M<br /> etc
labels = listbox.findAll('strong')
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next strong tag.
svalue = ""
while not isinstance(value,Tag) or value.name != 'strong':
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Read' in label:
self.story.setMetadata('read', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
value=re.sub(r"(\d+)(st|nd|rd|th)",r"\1",value)
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
value=re.sub(r"(\d+)(st|nd|rd|th)",r"\1",value)
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
try:
self.story.setMetadata('reviews',
stripHTML(soup.find('h2',{'id':'pagetitle'}).
findAll('a', href=re.compile(r'^reviews.php'))[1]))
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,223 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return EfictionEstelielDeAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class EfictionEstelielDeAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','eesd')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%B %d, %Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'efiction.esteliel.de'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# Now go hunting for all the meta data and the chapter list.
## Title and author
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'pagetitle'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
list = soup.find('div', {'class':'listbox'})
labelspan=list.find('span',{'class':'label'})
value = labelspan.nextSibling
label = labelspan.string
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
labels = list.findAll('b')
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while 'Rating' not in unicode(value):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rating' in label:
self.story.setMetadata('rating', value)
if 'Words' in label:
self.story.setMetadata('numWords', value)
if 'Category' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
if list.find('a', href=re.compile(r"series.php")) != None:
for series in asoup.findAll('a', href=re.compile(r"series.php\?seriesid=\d+")):
# Find Series name from series URL.
series_url = 'http://'+self.host+'/'+series['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
name=seriessoup.find('div', {'id' : 'pagetitle'})
name.find('a').extract()
self.setSeries(name.text.split(' by[')[0], i)
self.story.setMetadata('seriesUrl',series_url)
i=0
break
i+=1
if i == 0:
break
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2012 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,17 +16,16 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return EFPFanFicNet
@ -38,11 +37,6 @@ class EFPFanFicNet(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -52,7 +46,7 @@ class EFPFanFicNet(BaseSiteAdapter):
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','efp')
@ -64,14 +58,14 @@ class EFPFanFicNet(BaseSiteAdapter):
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.efpfanfic.net'
return 'efpfanfic.net'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
return "https://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
return r"https?://(www\.)?"+re.escape(self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
@ -93,11 +87,11 @@ class EFPFanFicNet(BaseSiteAdapter):
params['cookiecheck'] = '1'
params['submit'] = 'Invia'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?sid='+self.story.getMetadata('storyId')
loginUrl = 'https://' + self.getSiteDomain() + '/user.php?sid='+self.story.getMetadata('storyId')
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
d = self.post_request(loginUrl, params)
if '<a class="menu" href="newaccount.php">' in d : # register for new account link
logger.info("Failed to login to URL %s as %s" % (loginUrl,
@ -113,27 +107,19 @@ class EFPFanFicNet(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
data = self.get_request(url)
# if "Access denied. This story has not been validated by the adminstrators of this site." in data:
# raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'^viewstory\.php\?sid='+self.story.getMetadata('storyId')+"$"))
@ -142,29 +128,28 @@ class EFPFanFicNet(BaseSiteAdapter):
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapter selector
select = soup.find('select', { 'name' : 'sid' } )
if select is None:
# no selector found, so it's a one-chapter story.
self.chapterUrls.append((self.story.getMetadata('title'),url))
# no selector found, so it's a one-chapter story.
self.add_chapter(self.story.getMetadata('title'),url)
else:
allOptions = select.findAll('option', {'value' : re.compile(r'viewstory')})
allOptions = select.find_all('option', {'value' : re.compile(r'viewstory')})
for o in allOptions:
url = u'http://%s/%s' % ( self.getSiteDomain(),
url = u'https://%s/%s' % ( self.getSiteDomain(),
o['value'])
# just in case there's tags, like <i> in chapter titles.
title = stripHTML(o)
self.chapterUrls.append((title,url))
self.add_chapter(title,url)
self.story.setMetadata('numChapters',len(self.chapterUrls))
self.story.setMetadata('language','Italian')
# normalize story URL to first chapter if later chapter URL was given:
url = self.chapterUrls[0][1].replace('&i=1','')
url = self.get_chapter(0,'url').replace('&i=1','')
logger.debug("Normalizing to URL: "+url)
self._setURL(url)
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
@ -184,15 +169,15 @@ class EFPFanFicNet(BaseSiteAdapter):
# no storya, but do have authsoup--we're looping on author pages.
if authsoup != None:
# last author link with offset should be the 'next' link.
authurl = u'http://%s/%s' % ( self.getSiteDomain(),
authsoup.findAll('a',href=re.compile(r'viewuser\.php\?uid=\d+&catid=&offset='))[-1]['href'] )
authurl = u'https://%s/%s' % ( self.getSiteDomain(),
authsoup.find_all('a',href=re.compile(r'viewuser\.php\?uid=\d+&catid=&offset='))[-1]['href'] )
# Need author page for most of the metadata.
logger.debug("fetching author page: (%s)"%authurl)
authsoup = self.make_soup(self._fetchUrl(authurl))
authsoup = self.make_soup(self.get_request(authurl))
#print("authsoup:%s"%authsoup)
storyas = authsoup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r'&i=1$'))
storyas = authsoup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r'&i=1$'))
for storya in storyas:
#print("======storya:%s"%storya)
storyblock = storya.findParent('div',{'class':'storybloc'})
@ -209,7 +194,7 @@ class EFPFanFicNet(BaseSiteAdapter):
# Tipo di coppia: Het | Personaggi: Akasuna no Sasori , Akatsuki, Nuovo Personaggio | Note: OOC | Avvertimenti: Tematiche delicate<br />
# Categoria: <a href="categories.php?catid=1&amp;parentcatid=1">Anime & Manga</a> > <a href="categories.php?catid=108&amp;parentcatid=108">Naruto</a> | Contesto: Naruto Shippuuden | Leggi le <a href="reviews.php?sid=1331275&amp;a=">3</a> recensioni</div>
cats = noteblock.findAll('a',href=re.compile(r'browse.php\?type=categories'))
cats = noteblock.find_all('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
@ -273,12 +258,11 @@ class EFPFanFicNet(BaseSiteAdapter):
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?ssid=\d+&i=1"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
series_url = 'https://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
seriessoup = self.make_soup(self.get_request(series_url))
# can't use ^viewstory...$ in case of higher rated stories with javascript href.
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+&i=1'))
storyas = seriessoup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+&i=1'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId'))+'&i=1':
@ -296,7 +280,7 @@ class EFPFanFicNet(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'class' : 'storia'})
@ -304,11 +288,11 @@ class EFPFanFicNet(BaseSiteAdapter):
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
# remove any header and 'o:p' tags.
for tag in div.findAll("head") + div.findAll("o:p"):
for tag in div.find_all("head") + div.find_all("o:p"):
tag.extract()
# change any html and body tags to div.
for tag in div.findAll("html") + div.findAll("body"):
for tag in div.find_all("html") + div.find_all("body"):
tag.name='div'
# remove extra bogus doctype.

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -16,17 +16,17 @@
#
# Software: eFiction
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return ErosnSapphoSycophantHexComAdapter
@ -38,11 +38,6 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -50,7 +45,7 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
@ -91,13 +86,7 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
@ -111,24 +100,16 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title
pt = soup.find('div', {'id' : 'pagetitle'})
@ -145,11 +126,10 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
self.story.setMetadata('rating', rating)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.add_chapter(chapter,'http://'+self.host+'/'+chapter['href']+addurl)
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
@ -160,12 +140,12 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
labels = soup.find_all('span',{'class':'label'})
value = labels[0].previousSibling
svalue = ""
while value != None:
@ -175,7 +155,7 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
svalue += unicode(val)
val = val.nextSibling
self.setDescription(url,svalue)
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
@ -184,22 +164,22 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
self.story.setMetadata('numWords', value.split(' -')[0])
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
for warning in warnings:
self.story.addToList('warnings',warning.string)
@ -223,9 +203,8 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+'))
seriessoup = self.make_soup(self.get_request(series_url))
storyas = seriessoup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+'))
i=1
for a in storyas:
# skip 'report this' and 'TOC' links
@ -245,7 +224,7 @@ class ErosnSapphoSycophantHexComAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'id' : 'story'})

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# -- coding: utf-8 --
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2013 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -18,19 +18,19 @@
### Adapted by GComyn - November 26, 2016
###
####################################################################################################
from __future__ import absolute_import
from __future__ import unicode_literals
import time
import logging
logger = logging.getLogger(__name__)
import re
import sys
import urllib2
from bs4 import UnicodeDammit, Comment
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
####################################################################################################
def getClass():
@ -42,14 +42,6 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
logger.debug("FanficAuthorsNetAdapter.__init__ - url='{0}'".format(url))
self.decode = ["utf8",
"Windows-1252",
"iso-8859-1"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
@ -61,8 +53,11 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
#Setting the 'Zone' for each "Site"
self.zone = self.parsedUrl.netloc.replace('.fanficauthors.net','')
# site change .nsns to -nsns
self.zone = self.zone.replace('.nsns','-nsns')
# normalized story URL.
self._setURL('http://{0}.{1}/{2}/'.format(
self._setURL('https://{0}.{1}/{2}/'.format(
self.zone, self.getBaseDomain(), self.story.getMetadata('storyId')))
# Each adapter needs to have a unique site abbreviation.
@ -71,10 +66,10 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %b %y"
################################################################################################
def getBaseDomain(self):
''' Added because fanficauthors.net does send you to www.fanficauthors.net when
''' Added because fanficauthors.net does send you to www.fanficauthors.net when
you go to it '''
return 'fanficauthors.net'
@ -87,7 +82,10 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
@classmethod
def getAcceptDomains(cls):
# need both .nsns(old) and -nsns(new) because it's a domain
# change, not just URL change.
return ['aaran-st-vines.nsns.fanficauthors.net',
'aaran-st-vines-nsns.fanficauthors.net',
'abraxan.fanficauthors.net',
'bobmin.fanficauthors.net',
'canoncansodoff.fanficauthors.net',
@ -103,9 +101,12 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
'jeconais.fanficauthors.net',
'kinsfire.fanficauthors.net',
'kokopelli.nsns.fanficauthors.net',
'kokopelli-nsns.fanficauthors.net',
'ladya.nsns.fanficauthors.net',
'ladya-nsns.fanficauthors.net',
'lorddwar.fanficauthors.net',
'mrintel.nsns.fanficauthors.net',
'mrintel-nsns.fanficauthors.net',
'musings-of-apathy.fanficauthors.net',
'ruskbyte.fanficauthors.net',
'seelvor.fanficauthors.net',
@ -116,35 +117,43 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
################################################################################################
@classmethod
def getSiteExampleURLs(self):
return ("http://aaran-st-vines.nsns.fanficauthors.net/[StoryId]/\n"
+ "http://abraxan.fanficauthors.net/[StoryId]/\n"
+ "http://bobmin.fanficauthors.net/[StoryId]/\n"
+ "http://canoncansodoff.fanficauthors.net/[StoryId]/\n"
+ "http://chemprof.fanficauthors.net/[StoryId]/\n"
+ "http://copperbadge.fanficauthors.net/[StoryId]/\n"
+ "http://crys.fanficauthors.net/[StoryId]/\n"
+ "http://deluded-musings.fanficauthors.net/[StoryId]/\n"
+ "http://draco664.fanficauthors.net/[StoryId]/\n"
+ "http://fp.fanficauthors.net/[StoryId]/\n"
+ "http://frenchsession.fanficauthors.net/[StoryId]/\n"
+ "http://ishtar.fanficauthors.net/[StoryId]/\n"
+ "http://jbern.fanficauthors.net/[StoryId]/\n"
+ "http://jeconais.fanficauthors.net/[StoryId]/\n"
+ "http://kinsfire.fanficauthors.net/[StoryId]/\n"
+ "http://kokopelli.nsns.fanficauthors.net/[StoryId]/\n"
+ "http://ladya.nsns.fanficauthors.net/[StoryId]/\n"
+ "http://lorddwar.fanficauthors.net/[StoryId]/\n"
+ "http://mrintel.nsns.fanficauthors.net/[StoryId]/\n"
+ "http://musings-of-apathy.fanficauthors.net/[StoryId]/\n"
+ "http://ruskbyte.fanficauthors.net/[StoryId]/\n"
+ "http://seelvor.fanficauthors.net/[StoryId]/\n"
+ "http://tenhawk.fanficauthors.net/[StoryId]/\n"
+ "http://viridian.fanficauthors.net/[StoryId]/\n"
+ "http://whydoyouneedtoknow.fanficauthors.net/[StoryId]/\n")
return ("https://aaran-st-vines-nsns.fanficauthors.net/A_Story_Name/ "
+ "https://abraxan.fanficauthors.net/A_Story_Name/ "
+ "https://bobmin.fanficauthors.net/A_Story_Name/ "
+ "https://canoncansodoff.fanficauthors.net/A_Story_Name/ "
+ "https://chemprof.fanficauthors.net/A_Story_Name/ "
+ "https://copperbadge.fanficauthors.net/A_Story_Name/ "
+ "https://crys.fanficauthors.net/A_Story_Name/ "
+ "https://deluded-musings.fanficauthors.net/A_Story_Name/ "
+ "https://draco664.fanficauthors.net/A_Story_Name/ "
+ "https://fp.fanficauthors.net/A_Story_Name/ "
+ "https://frenchsession.fanficauthors.net/A_Story_Name/ "
+ "https://ishtar.fanficauthors.net/A_Story_Name/ "
+ "https://jbern.fanficauthors.net/A_Story_Name/ "
+ "https://jeconais.fanficauthors.net/A_Story_Name/ "
+ "https://kinsfire.fanficauthors.net/A_Story_Name/ "
+ "https://kokopelli-nsns.fanficauthors.net/A_Story_Name/ "
+ "https://ladya-nsns.fanficauthors.net/A_Story_Name/ "
+ "https://lorddwar.fanficauthors.net/A_Story_Name/ "
+ "https://mrintel-nsns.fanficauthors.net/A_Story_Name/ "
+ "https://musings-of-apathy.fanficauthors.net/A_Story_Name/ "
+ "https://ruskbyte.fanficauthors.net/A_Story_Name/ "
+ "https://seelvor.fanficauthors.net/A_Story_Name/ "
+ "https://tenhawk.fanficauthors.net/A_Story_Name/ "
+ "https://viridian.fanficauthors.net/A_Story_Name/ "
+ "https://whydoyouneedtoknow.fanficauthors.net/A_Story_Name/ ")
################################################################################################
def getSiteURLPattern(self):
return r'http?://(aaran-st-vines.nsns|abraxan|bobmin|canoncansodoff|chemprof|copperbadge|crys|deluded-musings|draco664|fp|frenchsession|ishtar|jbern|jeconais|kinsfire|kokopelli.nsns|ladya.nsns|lorddwar|mrintel.nsns|musings-of-apathy|ruskbyte|seelvor|tenhawk|viridian|whydoyouneedtoknow)\.fanficauthors\.net/([a-zA-Z0-9_]+)/'
## .nsns kept here to match both . and -
return r'https?://(aaran-st-vines.nsns|abraxan|bobmin|canoncansodoff|chemprof|copperbadge|crys|deluded-musings|draco664|fp|frenchsession|ishtar|jbern|jeconais|kinsfire|kokopelli.nsns|ladya.nsns|lorddwar|mrintel.nsns|musings-of-apathy|ruskbyte|seelvor|tenhawk|viridian|whydoyouneedtoknow)\.fanficauthors\.net/([a-zA-Z0-9_]+)/'
@classmethod
def get_section_url(cls,url):
## only changing .nsns to -nsns and only when part of the
## domain.
url = url.replace('.nsns.fanficauthors.net','-nsns.fanficauthors.net')
return url
################################################################################################
def doExtractChapterUrlsAndMetadata(self, get_cover=True):
@ -152,139 +161,105 @@ class FanficAuthorsNetAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
params={}
if self.password:
params['username'] = self.username
params['password'] = self.password
else:
params['username'] = self.getConfig("username")
params['password'] = self.getConfig("password")
if not params['username']:
raise exceptions.FailedToLogin('You need to have your username and pasword set.',params['username'])
soup = self.make_soup(self.get_request(url+'index/'))
try:
data = self._fetchUrl(url+'index/', params, usecache=False)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist("Code: 404. {0}".format(url))
elif e.code == 410:
raise exceptions.StoryDoesNotExist("Code: 410. {0}".format(url))
elif e.code == 401:
self.needToLogin = True
data = ''
else:
raise e
if "The requested file has not been found" in data:
raise exceptions.StoryDoesNotExist(
"{0}.{1} says: The requested file has not been found".format(
self.zone, self.getBaseDomain()))
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# Find authorid and URL.
# There is no place where the author's name is listed,
# Find authorid and URL.
# There is no place where the author's name is listed,
# except for in the image at the top of the page. We have to
# work with the url entered to get the Author's Name
a = self.zone.split('.')[0]
self.story.setMetadata('authorId',a)
a = a.replace('-',' ').title()
self.story.setMetadata('author',a)
self.story.setMetadata('authorUrl','http://{0}/'.format(self.parsedUrl.netloc))
loginUrl = self.story.getMetadata('authorUrl')+'account/'
loginsoup = self.make_soup(self._fetchUrl(loginUrl))
if True:
# if self.performLogin(loginUrl, loginsoup):
# Now go hunting for all the meta data and the chapter list.
self.story.setMetadata('authorUrl','https://{0}/'.format(self.parsedUrl.netloc))
## Title
a = soup.find('h2')
self.story.setMetadata('title',stripHTML(a))
## Title
a = soup.find('h2')
self.story.setMetadata('title',stripHTML(a))
# Find the chapters:
# The published and update dates are with the chapter links...
# so we have to get them from there.
chapters = soup.findAll('a', href=re.compile('/'+self.story.getMetadata(
'storyId')+'/([a-zA-Z0-9_]+)/'))
# Find the chapters:
# The published and update dates are with the chapter links...
# so we have to get them from there.
chapters = soup.find_all('a', href=re.compile('/'+self.story.getMetadata(
'storyId')+'/([a-zA-Z0-9_]+)/'))
# Here we are getting the published date. It is the date the first chapter was "updated"
updatedate = stripHTML(unicode(chapters[0].parent)).split('Uploaded on:')[1].strip()
updatedate = updatedate.replace('st ',' ').replace('nd ',' ').replace(
'rd ',' ').replace('th ',' ')
self.story.setMetadata('datePublished', makeDate(updatedate, self.dateformat))
# Here we are getting the published date. It is the date the first chapter was "updated"
updatedate = stripHTML(unicode(chapters[0].parent)).split('Uploaded on:')[1].strip()
updatedate = updatedate.replace('st ',' ').replace('nd ',' ').replace(
'rd ',' ').replace('th ',' ')
self.story.setMetadata('datePublished', makeDate(updatedate, self.dateformat))
for i, chapter in enumerate(chapters):
if '/reviews/' not in chapter['href']:
# here we get the update date. We will update this for every chapter,
# so we get the last one.
updatedate = stripHTML(unicode(chapters[i].parent)).split(
'Uploaded on:')[1].strip()
updatedate = updatedate.replace('st ',' ').replace('nd ',' ').replace(
'rd ',' ').replace('th ',' ')
self.story.setMetadata('dateUpdated', makeDate(updatedate, self.dateformat))
if '::' in stripHTML(unicode(chapter)):
chapter_title = stripHTML(unicode(chapter).split('::')[1])
else:
chapter_title = stripHTML(unicode(chapter))
chapter_Url = self.story.getMetadata('authorUrl')+chapter['href'][1:]
self.chapterUrls.append((chapter_title, chapter_Url))
self.story.setMetadata('numChapters', len(self.chapterUrls))
genres = ("Drama","Romance")
gotgenre = False
## Getting the Metadata that is there
div = soup.find('div',{'class':'well'})
metads = div.findAll('p')[1].get_text().replace('\n','').split(' - ')
for metad in metads:
metad = metad.strip()
if ':' in metad:
heading = metad.split(':')[0].strip()
text = metad.split(':')[1].strip()
if heading == 'Status':
self.story.setMetadata('status',text)
elif heading == 'Rating':
self.story.setMetadata('rating',text)
elif heading == 'Word count':
self.story.setMetadata('numWords',text)
elif heading == 'Genre':
self.story.setMetadata('genre',text.replace(',',', ').replace(' ',' '))
gotgenre = True
# Status: Completed - Rating: Adult Only - Chapters: 19 - Word count: 323,805 - Genre: Post-OotP
# Status: In progress - Rating: Adult Only - Chapters: 42 - Word count: 395,991 - Genre: Action/Adventure, Angst, Drama, Romance, Tragedy
# Status: Completed - Rating: Everyone - Chapters: 1 - Word count: 876 - Genre: Sorrow
# Status: In progress - Rating: Mature - Chapters: 39 - Word count: 314,544 - Genre: Drama - Romance
div = soup.find('div',{'class':'well'})
# logger.debug(div.find_all('p')[1])
metaline = re.sub(r' +',' ',stripHTML(div.find_all('p')[1]).replace('\n',' '))
# logger.debug(metaline)
match = re.match(r"Status: (?P<status>.+?) - Rating: (?P<rating>.+?) - Chapters: [0-9,]+ - Word count: (?P<numWords>[0-9,]+?) - Genre: ?(?P<genre>.*?)$",metaline)
if match:
# logger.debug(match.group('status'))
# logger.debug(match.group('rating'))
# logger.debug(match.group('numWords'))
# logger.debug(match.group('genre'))
if "Completed" in match.group('status'):
self.story.setMetadata('status',"Completed")
else:
self.story.setMetadata('status',"In-Progress")
self.story.setMetadata('rating',match.group('rating'))
self.story.setMetadata('numWords',match.group('numWords'))
self.story.extendList('genre',re.split(r'[;,-]',match.group('genre')))
else:
raise exceptions.FailedToDownload("Error parsing metadata: '{0}'".format(url))
summary = div.find('blockquote').get_text()
self.setDescription(url,summary)
## Raising AdultCheckRequired after collecting chapters gives
## a double chapter list. So does genre, but it de-dups
## automatically.
if( self.story.getMetadataRaw('rating') in ['Mature','Adult Only']
and not (self.is_adult or self.getConfig("is_adult")) ):
raise exceptions.AdultCheckRequired(self.url)
for i, chapter in enumerate(chapters):
if '/reviews/' not in chapter['href']:
# here we get the update date. We will update this for every chapter,
# so we get the last one.
updatedate = stripHTML(unicode(chapters[i].parent)).split(
'Uploaded on:')[1].strip()
updatedate = updatedate.replace('st ',' ').replace('nd ',' ').replace(
'rd ',' ').replace('th ',' ')
self.story.setMetadata('dateUpdated', makeDate(updatedate, self.dateformat))
if '::' in stripHTML(unicode(chapter)):
chapter_title = stripHTML(unicode(chapter).split('::')[1])
else:
if gotgenre == True:
if ',' in metad:
for gen in metad.split(','):
self.story.addToList('genre',gen.strip())
for gen in genres:
if metad == gen:
self.story.addToList('genre',metad.strip())
else:
for gen in genres:
if metad == gen:
self.story.addToList('genre',metad.strip())
chapter_title = stripHTML(unicode(chapter))
chapter_Url = self.story.getMetadata('authorUrl')+chapter['href'][1:]
self.add_chapter(chapter_title, chapter_Url)
summary = div.find('blockquote').get_text()
self.setDescription(url,summary)
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
if( self.story.getMetadataRaw('rating') in ['Mature','Adult Only'] and
(self.is_adult or self.getConfig("is_adult")) ):
addurl = "?bypass=1"
else:
addurl=""
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url+addurl))
story = soup.find('div',{'class':'story'})
if story == None:
raise exceptions.FailedToDownload(
"Error downloading Chapter: '{0}'! Missing required element!".format(url))
#Now, there are a lot of extranious tags within the story division.. so we will remove them.
for tag in story.findAll('ul',{'class':'pager'}) + story.findAll(
'div',{'class':'alert'}) + story.findAll('div', {'class':'btn-group'}):
for tag in story.find_all('ul',{'class':'pager'}) + story.find_all(
'div',{'class':'alert'}) + story.find_all('div', {'class':'btn-group'}):
tag.extract()
return self.utf8FromSoup(url,story)

View file

@ -1,321 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2014 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# In general an 'adapter' needs to do these five things:
# - 'Register' correctly with the downloader
# - Site Login (if needed)
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
# - Grab the chapter list
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
# - Grab the chapter texts
# Search for XXX comments--that's where things are most likely to need changing.
# This function is called by the downloader in all adapter_*.py files
# in this dir to register the adapter class. So it needs to be
# updated to reflect the class below it. That, plus getSiteDomain()
# take care of 'Registering'.
def getClass():
return FanficCastleTVNetAdapter # XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class FanficCastleTVNetAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','csltv') # XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%b %d, %Y" # XXX
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'fanfic.castletv.net' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=3"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
m = re.search(r"'viewstory.php\?sid=\d+((?:&amp;ageconsent=ok)?&amp;warning=\d+)'",data)
if m != None:
if self.is_adult or self.getConfig("is_adult"):
# We tried the default and still got a warning, so
# let's pull the warning number from the 'continue'
# link and reload data.
addurl = m.group(1)
# correct stupid &amp; error in url.
addurl = addurl.replace("&amp;","&")
url = self.url+'&index=1'+addurl
logger.debug("URL 2nd try: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
else:
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('div',{'id':'pagetitle'})
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Reviews
reviewdata = soup.find('div', {'id' : 'sort'})
a = reviewdata.findAll('a', href=re.compile(r'reviews.php\?type=ST&(amp;)?item='+self.story.getMetadata('storyId')+"$"))[1] # second one.
self.story.setMetadata('reviews',stripHTML(a))
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Summary' in label:
## Everything until the next span class='label'
svalue = ""
while value and 'label' not in defaultGetattr(value,'class'):
svalue += unicode(value)
value = value.nextSibling
self.setDescription(url,svalue)
#self.story.setMetadata('description',stripHTML(svalue))
if 'Rated' in label:
self.story.setMetadata('rating', value)
if 'Word count' in label:
self.story.setMetadata('numWords', value)
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
catstext = [cat.string for cat in cats]
for cat in catstext:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
charstext = [char.string for char in chars]
for char in charstext:
self.story.addToList('characters',char.string)
## Not all sites use Genre, but there's no harm to
## leaving it in. Check to make sure the type_id number
## is correct, though--it's site specific.
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
genrestext = [genre.string for genre in genres]
self.genre = ', '.join(genrestext)
for genre in genrestext:
self.story.addToList('genre',genre.string)
## Not all sites use Warnings, but there's no harm to
## leaving it in. Check to make sure the type_id number
## is correct, though--it's site specific.
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
warningstext = [warning.string for warning in warnings]
self.warning = ', '.join(warningstext)
for warning in warningstext:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,183 +0,0 @@
# coding=utf-8
import re
import urllib2
import urlparse
from base_adapter import BaseSiteAdapter, makeDate
from .. import exceptions
_SOURCE_CODE_ENCODING = 'utf-8'
def getClass():
return FanficHuAdapter
def _get_query_data(url):
components = urlparse.urlparse(url)
query_data = urlparse.parse_qs(components.query)
return dict((key, data[0]) for key, data in query_data.items())
class FanficHuAdapter(BaseSiteAdapter):
SITE_ABBREVIATION = 'ffh'
SITE_DOMAIN = 'fanfic.hu'
SITE_LANGUAGE = 'Hungarian'
BASE_URL = 'http://' + SITE_DOMAIN + '/merengo/'
VIEW_STORY_URL_TEMPLATE = BASE_URL + 'viewstory.php?sid=%s'
DATE_FORMAT = '%m/%d/%Y'
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
query_data = urlparse.parse_qs(self.parsedUrl.query)
story_id = query_data['sid'][0]
self.story.setMetadata('storyId', story_id)
self._setURL(self.VIEW_STORY_URL_TEMPLATE % story_id)
self.story.setMetadata('siteabbrev', self.SITE_ABBREVIATION)
self.story.setMetadata('language', self.SITE_LANGUAGE)
def _customized_fetch_url(self, url, exception=None, parameters=None):
if exception:
try:
data = self._fetchUrl(url, parameters)
except urllib2.HTTPError:
raise exception(self.url)
# Just let self._fetchUrl throw the exception, don't catch and
# customize it.
else:
data = self._fetchUrl(url, parameters)
return self.make_soup(data)
@staticmethod
def getSiteDomain():
return FanficHuAdapter.SITE_DOMAIN
@classmethod
def getSiteExampleURLs(cls):
return cls.VIEW_STORY_URL_TEMPLATE % 1234
def getSiteURLPattern(self):
return re.escape(self.VIEW_STORY_URL_TEMPLATE[:-2]) + r'\d+$'
def extractChapterUrlsAndMetadata(self):
soup = self._customized_fetch_url(self.url + '&i=1')
if soup.title.string.encode(_SOURCE_CODE_ENCODING).strip(' :') == 'írta':
raise exceptions.StoryDoesNotExist(self.url)
chapter_options = soup.find('form', action='viewstory.php').select('option')
# Remove redundant "Fejezetek" option
chapter_options.pop(0)
# If there is still more than one entry remove chapter overview entry
if len(chapter_options) > 1:
chapter_options.pop(0)
for option in chapter_options:
url = urlparse.urljoin(self.url, option['value'])
self.chapterUrls.append((option.string, url))
author_url = urlparse.urljoin(self.BASE_URL, soup.find('a', href=lambda href: href and href.startswith('viewuser.php?uid='))['href'])
soup = self._customized_fetch_url(author_url)
story_id = self.story.getMetadata('storyId')
for table in soup('table', {'class': 'mainnav'}):
title_anchor = table.find('span', {'class': 'storytitle'}).a
href = title_anchor['href']
if href.startswith('javascript:'):
href = href.rsplit(' ', 1)[1].strip("'")
query_data = _get_query_data(href)
if query_data['sid'] == story_id:
break
else:
# This should never happen, the story must be found on the author's
# page.
raise exceptions.FailedToDownload(self.url)
self.story.setMetadata('title', title_anchor.string)
rows = table('tr')
anchors = rows[0].div('a')
author_anchor = anchors[1]
query_data = _get_query_data(author_anchor['href'])
self.story.setMetadata('author', author_anchor.string)
self.story.setMetadata('authorId', query_data['uid'])
self.story.setMetadata('authorUrl', urlparse.urljoin(self.BASE_URL, author_anchor['href']))
self.story.setMetadata('reviews', anchors[3].string)
if self.getConfig('keep_summary_html'):
self.story.setMetadata('description', self.utf8FromSoup(author_url, rows[1].td))
else:
self.story.setMetadata('description', ''.join(rows[1].td(text=True)))
for row in rows[3:]:
index = 0
cells = row('td')
while index < len(cells):
cell = cells[index]
key = cell.b.string.encode(_SOURCE_CODE_ENCODING).strip(':')
try:
value = cells[index+1].string.encode(_SOURCE_CODE_ENCODING)
except AttributeError:
value = None
if key == 'Kategória':
for anchor in cells[index+1]('a'):
self.story.addToList('category', anchor.string)
elif key == 'Szereplõk':
if cells[index+1].string:
for name in cells[index+1].string.split(', '):
self.story.addToList('character', name)
elif key == 'Korhatár':
if value != 'nem korhatáros':
self.story.setMetadata('rating', value)
elif key == 'Figyelmeztetések':
for b_tag in cells[index+1]('b'):
self.story.addToList('warnings', b_tag.string)
elif key == 'Jellemzõk':
for genre in cells[index+1].string.split(', '):
self.story.addToList('genre', genre)
elif key == 'Fejezetek':
self.story.setMetadata('numChapters', int(value))
elif key == 'Megjelenés':
self.story.setMetadata('datePublished', makeDate(value, self.DATE_FORMAT))
elif key == 'Frissítés':
self.story.setMetadata('dateUpdated', makeDate(value, self.DATE_FORMAT))
elif key == 'Szavak':
self.story.setMetadata('numWords', value)
elif key == 'Befejezett':
self.story.setMetadata('status', 'Completed' if value == 'Nem' else 'In-Progress')
index += 2
if self.story.getMetadata('rating') == '18':
if not (self.is_adult or self.getConfig('is_adult')):
raise exceptions.AdultCheckRequired(self.url)
def getChapterText(self, url):
soup = self._customized_fetch_url(url)
story_cell = soup.find('form', action='viewstory.php').parent.parent
for div in story_cell('div'):
div.extract()
return self.utf8FromSoup(url, story_cell)

View file

@ -0,0 +1,324 @@
# -*- coding: utf-8 -*-
# Copyright 2014 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return FanFicsMeAdapter
logger = logging.getLogger(__name__)
class FanFicsMeAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
self.full_work_soup = None
self.use_full_work_soup = True
## All Russian as far as I know.
self.story.setMetadata('language','Russian')
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL('https://' + self.getSiteDomain() + '/fic'+self.story.getMetadata('storyId'))
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ffme')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d.%m.%Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'fanfics.me'
@classmethod
def getSiteExampleURLs(cls):
return "https://"+cls.getSiteDomain()+"/fic1234 https://"+cls.getSiteDomain()+"/read.php?id=1234 https://"+cls.getSiteDomain()+"/read.php?id=1234&chapter=2"
def getSiteURLPattern(self):
# https://fanfics.me/fic137282
# https://fanfics.me/read.php?id=137282
# https://fanfics.me/read.php?id=137282&chapter=2
# https://fanfics.me/download.php?fic=137282&format=epub
return r"https?://"+re.escape(self.getSiteDomain())+r"/(fic|read\.php\?id=|download\.php\?fic=)(?P<id>\d+)"
## Login
def needToLoginCheck(self, data):
return '<form name="autent" action="https://fanfics.me/autent.php" method="post">' in data
def performLogin(self, url):
'''
<form name="autent" action="https://fanfics.me/autent.php" method="post">
Имя:<br>
<input class="input_3" type="text" name="name" id="name"><br>
Пароль:<br>
<input class="input_3" type="password" name="pass" id="pass"><br>
<input type="checkbox" name="nocookie" id="nocookie" />&nbsp;<label for="nocookie">Чужой&nbsp;компьютер</label><br>
<input class="modern_button" type="submit" value="Войти">
<div class="lostpass center"><a href="/index.php?section=lostpass">Забыл пароль</a></div>
'''
params = {}
if self.password:
params['name'] = self.username
params['pass'] = self.password
else:
params['name'] = self.getConfig("username")
params['pass'] = self.getConfig("password")
loginUrl = 'https://' + self.getSiteDomain() + '/autent.php'
logger.info("Will now login to URL (%s) as (%s)" % (loginUrl,
params['name']))
## must need a cookie or something.
self.get_request(loginUrl, usecache=False)
d = self.post_request(loginUrl, params, usecache=False)
if self.needToLoginCheck(d):
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['name']))
raise exceptions.FailedToLogin(url,params['name'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
url = self.url
logger.info("url: "+url)
data = self.get_request(url)
soup = self.make_soup(data)
## restrict meta searches to header.
fichead = soup.find('div',class_='FicHead')
def get_meta_content(title):
val_label = fichead.find('div',string=re.compile(u'^'+title+u':'))
if val_label:
return val_label.find_next('div')
## fanfics.me doesn't have separate adult--you have to set
## your age to 18+ in your user account
## Rating
## R, NC-17, PG-13 require login
## doesn't: General
#('Рейтинг', 'rating', False, False)
# val_label = fichead.find('div',string=u'Рейтинг:')
# val = stripHTML(val_label.find_next('div'))
# logger.debug(val)
self.story.setMetadata('rating',stripHTML(get_meta_content(u'Рейтинг')))
## Need to login for any rating higher than General.
if self.story.getMetadataRaw('rating') != 'General' and self.needToLoginCheck(data):
self.performLogin(url)
# reload after login.
data = self.get_request(url,usecache=False)
soup = self.make_soup(data)
fichead = soup.find('div',class_='FicHead')
## Title
## <h1>Третья сторона&nbsp;<span class="small green">(гет)</span></h1>
h = fichead.find('h1')
span = h.find('span')
## I haven't found a term for what fanfics.me calls this, but
## it translates to Get Jen Slash Femslash
self.story.addToList('category',stripHTML(span)[1:-1])
span.extract()
self.story.setMetadata('title',stripHTML(h))
## author(s):
content = get_meta_content(u'Авторы?')
if content:
alist = content.find_all('a', class_='user')
for a in alist:
self.story.addToList('authorId',a['href'].split('/user')[-1])
self.story.addToList('authorUrl','https://'+self.host+a['href'])
self.story.addToList('author',stripHTML(a))
# can be deliberately anonymous.
if not alist:
self.story.setMetadata('author','Anonymous')
self.story.setMetadata('authorUrl','https://'+self.host)
self.story.setMetadata('authorId','0')
# translator(s) in different strings
content = get_meta_content(u'Переводчикк?и?')
if content:
for a in content.find_all('a', class_='user'):
self.story.addToList('translatorsId',a['href'].split('/user')[-1])
self.story.addToList('translatorsUrl','https://'+self.host+a['href'])
self.story.addToList('translators',stripHTML(a))
# If there are translators, but no authors, copy translators to authors.
if self.story.getList('translators') and not self.story.getList('author'):
self.story.extendList('authorId',self.story.getList('translatorsId'))
self.story.extendList('authorUrl',self.story.getList('translatorsUrl'))
self.story.extendList('author',self.story.getList('translators'))
# beta(s)
content = get_meta_content(u'Бета')
if content:
for a in content.find_all('a', class_='user'):
self.story.addToList('betasId',a['href'].split('/user')[-1])
self.story.addToList('betasUrl','https://'+self.host+a['href'])
self.story.addToList('betas',stripHTML(a))
content = get_meta_content(u'Фандом')
self.story.extendList('fandoms', [ stripHTML(a) for a in
fichead.find_all('a',href=re.compile(r'/fandom\d+$')) ] )
## 'Characters' header has both ships and chars lists
content = get_meta_content(u'Персонажи')
if content:
self.story.extendList('ships', [ stripHTML(a) for a in
content.find_all('a',href=re.compile(r'/paring\d+_\d+$')) ] )
for ship in self.story.getList('ships'):
self.story.extendList('characters', ship.split('/'))
self.story.extendList('characters', [ stripHTML(a) for a in
content.find_all('a',href=re.compile(r'/character\d+$')) ] )
self.story.extendList('genre',stripHTML(get_meta_content(u'Жанр')).split(', '))
## fanfics.me includes 'AU' and 'OOC' as warnings...
content = get_meta_content(u'Предупреждение')
if content:
self.story.extendList('warnings',stripHTML(content).split(', '))
content = get_meta_content(u'События')
if content:
self.story.extendList('events', [ stripHTML(a) for a in
content.find_all('a',href=re.compile(r'/find\?keyword=\d+$')) ] )
## Original work block
content = get_meta_content(u'Оригинал')
if content:
# only going to record URL.
titletd = content.find('td',string=u'Ссылка:')
self.story.setMetadata('originUrl',stripHTML(titletd.find_next('td')))
## size block, only saving word count.
content = get_meta_content(u'Размер')
words = stripHTML(content.find('a'))
words = re.sub(r'[^0-9]','',words) # only keep numbers
self.story.setMetadata('numWords',words)
## status by color code
statuscolors = {'red':'In-Progress',
'green':'Completed',
'blue':'Hiatus'}
content = get_meta_content(u'Статус')
self.story.setMetadata('status',statuscolors[content.span['class'][0]])
# desc
self.setDescription(url,soup.find('div',id='summary_'+self.story.getMetadata('storyId')))
# cover
div = fichead.find('div',class_='FicHead_cover')
if div:
# get the larger version.
self.setCoverImage(self.url,div.img['src'].replace('_200_300',''))
# dates
# <span class="DateUpdate" title="Опубликовано 22.04.2020, изменено 22.04.2020">22.04.2020 - 22.04.2020</span>
datespan = soup.find('span',class_='DateUpdate')
dates = stripHTML(datespan).split(" - ")
self.story.setMetadata('datePublished', makeDate(dates[0], self.dateformat))
self.story.setMetadata('dateUpdated', makeDate(dates[1], self.dateformat))
# series
seriesdiv = soup.find('div',id='fic_info_content_serie')
if seriesdiv:
seriesa = seriesdiv.find('a', href=re.compile(r'/serie\d+$'))
i=1
for a in seriesdiv.find_all('a', href=re.compile(r'/fic\d+$')):
if a['href'] == ('/fic'+self.story.getMetadata('storyId')):
self.setSeries(stripHTML(seriesa), i)
self.story.setMetadata('seriesUrl','https://'+self.host+seriesa['href'])
break
i+=1
chapteruls = soup.find_all('ul',class_='FicContents')
if chapteruls:
for ul in chapteruls:
# logger.debug(ul.prettify())
for chapter in ul.find_all('li'):
a = chapter.find('a')
# logger.debug(a.prettify())
if a and a.has_attr('href'):
# logger.debug(chapter.prettify())
self.add_chapter(stripHTML(a),'https://' + self.getSiteDomain() + a['href'])
else:
self.add_chapter(self.story.getMetadata('title'),
'https://' + self.getSiteDomain() +
'/read.php?id='+self.story.getMetadata('storyId')+'&chapter=0')
return
# grab the text for an individual chapter.
def getChapterTextNum(self, url, index):
logger.debug('Getting chapter text for: %s index: %s' % (url,index))
m = re.match(r'.*&chapter=(\d+).*',url)
if m:
index=m.group(1)
logger.debug("Using index(%s) from &chapter="%index)
chapter_div = None
if self.use_full_work_soup and self.getConfig("use_view_full_work",True) and self.num_chapters() > 1:
logger.debug("USE view_full_work")
## Assumed view_adult=true was cookied during metadata
if not self.full_work_soup:
self.full_work_soup = self.make_soup(self.get_request(
'https://' + self.getSiteDomain() + '/read.php?id='+self.story.getMetadata('storyId')))
whole_dl_soup = self.full_work_soup
chapter_div = whole_dl_soup.find('div',{'id':'c%s'%(index)})
if not chapter_div:
self.use_full_work_soup = False
logger.warning("c%s not found in view_full_work--ending use_view_full_work"%(index))
if chapter_div == None:
whole_dl_soup = self.make_soup(self.get_request(url))
chapter_div = whole_dl_soup.find('div',{'id':'c%s'%(index)})
if None == chapter_div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,chapter_div)

View file

@ -0,0 +1,224 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return FanfictalkComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class FanfictalkComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ahpfftc')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %b %Y"
@classmethod
def getAcceptDomains(cls):
return [cls.getSiteDomain(),'archive.hpfanfictalk.com','fanfictalk.com']
@classmethod
def getConfigSections(cls):
"Only needs to be overriden if has additional ini sections."
return [cls.getConfigSection(),'archive.hpfanfictalk.com','fanfictalk.com']
@staticmethod # must be @stgetAcceptDomainsaticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'archive.fanfictalk.com'
@classmethod
def getSiteExampleURLs(cls):
return "https://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return r"https?://("+r"|".join([x.replace('.',r'\.') for x in self.getAcceptDomains()])+r")(/archive)?/viewstory\.php\?sid=\d+$"
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=3"
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
data = self.get_request(url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
## Title and author
soup = self.make_soup(data)
# logger.debug(soup)
pagetitle = soup.select_one('div#pagetitle')
# logger.debug(pagetitle)
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
for a in pagetitle.find_all('a', href=re.compile(r"viewuser.php\?uid=\d+")):
self.story.addToList('authorId',a['href'].split('=')[1])
self.story.addToList('authorUrl','https://'+self.host+'/'+a['href'])
self.story.addToList('author',stripHTML(a))
# Find the chapters:
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.add_chapter(chapter,'https://'+self.host+'/'+chapter['href'])
# categories
for a in soup.select("div#sort a"):
self.story.addToList('category',stripHTML(a))
# this site has two divs with class=gb-50 and no immediate container.
gb50s = soup.find_all('div', {'class':'gb-50'})
def list_from_urls(parent, regex, metadata):
urls = parent.find_all('a',href=re.compile(regex))
for url in urls:
self.story.addToList(metadata,stripHTML(url))
list_from_urls(gb50s[0],r'browse.php\?type=characters','characters')
list_from_urls(gb50s[0],r'browse.php\?type=class&type_id=11','ships')
list_from_urls(gb50s[0],r'browse.php\?type=class&type_id=10','representation')
list_from_urls(gb50s[0],r'browse.php\?type=class&type_id=7','storytype')
list_from_urls(gb50s[0],r'browse.php\?type=class&type_id=14','house')
list_from_urls(gb50s[1],r'browse.php\?type=class&type_id=8','warnings')
list_from_urls(gb50s[1],r'browse.php\?type=class&type_id=15','contentwarnings')
list_from_urls(gb50s[1],r'browse.php\?type=class&type_id=4','genre')
list_from_urls(gb50s[1],r'browse.php\?type=class&type_id=13','tropes')
bq = soup.find('blockquote2')
if bq:
# blockquote2??? Whatever. But we're changing it to a real tag.
bq.name='div'
self.setDescription(url,bq)
# usually use something more precise for label search, but
# site doesn't group much.
labels = soup.find_all('b')
for labelspan in labels:
# logger.debug(labelspan)
value = labelspan.nextSibling
label = stripHTML(labelspan)
# logger.debug(value)
# logger.debug(label)
if 'Words:' in label:
stripHTML(value)
self.story.setMetadata('numWords', stripHTML(value).replace('·',''))
if 'Published:' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value).replace('·',''), self.dateformat))
if 'Updated:' in label:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value).replace('·',''), self.dateformat))
# Site allows stories to be in several series at once. FFF
# isn't thrilled with that, we have series00, series01, etc.
# Example:
# https://archive.fanfictalk.com/viewstory.php?sid=483
if self.getConfig("collect_series"):
seriesspan = soup.find('span',label='Series')
for i, seriesa in enumerate(seriesspan.find_all('a', href=re.compile(r"viewseries\.php\?seriesid=\d+"))):
# logger.debug(seriesa)
series_name = stripHTML(seriesa)
series_url = 'https://'+self.host+'/'+seriesa['href']
seriessoup = self.make_soup(self.get_request(series_url))
storyas = seriessoup.find_all('a', href=re.compile(r'viewstory.php\?sid=\d+'))
# logger.debug(storyas)
j=1
found = False
for storya in storyas:
# logger.debug(storya)
## allow for JS links.
if ('viewstory.php?sid='+self.story.getMetadata('storyId')) in storya['href']:
found = True
break
j+=1
if found:
series_index = j
self.story.setMetadata('series%02d'%i,"%s [%s]"%(series_name,series_index))
self.story.setMetadata('series%02dUrl'%i,series_url)
if i == 0:
self.setSeries(series_name, series_index)
self.story.setMetadata('seriesUrl',series_url)
else:
logger.debug("Story URL not found in series (%s) page, not including."%series_url)
# grab the text for an individual chapter.
def getChapterText(self, url):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=3"
else:
addurl=""
logger.debug('Getting chapter text from: %s' % (url+addurl))
soup = self.make_soup(self.get_request(url+addurl))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,290 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# By virtue of being recent and requiring both is_adult and user/pass,
# adapter_fanficcastletvnet.py is the best choice for learning to
# write adapters--especially for sites that use the eFiction system.
# Most sites that have ".../viewstory.php?sid=123" in the story URL
# are eFiction.
# For non-eFiction sites, it can be considerably more complex, but
# this is still a good starting point.
# In general an 'adapter' needs to do these five things:
# - 'Register' correctly with the downloader
# - Site Login (if needed)
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
# - Grab the chapter list
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
# - Grab the chapter texts
# Search for XXX comments--that's where things are most likely to need changing.
# This function is called by the downloader in all adapter_*.py files
# in this dir to register the adapter class. So it needs to be
# updated to reflect the class below it. That, plus getSiteDomain()
# take care of 'Registering'.
def getClass():
return FanfictionJunkiesDeAdapter # XXX
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class FanfictionJunkiesDeAdapter(BaseSiteAdapter): # XXX
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
self._setURL('http://' + self.getSiteDomain() + '/efiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ffjde') # XXX
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d/%m/%y" # XXX
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'fanfiction-junkies.de' # XXX
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/efiction/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/efiction/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/efiction/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if self.is_adult or self.getConfig("is_adult"):
# Weirdly, different sites use different warning numbers.
# If the title search below fails, there's a good chance
# you need a different number. print data at that point
# and see what the 'click here to continue' url says.
addurl = "&ageconsent=ok&warning=1" # XXX
else:
addurl=""
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+'&index=1'+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
# The actual text that is used to announce you need to be an
# adult varies from site to site. Again, print data before
# the title search to troubleshoot.
if "For adults only " in data: # XXX
raise exceptions.AdultCheckRequired(self.url)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
pagetitle = soup.find('h4')
## Title
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',a.string)
# Find authorid and URL from... author url.
a = pagetitle.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/efiction/'+a['href'])
self.story.setMetadata('author',a.string)
# Reviews
reviewdata = soup.find('div', {'id' : 'sort'})
a = reviewdata.findAll('a', href=re.compile(r'reviews.php\?type=ST&(amp;)?item='+self.story.getMetadata('storyId')+"$"))[1] # second one.
self.story.setMetadata('reviews',stripHTML(a))
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/efiction/'+chapter['href']+addurl))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
list = soup.find('div', {'class':'listbox'})
labels = list.findAll('b')
for labelspan in labels:
value = labelspan.nextSibling
label = labelspan.string
if 'Zusammenfassung' in label:
self.setDescription(url,value)
if 'Eingestuft' in label:
self.story.setMetadata('rating', value)
if u'Wörter' in label:
self.story.setMetadata('numWords', value)
if 'Kategorie' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Charaktere' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Abgeschlossen' in label:
if 'Yes' in value:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if u'Veröffentlicht' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Aktualisiert' in label:
# there's a stray [ at the end.
#value = value[0:-1]
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/efiction/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
i=1
for a in storyas:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2016 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,21 +15,31 @@
# limitations under the License.
#
from __future__ import absolute_import
from datetime import datetime
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from urllib import unquote_plus
# py2 vs py3 transition
from ..six import text_type as unicode
from ..six.moves.urllib.parse import urlparse
from .. import exceptions as exceptions
from ..htmlcleanup import stripHTML
from base_adapter import BaseSiteAdapter, makeDate
from .base_adapter import BaseSiteAdapter
ffnetgenres=["Adventure", "Angst", "Crime", "Drama", "Family", "Fantasy", "Friendship", "General",
"Horror", "Humor", "Hurt-Comfort", "Mystery", "Parody", "Poetry", "Romance", "Sci-Fi",
"Spiritual", "Supernatural", "Suspense", "Tragedy", "Western"]
ffnetgenres=["Adventure", "Angst", "Crime", "Drama", "Family", "Fantasy",
"Friendship", "General", "Horror", "Humor", "Hurt-Comfort",
"Mystery", "Parody", "Poetry", "Romance", "Sci-Fi", "Spiritual",
"Supernatural", "Suspense", "Tragedy", "Western"]
ffnetpluscategories=["+Anima", "Alex + Ada", "Rosario + Vampire", "Blood+",
"+C: Sword and Cornett", "Norn9 - ノルン+ノネット",
"Haré+Guu/ジャングルはいつもハレのちグゥ", "Lost+Brain",
"Wicked + The Divine", "Alex + Ada", "RE: Alistair++",
"Tristan + Isolde"]
class FanFictionNetSiteAdapter(BaseSiteAdapter):
@ -37,27 +47,13 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','ffnet')
# get storyId from url--url validation guarantees second part is storyId
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
self.set_story_idurl(url)
# normalized story URL.
self._setURL("https://"+self.getSiteDomain()\
+"/s/"+self.story.getMetadata('storyId')+"/1/")
# ffnet update emails have the latest chapter URL.
# Frequently, when they arrive, not all the servers have the
# latest chapter yet and going back to chapter 1 to pull the
# chapter list doesn't get the latest. So save and use the
# original URL given to pull chapter list & metadata.
# Not used by plugin because URL gets normalized first for
# eliminating duplicate story urls.
self.origurl = url
if "https://m." in self.origurl:
## accept m(mobile)url, but use www.
self.origurl = self.origurl.replace("https://m.","https://www.")
self.opener.addheaders.append(('Referer',self.origurl))
@staticmethod
def getSiteDomain():
return 'www.fanfiction.net'
@ -70,24 +66,74 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
def getSiteExampleURLs(cls):
return "https://www.fanfiction.net/s/1234/1/ https://www.fanfiction.net/s/1234/12/ http://www.fanfiction.net/s/1234/1/Story_Title http://m.fanfiction.net/s/1234/1/"
def set_story_idurl(self,url):
parsedUrl = urlparse(url)
pathparts = parsedUrl.path.split('/',)
self.story.setMetadata('storyId',pathparts[2])
self.urltitle='' if len(pathparts)<5 else pathparts[4]
# normalized story URL.
self._setURL("https://"+self.getSiteDomain()\
+"/s/"+self.story.getMetadata('storyId')+"/1/"+self.urltitle)
## here so getSiteURLPattern and get_section_url(class method) can
## both use it. Note adapter_fictionpresscom has one too.
@classmethod
def _get_site_url_pattern(cls):
return r"https?://(www|m)?\.fanfiction\.net/s/(?P<id>\d+)(/\d+)?(/(?P<title>[^/]+))?/?$"
@classmethod
def get_section_url(cls,url):
## minimal URL used for section names in INI and reject list
## for comparison
# logger.debug("pre--url:%s"%url)
m = re.match(cls._get_site_url_pattern(),url)
if m:
url = "https://"+cls.getSiteDomain()\
+"/s/"+m.group('id')+"/1/"
# logger.debug("post-url:%s"%url)
return url
@classmethod
def get_url_search(cls,url):
regexp = super(getClass(), cls).get_url_search(url)
regexp = re.sub(r"^(?P<keep>.*net/s/\d+/\d+/)(?P<urltitle>[^\$]*)?",
r"\g<keep>(.*)",regexp)
logger.debug(regexp)
return regexp
def getSiteURLPattern(self):
return r"https?://(www|m)?\.fanfiction\.net/s/\d+(/\d+)?(/|/[^/]+)?/?$"
return self._get_site_url_pattern()
def _fetchUrl(self,url,parameters=None,extrasleep=1.0,usecache=True):
## ffnet(and, I assume, fpcom) tends to fail more if hit too
## fast. This is in additional to what ever the
## slow_down_sleep_time setting is.
return BaseSiteAdapter._fetchUrl(self,url,
parameters=parameters,
extrasleep=extrasleep,
usecache=usecache)
## normalized chapter URLs DO contain the story title now, but
## normalized to current urltitle in case of title changes.
def normalize_chapterurl(self,url):
return re.sub(r"https?://(www|m)\.(?P<keep>fanfiction\.net/s/\d+/\d+/).*",
r"https://www.\g<keep>",url)+self.urltitle
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
def get_request(self,url,usecache=True):
## use super version if not set or isn't a chapter URL with a
## title.
if( not self.getConfig("try_shortened_title_urls") or
not re.match(r"https?://www\.fanfiction\.net/s/\d+/\d+/(?P<title>[^/]+)$", url) ):
return super(getClass(), self).get_request(url,usecache)
## kludgey way to attempt more than one URL variant by
## removing title one letter at a time. Note that network and
## open_pages_in_browser retries still happen first.
titlelen = len(url.split('/')[-1])
maxcut = min([4,titlelen])
j = 0
while j < maxcut: # should actually leave loop either by
# return or exception raise.
try:
useurl = url
if j: # j==0, full URL, then remove letters.
useurl = url[:-j]
return super(getClass(), self).get_request(useurl,usecache)
except exceptions.HTTPErrorFFF as fffe:
if j >= maxcut or 'Page not found or expired' not in unicode(fffe):
raise
j = j+1
def doExtractChapterUrlsAndMetadata(self,get_cover=True):
@ -97,52 +143,60 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
url = self.origurl
logger.debug("URL: "+url)
# use BeautifulSoup HTML parser to make everything easier to find.
try:
data = self._fetchUrl(url)
#logger.debug("\n===================\n%s\n===================\n"%data)
soup = self.make_soup(data)
except urllib2.HTTPError as e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(url)
else:
raise e
data = self.get_request(url)
#logger.debug("\n===================\n%s\n===================\n"%data)
soup = self.make_soup(data)
if "Unable to locate story" in data:
if "Unable to locate story" in data or "Story Not Found" in data:
raise exceptions.StoryDoesNotExist(url)
# some times "Chapter not found...", sometimes "Chapter text not found..."
if "not found. Please check to see you are not using an outdated url." in data:
# some times "Chapter not found...", sometimes "Chapter text
# not found..." or "Story does not have any chapters"
if "Please check to see you are not using an outdated url." in data:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! 'Chapter not found. Please check to see you are not using an outdated url.'" % url)
if "Category for this story has been disabled" in data:
raise exceptions.FailedToDownload("FanFiction.Net has removed the category for this story and will no longer serve it.")
# <link rel="canonical" href="//www.fanfiction.net/s/13551154/100/Haze-Gray">
canonicalurl = soup.select_one('link[rel=canonical]')['href']
self.set_story_idurl(canonicalurl)
## ffnet used to have a tendency to send out update notices in
## email before all their servers were showing the update on
## the first chapter. It generates another server request and
## doesn't seem to be needed lately, so now default it to off.
try:
chapcount = len(soup.find('select', { 'name' : 'chapter' } ).find_all('option'))
# get chapter part of url.
except:
chapcount = 1
have_later_meta = False
if self.getConfig('check_next_chapter'):
try:
## ffnet used to have a tendency to send out update
## notices in email before all their servers were
## showing the update on the first chapter. It
## generates another server request and doesn't seem
## to be needed lately, so now default it to off.
try:
chapcount = len(soup.find('select', { 'name' : 'chapter' } ).findAll('option'))
# get chapter part of url.
except:
chapcount = 1
chapter = url.split('/',)[5]
tryurl = "https://%s/s/%s/%d/"%(self.getSiteDomain(),
self.story.getMetadata('storyId'),
chapcount+1)
tryurl = "https://%s/s/%s/%d/%s"%(self.getSiteDomain(),
self.story.getMetadata('storyId'),
chapcount+1,
self.urltitle)
logger.debug('=Trying newer chapter: %s' % tryurl)
newdata = self._fetchUrl(tryurl)
newdata = self.get_request(tryurl)
if "not found. Please check to see you are not using an outdated url." not in newdata \
and "This request takes too long to process, it is timed out by the server." not in newdata:
logger.debug('=======Found newer chapter: %s' % tryurl)
soup = self.make_soup(newdata)
except urllib2.HTTPError as e:
if e.code == 503:
raise e
except e:
logger.warn("Caught an exception reading URL: %s sleeptime(%s) Exception %s."%(unicode(url),sleeptime,unicode(e)))
pass
have_later_meta = True
except Exception as e:
logger.warning("Caught exception in check_next_chapter URL: %s Exception %s."%(unicode(tryurl),unicode(e)))
if self.getConfig('meta_from_last_chapter') and not have_later_meta and chapcount > 1:
tryurl = "https://%s/s/%s/%d/%s"%(self.getSiteDomain(),
self.story.getMetadata('storyId'),
chapcount,
self.urltitle)
logger.debug('=Trying last chapter for meta_from_last_chapter: %s' % tryurl)
newdata = self.get_request(tryurl)
soup = self.make_soup(newdata)
have_later_meta = True
# Find authorid and URL from... author url.
a = soup.find('a', href=re.compile(r"^/u/\d+"))
@ -157,8 +211,8 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
## 2) cat1_cat2_Crossover
## For 1, use the second link.
## For 2, fetch the crossover page and pull the two categories from there.
categories = soup.find('div',{'id':'pre_story_links'}).findAll('a',{'class':'xcontrast_txt'})
pre_links = soup.find('div',{'id':'pre_story_links'})
categories = pre_links.find_all('a',{'class':'xcontrast_txt'})
#print("xcontrast_txt a:%s"%categories)
if len(categories) > 1:
# Strangely, the ones with *two* links are the
@ -166,20 +220,17 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
# of Book, Movie, etc.
self.story.addToList('category',stripHTML(categories[1]))
elif 'Crossover' in categories[0]['href']:
caturl = "https://%s%s"%(self.getSiteDomain(),categories[0]['href'])
catsoup = self.make_soup(self._fetchUrl(caturl))
found = False
for a in catsoup.findAll('a',href=re.compile(r"^/crossovers/.+?/\d+/")):
self.story.addToList('category',stripHTML(a))
found = True
if not found:
# Fall back. I ran across a story with a Crossver
# category link to a broken page once.
# http://www.fanfiction.net/s/2622060/1/
# Naruto + Harry Potter Crossover
logger.info("Fall back category collection")
for c in stripHTML(categories[0]).replace(" Crossover","").split(' + '):
self.story.addToList('category',c)
## turns out there's only a handful of ffnet category's
## with '+' in. Keep a list and look for them
## specifically instead of looking up the crossover page.
crossover_cat = stripHTML(categories[0]).replace(" Crossover","")
for pluscat in ffnetpluscategories:
if pluscat in crossover_cat:
self.story.addToList('category',pluscat)
crossover_cat = crossover_cat.replace(pluscat,'')
for cat in crossover_cat.split(' + '):
if cat:
self.story.addToList('category',cat)
a = soup.find('a', href=re.compile(r'https?://www\.fictionratings\.com/'))
rating = a.string
@ -200,7 +251,7 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
grayspan = gui_table1i.find('span', {'class':'xgray xcontrast_txt'})
# for b in grayspan.findAll('button'):
# for b in grayspan.find_all('button'):
# b.extract()
metatext = stripHTML(grayspan).replace('Hurt/Comfort','Hurt-Comfort')
#logger.debug("metatext:(%s)"%metatext)
@ -210,7 +261,8 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
else:
self.story.setMetadata('status', 'In-Progress')
metalist = metatext.split(" - ")
## Newer BS libraries are discarding whitespace after tags now. :-/
metalist = re.split(" ?- ",metatext)
#logger.debug("metalist:(%s)"%metalist)
# Rated: Fiction K - English - Words: 158,078 - Published: 02-04-11
@ -238,7 +290,7 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
# Updated: <span data-xutime='1368059198'>5/8</span> - Published: <span data-xutime='1278984264'>7/12/2010</span>
# Published: <span data-xutime='1384358726'>8m ago</span>
dates = soup.findAll('span',{'data-xutime':re.compile(r'^\d+$')})
dates = soup.find_all('span',{'data-xutime':re.compile(r'^\d+$')})
if len(dates) > 1 :
# updated get set to the same as published upstream if not found.
self.story.setMetadata('dateUpdated',datetime.fromtimestamp(float(dates[0]['data-xutime'])))
@ -286,42 +338,51 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
# Try the larger image first.
cover_url = ""
try:
img = soup.select('img.lazy.cimage')
cover_url=img[0]['data-original']
img = soup.select_one('img.lazy.cimage')
cover_url=img['data-original']
except:
img = soup.select('img.cimage')
if img:
cover_url=img[0]['src']
## Nov 2023 - src is always "/static/images/d_60_90.jpg" now
## Only take cover if there's data-original
## Primary motivator is to prevent unneeded author page hits.
pass
logger.debug("cover_url:%s"%cover_url)
authimg_url = ""
if cover_url and self.getConfig('skip_author_cover'):
authsoup = self.make_soup(self._fetchUrl(self.story.getMetadata('authorUrl')))
if cover_url and self.getConfig('skip_author_cover') and self.getConfig('include_images'):
try:
img = authsoup.select('img.lazy.cimage')
authimg_url=img[0]['data-original']
except:
img = authsoup.select('img.cimage')
if img:
authimg_url=img[0]['src']
authsoup = self.make_soup(self.get_request(self.story.getMetadata('authorUrl')))
try:
img = authsoup.select_one('img.lazy.cimage')
authimg_url=img['data-original']
except:
img = authsoup.select_one('img.cimage')
if img:
authimg_url=img['src']
logger.debug("authimg_url:%s"%authimg_url)
logger.debug("authimg_url:%s"%authimg_url)
## ffnet uses different sizes on auth & story pages, but same id.
## //ffcdn2012t-fictionpressllc.netdna-ssl.com/image/1936929/150/
## //ffcdn2012t-fictionpressllc.netdna-ssl.com/image/1936929/180/
try:
cover_id = cover_url.split('/')[4]
except:
cover_id = None
try:
authimg_id = authimg_url.split('/')[4]
except:
authimg_id = None
## ffnet uses different sizes on auth & story pages, but same id.
## Old URLs:
## //ffcdn2012t-fictionpressllc.netdna-ssl.com/image/1936929/150/
## //ffcdn2012t-fictionpressllc.netdna-ssl.com/image/1936929/180/
## After Dec 2020 ffnet changes:
## /image/6472517/180/
## /image/6472517/150/
try:
cover_id = cover_url.split('/')[-3]
except:
cover_id = None
try:
authimg_id = authimg_url.split('/')[-3]
except:
authimg_id = None
## don't use cover if it matches the auth image.
if cover_id and authimg_id and cover_id == authimg_id:
cover_url = None
## don't use cover if it matches the auth image.
if cover_id and authimg_id and cover_id == authimg_id:
logger.debug("skip_author_cover: cover_url matches authimg_url: don't use")
cover_url = None
except Exception as e:
logger.warning("Caught exception in skip_author_cover: %s."%unicode(e))
if cover_url:
self.setCoverImage(url,cover_url)
@ -331,35 +392,40 @@ class FanFictionNetSiteAdapter(BaseSiteAdapter):
select = soup.find('select', { 'name' : 'chapter' } )
if select is None:
# no selector found, so it's a one-chapter story.
self.chapterUrls.append((self.story.getMetadata('title'),url))
# no selector found, so it's a one-chapter story.
self.add_chapter(self.story.getMetadata('title'),url)
else:
allOptions = select.findAll('option')
allOptions = select.find_all('option')
for o in allOptions:
url = u'https://%s/s/%s/%s/' % ( self.getSiteDomain(),
self.story.getMetadata('storyId'),
o['value'])
## title URL will be put back on chapter URL during
## normalize_chapterurl() anyway, but also here for
## clarity
url = u'https://%s/s/%s/%s/%s' % ( self.getSiteDomain(),
self.story.getMetadata('storyId'),
o['value'],
self.urltitle)
# just in case there's tags, like <i> in chapter titles.
title = u"%s" % o
title = re.sub(r'<[^>]+>','',title)
self.chapterUrls.append((title,url))
self.story.setMetadata('numChapters',len(self.chapterUrls))
self.add_chapter(title,url)
return
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
## ffnet(and, I assume, fpcom) tends to fail more if hit too
## fast. This is in additional to what ever the
## slow_down_sleep_time setting is.
data = self._fetchUrl(url,extrasleep=4.0)
logger.debug('Getting chapter text from: %s' % (url))
if "Please email this error message in full to <a href='mailto:support@fanfiction.com'>support@fanfiction.com</a>" in data:
## title URL was put back on chapter URL during
## normalize_chapterurl()
data = self.get_request(url)
if "Please email this error message in full to <a href='mailto:" in data:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! FanFiction.net Site Error!" % url)
soup = self.make_soup(data)
## remove inline ads -- only seen with flaresolverr
for adtag in soup.select("div.google-auto-placed"):
adtag.decompose()
div = soup.find('div', {'id' : 'storytextp'})
if None == div:

View file

@ -0,0 +1,157 @@
# -*- coding: utf-8 -*-
# Copyright 2024 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import io
import logging
import re
import zipfile
from bs4 import BeautifulSoup
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
from fanficfare.htmlcleanup import stripHTML
from .. import exceptions as exceptions
logger = logging.getLogger(__name__)
def getClass():
return FanfictionsFrSiteAdapter
class FanfictionsFrSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev', 'fanfictionsfr')
self.story.setMetadata('langcode','fr')
self.story.setMetadata('language','Français')
# get storyId from url--url validation guarantees query correct
match = re.match(self.getSiteURLPattern(), url)
if not match:
raise exceptions.InvalidStoryURL(url, self.getSiteDomain(), self.getSiteExampleURLs())
story_id = match.group('id')
self.story.setMetadata('storyId', story_id)
fandom_name = match.group('fandom')
self._setURL('https://%s/fanfictions/%s/%s/chapters.html' % (self.getSiteDomain(), fandom_name, story_id))
@staticmethod
def getSiteDomain():
return 'www.fanfictions.fr'
@classmethod
def getSiteExampleURLs(cls):
return 'https://%s/fanfictions/fandom/fanfiction-id/chapters.html' % cls.getSiteDomain()
def getSiteURLPattern(self):
return r'https?://(?:www\.)?fanfictions\.fr/fanfictions/(?P<fandom>[^/]+)/(?P<id>[^/]+)(/chapters.html)?'
def extractChapterUrlsAndMetadata(self):
logger.debug('URL: %s', self.url)
data = self.get_request(self.url)
soup = self.make_soup(data)
# detect if the fanfiction is 'suspended' (chapters unavailable)
alert_div = soup.find('div', id='alertInactiveFic')
if alert_div:
raise exceptions.FailedToDownload("Failed to download the fanfiction, most likely because it is suspended.")
title_element = soup.find('h1', itemprop='name')
self.story.setMetadata('title', stripHTML(title_element))
author_div = soup.find('div', itemprop='author')
author_name = stripHTML(author_div.a)
author_id = author_div.a['href'].split('/')[-1].replace('.html', '')
self.story.setMetadata('author', author_name)
self.story.setMetadata('authorId', author_id)
published_date_element = soup.find('span', class_='date-distance')
published_date_text = published_date_element['data-date']
published_date = makeDate(published_date_text, '%Y-%m-%d %H:%M:%S')
if published_date:
self.story.setMetadata('datePublished', published_date)
status_element = soup.find('p', title="Statut de la fanfiction").find('span', class_='badge')
french_status = stripHTML(status_element)
status_translation = {
"En cours": "In-Progress",
"Terminée": "Completed",
"One-shot": "Completed",
}
self.story.setMetadata('status', status_translation.get(french_status, french_status))
genre_elements = soup.find('div', title="Format et genres").find_all('span', class_="highlightable")
self.story.extendList('genre', [ stripHTML(genre) for genre in genre_elements[1:] ])
category_elements = soup.find_all('li', class_="breadcrumb-item")
self.story.extendList('category', [ stripHTML(category) for category in category_elements[-2].find_all('a') ])
first_description = soup.find('p', itemprop='abstract')
self.setDescription(self.url, first_description)
chapter_cards = soup.find_all(class_=['card', 'chapter'])
for chapter_card in chapter_cards:
chapter_title_tag = chapter_card.find('h2')
if chapter_title_tag:
chapter_title = stripHTML(chapter_title_tag)
chapter_link = 'https://'+self.getSiteDomain()+chapter_title_tag.find('a')['href']
# Clean up the chapter title by replacing multiple spaces and newline characters with a single space
chapter_title = re.sub(r'\s+', ' ', chapter_title)
self.add_chapter(chapter_title, chapter_link)
last_chapter_div = chapter_cards[-1]
updated_date_element = last_chapter_div.find('span', class_='date-distance')
last_chapter_update_date = updated_date_element['data-date']
date = makeDate(last_chapter_update_date, '%Y-%m-%d %H:%M:%S')
if date:
self.story.setMetadata('dateUpdated', date)
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
response, redirection_url = self.get_request_redirected(url)
if "telecharger_pdf.html" in redirection_url:
with zipfile.ZipFile(io.BytesIO(response.encode('latin1'))) as z:
# Assuming there's only one text file inside the zip
file_list = z.namelist()
if len(file_list) != 1:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Zip file should contain exactly one text file!" % url)
text_filename = file_list[0]
with z.open(text_filename) as text_file:
# Decode the text file with windows-1252 encoding
text = text_file.read().decode('windows-1252')
return text.replace("\r\n", "<br>\r\n")
else:
soup = self.make_soup(response)
div_content = soup.find('div', id='readarea')
if div_content is None:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url, div_content)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2012 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,19 +15,18 @@
# limitations under the License.
#
from __future__ import absolute_import
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return FanFiktionDeAdapter
@ -39,11 +38,6 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -53,7 +47,7 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/s/'+self.story.getMetadata('storyId') + '/1')
self._setURL('https://' + self.getSiteDomain() + '/s/'+self.story.getMetadata('storyId') + '/1')
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ffde')
@ -69,17 +63,10 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050 http://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050/1 http://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050/1/story-name"
return "https://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050 https://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050/1 https://"+cls.getSiteDomain()+"/s/46ccbef30000616306614050/1/story-name"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/s/")+r"\w+(/\d+)?"
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
return r"https?"+re.escape("://"+self.getSiteDomain()+"/s/")+r"\w+(/\d+)?"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
@ -103,10 +90,10 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
params['a'] = 'l'
params['submit'] = 'Login...'
loginUrl = 'https://ssl.fanfiktion.de/'
loginUrl = 'https://www.fanfiktion.de/'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['nickname']))
soup = self.make_soup(self._postUrl(loginUrl,params))
soup = self.make_soup(self.post_request(loginUrl,params))
if not soup.find('a', title='Logout'):
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['nickname']))
@ -121,27 +108,19 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url,usecache=False)
data = self.get_request(url,usecache=False)
if "Uhr ist diese Geschichte nur nach einer" in data:
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Auserhalb der Zeit von 23:00 Uhr bis 04:00 Uhr ist diese Geschichte nur nach einer erfolgreichen Altersverifikation zuganglich.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# logger.debug(data)
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('a', href=re.compile(r'/s/'+self.story.getMetadata('storyId')+"/"))
@ -151,41 +130,69 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
head = soup.find('div', {'class' : 'story-left'})
a = head.find('a')
self.story.setMetadata('authorId',a['href'].split('/')[2])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
self.story.setMetadata('author',stripHTML(a))
# Find the chapters:
for chapter in soup.find('select').findAll('option'):
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/s/'+self.story.getMetadata('storyId')+'/'+chapter['value']))
for chapter in soup.find('select').find_all('option'):
self.add_chapter(chapter,'https://'+self.host+'/s/'+self.story.getMetadata('storyId')+'/'+chapter['value'])
self.story.setMetadata('numChapters',len(self.chapterUrls))
## title="Wörter" failed with max_zalgo:1
self.story.setMetadata('numWords',stripHTML(soup.find("span",{'class':"fa-keyboard"}).parent).replace('.','')) # 1.234 = 1,234
self.story.setMetadata('language','German')
self.story.setMetadata('datePublished', makeDate(stripHTML(head.find('span',title='erstellt').parent), self.dateformat))
self.story.setMetadata('dateUpdated', makeDate(stripHTML(head.find('span',title='aktualisiert').parent), self.dateformat))
# second colspan=3 td in head.
## Genre now shares a line with rating.
genres=stripHTML(head.find('span',class_='fa-angle-right').next_sibling)
self.story.extendList('genre',genres[:genres.index('/')].split(', '))
self.story.extendList('genre',genres[:genres.index(' / ')].split(', '))
self.story.setMetadata('rating', genres[genres.index(' / ')+3:])
if head.find('span',title='Fertiggestellt'):
# self.story.addToList('category',stripHTML(soup.find('span',id='ffcbox-story-topic-1')).split('/')[2].strip())
for a in soup.find('span',id='ffcbox-story-topic-1').find_all('a',href=re.compile(r'/c/')):
cat = stripHTML(a)
if cat != 'Fanfiction':
self.story.addToList('category',cat)
for span in soup.find_all('span',class_='badge-character'):
self.story.addToList('characters',stripHTML(span))
try:
self.story.setMetadata('native_status', head.find_all('span',{'class':'titled-icon'})[3]['title'])
except e:
logger.debug("Failed to find native status:%s"%e)
if head.find('span',title='fertiggestellt'):
self.story.setMetadata('status', 'Completed')
elif head.find('span',title='pausiert'):
self.story.setMetadata('status', 'Paused')
elif head.find('span',title='abgebrochen'):
self.story.setMetadata('status', 'Cancelled')
else:
self.story.setMetadata('status', 'In-Progress')
#find metadata on the author's page
asoup = self.make_soup(self._fetchUrl("http://"+self.getSiteDomain()+"?a=q&a1=v&t=nickdetailsstories&lbi=stories&ar=0&nick="+self.story.getMetadata('authorId')))
tr=asoup.findAll('tr')
for i in range(1,len(tr)):
a = tr[i].find('a')
if '/s/'+self.story.getMetadata('storyId')+'/1/' in a['href']:
break
self.setDescription(url,a['onmouseover'].split("', '")[1])
## Get description
descdiv = soup.select_one('div#story-summary-inline div')
if descdiv:
if 'center' in descdiv['class']:
del descdiv['class']
self.setDescription(url,descdiv)
# #find metadata on the author's page
# asoup = self.make_soup(self.get_request("https://"+self.getSiteDomain()+"?a=q&a1=v&t=nickdetailsstories&lbi=stories&ar=0&nick="+self.story.getMetadata('authorId')))
# tr=asoup.find_all('tr')
# for i in range(1,len(tr)):
# a = tr[i].find('a')
# if '/s/'+self.story.getMetadata('storyId')+'/1/' in a['href']:
# break
# td = tr[i].find_all('td')
# self.story.addToList('category',stripHTML(td[2]))
# self.story.setMetadata('rating', stripHTML(td[5]))
# self.story.setMetadata('numWords', stripHTML(td[6]))
td = tr[i].findAll('td')
self.story.addToList('category',stripHTML(td[2]))
self.story.setMetadata('rating', stripHTML(td[5]))
self.story.setMetadata('numWords', stripHTML(td[6]))
# grab the text for an individual chapter.
@ -194,10 +201,10 @@ class FanFiktionDeAdapter(BaseSiteAdapter):
logger.debug('Getting chapter text from: %s' % url)
time.sleep(0.5) ## ffde has "floodlock" protection
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
div = soup.find('div', {'id' : 'storytext'})
for a in div.findAll('script'):
for a in div.find_all('script'):
a.extract()
if None == div:

View file

@ -1,57 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2014 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import re
from base_efiction_adapter import BaseEfictionAdapter
class FHSArchiveComAdapter(BaseEfictionAdapter):
@staticmethod
def getSiteDomain():
return 'fhsarchive.com'
@classmethod
def getPathToArchive(self):
return '/autoarchive'
@classmethod
def getSiteAbbrev(self):
return 'fhsa'
@classmethod
def getDateFormat(self):
return "%m/%d/%y"
def handleMetadataPair(self, key, value):
if key == 'Warnings':
for val in re.split("\s*,\s*", value):
if value == 'None':
return
else:
# toss numbers only.
self.story.addToList('warnings', filter(lambda x : not x.isdigit() , val))
# elif 'Categories' in key:
# for val in re.split("\s*>\s*", value):
# self.story.addToList('category', val)
else:
super(FHSArchiveComAdapter, self).handleMetadataPair(key, value)
def getClass():
return FHSArchiveComAdapter

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,19 +15,20 @@
# limitations under the License.
#
import time
import datetime
from __future__ import absolute_import,unicode_literals
# import datetime
import logging
logger = logging.getLogger(__name__)
import json
import re
import urllib2
from .. import translit
# from .. import translit
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from .. import exceptions# as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
@ -41,11 +42,6 @@ class FicBookNetAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["utf8",
"Windows-1252"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
@ -62,34 +58,42 @@ class FicBookNetAdapter(BaseSiteAdapter):
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d %m %Y"
self.dateformat = u"%d %m %Y г., %H:%M"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.ficbook.net'
return 'ficbook.net'
@classmethod
def getSiteExampleURLs(cls):
return "https://"+cls.getSiteDomain()+"/readfic/12345 https://"+cls.getSiteDomain()+"/readfic/93626/246417#part_content"
return "https://"+cls.getSiteDomain()+"/readfic/12345 https://"+cls.getSiteDomain()+"/readfic/93626/246417#part_content https://"+cls.getSiteDomain()+"/readfic/578de1cd-a8b4-7ff1-aa49-750426508b82 https://"+cls.getSiteDomain()+"/readfic/578de1cd-a8b4-7ff1-aa49-750426508b82/94793742#part_content"
def getSiteURLPattern(self):
return r"https?://"+re.escape(self.getSiteDomain()+"/readfic/")+r"\d+"
return r"https?://"+re.escape(self.getSiteDomain()+"/readfic/")+r"[\d\-a-zA-Z]+"
def performLogin(self,url,data):
params = {}
if self.password:
params['login'] = self.username
params['password'] = self.password
else:
params['login'] = self.getConfig("username")
params['password'] = self.getConfig("password")
logger.debug("Try to login in as (%s)" % params['login'])
d = self.post_request('https://' + self.getSiteDomain() + '/login_check_static',params,usecache=False)
if 'Войти используя аккаунт на сайте' in d:
raise exceptions.FailedToLogin(url,params['login'])
return True
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
def extractChapterUrlsAndMetadata(self,get_cover=True):
url=self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# use BeautifulSoup HTML parser to make everything easier to find.
data = self.get_request(url)
soup = self.make_soup(data)
adult_div = soup.find('div',id='adultCoverWarning')
@ -98,11 +102,12 @@ class FicBookNetAdapter(BaseSiteAdapter):
adult_div.extract()
else:
raise exceptions.AdultCheckRequired(self.url)
# Now go hunting for all the meta data and the chapter list.
## Title
a = soup.find('section',{'class':'chapter-info'}).find('h1')
try:
a = soup.find('section',{'class':'chapter-info'}).find('h1')
except AttributeError:
raise exceptions.FailedToDownload("Error collecting meta: %s! Missing required element!" % url)
# kill '+' marks if present.
sup = a.find('sup')
if sup:
@ -112,42 +117,12 @@ class FicBookNetAdapter(BaseSiteAdapter):
# Find authorid and URL from... author url.
# assume first avatar-nickname -- there can be a second marked 'beta'.
a = soup.find('a',{'class':'avatar-nickname'})
a = soup.find('a',{'class':'creator-username'})
self.story.setMetadata('authorId',a.text) # Author's name is unique
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+a['href'])
self.story.setMetadata('author',a.text)
logger.debug("Author: (%s)"%self.story.getMetadata('author'))
# Find the chapters:
chapters = soup.find('ul', {'class' : 'table-of-contents'})
if chapters != None:
chapters=chapters.findAll('a', href=re.compile(r'/readfic/'+self.story.getMetadata('storyId')+"/\d+#part_content$"))
self.story.setMetadata('numChapters',len(chapters))
for x in range(0,len(chapters)):
chapter=chapters[x]
churl='https://'+self.host+chapter['href']
self.chapterUrls.append((stripHTML(chapter),churl))
if x == 0:
pubdate = translit.translit(stripHTML(chapter.parent.find('span')))
# pubdate = translit.translit(stripHTML(self.make_soup(self._fetchUrl(churl)).find('div', {'class' : 'part_added'}).find('span')))
if x == len(chapters)-1:
update = translit.translit(stripHTML(chapter.parent.find('span')))
# update = translit.translit(stripHTML(self.make_soup(self._fetchUrl(churl)).find('div', {'class' : 'part_added'}).find('span')))
else:
self.chapterUrls.append((self.story.getMetadata('title'),url))
self.story.setMetadata('numChapters',1)
pubdate=translit.translit(stripHTML(soup.find('div',{'class':'title-area'}).find('span')))
update=pubdate
logger.debug("numChapters: (%s)"%self.story.getMetadata('numChapters'))
if not ',' in pubdate:
pubdate=datetime.date.today().strftime(self.dateformat)
if not ',' in update:
update=datetime.date.today().strftime(self.dateformat)
pubdate=pubdate.split(',')[0]
update=update.split(',')[0]
fullmon = {"yanvarya":"01", u"января":"01",
"fievralya":"02", u"февраля":"02",
"marta":"03", u"марта":"03",
@ -161,44 +136,68 @@ class FicBookNetAdapter(BaseSiteAdapter):
"noyabrya":"11", u"ноября":"11",
"diekabrya":"12", u"декабря":"12" }
for (name,num) in fullmon.items():
if name in pubdate:
pubdate = pubdate.replace(name,num)
if name in update:
update = update.replace(name,num)
# Find the chapters:
pubdate = None
chapters = soup.find('ul', {'class' : 'list-of-fanfic-parts'})
if chapters is not None:
for chapdiv in chapters.find_all('li', {'class':'part'}):
chapter=chapdiv.find('a',href=re.compile(r'/readfic/'+self.story.getMetadata('storyId')+r"/\d+#part_content$"))
churl='https://'+self.host+chapter['href']
self.story.setMetadata('dateUpdated', makeDate(update, self.dateformat))
self.story.setMetadata('datePublished', makeDate(pubdate, self.dateformat))
# Find the chapter dates.
date_str = chapdiv.find('span', {'title': True})['title'].replace(u"\u202fг. в", "")
for month_name, month_num in fullmon.items():
date_str = date_str.replace(month_name, month_num)
chapterdate = makeDate(date_str,self.dateformat)
self.add_chapter(chapter,churl,
{'date':chapterdate.strftime(self.getConfig("datechapter_format",self.getConfig("datePublished_format",self.dateformat)))})
if pubdate is None and chapterdate:
pubdate = chapterdate
update = chapterdate
else:
self.add_chapter(self.story.getMetadata('title'),url)
date_str = soup.find('div', {'class' : 'part-date'}).find('span', {'title': True})['title'].replace(u"\u202fг. в", "")
for month_name, month_num in fullmon.items():
date_str = date_str.replace(month_name, month_num)
pubdate = update = makeDate(date_str,self.dateformat)
logger.debug("numChapters: (%s)"%self.story.getMetadata('numChapters'))
self.story.setMetadata('dateUpdated', update)
self.story.setMetadata('datePublished', pubdate)
self.story.setMetadata('language','Russian')
## after site change, I don't see word count anywhere.
# pr=soup.find('a', href=re.compile(r'/printfic/\w+'))
# pr='https://'+self.host+pr['href']
# pr = self.make_soup(self._fetchUrl(pr))
# pr=pr.findAll('div', {'class' : 'part_text'})
# i=0
# for part in pr:
# i=i+len(stripHTML(part).split(' '))
# self.story.setMetadata('numWords', unicode(i))
dlinfo = soup.select_one('header.d-flex.flex-column.gap-12.word-break')
series_label = dlinfo.select_one('div.description.word-break').find('strong', string='Серия:')
logger.debug('Series: %s'%str(series_label))
if series_label:
series_div = series_label.find_next_sibling("div")
# No accurate series number as for that, additional request needs to be made
self.setSeries(stripHTML(series_div.a), 1)
self.story.setMetadata('seriesUrl','https://' + self.getSiteDomain() + series_div.a.get('href'))
dlinfo = soup.find('dl',{'class':'info'})
i=0
fandoms = dlinfo.find('dd').findAll('a', href=re.compile(r'/fanfiction/\w+'))
fandoms = dlinfo.select_one('div:not([class])').find_all('a', href=re.compile(r'/fanfiction/\w+'))
for fandom in fandoms:
self.story.addToList('category',fandom.string)
i=i+1
if i > 1:
self.story.addToList('genre', u'Кроссовер')
for genre in dlinfo.findAll('a',href=re.compile(r'/genres/')):
self.story.addToList('genre',stripHTML(genre))
tags = soup.find('div',{'class':'tags'})
if tags:
for genre in tags.find_all('a',href=re.compile(r'/tags/')):
self.story.addToList('genre',stripHTML(genre))
ratingdt = dlinfo.find('dt',text='Рейтинг:')
self.story.setMetadata('rating', stripHTML(ratingdt.next_sibling))
# meta=table.findAll('a', href=re.compile(r'/ratings/'))
logger.debug("category: (%s)"%self.story.getMetadata('category'))
logger.debug("genre: (%s)"%self.story.getMetadata('genre'))
ratingdt = dlinfo.find('div',{'class':re.compile(r'badge-rating-.*')})
self.story.setMetadata('rating', stripHTML(ratingdt.find('span')))
# meta=table.find_all('a', href=re.compile(r'/ratings/'))
# i=0
# for m in meta:
# if i == 0:
@ -209,39 +208,186 @@ class FicBookNetAdapter(BaseSiteAdapter):
# i=2
# self.story.addToList('genre', m.find('b').text)
# elif i == 2:
# self.story.addToList('warnings', m.find('b').text)
# self.story.addToList('warnings', m.find('b').text)
if dlinfo.find('span', {'style' : 'color: green'}):
if dlinfo.find('div', {'class':'badge-status-finished'}):
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
tags = dlinfo.findAll('dt')
for tag in tags:
label = translit.translit(tag.text)
if 'Piersonazhi:' in label or u'Персонажи:' in label:
chars=stripHTML(tag.next_sibling).split(', ')
for char in chars:
self.story.addToList('characters',char)
break
summary=soup.find('div', {'class' : 'urlize'})
self.setDescription(url,summary)
#self.story.setMetadata('description', summary.text)
try:
self.story.setMetadata('universe', stripHTML(dlinfo.find('a', href=re.compile('/fandom_universe/'))))
except AttributeError:
pass
paircharsdt = soup.find('strong',string='Пэйринг и персонажи:')
# site keeps both ships and indiv chars in /pairings/ links.
if paircharsdt:
for paira in paircharsdt.find_next('div').find_all('a', href=re.compile(r'/pairings/')):
if 'pairing-highlight' in paira['class']:
self.story.addToList('ships',stripHTML(paira))
chars=stripHTML(paira).split('/')
for char in chars:
self.story.addToList('characters',char)
else:
self.story.addToList('characters',stripHTML(paira))
summary=soup.find('div', itemprop='description')
if summary:
# Fix for the text not displaying properly
summary['class'].append('part_text')
self.setDescription(url,summary)
#self.story.setMetadata('description', summary.text)
stats = soup.find('div', {'class':'hat-actions-container'})
targetdata = stats.find_all('span', {'class' : 'main-info'})
for data in targetdata:
svg_class = data.find('svg')['class'][1] if data.find('svg') else None
value = int(stripHTML(data)) if stripHTML(data).isdigit() else 0
if svg_class == 'ic_thumbs-up' and value > 0:
self.story.setMetadata('likes', value)
#logger.debug("likes: (%s)"%self.story.getMetadata('likes'))
elif svg_class == 'ic_bubble-dark' and value > 0:
self.story.setMetadata('reviews', value)
#logger.debug("reviews: (%s)"%self.story.getMetadata('reviews'))
elif svg_class == 'ic_bookmark' and value > 0:
self.story.setMetadata('numCollections', value)
logger.debug("numCollections: (%s)"%self.story.getMetadata('numCollections'))
# Grab the amount of pages and words
targetpages = soup.find('strong',string='Размер:').find_next('div')
if targetpages:
targetpages_text = re.sub(r"(?<!\,)\s| ", "", targetpages.text, flags=re.UNICODE | re.MULTILINE)
pages_raw = re.search(r'(\d+)(?:страницы|страниц)', targetpages_text, re.UNICODE)
pages = int(pages_raw.group(1))
if pages > 0:
self.story.setMetadata('pages', pages)
logger.debug("pages: (%s)"%self.story.getMetadata('pages'))
numWords_raw = re.search(r"(\d+)(?:слова|слов)", targetpages_text, re.UNICODE)
numWords = int(numWords_raw.group(1))
if numWords > 0:
self.story.setMetadata('numWords', numWords)
logger.debug("numWords: (%s)"%self.story.getMetadata('numWords'))
# Grab FBN Category
class_tag = soup.select_one('div[class^="badge-with-icon direction"]').find('span', {'class' : 'badge-text'}).text
if class_tag:
self.story.setMetadata('classification',class_tag)
#logger.debug("classification: (%s)"%self.story.getMetadata('classification'))
# Find dedication.
ded = soup.find('div', {'class' : 'js-public-beta-dedication'})
if ded:
ded['class'].append('part_text')
self.story.setMetadata('dedication',ded)
# Find author comment
comm = soup.find('div', {'class' : 'js-public-beta-author-comment'})
if comm:
comm['class'].append('part_text')
self.story.setMetadata('authorcomment',comm)
follows = stats.find('fanfic-follow-button')[':follow-count']
if int(follows) > 0:
self.story.setMetadata('follows', int(follows))
logger.debug("follows: (%s)"%self.story.getMetadata('follows'))
# Grab the amount of awards
numAwards = 0
try:
awards = soup.find('fanfic-reward-list')[':initial-fic-rewards-list']
award_list = json.loads(awards)
numAwards = int(len(award_list))
# Grab the awards, but if multiple awards have the same name, only one will be kept; only an issue with hundreds of them.
self.story.extendList('awards', [str(award['user_text']) for award in award_list])
#logger.debug("awards (%s)"%self.story.getMetadata('awards'))
except (TypeError, KeyError):
logger.debug("Could not grab the awards")
if numAwards > 0:
self.story.setMetadata('numAwards', numAwards)
logger.debug("Num Awards (%s)"%self.story.getMetadata('numAwards'))
if get_cover:
cover = soup.find('fanfic-cover', {'class':"jsVueComponent"})
if cover is not None:
self.setCoverImage(url,cover['src-original'])
def replace_formatting(self,tag):
tname = tag.name
## operating on plain text because BS4 is hard to work on
## text with.
## stripHTML() discards whitespace around other tags, like <i>
txt = tag.get_text()
txt = txt.replace("\n","<br/>")
soup = self.make_soup("<"+tname+">"+txt+"</"+tname+">")
return soup.find(tname)
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
chapter = soup.find('div', {'class' : 'public_beta'})
if chapter == None:
chapter = soup.find('div', {'id' : 'content'})
if chapter is None: ## still needed?
chapter = soup.find('div', {'class' : 'public_beta_disabled'})
if None == chapter:
if chapter is None:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
## ficbook uses weird CSS white-space: pre-wrap; for
## paragraphing. Doesn't work with txt output
if 'part_text' in chapter['class'] and self.getConfig('replace_text_formatting'):
## copy classes, except part_text
divclasses = chapter['class']
divclasses.remove('part_text')
chapter = self.replace_formatting(chapter)
chapter['class'] = divclasses
exclude_notes=self.getConfigList('exclude_notes')
if 'headnotes' not in exclude_notes:
# Find the headnote
head_note = soup.select_one("div.part-comment-top div.js-public-beta-comment-before")
if head_note:
# Create the structure for the headnote
head_notes_div_tag = soup.new_tag('div', attrs={'class': 'fff_chapter_notes fff_head_notes'})
head_b_tag = soup.new_tag('b')
head_b_tag.string = 'Примечания:'
if 'text-preline' in head_note['class'] and self.getConfig('replace_text_formatting'):
head_blockquote_tag = self.replace_formatting(head_note)
head_blockquote_tag.name = 'blockquote'
else:
head_blockquote_tag = soup.new_tag('blockquote')
head_blockquote_tag.string = stripHTML(head_note)
head_notes_div_tag.append(head_b_tag)
head_notes_div_tag.append(head_blockquote_tag)
# Prepend the headnotes to the chapter, <hr> to mimic the site
chapter.insert(0, head_notes_div_tag)
chapter.insert(1, soup.new_tag('hr'))
if 'footnotes' not in exclude_notes:
# Find the endnote
end_note = soup.select_one("div.part-comment-bottom div.js-public-beta-comment-after")
if end_note:
# Create the structure for the footnote
end_notes_div_tag = soup.new_tag('div', attrs={'class': 'fff_chapter_notes fff_foot_notes'})
end_b_tag = soup.new_tag('b')
end_b_tag.string = 'Примечания:'
if 'text-preline' in end_note['class'] and self.getConfig('replace_text_formatting'):
end_blockquote_tag = self.replace_formatting(end_note)
end_blockquote_tag.name = 'blockquote'
else:
end_blockquote_tag = soup.new_tag('blockquote')
end_blockquote_tag.string = stripHTML(end_note)
end_notes_div_tag.append(end_b_tag)
end_notes_div_tag.append(end_blockquote_tag)
# Append the endnotes to the chapter, <hr> to mimic the site
chapter.append(soup.new_tag('hr'))
chapter.append(end_notes_div_tag)
return self.utf8FromSoup(url,chapter)

View file

@ -1,292 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2012 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Software: eFiction
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import sys
from bs4.element import Comment
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
def getClass():
return HPFanficArchiveComAdapter
# Class name has to be unique. Our convention is camel case the
# sitename with Adapter at the end. www is skipped.
class HPFanficArchiveComAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.decode = ["Windows-1252",
"utf8", "iso-8859-1"]
# 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
self.password = ""
self.is_adult=False
# get storyId from url--url validation guarantees query is only sid=1234
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','ficsite')
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%m/%d/%Y"
@staticmethod # must be @staticmethod, don't remove it.
def getSiteDomain():
# The site domain. Does have www here, if it uses it.
return 'www.ficsite.com'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
## Login seems to be reasonably standard across eFiction sites.
def needToLoginCheck(self, data):
if 'Registered Users Only' in data \
or 'There is no such account on our website' in data \
or "That password doesn't match the one in our database" in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['penname'] = self.username
params['password'] = self.password
else:
params['penname'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['cookiecheck'] = '1'
params['submit'] = 'Submit'
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['penname']))
d = self._fetchUrl(loginUrl, params)
if "Member Account" not in d : #Member Account
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['penname']))
raise exceptions.FailedToLogin(url,params['penname'])
return False
else:
return True
# I've added this because there are several warnings
# that are used by this site.
def getWarning(self, data):
if "This story contains adult subject matter that may include coarse language, violence, and mild sexual content of a graphical nature. Reader discretion is requested. Thank you." in data:
return '&ageconsent=ok&warning=5'
elif "This story contains graphical material of an adult nature and a same sex primary relationship. Please do not read if this is not to your taste. Thank you." in data:
return '&warning=7'
elif "This story contains graphical material of an adult nature. Reader discretion is requested. Thank you." in data:
return '&warning=6'
else:
return False
## Getting the chapter list and the meta data, plus 'is adult' checking.
def extractChapterUrlsAndMetadata(self):
if (self.is_adult or self.getConfig("is_adult")):
addurl = '&index=1&ageconsent=ok&warning=5'
else:
addurl='&index=1'
# index=1 makes sure we see the story chapter index. Some
# sites skip that for one-chapter stories.
url = self.url+addurl
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
if self.needToLoginCheck(data):
# need to log in for this one.
self.performLogin(url)
data = self._fetchUrl(url)
warning = self.getWarning(data)
if warning != False:
data = self._fetchUrl(url+warning)
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
elif "This story contains adult subject matter that may include coarse language, violence, and mild sexual content of a graphical nature. Reader discretion is requested. Thank you." in data:
raise exceptions.AccessDenied(self.getSiteDomain()+" says: This story contains adult subject matter that may include coarse language, violence, and mild sexual content of a graphical nature. Reader discretion is requested. Thank you.")
elif "This story contains graphical material of an adult nature and a same sex primary relationship. Please do not read if this is not to your taste. Thank you." in data:
raise exceptions.AccessDenied(self.getSiteDomain()+" says: This story contains graphical material of an adult nature and a same sex primary relationship. Please do not read if this is not to your taste. Thank you.")
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# print data
# Now go hunting for all the meta data and the chapter list.
## Title and Author Div
div = soup.find('div',{'id':'pagetitle'})
## Title
a = div.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
self.story.setMetadata('title',stripHTML(a))
# Find authorid and URL from... author url.
a = div.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
self.story.setMetadata('authorId',a['href'].split('=')[1])
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
self.story.setMetadata('author',a.string)
# Find the chapters:
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
self.story.setMetadata('numChapters',len(self.chapterUrls))
# eFiction sites don't help us out a lot with their meta data
# formating, so it's a little ugly.
# utility method
def defaultGetattr(d,k):
try:
return d[k]
except:
return ""
# <span class="label">Rated:</span> NC-17<br /> etc
labels = soup.findAll('span',{'class':'label'})
for labelspan in labels:
val = labelspan.nextSibling
value = unicode('')
while val and not 'label' in defaultGetattr(val,'class'):
# print("val:%s"%val)
if not isinstance(val,Comment):
value += unicode(val)
val = val.nextSibling
label = labelspan.string
# print("label:%s\nvalue:%s"%(label,value))
if 'Summary' in label:
self.setDescription(url,value)
if 'Rated' in label:
self.story.setMetadata('rating', stripHTML(value))
if 'Word count' in label:
self.story.setMetadata('numWords', stripHTML(value))
if 'Categories' in label:
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
for cat in cats:
self.story.addToList('category',cat.string)
if 'Characters' in label:
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
for char in chars:
self.story.addToList('characters',char.string)
if 'Genre' in label:
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
for genre in genres:
self.story.addToList('genre',genre.string)
if 'Pairing' in label:
ships = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=4'))
for ship in ships:
self.story.addToList('ships',ship.string)
if 'Warnings' in label:
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
for warning in warnings:
self.story.addToList('warnings',warning.string)
if 'Completed' in label:
if 'Yes' in stripHTML(value):
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
if 'Published' in label:
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
if 'Updated' in label:
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
try:
# Find Series name from series URL.
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
series_name = a.string
series_url = 'http://'+self.host+'/'+a['href']
# use BeautifulSoup HTML parser to make everything easier to find.
seriessoup = self.make_soup(self._fetchUrl(series_url))
# can't use ^viewstory...$ in case of higher rated stories with javascript href.
storyas = seriessoup.findAll('a', href=re.compile(r'viewstory.php\?sid=\d+'))
i=1
for a in storyas:
# skip 'report this' and 'TOC' links
if 'contact.php' not in a['href'] and 'index' not in a['href']:
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
self.setSeries(series_name, i)
self.story.setMetadata('seriesUrl',series_url)
break
i+=1
except:
# I find it hard to care if the series parsing fails
pass
# grab the text for an individual chapter.
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
div = soup.find('div', {'id' : 'story'})
if None == div:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,div)

View file

@ -0,0 +1,225 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2021 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from .base_adapter import BaseSiteAdapter, makeDate
class FictionAlleyArchiveOrgSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','fa')
self.is_adult=False
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
# normalized story URL.
url = "https://"+self.getSiteDomain()+"/authors/"+m.group('auth')+"/"+m.group('id')+".html"
self._setURL(url)
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%m/%d/%Y"
def _setURL(self,url):
# logger.debug("set URL:%s"%url)
super(FictionAlleyArchiveOrgSiteAdapter, self)._setURL(url)
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('authorId',m.group('auth'))
self.story.setMetadata('storyId',m.group('id'))
@staticmethod
def getSiteDomain():
return 'www.fictionalley-archive.org'
@classmethod
def getAcceptDomains(cls):
return ['www.fictionalley-archive.org',
'www.fictionalley.org']
@classmethod
def getSiteExampleURLs(cls):
return "https://"+cls.getSiteDomain()+"/authors/drt/DA.html https://"+cls.getSiteDomain()+"/authors/drt/JOTP01a.html"
@classmethod
def getURLDomain(cls):
return 'https://' + cls.getSiteDomain()
def getSiteURLPattern(self):
# http://www.fictionalley-archive.org/authors/drt/DA.html
# http://www.fictionalley-archive.org/authors/drt/JOTP01a.html
return r"https?://www.fictionalley(-archive)?.org/authors/(?P<auth>[a-zA-Z0-9_]+)/(?P<id>[a-zA-Z0-9_]+)\.html"
def extractChapterUrlsAndMetadata(self):
## could be either chapter list page or one-shot text page.
logger.debug("URL: "+self.url)
(data,rurl) = self.get_request_redirected(self.url)
if rurl != self.url:
self._setURL(rurl)
logger.debug("set to redirected url:%s"%self.url)
soup = self.make_soup(data)
# If chapter list page, get the first chapter to look for adult check
chapterlinklist = soup.select('h5.mb-1 > a')
# logger.debug(chapterlinklist)
if not chapterlinklist:
# no chapter list, it's either a chapter URL or a single chapter story
# <nav aria-label="Chapter Navigation">
# <a class="page-link" href="/authors/mz_xxo/HPATOTFI.html">Index</a>
storya = soup.select_one('nav[aria-label="Chapter Navigation"] a')
# logger.debug(storya)
if storya:
## multi chapter story
self._setURL(self.getURLDomain()+storya['href'])
logger.debug("Normalizing to URL: "+self.url)
# ## title's right there...
# self.story.setMetadata('title',stripHTML(storya))
data = self.get_request(self.url)
soup = self.make_soup(data)
chapterlinklist = soup.select('h5.mb-1 > a')
# logger.debug(chapterlinklist)
else:
## single chapter story.
# logger.debug("Single chapter story")
pass
self.story.setMetadata('title',stripHTML(soup.select_one('h1')))
## authorid already set.
## <h1 class="title" align="center">Just Off The Platform II by <a href="http://www.fictionalley.org/authors/drt/">DrT</a></h1>
authora=soup.select_one('h1 + h3 > a')
self.story.setMetadata('author',stripHTML(authora))
self.story.setMetadata('authorUrl',self.getURLDomain()+authora['href'])
if chapterlinklist:
# Find the chapters:
for chapter in chapterlinklist:
listitem = chapter.parent.parent.parent
# logger.debug(listitem)
# date
date = stripHTML(listitem.select_one('small.text-nowrap'))
chapterDate = makeDate(date,self.dateformat)
wordshits = listitem.select('span.font-weight-normal')
chap_data = {
'date':chapterDate.strftime(self.getConfig("datechapter_format",self.getConfig("datePublished_format","%Y-%m-%d"))),
'words':stripHTML(wordshits[0]),
'hits':stripHTML(wordshits[1]),
'summary':stripHTML(listitem.select_one('p.my-2')),
}
# logger.debug(chap_data)
self.add_chapter(chapter,self.getURLDomain()+chapter['href'], chap_data)
else:
self.add_chapter(self.story.getMetadata('title'),self.url)
cardbody = soup.select_one('div.card-body')
searchs_to_meta = (
# sitetype, ffftype, islist
('Rating', 'rating', False),
('House', 'house', True),
('Character', 'characters', True),
('Genre', 'genre', True),
('Era', 'era', True),
('Spoiler', 'spoilers', True),
('Ship', 'ships', True),
)
for (sitetype,ffftype, islist) in searchs_to_meta:
# logger.debug((sitetype,ffftype, islist))
tags = cardbody.select('a[href^="/stories?Include.%s"]'%sitetype)
# logger.debug(tags)
if tags:
if islist:
self.story.extendList(ffftype, [ stripHTML(a) for a in tags ])
else:
self.story.setMetadata(ffftype, stripHTML(tags[0]))
# Published: 09/26/2003 Updated: 04/13/2004 Words: 14,268 Chapters: 5 Hits: 743
badgeinfos = cardbody.select('div.badge-info')
# logger.debug(badgeinfos)
for badge in badgeinfos:
txt = stripHTML(badge)
(key,val)=txt.split(':')
# logger.debug((key,val))
if key in ( 'Published', 'Updated'):
date = makeDate(val,self.dateformat)
self.story.setMetadata('date'+key,date)
elif key in ('Hits'):
self.story.setMetadata(key.lower(),val)
elif key == 'Words':
self.story.setMetadata('numWords',val)
summary = soup.find('dt',string='Story Summary:')
if summary:
summary = summary.find_next_sibling('dd')
summary.name='div'
self.setDescription(self.url,summary)
return
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self.get_request(url)
soup = self.make_soup(data)
# this may be a brittle way to get the chapter text.
# Site doesn't give a lot of hints.
chaptext = soup.select_one('main#content div:not([class])')
# not sure how, but we can get html, etc tags still in some
# stories. That breaks later updates because it confuses
# epubutils.py
# Yes, this still applies to fictionalley-archive.
for tag in chaptext.find_all('head') + chaptext.find_all('meta') + chaptext.find_all('script'):
tag.extract()
for tag in chaptext.find_all('body') + chaptext.find_all('html'):
tag.name = 'div'
if self.getConfig('include_author_notes'):
row = chaptext.find_previous_sibling('div',class_='row')
logger.debug(row)
andt = row.find('dt',string="Author's Note:")
logger.debug(andt)
if andt:
chaptext.insert(0,andt.parent.extract())
# post notes aren't as structured(?)
for div in chaptext.find_next_siblings('div',class_='row'):
chaptext.append(div.extract())
# logger.debug(chaptext)
return self.utf8FromSoup(url,chaptext)
def getClass():
return FictionAlleyArchiveOrgSiteAdapter

View file

@ -1,244 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib
import urllib2
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
class FictionAlleyOrgSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','fa')
self.decode = ["Windows-1252",
"utf8"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.is_adult=False
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('authorId',m.group('auth'))
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL(url)
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
@staticmethod
def getSiteDomain():
return 'www.fictionalley.org'
@classmethod
def getSiteExampleURLs(cls):
return "http://"+cls.getSiteDomain()+"/authors/drt/DA.html http://"+cls.getSiteDomain()+"/authors/drt/JOTP01a.html"
def getSiteURLPattern(self):
# http://www.fictionalley.org/authors/drt/DA.html
# http://www.fictionalley.org/authors/drt/JOTP01a.html
return re.escape("http://"+self.getSiteDomain())+"/authors/(?P<auth>[a-zA-Z0-9_]+)/(?P<id>[a-zA-Z0-9_]+)\.html"
def _postFetchWithIAmOld(self,url):
if self.is_adult or self.getConfig("is_adult"):
params={'iamold':'Yes',
'action':'ageanswer'}
logger.info("Attempting to get cookie for %s" % url)
## posting on list doesn't work, but doesn't hurt, either.
data = self._postUrl(url,params)
else:
data = self._fetchUrl(url)
return data
def extractChapterUrlsAndMetadata(self):
## could be either chapter list page or one-shot text page.
url = self.url
logger.debug("URL: "+url)
try:
data = self._postFetchWithIAmOld(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
chapterdata = data
# If chapter list page, get the first chapter to look for adult check
chapterlinklist = soup.findAll('a',{'class':'chapterlink'})
if chapterlinklist:
chapterdata = self._postFetchWithIAmOld(chapterlinklist[0]['href'])
if "Are you over seventeen years old" in chapterdata:
raise exceptions.AdultCheckRequired(self.url)
if not chapterlinklist:
# no chapter list, chapter URL: change to list link.
# second a tag inside div breadcrumbs
storya = soup.find('div',{'class':'breadcrumbs'}).findAll('a')[1]
self._setURL(storya['href'])
url=self.url
logger.debug("Normalizing to URL: "+url)
## title's right there...
self.story.setMetadata('title',stripHTML(storya))
data = self._fetchUrl(url)
soup = self.make_soup(data)
chapterlinklist = soup.findAll('a',{'class':'chapterlink'})
else:
## still need title from somewhere. If chapterlinklist,
## then chapterdata contains a chapter, find title the
## same way.
chapsoup = self.make_soup(chapterdata)
storya = chapsoup.find('div',{'class':'breadcrumbs'}).findAll('a')[1]
self.story.setMetadata('title',stripHTML(storya))
del chapsoup
del chapterdata
## authorid already set.
## <h1 class="title" align="center">Just Off The Platform II by <a href="http://www.fictionalley.org/authors/drt/">DrT</a></h1>
authora=soup.find('h1',{'class':'title'}).find('a')
self.story.setMetadata('author',authora.string)
self.story.setMetadata('authorUrl',authora['href'])
if len(chapterlinklist) == 1:
self.chapterUrls.append((self.story.getMetadata('title'),chapterlinklist[0]['href']))
else:
# Find the chapters:
for chapter in chapterlinklist:
# just in case there's tags, like <i> in chapter titles.
self.chapterUrls.append((stripHTML(chapter),chapter['href']))
self.story.setMetadata('numChapters',len(self.chapterUrls))
## Go scrape the rest of the metadata from the author's page.
data = self._fetchUrl(self.story.getMetadata('authorUrl'))
soup = self.make_soup(data)
# <dl><dt><a class = "Rid story" href = "http://www.fictionalley.org/authors/aafro_man_ziegod/TMH.html">
# [Rid] The Magical Hottiez</a> by <a class = "pen_name" href = "http://www.fictionalley.org/authors/aafro_man_ziegod/">Aafro Man Ziegod</a> </small></dt>
# <dd><small class = "storyinfo"><a href = "http://www.fictionalley.org/ratings.html" target = "_new">Rating:</a> PG-13 - Spoilers: PS/SS, CoS, PoA, GoF, QTTA, FB - 4264 hits - 5060 words<br />
# Genre: Humor, Romance - Main character(s): None - Ships: None - Era: Multiple Eras<br /></small>
# Chaos ensues after Witch Weekly, seeking to increase readers, decides to create a boyband out of five seemingly talentless wizards: Harry Potter, Draco Malfoy, Ron Weasley, Neville Longbottom, and Oliver "Toss Your Knickers Here" Wood.<br />
# <small class = "storyinfo">Published: June 3, 2002 (between Goblet of Fire and Order of Phoenix) - Updated: June 3, 2002</small>
# </dd></dl>
storya = soup.find('a',{'href':self.story.getMetadata('storyUrl')})
storydd = storya.findNext('dd')
# Rating: PG - Spoilers: None - 2525 hits - 736 words
# Genre: Humor - Main character(s): H, R - Ships: None - Era: Multiple Eras
# Harry and Ron are back at it again! They reeeeeeally don't want to be back, because they know what's awaiting them. "VH1 Goes Inside..." is back! Why? 'Cos there are soooo many more couples left to pick on.
# Published: September 25, 2004 (between Order of Phoenix and Half-Blood Prince) - Updated: September 25, 2004
## change to text and regexp find.
metastr = stripHTML(storydd).replace('\n',' ').replace('\t',' ')
m = re.match(r".*?Rating: (.+?) -.*?",metastr)
if m:
self.story.setMetadata('rating', m.group(1))
m = re.match(r".*?Genre: (.+?) -.*?",metastr)
if m:
for g in m.group(1).split(','):
self.story.addToList('genre',g)
m = re.match(r".*?Published: ([a-zA-Z]+ \d\d?, \d\d\d\d).*?",metastr)
if m:
self.story.setMetadata('datePublished',makeDate(m.group(1), "%B %d, %Y"))
m = re.match(r".*?Updated: ([a-zA-Z]+ \d\d?, \d\d\d\d).*?",metastr)
if m:
self.story.setMetadata('dateUpdated',makeDate(m.group(1), "%B %d, %Y"))
m = re.match(r".*? (\d+) words Genre.*?",metastr)
if m:
self.story.setMetadata('numWords', m.group(1))
for small in storydd.findAll('small'):
small.extract() ## removes the <small> tags, leaving only the summary.
storydd.name = 'div' ## change tag name else Calibre treats it oddly.
self.setDescription(url,storydd)
#self.story.setMetadata('description',stripHTML(storydd))
return
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self._fetchUrl(url)
# find <!-- headerend --> & <!-- footerstart --> and
# replaced with matching div pair for easier parsing.
# Yes, it's an evil kludge, but what can ya do? Using
# something other than div prevents soup from pairing
# our div with poor html inside the story text.
crazy = "crazytagstringnobodywouldstumbleonaccidently"
data = data.replace('<!-- headerend -->','<'+crazy+' id="storytext">').replace('<!-- footerstart -->','</'+crazy+'>')
# problems with some stories confusing Soup. This is a nasty
# hack, but it works.
data = data[data.index('<'+crazy+''):]
# ditto with extra crap at the end.
data = data[:data.index('</'+crazy+'>')+len('</'+crazy+'>')]
soup = self.make_soup(data)
body = soup.findAll('body') ## some stories use a nested body and body
## tag, in which case we don't
## need crazytagstringnobodywouldstumbleonaccidently
## and use the second one instead.
if len(body)>1:
text = body[1]
text.name='div' # force to be a div to avoid multiple body tags.
else:
text = soup.find(crazy, {'id' : 'storytext'})
text.name='div' # change to div tag.
if not data or not text:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
# not sure how, but we can get html, etc tags still in some
# stories. That breaks later updates because it confuses
# epubutils.py
for tag in text.findAll('head'):
tag.extract()
for tag in text.findAll('body') + text.findAll('html'):
tag.name = 'div'
return self.utf8FromSoup(url,text)
def getClass():
return FictionAlleyOrgSiteAdapter

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2016 FanFicFare team
# Copyright 2022 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,15 +15,103 @@
# limitations under the License.
#
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
from .. import exceptions as exceptions
from ..htmlcleanup import stripHTML
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from .base_adapter import BaseSiteAdapter, makeDate
ampfandoms = ["A Falcone & Driscoll Investigation",
"Alias Smith & Jones",
"Atelier Escha & Logy",
"Austin & Ally",
"Baby & Me/赤ちゃんと僕",
"Barney & Friends",
"Between Love & Goodbye",
"Beyond Good & Evil",
"Bill & Ted's Excellent Adventure/Bogus Journey",
"BLACK & WHITE",
"Bonnie & Clyde",
"Brandy & Mr. Whiskers",
"Brothers & Sisters",
"Bucket & Skinner's Epic Adventures",
"Calvin & Hobbes",
"Cats & Dogs",
"Command & Conquer",
"Devil & Devil",
"Dharma & Greg",
"Dicky & Dawn",
"Drake & Josh",
"Edgar & Ellen",
"Franklin & Bash",
"Gabby Duran & The Unsittables",
"Girls und Panzer/ガールズ&パンツァー",
"Gnomeo & Juliet",
"Grim Adventures of Billy & Mandy",
"Half & Half/ハーフ・アンド・ハーフ",
"Hansel & Gretel",
"Hatfields & McCoys",
"High & Low - The Story of S.W.O.R.D.",
"Home & Away",
"Hudson & Rex",
"Huntik: Secrets & Seekers",
"Imagine Me & You",
"Jekyll & Hyde",
"Jonathan Strange & Mr. Norrell",
"Knight's & Magic/ナイツ&マジック",
"Law & Order: Los Angeles",
"Law & Order: Organized Crime",
"Lilo & Stitch",
"Locke & Key",
"Lockwood & Co.",
"Lost & Found Music Studios",
"Lu & Og",
"Me & My Brothers",
"Melissa & Joey",
"Mickey Mouse & Friends",
"Mike & Molly",
"Mike, Lu & Og",
"Miraculous: Tales of Ladybug & Cat Noir",
"Mork & Mindy",
"Mount&Blade",
"Mr. & Mrs. Smith",
"Mr. Peabody & Sherman",
"Muhyo & Roji",
"Nicky, Ricky, Dicky & Dawn",
"Oliver & Company",
"Ozzy & Drix",
"Panty & Stocking with Garterbelt/パンティストッキングwithガーターベルト",
"Penryn & the End of Days",
"Prep & Landing",
"Prince & Hero/王子とヒーロー",
"Prince & Me",
"Puzzle & Dragons",
"Ren & Stimpy Show",
"Rizzoli & Isles",
"Romeo & Juliet",
"Rosemary & Thyme",
"Sam & Cat",
"Sam & Max",
"Sapphire & Steel",
"Scott & Bailey",
"Shakespeare & Hathaway: Private Investigators",
"Soul Nomad & the World Eaters",
"Superman & Lois",
"Tiger & Bunny/タイガー&バニー",
"Trains & Automobiles",
"Upin & Ipin",
"Wallace & Gromit",
"Witch & Wizard",
"Wolverine & the X-Men",
"Yotsuba&!/よつばと!",
"Young & Hungry",
]
class FictionHuntComSiteAdapter(BaseSiteAdapter):
@ -31,16 +119,32 @@ class FictionHuntComSiteAdapter(BaseSiteAdapter):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','fichunt')
# get storyId from url--url validation guarantees second part is storyId
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
# normalized story URL.
self._setURL("http://"+self.getSiteDomain()\
+"/read/"+self.story.getMetadata('storyId')+"/1")
## new types:
## https://fictionhunt.com/stories/7edm248/the-last-of-his-kind/chapters/1
## https://fictionhunt.com/stories/89kzg4z/the-last-of-his-kind-new
## old type:
## http://fictionhunt.com/read/12411643/1
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
# logger.debug(m.groupdict())
self.story.setMetadata('storyId',m.group('id'))
if m.group('type') == "stories": # newer URL
# normalized story URL.
self._setURL("https://"+self.getSiteDomain()\
+"/stories/"+self.story.getMetadata('storyId')+"/"+ (m.group('title') or ""))
else:
self._setURL("https://"+self.getSiteDomain()\
+"/read/"+self.story.getMetadata('storyId')+"/1")
# logger.debug(self.url)
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%d-%m-%Y"
self.dateformat = "%Y-%m-%d %H:%M:%S"
@staticmethod
def getSiteDomain():
@ -48,17 +152,55 @@ class FictionHuntComSiteAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://fictionhunt.com/read/1234/1"
return "https://fictionhunt.com/stories/1a1a1a/story-title http://fictionhunt.com/read/1234/1"
def getSiteURLPattern(self):
return r"http://(www.)?fictionhunt.com/read/\d+(/\d+)?(/|/[^/]+)?/?$"
## https://fictionhunt.com/stories/7edm248/the-last-of-his-kind/chapters/1
## https://fictionhunt.com/stories/89kzg4z/the-last-of-his-kind-new
## http://fictionhunt.com/read/12411643/1
return r"https?://(www.)?fictionhunt.com/(?P<type>read|stories)/(?P<id>[0-9a-z]+)(/(?P<title>[^/]+))?(/|/[^/]+)*/?$"
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
def needToLoginCheck(self, data):
## FH is apparently reporting "Story has been removed" for all
## chapters when not logged in now.
if 'https://fictionhunt.com/login' in data:
return True
else:
return False
def performLogin(self, url):
params = {}
if self.password:
params['identifier'] = self.username
params['password'] = self.password
else:
params['identifier'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['remember'] = 'on'
loginUrl = 'https://' + self.getSiteDomain() + '/login'
if not params['identifier']:
logger.info("This site requires login.")
raise exceptions.FailedToLogin(url,params['identifier'])
## need to pull empty login page first to get authenticity_token
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['identifier']))
soup = self.make_soup(self.get_request(loginUrl,usecache=False))
params['_token']=soup.find('input', {'name':'_token'})['value']
d = self.post_request(loginUrl, params, usecache=False)
# logger.debug(d)
if self.needToLoginCheck(d):
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['identifier']))
raise exceptions.FailedToLogin(url,params['identifier'])
return False
else:
return True
def doExtractChapterUrlsAndMetadata(self,get_cover=True):
@ -66,80 +208,132 @@ class FictionHuntComSiteAdapter(BaseSiteAdapter):
# metadata and chapter list
url = self.url
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.meta)
else:
raise e
data = self.get_request(url)
## As per #784, site isn't requiring login anymore.
## Login check commented since we've seen it toggle before.
# if self.needToLoginCheck(data):
# self.performLogin(url)
# data = self.get_request(url,usecache=False)
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
## detect old storyUrl, switch to new storyUrl:
canonlink = soup.find('link',rel='canonical')
if canonlink:
# logger.debug(canonlink)
canonlink = re.sub(r"/chapters/\d+","",canonlink['href'])
# logger.debug(canonlink)
self._setURL(canonlink)
url = self.url
data = self.get_request(url)
soup = self.make_soup(data)
else:
# in case title changed
self._setURL(soup.select_one("div.Story__details a")['href'])
url = self.url
self.story.setMetadata('title',stripHTML(soup.find('div',{'class':'title'})).strip())
# logger.debug(data)
self.story.setMetadata('title',stripHTML(soup.find('h1',{'class':'Story__title'})))
self.setDescription(url,'<i>(Story descriptions not available on fictionhunt.com)</i>')
summhead = soup.find('h5',string='Summary')
self.setDescription(url,summhead.find_next('div'))
# Find authorid and URL from... author url.
# fictionhunt doesn't have author pages, use ffnet original author link.
a = soup.find('a', href=re.compile(r"fanfiction.net/u/\d+"))
self.story.setMetadata('authorId',a['href'].split('/')[-1])
self.story.setMetadata('authorUrl','https://www.fanfiction.net/u/'+self.story.getMetadata('authorId'))
self.story.setMetadata('author',a.string)
## author:
autha = soup.find('div',{'class':'StoryContents__meta'}).find('a') # first a in StoryContents__meta
self.story.setMetadata('authorId',autha['href'].split('/')[4])
self.story.setMetadata('authorUrl',autha['href'])
self.story.setMetadata('author',autha.string)
updlab = soup.find('label',string='Last Updated:')
if updlab:
update = updlab.find_next('time')['datetime']
self.story.setMetadata('dateUpdated', makeDate(update, self.dateformat))
publab = soup.find('label',string='Published:')
if publab:
pubdate = publab.find_next('time')['datetime']
self.story.setMetadata('datePublished', makeDate(pubdate, self.dateformat))
## need author page for some metadata.
authsoup = None
authpagea = autha
authstorya = None
## Rating and exact word count doesn't appear on the summary
## page, try to get from author page.
## find story url, might need to spin through author's pages.
while authpagea and not authstorya:
authsoup = self.make_soup(self.get_request(authpagea['href']))
authpagea = authsoup.find('a',{'rel':'next'})
# CSS selectors don't allow : or / unquoted, which
# BS4(and dependencies) didn't used to enforce.
authstorya = authsoup.select_one('h4.Story__item-title a[href="%s"]'%self.url)
if not authstorya:
raise exceptions.FailedToDownload("Error finding %s on author page(s)" % self.url)
meta = authstorya.find_parent('li').find('div',class_='Story__meta-info')
meta=meta.text.split()
self.story.setMetadata('numWords',meta[meta.index('words')-1])
self.story.setMetadata('rating',meta[meta.index('Rating:')+1])
# logger.debug(meta)
# Find original ffnet URL
a = soup.find('a', href=re.compile(r"fanfiction.net/s/\d+"))
a = soup.find('a', string="Source")
self.story.setMetadata('origin',stripHTML(a))
self.story.setMetadata('originUrl',a['href'])
# Fleur D. & Harry P. & Hermione G. & Susan B. - Words: 42,848 - Rated: M - English - None - Chapters: 9 - Reviews: 248 - Updated: 21-09-2016 - Published: 16-05-2015 - by Elven Sorcerer (FFN)
# None - Words: 13,087 - Rated: M - English - Romance & Supernatural - Chapters: 3 - Reviews: 5 - Updated: 21-09-2016 - Published: 20-09-2016
# Harry P. & OC - Words: 10,910 - Rated: M - English - None - Chapters: 5 - Reviews: 6 - Updated: 21-09-2016 - Published: 11-09-2016
# Dudley D. & Harry P. & Nagini & Vernon D. - Words: 4,328 - Rated: K+ - English - None - Chapters: 2 - Updated: 21-09-2016 - Published: 20-09-2016 -
details = soup.find('div',{'class':'details'})
detail_re = \
r'(?P<characters>.+) - Words: (?P<numWords>[0-9,]+) - Rated: (?P<rating>[a-zA-Z\\+]+) - (?P<language>.+) - (?P<genre>.+)'+ \
r' - Chapters: (?P<numChapters>[0-9,]+)( - Reviews: (?P<reviews>[0-9,]+))? - Updated: (?P<dateUpdated>[0-9-]+)'+ \
r' - Published: (?P<datePublished>[0-9-]+)(?P<completed> - Complete)?'
details_dict = re.match(detail_re,stripHTML(details)).groupdict()
# lists
for meta in ('characters','genre'):
if details_dict[meta] != 'None':
self.story.extendList(meta,details_dict[meta].split(' & '))
# scalars
for meta in ('numWords','numChapters','rating','language','reviews'):
self.story.setMetadata(meta,details_dict[meta])
# dates
for meta in ('datePublished','dateUpdated'):
self.story.setMetadata(meta, makeDate(details_dict[meta], self.dateformat))
# status
if details_dict['completed']:
datesdiv = soup.find('div',{'class':'dates'})
if stripHTML(datesdiv.find('label')) == 'Completed' : # first label is status.
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
# It's assumed that the number of chapters is correct.
# There's no complete list of chapters, so the only
# alternative is to get the num of chaps from the last
# indiated chapter list instead.
for i in range(1,1+int(self.story.getMetadata('numChapters'))):
self.chapterUrls.append(("Chapter "+unicode(i),"http://"+self.getSiteDomain()\
+"/read/"+self.story.getMetadata('storyId')+"/%s"%i))
for a in soup.select("div.genres a"):
self.story.addToList('genre',stripHTML(a))
for a in soup.select("section.characters li.Tags__item a"):
self.story.addToList('characters',stripHTML(a))
for a in soup.select('a[href*="pairings="]'):
self.story.addToList('ships',stripHTML(a).replace("+","/"))
for a in soup.select('div.Story__type a[href*="fandoms="]'):
# logger.debug(a)
fandomstr=stripHTML(a).replace(' Fanfiction','').strip()
# logger.debug("'%s'"%fandomstr)
## haven't thought of a better way to detect and *not*
## split on fandoms with a '&' in them.
for ampfandom in ampfandoms:
if ampfandom in fandomstr:
self.story.addToList('category',ampfandom)
fandomstr = fandomstr.replace(ampfandom,'')
for fandom in fandomstr.split('&'):
if fandom:
self.story.addToList('category',fandom)
## Currently no 'Original' stories on the site, but does list
## it as a search type. Set extratags: and uncomment this if
## and when.
# if self.story.getList('category'):
# self.story.addToList('category', 'FanFiction')
# else:
# self.story.addToList('category', 'Original')
for chapli in soup.select('ul.StoryContents__chapters li'):
self.add_chapter(stripHTML(chapli.select_one('span.chapter-title')),chapli.select_one('a')['href'])
if self.num_chapters() == 0:
raise exceptions.FailedToDownload("Story at %s has no chapters." % self.url)
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self._fetchUrl(url)
data = self.get_request(url)
soup = self.make_soup(data)
div = soup.find('div', {'class' : 'text'})
div = soup.find('div', {'class' : 'StoryChapter__text'})
return self.utf8FromSoup(url,div)

View file

@ -0,0 +1,594 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#### Hazel's fiction.live fanficfare adapter
# what an *adventure* this was. fiction.live is an angular web3.0 app that does async background stuff everywhere.
# they're not kidding about it being live.
# can I wrangle it's stories into books for offline reading? yes I 98% can!
### won't support, because they aren't part of the text
# chat, threads, chat replies on vote options
### can't support because wtf this is a book
# music / audio embeds
# per-user achivement tracking with fancy achievement-get animations
# story scripting (shows script tags visible in the text, not computed values or input fields)
import re
import json
from datetime import datetime
import itertools
import logging
logger = logging.getLogger(__name__)
# __package__ = 'fanficfare.adapters' # fixes dev issues with unknown package base
from .base_adapter import BaseSiteAdapter
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from ..six import ensure_text
def getClass():
return FictionLiveAdapter
class FictionLiveAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','flive')
self.story_id = self.parsedUrl.path.split('/')[3]
self.story.setMetadata('storyId', self.story_id)
self.chapter_id_to_api = {}
# normalize URL. omits title in the url
self._setURL("https://fiction.live/stories//{s_id}".format(s_id = self.story_id));
@staticmethod
def getSiteDomain():
return "fiction.live"
@classmethod
def getAcceptDomains(cls):
return ["fiction.live", "beta.fiction.live"] # I still remember anonkun, but the domain has now lapsed
def getSiteURLPattern(self):
# I'd like to thank regex101.com for helping me screw this up less
return r"https?://(beta\.)?fiction\.live/[^/]*/[^/]*/([a-zA-Z0-9\-]+)(/(home)?)?$"
@classmethod
def getSiteExampleURLs(cls):
return ("https://fiction.live/stories/Example-Story-Title/17CharacterIDhere/home "
+"https://fiction.live/stories/Example-Story-With-Long-ID/-20CharacterIDisHere "
+"https://fiction.live/Sci-fi/Example-Story-With-URL-Genre/17CharacterIDhere/ "
+"https://fiction.live/stories/Example-Story-With-UUID/00000000-0000-4000-0000-000000000000/")
@classmethod
def get_section_url(cls,url):
## minimal URL used for section names in INI and reject list
## for comparison
# logger.debug("pre--url:%s"%url)
url = re.sub(r"https?://(beta\.)?fiction\.live/[^/]*/[^/]*/(?P<id>[a-zA-Z0-9\-]+)(/(home)?)?$",r'https://fiction.live/stories//\g<id>',url)
# logger.debug("post-url:%s"%url)
return url
def parse_timestamp(self, timestamp):
# fiction.live date format is unix-epoch milliseconds. not a good fit for fanficfare's makeDate.
# doesn't use a timezone object and returns tz-naive datetimes. I *think* I can leave the rest to fanficfare
return datetime.fromtimestamp(timestamp / 1000.0, None)
def img_url_trans(self,imgurl):
"Apparently site changed cdn URLs for images more than once."
# logger.debug("pre--imgurl:%s"%imgurl)
imgurl = re.sub(r'(\w+)\.cloudfront\.net',r'cdn6.fiction.live/file/fictionlive',imgurl)
imgurl = re.sub(r'www\.filepicker\.io/api/file/(\w+)',r'cdn4.fiction.live/fp/\1',imgurl)
imgurl = re.sub(r'cdn[34].fiction.live/(.+)',r'cdn6.fiction.live/file/fictionlive/\1',imgurl)
# logger.debug("post-imgurl:%s"%imgurl)
return imgurl
def doExtractChapterUrlsAndMetadata(self, get_cover=True):
metadata_url = "https://fiction.live/api/node/{s_id}/"
response = self.get_request(metadata_url.format(s_id = self.story_id))
if not response: # this is how fiction.live responds to nonsense urls -- HTTP200 with empty response
raise exceptions.StoryDoesNotExist("Empty response for " + self.url)
data = json.loads(response)
## get metadata for multi route chapters
if 'multiRoute' in data and data['multiRoute'] == True:
route_metadata_url = "https://fiction.live/api/anonkun/routes/{s_id}/"
response = self.get_request(route_metadata_url.format(s_id = self.story_id))
if not response: # this is how fiction.live responds to nonsense urls -- HTTP200 with empty response
raise exceptions.StoryDoesNotExist("Empty response for " + self.url)
data["route_metadata"] = json.loads(response)
self.extract_metadata(data, get_cover)
self.add_chapters(data)
def extract_metadata(self, data, get_cover):
# on one hand, we've got nicely-formatted JSON and can just index into the thing we want, no parsing needed.
# on the other, nearly *everything* in this api is optional. found that out the hard way.
# not optional
self.story.setMetadata('title', stripHTML(data['t']))
# stories have ut, rt, ct, and cht. fairly sure that ut = update time and rt = release time.
# ct is 'creation time' and everything in the api has it -- you can create stories and edit before publishing
# cht is *chunktime* -- newest story chunk added.
# ut for update time includes other kinds of update -- threads, chat etc
# ct <= rt <= cht <= ut
self.story.setMetadata("dateUpdated", self.parse_timestamp(data['cht']))
self.story.setMetadata("datePublished", self.parse_timestamp(data['rt']))
self.most_recent_chunk = data['cht'] if 'cht' in data else 9999999999999998
# nearly everything optional from here out
if 'storyStatus' in data:
status_translate = {'active': "In-Progress", 'finished': "Completed"} # fiction.live to fanficfare
status = data['storyStatus']
self.story.setMetadata('status', status_translate.get(status, status.title()))
elif 'complete' in data:
if data['complete'] == True:
self.story.setMetadata('status', "Completed")
else:
self.story.setMetadata('status', "In-Progress")
else:
self.story.setMetadata('status', "In-Progress")
if 'contentRating' in data:
self.story.setMetadata('rating', data['contentRating'])
elif 'tAge' in data:
self.story.setMetadata('rating', data['tAge'])
else:
self.story.setMetadata('rating', "teen")
if 'w' in data: self.story.setMetadata('numWords', data['w'])
if 'likeCount' in data: self.story.setMetadata('likes', data['likeCount'])
if 'rInput' in data: self.story.setMetadata('reader_input', data['rInput'].title())
summary = stripHTML(data['d']) if 'd' in data else ""
firstblock = data['b'].strip() if 'b' in data else ""
self.setDescription(self.url, summary if not firstblock else summary + "\n<br />\n" + firstblock)
tags = data['ta'] if 'ta' in data else []
if (self.story.getMetadataRaw('rating') in {"nsfw", "adult"} or 'smut' in tags) and \
not (self.is_adult or self.getConfig("is_adult")):
raise exceptions.AdultCheckRequired(self.url)
show_spoiler_tags = self.getConfig('show_spoiler_tags')
spoiler_tags = data['spoilerTags'] if 'spoilerTags' in data else []
for tag in tags:
if show_spoiler_tags or not tag in spoiler_tags:
self.story.addToList('tags', tag)
authors = data['u'] # non-optional
if len(authors) > 1:
for author in data['u']:
if '_id' in author and author['n']: # some stories have spurious co-authors (may have been fixed?)
self.story.addToList('author', author['n'])
self.story.addToList('authorUrl', "https://fiction.live/user/" + author['n'] + "/")
self.story.addToList('authorId', author['_id'])
else: # TODO: can avoid this?
author = authors[0]
self.story.setMetadata('author', author['n'])
self.story.setMetadata('authorUrl', "https://fiction.live/user/" + author['n'] + "/")
self.story.setMetadata('authorId', author['_id'])
if 'isLive' in data and data['isLive']:
self.story.setMetadata('live', "Now! (at time of download)")
elif 'nextLive' in data and data['nextLive']:
# formatted to match site, not other fanficfare timestamps
next_live_time = self.parse_timestamp(data['nextLive'])
self.story.setMetadata('live', next_live_time)
show_nsfw_cover_images = self.getConfig('show_nsfw_cover_images')
nsfw_cover = data['nsfwCover'] if 'nsfwCover' in data else False
if get_cover and 'i' in data:
if show_nsfw_cover_images or not nsfw_cover:
coverUrl = data['i'][0]
self.setCoverImage(self.url, coverUrl)
# gonna need these later for adding details to achievement-granting links in the text
try:
self.achievements = data['achievements']['achievements']
except KeyError:
self.achievements = []
def add_chapters(self, data):
## chapter urls are for the api. they return json and aren't user-navigatable, or the same as on the website
chunkrange_url = "https://fiction.live/api/anonkun/chapters/{s_id}/{start}/{end}/"
## api url to get content of a multi route chapter. requires only the route id and no timestamps
route_chunkrange_url = "https://fiction.live/api/anonkun/route/{c_id}/chapters"
def add_chapter_url(title, bounds):
"Adds a chapter url based on the start/end chunk-range timestamps."
start, end = bounds
end -= 1
chapter_url = chunkrange_url.format(s_id = data['_id'], start = start, end = end)
self.add_chapter(title, chapter_url)
def add_route_chapter_url(title, route_id):
"Adds a route chapter url based on the route id."
chapter_url = route_chunkrange_url.format(c_id = route_id)
self.add_chapter(title, chapter_url)
def pair(iterable):
"[1,2,3,4] -> [(1, 2), (2, 3), (3, 4)]"
a, b = itertools.tee(iterable, 2)
next(b, None)
return list(zip(a, b))
def map_chap_ids_to_api(chapter_ids, route_ids, times):
for index, bounds in enumerate(times):
start, end = bounds
end -= 1
chapter_url = chunkrange_url.format(s_id = data['_id'], start = start, end = end)
self.chapter_id_to_api[chapter_ids[index]] = chapter_url
for route_id in route_ids:
chapter_url = route_chunkrange_url.format(c_id = route_id)
self.chapter_id_to_api[route_id] = chapter_url
## first thing to do is seperate out the appendices
appendices, maintext, routes = [], [], []
chapters = data['bm'] if 'bm' in data else []
## not all stories use multiple routes. Those that do have a route id and a title for each route
if 'route_metadata' in data and data['route_metadata']:
for r in data['route_metadata']:
# checking if route title even exists or is None, since most things in the api are optional
if 't' in r and r['t'] is not None:
title = r['t']
else:
title = ""
routes.append({"id": r['_id'], "title": title})
for c in chapters:
appendices.append(c) if c['title'].startswith('#special') else maintext.append(c)
## main-text chapter extraction processing. *should* now handle all the edge cases.
## relies on fanficfare ignoring empty chapters!
titles = ["Home"] + [c['title'] for c in maintext]
chapter_ids = ['home'] + [c['id'] for c in maintext]
times = [data['ct']] + [c['ct'] for c in maintext] + [self.most_recent_chunk + 2] # need to be 1 over, and add_url etc does -1
times = pair(times)
if self.getConfig('include_appendices', True): # Add appendices after main text if desired
titles = titles + ["Appendix: " + a['title'][9:] for a in appendices]
chapter_ids = chapter_ids + [a['id'] for a in appendices]
times = times + [(a['ct'], a['ct'] + 2) for a in appendices]
route_ids = [r['id'] for r in routes]
map_chap_ids_to_api(chapter_ids, route_ids, times) # Map chapter ids to API URLs for use when comparing the two
# doesn't actually run without the call to list.
list(map(add_chapter_url, titles, times))
for r in routes: # add route at the end, after appendices
route_id = r['id'] # to get route chapter content, the route id is needed, not the timestamp
chapter_title = "Route: " + r['title'] # 'Route: ' at beginning of name, since it's a multiroute chapter
add_route_chapter_url(chapter_title, route_id)
def getChapterText(self, url):
chunk_handler = {
"choice" : self.format_choice,
"readerPost" : self.format_readerposts,
"chapter" : self.format_chapter
}
response = self.get_request(url)
data = json.loads(response)
if data == []:
return ""
# and *now* we can assume there's at least one chunk in the data -- chapters can be totally empty.
# are we trying to read an appendix? check the first chunk to find out.
getting_appendix = len(data) == 1 and 't' in data[0] and data[0]['t'].startswith("#special")
text = ""
for count, chunk in enumerate(data):
# logger.debug(count) # pollutes the debug log, shows which chunk crashed the handler
text += "<div>" # chapter chunks aren't always well-delimited in their contents
# appendix chunks are mixed in with other things
if not getting_appendix and 't' in chunk and chunk['t'].startswith("#special"): # t = title = bookmark
continue
handler = chunk_handler.get(chunk['nt'], self.format_unknown) # nt = node type
text += handler(chunk)
show_timestamps = self.getConfig('show_timestamps')
if show_timestamps and 'ct' in chunk:
#logger.debug("Adding timestamp for chunk...")
timestamp = ensure_text(self.parse_timestamp(chunk['ct']).strftime("%x -- %X"))
text += '<div class="ut">' + timestamp + '</div>'
text += "</div><br />\n"
## soup to repair the most egregious HTML errors.
return self.utf8FromSoup(url,self.make_soup(text))
### everything from here out is chunk data handling.
def format_chapter(self, chunk):
"""Handles any formatting in the chapter body text for text chapters.
In the 'default case' where we're getting boring chapter-chunk body text, just calls utf8fromSoup
and returns the text as is on the website."""
soup = self.make_soup(chunk['b'] if 'b' in chunk else "")
if self.getConfig('legend_spoilers',True):
soup = self.add_spoiler_legends(soup)
if self.achievements:
soup = self.append_achievments(soup)
return str(soup)
def add_spoiler_legends(self, soup):
# find spoiler links and change link-anchor block to legend block
spoilers = soup.find_all('a', class_="tydai-spoiler")
for link_tag in spoilers:
link_tag.name = 'fieldset'
legend = soup.new_tag('legend')
legend.string = "Spoiler"
link_tag.insert(0, legend)
return soup
def fictionlive_normalize(self, string):
# might be able to use this to preserve titles in normalized urls, if the scheme is the same
# BUG: in achivement ids these are all replaced, but I *don't* know that the list is complete.
# should be rare, thankfully. *most* authors don't use any funny characters in the achievment's *ID*
special_chars = "\"\\,.!?+=/[](){}<>_'@#$%^&*~`;:|" # not the hyphen, which is used to represent spaces
return string.lower().replace(" ", "-").translate({ord(x) : None for x in special_chars})
def append_achievments(self, soup):
# achivements are present in the text as a kind of link, and you get the shiny popup by clicking them.
achievement_links = soup.find_all('a', class_="tydai-achievement")
achieved_ids = []
for link_tag in achievement_links:
# these are not only prepended by a unicode lightning-bolt, but also format clearly as a link
# should use .u css selector -- part of output_css defaults? or just let replace_tags_with_spans do it?
new_u = soup.new_tag('u')
new_u.string = link_tag.text # copy out the link text into a new element
# html entities for improved compatability with AZW3 conversion
link_tag.string = "&#x26A1;" # then overwrite
link_tag.insert(1, new_u)
## while we've got the achievment links, get the ids from the link
a_id = link_tag['data-id']
a_id = self.fictionlive_normalize(a_id)
achieved_ids.append(a_id)
if achieved_ids:
logger.debug("achievements (this chunk): " + ", ".join(achieved_ids))
# can't replicate the animated shiny announcement popup, so have an end-of-chunk announcement instead
# TODO: achievement images -- does anyone use them?
a_source = "<br />\n<fieldset><legend>&#x26A1; Achievement obtained!</legend>\n<h4>{}</h4>\n{}</fieldset>\n"
for a_id in achieved_ids:
if a_id in self.achievements:
a_title = self.achievements[a_id]['t'] if 't' in self.achievements[a_id] else a_id.title()
a_text = self.achievements[a_id]['d'] if 'd' in self.achievements[a_id] else ""
soup.append(self.make_soup(a_source.format(a_title, a_text)))
else:
a_title = a_id.title()
error = "<br />\n<fieldset><legend>Error: Achievement not found.</legend>Couldn't find '{}'. Ask the story author to check if the achievment exists."
soup.append(self.make_soup(error.format(a_title)))
return soup
def count_votes(self, chunk):
"""So, fiction.live's api doesn't return the counted votes you see on the website.
After all, it needs to allow for things like revoking a vote,
with the count live and updated in realtime on your client.
So instead we get the raw vote-data, but have to count it ourselves."""
# optional.
choices = chunk['choices'] if 'choices' in chunk else []
def counter(votes):
output = [0] * len(choices)
for vote in votes.values():
## votes are either a single option-index or a list of option-indicies, depending on the choice type
if 'multiple' in chunk and chunk['multiple'] == False:
vote = [vote] # normalize to list
for v in vote:
# v should only be int, but there is at least one story where some unrelated string was returned,
# so let's just ignore non-int values here
if not isinstance(v, int):
continue
if 0 <= v < len(choices):
output[v] += 1
return output
# I believe that verified is always a subset of all votes, but that's not enforced here
total_votes = counter(chunk['votes'] if 'votes' in chunk else {})
verified_votes = counter(chunk['userVotes'] if 'userVotes' in chunk else {})
# Choices can link to route chapters, where the index of the choice in list 'choices' is a key in the
# 'routes' dict and the dict value is the route id.
# That route id is needed for the url to create the internal link from the choice to the route chapter.
routes = chunk['routes'] if 'routes' in chunk else {}
if choices and len(routes) > 0:
altered_choices = []
for i, choice in enumerate(choices):
choice_index = str(i)
if choice_index in routes.keys():
route_chunkrange_url = "https://fiction.live/api/anonkun/route/{c_id}/chapters"
route_url = route_chunkrange_url.format(c_id=routes[choice_index])
choice_link = "<a data-orighref='" + route_url + "' >" + choice + "</a>"
altered_choices.append(choice_link)
else:
altered_choices.append(choice)
choices = altered_choices
return zip(choices, verified_votes, total_votes)
def format_choice(self, chunk):
options = self.count_votes(chunk)
# crossed-out writeins. authors can censor user-written choices, and (optionally) offer a reason.
x_outs = [int(x) for x in chunk['xOut']] if 'xOut' in chunk else []
x_reasons = chunk['xOutReasons'] if 'xOutReasons' in chunk else {}
closed = "closed" if 'closed' in chunk else "open" # BUG: check on reopened votes
num_voters = len(chunk['votes']) if 'votes' in chunk else 0
vote_title = chunk['b'] if 'b' in chunk else "Choices"
output = ""
# start with the header
output += u"<h4><span>" + vote_title + " — <small>Voting " + closed
output += u"" + str(num_voters) + " voters</small></span></h4>\n"
# we've got everything needed to build the html for our vote table.
output += "<table class=\"voteblock\">\n"
# filter out the crossed-out options, which display last
crossed = []
for index, (choice_text, verified_votes, total_votes) in enumerate(options):
if index in x_outs:
crossed.append((index, choice_text, verified_votes, total_votes))
else:
output += "<tr class=\"choiceitem\"><td>" + str(choice_text) + "</td><td class=\"votecount\">"
if verified_votes > 0:
output += "" + str(verified_votes) + "/"
output += str(total_votes)+ " </td></tr>\n"
# crossed out options are: displayed last, struckthrough, smaller, with the reason below, and no vote count.
# also greyed out, but that's a bit much.
for index, choice_text, _, _ in crossed:
if choice_text == "permanentlyRemoved":
continue
else:
x_reason = x_reasons[str(index)] if str(index) in x_reasons else ""
output += "<tr class=\"choiceitem\"><td colspan=\"2\"><small><strike>" \
+ str(choice_text) + "</strike><br>" + str(x_reason) + "</small></td></tr>"
output += "</table>\n"
return output
def format_readerposts(self, chunk):
closed = "Closed" if 'closed' in chunk else "Open"
posts = chunk['votes'] if 'votes' in chunk else {}
dice = chunk['dice'] if 'dice' in chunk else {}
# now matches the site and does *not* include dicerolls as posts!
num_votes = str(len(posts)) + " posts" if len(posts) != 0 else "be the first to post."
posts_title = chunk['b'] if 'b' in chunk else "Reader Posts"
output = ""
output += u"<h4><span>" + posts_title + " — <small> Posting " + closed
output += u"" + num_votes + "</small></span></h4>\n"
## so. a voter can roll with their post. these rolls are in a seperate dict, but have the **same uid**.
## they're then formatted with the roll above the writein for that user.
## I *think* that formatting roll-only before writein-only posts is correct, but tbh, it's hard to tell.
## writeins are usually opened by the author for posts or rolls, not both at once.
## people tend to only mix the two by accident.
if dice != {}:
for uid, roll in dice.items():
output += '<div class="choiceitem">'
if roll: # optional. just because there's a list entry for it doesn't mean it has a value!
output += '<div class="dice">' + str(roll) + '</div>\n'
if uid in posts:
post = posts[uid]
if post:
output += str(post)
del posts[uid] # it's handled here with the roll instead of later
output += '</div>'
for post in posts.values():
if post:
output += '<div class="choiceitem">' + str(post) + '</div>\n'
return output
def normalize_chapterurl(self, url):
if url.startswith(r'https://fiction.live/api/anonkun/chapters'):
return url
pattern = None
if url.startswith(r'https://fiction.live/api/anonkun/route'):
pattern = r"https?://(?:beta\.)?fiction\.live/[^/]*/[^/]*/[a-zA-Z0-9]+/routes/([a-zA-Z0-9]+)"
elif url.startswith(r'https://fiction.live/'):
pattern = r"https?://(?:beta\.)?fiction\.live/[^/]*/[^/]*/[a-zA-Z0-9]+/[^/]*(/[a-zA-Z0-9]+|home)"
# regex101 rocks
if not pattern:
return url
match = re.match(pattern, url)
if not match:
return url
chapter_id = match.group(1)
if chapter_id.startswith('/'):
chapter_id = chapter_id[1:]
if chapter_id and chapter_id in self.chapter_id_to_api:
return self.chapter_id_to_api[chapter_id]
return url
def format_unknown(self, chunk):
raise NotImplementedError("Unknown chunk type ({}) in fiction.live story.".format(chunk))
# in future, I'd like to handle audio embeds somehow. but they're not availble to add to stories right now.
# pretty sure they'll just format as a link (with a special tydai-audio class) and should be easier than achievements
# TODO: verify that show_timestamps is working, check times!
# TODO: find a story that uses achievement images and implement them?
### known bugs:
# TODO: support chapter urls for single-chapter / chapter-range downloads
# complicated -- urls for getChapterText are API urls generated by add_chapters, not the public/website ones
# in particular, may need more API reversing to figure out how to get the *end* of the chunk range
# find in 'bm' in the metadata?

View file

@ -1,8 +1,12 @@
from __future__ import absolute_import
import re
import urllib2
import urlparse
import logging
logger = logging.getLogger(__name__)
# py2 vs py3 transition
from ..six import text_type as unicode
from ..six.moves.urllib import parse as urlparse
from base_adapter import BaseSiteAdapter, makeDate
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
@ -19,7 +23,7 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
SITE_ABBREVIATION = 'fmt'
SITE_DOMAIN = 'fictionmania.tv'
BASE_URL = 'http://' + SITE_DOMAIN + '/stories/'
BASE_URL = 'https://' + SITE_DOMAIN + '/stories/'
READ_TEXT_STORY_URL_TEMPLATE = BASE_URL + 'readtextstory.html?storyID=%s'
DETAILS_URL_TEMPLATE = BASE_URL + 'details.html?storyID=%s'
@ -36,23 +40,6 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
self._setURL(self.READ_TEXT_STORY_URL_TEMPLATE % story_id)
self.story.setMetadata('siteabbrev', self.SITE_ABBREVIATION)
# Always single chapters, probably should use the Anthology feature to
# merge chapters of a story
self.story.setMetadata('numChapters', 1)
def _customized_fetch_url(self, url, exception=None, parameters=None):
if exception:
try:
data = self._fetchUrl(url, parameters)
except urllib2.HTTPError:
raise exception(self.url)
# Just let self._fetchUrl throw the exception, don't catch and
# customize it.
else:
data = self._fetchUrl(url, parameters)
return self.make_soup(data)
@staticmethod
def getSiteDomain():
return FictionManiaTVAdapter.SITE_DOMAIN
@ -62,11 +49,11 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
return cls.READ_TEXT_STORY_URL_TEMPLATE % 1234
def getSiteURLPattern(self):
return 'https?' + re.escape(self.BASE_URL[len('http'):]) + '(readtextstory|readxstory|details)\.html\?storyID=\d+$'
return r'https?' + re.escape(self.BASE_URL[len('https'):]) + r'(readtextstory|readhtmlstory|readxstory|details)\.html\?storyID=\d+$'
def extractChapterUrlsAndMetadata(self):
url = self.DETAILS_URL_TEMPLATE % self.story.getMetadata('storyId')
soup = self._customized_fetch_url(url)
soup = self.make_soup(self.get_request(url))
keep_summary_html = self.getConfig('keep_summary_html')
for row in soup.find('table')('tr'):
@ -79,7 +66,7 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
if key == 'Title':
self.story.setMetadata('title', value)
self.chapterUrls.append((value, self.url))
self.add_chapter(value, self.url)
elif key == 'File Name':
self.story.setMetadata('fileName', value)
@ -119,7 +106,7 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
self.story.setMetadata('rating', value)
elif key == 'Complete':
self.story.setMetadata('status', 'Completed' if value == 'Complete' else 'In-Progress')
self.story.setMetadata('status', 'Completed' if value == 'yes' else 'In-Progress')
elif key == 'Categories':
for element in cells[1]('a'):
@ -149,20 +136,78 @@ class FictionManiaTVAdapter(BaseSiteAdapter):
self.story.setMetadata('readings', value)
def getChapterText(self, url):
soup = self._customized_fetch_url(url)
element = soup.find('pre')
element.name = 'div'
if self.getConfig("download_text_version",False):
soup = self.make_soup(self.get_request(url))
element = soup.find('pre')
element.name = 'div'
# The story's content is contained in a <pre> tag, probably taken 1:1
# from the source text file. A simple replacement of all newline
# characters with a break line tag should take care of formatting.
# The story's content is contained in a <pre> tag, probably taken 1:1
# from the source text file. A simple replacement of all newline
# characters with a break line tag should take care of formatting.
# While wrapping in paragraphs would be possible, it's too much work,
# I'd rather display the story 1:1 like it was found in the pre tag.
content = unicode(element)
content = content.replace('\n', '<br/>')
# While wrapping in paragraphs would be possible, it's too much work,
# I'd rather display the story 1:1 like it was found in the pre tag.
content = unicode(element)
content = content.replace('\n', '<br/>')
if self.getConfig('non_breaking_spaces'):
return content.replace(' ', '&nbsp;')
if self.getConfig('non_breaking_spaces'):
return content.replace(' ', '&nbsp;')
return content
## Normally, getChapterText should use self.utf8FromSoup(),
## but this is converting from plain(ish) text. -- JM
return content
else:
# try SWI (story with images) version first
# <div style="margin-left:10ex;margin-right:10ex">
## fetching SWI version now instead of text.
htmlurl = url.replace('readtextstory','readhtmlstory')
## Used to find by style, but it's inconsistent now. we've seen:
## margin-left:10ex;margin-right:10ex
## margin-right: 5%; margin-left: 5%
## margin-left:5%; margin-right:5%
## margin-left:5%; margin-right:5%; background: white
## And there's some without a <div> tag (or an unclosed div)
## Only the comments appear to be consistent.
beginmarker='<!--Read or display the file-->'
endmarker='''<hr size=1 noshade>
<!--review add read, top and bottom-->
'''
data = self.get_request(htmlurl)
try:
## if both markers are found, assume whatever is in between
## is the chapter text.
soup = self.make_soup(data[data.index(beginmarker):data.index(endmarker)])
return self.utf8FromSoup(htmlurl,soup)
except Exception as e:
# logger.debug(e)
# logger.debug(soup)
logger.debug("Story With Images(SWI) not found, falling back to HTML.")
## fetching html version now instead of text.
## Note that html and SWI pages are *not* formatted the same.
soup = self.make_soup(self.get_request(url.replace('readtextstory','readxstory')))
# logger.debug(soup)
# remove first hr and everything before
remove = soup.find('hr')
# logger.debug(remove)
for tag in remove.find_previous_siblings():
tag.extract()
remove.extract()
# remove trailing hr, parent tags and everything after.
remove = soup.find('hr',size='1') # <center><hr size=1>
if remove.parent.name == 'center':
## can also be directly in body without <center>
remove = remove.parent
# logger.debug(remove)
for tag in remove.find_next_siblings():
tag.extract()
remove.extract()
content = soup.find('body')
content.name='div'
return self.utf8FromSoup(url,content)

View file

@ -1,194 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2013 Fanficdownloader team, 2015 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import time
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import time
import json
#from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
class FictionPadSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','fpad')
self.dateformat = "%Y-%m-%dT%H:%M:%SZ"
self.is_adult=False
self.username = None
self.password = None
# get storyId from url--url validation guarantees query correct
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL("https://"+self.getSiteDomain()
+"/author/"+m.group('author')
+"/stories/"+self.story.getMetadata('storyId'))
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
@staticmethod
def getSiteDomain():
return 'fictionpad.com'
@classmethod
def getSiteExampleURLs(cls):
return "https://fictionpad.com/author/Author/stories/1234/Some-Title"
def getSiteURLPattern(self):
# http://fictionpad.com/author/Serdd/stories/4275
return r"http(s)?://(www\.)?fictionpad\.com/author/(?P<author>[^/]+)/stories/(?P<id>\d+)"
# <form method="post" action="/signin">
# <input name="authenticity_token" type="hidden" value="u+cfdXh46dRnwVnSlmE2B2BFmHgu760paqgBG6KQeos=" />
# <input type="hidden" name="remember" value="1">
# <strong class="help-start text-center">or with FictionPad</strong>
# <label class="control-label hidden-placeholder">Pseudonym or Email Address</label>
# <input name="login" class="input-block-level" type="text" placeholder="Pseudonym or Email Address" maxlength="50" required autofocus>
# <label class="control-label hidden-placeholder">Password</label>
# <input name="password" class="input-block-level" type="password" placeholder="Password" minlength="6" required>
# <button type="submit" class="btn btn-primary btn-block">Sign In</button>
# <p class="help-end">
# <a href="/passwordreset">Forgot your password?</a>
# </p>
# </form>
def performLogin(self):
params = {}
if self.password:
params['login'] = self.username
params['password'] = self.password
else:
params['login'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['remember'] = '1'
loginUrl = 'http://' + self.getSiteDomain() + '/signin'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['login']))
## need to pull empty login page first to get authenticity_token
soup = self.make_soup(self._fetchUrl(loginUrl))
params['authenticity_token']=soup.find('input', {'name':'authenticity_token'})['value']
data = self._postUrl(loginUrl, params)
if "Invalid email/pseudonym and password combination." in data:
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['login']))
raise exceptions.FailedToLogin(loginUrl,params['login'])
def extractChapterUrlsAndMetadata(self):
# fetch the chapter. From that we will get almost all the
# metadata and chapter list
url=self.url
logger.debug("URL: "+url)
try:
data = self._fetchUrl(url)
if "This is a mature story. Please sign in to read it." in data:
self.performLogin()
data = self._fetchUrl(url)
find = "wordyarn.config.page = "
data = data[data.index(find)+len(find):]
data = data[:data.index("</script>")]
data = data[:data.rindex(";")]
data = data.replace('tables:','"tables":')
tables = json.loads(data)['tables']
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(url)
else:
raise e
# looks like only one author per story allowed.
author = tables['users'][0]
story = tables['stories'][0]
story_ver = tables['story_versions'][0]
logger.debug("story:%s"%story)
self.story.setMetadata('authorId',author['id'])
self.story.setMetadata('author',author['display_name'])
self.story.setMetadata('authorUrl','https://'+self.host+'/author/'+author['display_name']+'/stories')
self.story.setMetadata('title',story_ver['title'])
self.setDescription(url,story_ver['description'])
if not ('assets/story_versions/covers' in story_ver['profile_image_url@2x']):
self.setCoverImage(url,story_ver['profile_image_url@2x'])
self.story.setMetadata('datePublished',makeDate(story['published_at'], self.dateformat))
self.story.setMetadata('dateUpdated',makeDate(story['published_at'], self.dateformat))
self.story.setMetadata('followers',story['followers_count'])
self.story.setMetadata('comments',story['comments_count'])
self.story.setMetadata('views',story['views_count'])
self.story.setMetadata('likes',int(story['likes'])) # no idea why they floated these.
if 'dislikes' in story:
self.story.setMetadata('dislikes',int(story['dislikes']))
if story_ver['is_complete']:
self.story.setMetadata('status', 'Completed')
else:
self.story.setMetadata('status', 'In-Progress')
self.story.setMetadata('rating', story_ver['maturity_level'])
self.story.setMetadata('numWords', unicode(story_ver['word_count']))
for i in tables['fandoms']:
self.story.addToList('category',i['name'])
for i in tables['genres']:
self.story.addToList('genre',i['name'])
for i in tables['characters']:
self.story.addToList('characters',i['name'])
for c in tables['chapters']:
chtitle = "Chapter %d"%c['number']
if c['title']:
chtitle += " - %s"%c['title']
self.chapterUrls.append((chtitle,c['body_url']))
self.story.setMetadata('numChapters',len(self.chapterUrls))
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
if not url:
data = u"<em>This chapter has no text.</em>"
else:
data = self._fetchUrl(url)
soup = self.make_soup(u"<div id='story'>"+data+u"</div>")
return self.utf8FromSoup(url,soup)
def getClass():
return FictionPadSiteAdapter

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,15 +15,15 @@
# limitations under the License.
#
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import time
# py2 vs py3 transition
## They're from the same people and pretty much identical.
from adapter_fanfictionnet import FanFictionNetSiteAdapter
from .adapter_fanfictionnet import FanFictionNetSiteAdapter
class FictionPressComSiteAdapter(FanFictionNetSiteAdapter):
@ -43,8 +43,15 @@ class FictionPressComSiteAdapter(FanFictionNetSiteAdapter):
def getSiteExampleURLs(cls):
return "https://www.fictionpress.com/s/1234/1/ https://www.fictionpress.com/s/1234/12/ http://www.fictionpress.com/s/1234/1/Story_Title http://m.fictionpress.com/s/1234/1/"
def getSiteURLPattern(self):
return r"https?://(www|m)?\.fictionpress\.com/s/\d+(/\d+)?(/|/[a-zA-Z0-9_-]+)?/?$"
@classmethod
def _get_site_url_pattern(cls):
return r"https?://(www|m)?\.fictionpress\.com/s/(?P<id>\d+)(/\d+)?(/(?P<title>[^/]+))?/?$"
## normalized chapter URLs DO contain the story title now, but
## normalized to current urltitle in case of title changes.
def normalize_chapterurl(self,url):
return re.sub(r"https?://(www|m)\.(?P<keep>fictionpress\.com/s/\d+/\d+/).*",
r"https://www.\g<keep>",url)+self.urltitle
def getClass():
return FictionPressComSiteAdapter

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,18 +15,18 @@
# limitations under the License.
#
import time
from __future__ import absolute_import
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import time
import httplib, urllib
from .. import exceptions as exceptions
from ..htmlcleanup import stripHTML
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from .base_adapter import BaseSiteAdapter, makeDate
class FicwadComSiteAdapter(BaseSiteAdapter):
@ -46,10 +46,10 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://ficwad.com/story/1234"
return "https://ficwad.com/story/1234"
def getSiteURLPattern(self):
return re.escape(r"http://"+self.getSiteDomain())+"/story/\d+?$"
return r"https?:"+re.escape(r"//"+self.getSiteDomain())+r"/story/\d+?$"
def performLogin(self,url):
params = {}
@ -61,12 +61,13 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
params['username'] = self.getConfig("username")
params['password'] = self.getConfig("password")
loginUrl = 'http://' + self.getSiteDomain() + '/account/login'
loginUrl = 'https://' + self.getSiteDomain() + '/account/login'
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
params['username']))
d = self._postUrl(loginUrl,params,usecache=False)
d = self.post_request(loginUrl,params,usecache=False)
if "Login attempt failed..." in d:
if "Login attempt failed..." in d or \
'<div id="error">Please enter your username and password.</div>' in d:
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['username']))
raise exceptions.FailedToLogin(url,params['username'])
@ -74,13 +75,6 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
else:
return True
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
def extractChapterUrlsAndMetadata(self):
# fetch the chapter. From that we will get almost all the
@ -89,58 +83,45 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: "+url)
# use BeautifulSoup HTML parser to make everything easier to find.
try:
data = self._fetchUrl(url)
# non-existent/removed story urls get thrown to the front page.
if "<h4>Featured Story</h4>" in data:
raise exceptions.StoryDoesNotExist(self.url)
soup = self.make_soup(data)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
# non-existent/removed story urls get thrown to the front page.
if "<h4>Featured Story</h4>" in data:
raise exceptions.StoryDoesNotExist(self.url)
soup = self.make_soup(data)
# if blocked, attempt login.
if soup.find("div",{"class":"blocked"}) or soup.find("li",{"class":"blocked"}):
if self.performLogin(url): # performLogin raises
# FailedToLogin if it fails.
soup = self.make_soup(self._fetchUrl(url,usecache=False))
soup = self.make_soup(self.get_request(url,usecache=False))
divstory = soup.find('div',id='story')
storya = divstory.find('a',href=re.compile("^/story/\d+$"))
storya = divstory.find('a',href=re.compile(r"^/story/\d+$"))
if storya : # if there's a story link in the divstory header, this is a chapter page.
# normalize story URL on chapter list.
self.story.setMetadata('storyId',storya['href'].split('/',)[2])
url = "http://"+self.getSiteDomain()+storya['href']
url = "https://"+self.getSiteDomain()+storya['href']
logger.debug("Normalizing to URL: "+url)
self._setURL(url)
try:
soup = self.make_soup(self._fetchUrl(url))
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
soup = self.make_soup(self.get_request(url))
# if blocked, attempt login.
if soup.find("div",{"class":"blocked"}) or soup.find("li",{"class":"blocked"}):
if self.performLogin(url): # performLogin raises
# FailedToLogin if it fails.
soup = self.make_soup(self._fetchUrl(url,usecache=False))
soup = self.make_soup(self.get_request(url,usecache=False))
# title - first h4 tag will be title.
titleh4 = soup.find('div',{'class':'storylist'}).find('h4')
self.story.setMetadata('title', stripHTML(titleh4.a))
if 'Deleted story' in self.story.getMetadata('title'):
if 'Deleted story' in self.story.getMetadataRaw('title'):
raise exceptions.StoryDoesNotExist("This story was deleted. %s"%self.url)
# Find authorid and URL from... author url.
a = soup.find('span',{'class':'author'}).find('a', href=re.compile(r"^/a/"))
self.story.setMetadata('authorId',a['href'].split('/')[2])
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
self.story.setMetadata('authorUrl','https://'+self.host+a['href'])
self.story.setMetadata('author',a.string)
# description
@ -149,14 +130,14 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
#self.story.setMetadata('description', storydiv.find("blockquote",{'class':'summary'}).p.string)
# most of the meta data is here:
metap = storydiv.find("p",{"class":"meta"})
metap = storydiv.find("div",{"class":"meta"})
self.story.addToList('category',metap.find("a",href=re.compile(r"^/category/\d+")).string)
# warnings
# <span class="req"><a href="/help/38" title="Medium Spoilers">[!!] </a> <a href="/help/38" title="Rape/Sexual Violence">[R] </a> <a href="/help/38" title="Violence">[V] </a> <a href="/help/38" title="Child/Underage Sex">[Y] </a></span>
spanreq = metap.find("span",{"class":"story-warnings"})
if spanreq: # can be no warnings.
for a in spanreq.findAll("a"):
for a in spanreq.find_all("a"):
self.story.addToList('warnings',a['title'])
## perhaps not the most efficient way to parse this, using
@ -168,7 +149,9 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
if m:
self.story.setMetadata('rating', m.group(1))
m = re.match(r".*?Genres: (.+?) -.*?",metastr)
## Genre appears even if list is empty. But there are a
## limited number of genres allowed by the site.
m = re.match(r".*?Genres: ((?:(?:Angst|Crossover|Drama|Erotica|Fantasy|Horror|Humor|Parody|Romance|Sci-fi)(?:,)?)+) -.*?",metastr)
if m:
for g in m.group(1).split(','):
self.story.addToList('genre',g)
@ -202,27 +185,24 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
storylistul = soup.find('ul',{'class':'storylist'})
if not storylistul:
# no list found, so it's a one-chapter story.
self.chapterUrls.append((self.story.getMetadata('title'),url))
self.add_chapter(self.story.getMetadata('title'),url)
else:
chapterlistlis = storylistul.findAll('li')
chapterlistlis = storylistul.find_all('li')
for chapterli in chapterlistlis:
if "blocked" in chapterli['class']:
# paranoia check. We should already be logged in by now.
raise exceptions.FailedToLogin(url,self.username)
else:
#print "chapterli.h4.a (%s)"%chapterli.h4.a
self.chapterUrls.append((chapterli.h4.a.string,
u'http://%s%s'%(self.getSiteDomain(),
chapterli.h4.a['href'])))
#print "self.chapterUrls:%s"%self.chapterUrls
self.story.setMetadata('numChapters',len(self.chapterUrls))
self.add_chapter(chapterli.h4.a.string,
u'https://%s%s'%(self.getSiteDomain(),
chapterli.h4.a['href']))
return
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
soup = self.make_soup(self._fetchUrl(url))
soup = self.make_soup(self.get_request(url))
span = soup.find('div', {'id' : 'storytext'})
@ -233,4 +213,3 @@ class FicwadComSiteAdapter(BaseSiteAdapter):
def getClass():
return FicwadComSiteAdapter

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2020 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,20 +15,22 @@
# limitations under the License.
#
from __future__ import absolute_import
import time
from datetime import date
from datetime import timedelta
from datetime import date, datetime
import logging
logger = logging.getLogger(__name__)
import re
import urllib2
import cookielib as cl
import json
from ..htmlcleanup import stripHTML
from .. import exceptions as exceptions
from base_adapter import BaseSiteAdapter, makeDate
# py2 vs py3 transition
from ..six import text_type as unicode
from ..six.moves import http_cookiejar as cl
from .base_adapter import BaseSiteAdapter, makeDate
def getClass():
return FimFictionNetSiteAdapter
@ -39,11 +41,12 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev','fimficnet')
self.story.setMetadata('storyId', self.parsedUrl.path.split('/',)[2])
self._setURL("http://"+self.getSiteDomain()+"/story/"+self.story.getMetadata('storyId')+"/")
self._setURL("https://"+self.getSiteDomain()+"/story/"+self.story.getMetadata('storyId')+"/")
self.is_adult = False
# The date format will vary from site to site.
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
# FYI, not the only format used in this file.
self.dateformat = "%d %b %Y"
@staticmethod
@ -57,18 +60,11 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
@classmethod
def getSiteExampleURLs(cls):
return "http://www.fimfiction.net/story/1234/story-title-here http://www.fimfiction.net/story/1234/ http://www.fimfiction.com/story/1234/1/ http://mobile.fimfiction.net/story/1234/1/story-title-here/chapter-title-here"
return "https://www.fimfiction.net/story/1234/story-title-here https://www.fimfiction.net/story/1234/ https://www.fimfiction.com/story/1234/1/ https://mobile.fimfiction.net/story/1234/1/story-title-here/chapter-title-here"
def getSiteURLPattern(self):
return r"https?://(www|mobile)\.fimfiction\.(net|com)/story/\d+/?.*"
def use_pagecache(self):
'''
adapters that will work with the page cache need to implement
this and change it to True.
'''
return True
def set_adult_cookie(self):
cookie = cl.Cookie(version=0, name='view_mature', value='true',
port=None, port_specified=False,
@ -81,27 +77,57 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
comment_url=None,
rest={'HttpOnly': None},
rfc2109=False)
self.get_cookiejar().set_cookie(cookie)
self.get_configuration().get_cookiejar().set_cookie(cookie)
def performLogin(self, url):
params = {}
if self.password:
params['username'] = self.username
params['password'] = self.password
else:
params['username'] = self.getConfig("username")
params['password'] = self.getConfig("password")
params['keep_logged_in'] = '1'
if params['username'] and params['password']:
loginUrl = 'https://' + self.getSiteDomain() + '/ajax/login'
logger.info("Will now login to URL (%s) as (%s)" % (loginUrl,
params['username']))
d = self.post_request(loginUrl, params)
if "signing_key" not in d :
logger.info("Failed to login to URL %s as %s" % (loginUrl,
params['username']))
raise exceptions.FailedToLogin(url,params['username'])
def make_soup(self,data):
soup = super(FimFictionNetSiteAdapter, self).make_soup(data)
for img in soup.select('img.lazy-img, img.user_image'):
## FimF has started a 'camo' mechanism for images that
## gets block by CF. attr data-source is original source.
if img.has_attr('data-source'):
img['src'] = img['data-source']
elif img.has_attr('data-src'):
img['src'] = img['data-src']
return soup
def doExtractChapterUrlsAndMetadata(self,get_cover=True):
if self.is_adult or self.getConfig("is_adult"):
self.set_adult_cookie()
## Only needed with password protected stories, which you have
## to have logged into in the website using this account.
if self.getConfig("always_login"):
self.performLogin(self.url)
##---------------------------------------------------------------------------------------------------
## Get the story's title page. Check if it exists.
try:
# don't use cache if manual is_adult--should only happen
# if it's an adult story and they don't have is_adult in ini.
data = self.do_fix_blockquotes(self._fetchUrl(self.url,
usecache=(not self.is_adult)))
soup = self.make_soup(data)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
# don't use cache if manual is_adult--should only happen
# if it's an adult story and they don't have is_adult in ini.
data = self.do_fix_blockquotes(self.get_request(self.url,
usecache=(not self.is_adult)))
soup = self.make_soup(data)
if "Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource" in data:
raise exceptions.StoryDoesNotExist(self.url)
@ -109,18 +135,6 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
if "This story has been marked as having adult content. Please click below to confirm you are of legal age to view adult material in your country." in data:
raise exceptions.AdultCheckRequired(self.url)
if self.password:
params = {}
params['password'] = self.password
data = self._postUrl(self.url, params)
soup = self.make_soup(data)
if not (soup.find('form', {'id' : 'password_form'}) == None):
if self.getConfig('fail_on_password'):
raise exceptions.FailedToDownload("%s requires story password and fail_on_password is true."%self.url)
else:
raise exceptions.FailedToLogin(self.url,"Story requires individual password",passwdonly=True)
##----------------------------------------------------------------------------------------------------
## Extract metadata
@ -131,11 +145,14 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
self.story.setMetadata('title',stripHTML(title))
# Author
author = storyContentBox.find('div', {'class':'author'}).find('a')
author = soup.find('div', {'class':'info-container'}).find('a')
self.story.setMetadata("author", stripHTML(author))
#No longer seems to be a way to access Fimfiction's internal author ID
self.story.setMetadata("authorId", self.story.getMetadata("author"))
self.story.setMetadata("authorUrl", "http://%s/user/%s" % (self.getSiteDomain(), stripHTML(author)))
# /user/288866/Stryker-Shadowpony-Blade
self.story.setMetadata("authorId", author['href'].split('/')[2])
self.story.setMetadata("authorUrl", "https://%s/user/%s/%s" % (self.getSiteDomain(),
self.story.getMetadata('authorId'),
# meta entry author can be changed by the user.
stripHTML(author)))
#Rating text is replaced with full words for historical compatibility after the site changed
#on 2014-10-27
@ -144,10 +161,9 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
self.story.setMetadata("rating", rating)
# Chapters
for chapter in storyContentBox.find_all('a',{'class':'chapter_link'}):
self.chapterUrls.append((stripHTML(chapter), 'http://'+self.host+chapter['href']))
for chapter in soup.find('ul',{'class':'chapters'}).find_all('a',{'class':'chapter-title'}):
self.add_chapter(chapter, 'https://'+self.host+chapter['href'])
self.story.setMetadata('numChapters',len(self.chapterUrls))
# Status
# In the case of Fimfiction, possible statuses are 'Completed', 'Incomplete', 'On Hiatus' and 'Cancelled'
@ -158,51 +174,53 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
status = status.replace("Incomplete", "In-Progress").replace("Complete", "Completed")
self.story.setMetadata("status", status)
# Genres and Warnings
# warnings were folded into general categories in the 2014-10-27 site update
categories = storyContentBox.find_all('a', {'class':re.compile(r'.*\bstory_category\b.*')})
for category in categories:
category = stripHTML(category)
if category == "Gore" or category == "Sex":
self.story.addToList('warnings', category)
else:
self.story.addToList('genre', category)
# Word count
wordCountText = stripHTML(storyContentBox.find('li', {'class':'bottom'}).find('div', {'class':'word_count'}))
wordCountText = stripHTML(storyContentBox.find('div', {'class':'chapters-footer'}).find('div', {'class':'word_count'}))
self.story.setMetadata("numWords", re.sub(r'[^0-9]', '', wordCountText))
# Cover image
storyImage = storyContentBox.find('div', {'class':'story_image'})
if storyImage:
coverurl = storyImage.find('a')['href']
if coverurl.startswith('//'): # fix for img urls missing 'http:'
coverurl = "http:"+coverurl
if get_cover:
# try setting from href, if fails, try using the img src
if self.setCoverImage(self.url,coverurl)[0] == "failedtoload":
img = storyImage.find('img')
# try src, then data-src, then leave None.
coverurl = img.get('src',img.get('data-src',None))
if coverurl:
self.setCoverImage(self.url,coverurl)
if get_cover:
storyImage = soup.select_one('div.story_container__story_image img')
if storyImage:
coverurl = storyImage['data-fullsize']
# try setting from data-fullsize, if fails, try using data-src
cover_set = self.setCoverImage(self.url,coverurl)[0]
if not cover_set or cover_set.startswith("failedtoload"):
coverurl = storyImage['src']
self.setCoverImage(self.url,coverurl)
coverSource = storyImage.find('a', {'class':'source'})
if coverSource:
self.story.setMetadata('coverSourceUrl', coverSource['href'])
#There's no text associated with the cover source link, so just
#reuse the URL. Makes it clear it's an external link leading
#outside of the fanfic site, at least.
self.story.setMetadata('coverSource', coverSource['href'])
coverSource = storyImage.parent.find('a', {'class':'source'})
if coverSource:
self.story.setMetadata('coverSourceUrl', coverSource['href'])
# There's no text associated with the cover source
# link, so just reuse the URL. Makes it clear it's
# an external link leading outside of the fanfic
# site, at least.
self.story.setMetadata('coverSource', coverSource['href'])
# fimf has started including extra stuff inside the description div.
descdivstr = u"%s"%storyContentBox.find("div", {"class":"description"})
hrstr=u"<hr/>"
descdivstr = u'<div class="description">'+descdivstr[descdivstr.index(hrstr)+len(hrstr):]
# specifically, the prequel link
description = storyContentBox.find("span", {"class":"description-text"})
description.name='div' # change to div, technically, spans
# aren't supposed to contain <p>'s.
descdivstr = u"%s"%description # string, but not stripHTML'ed
#The link to the prequel is embedded in the description text, so erring
#on the side of caution and wrapping this whole thing in a try block.
#If anything goes wrong this probably wasn't a valid prequel link.
try:
if "This story is a sequel to" in stripHTML(description):
link = description.find('a') # assume first link.
self.story.setMetadata("prequelUrl", 'https://'+self.host+link["href"])
self.story.setMetadata("prequel", stripHTML(link))
if not self.getConfig('keep_prequel_in_description',False):
hrstr=u"<hr/>"
descdivstr = u'<div class="description">'+descdivstr[descdivstr.index(hrstr)+len(hrstr):]
except:
logger.info("Prequel parsing failed...")
self.setDescription(self.url,descdivstr)
# Find the newest and oldest chapter dates
storyData = storyContentBox.find('div', {'class':'story_data'})
storyData = storyContentBox.find('ul', {'class':'chapters'})
oldestChapter = None
newestChapter = None
self.newestChapterNum = None # save for comparing during update.
@ -230,7 +248,7 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
# Date published
# falls back to oldest chapter date for stories that haven't been officially published yet
pubdatetag = storyContentBox.find('span', {'class':'date_approved'})
pubdatetag = storyContentBox.find('span', {'class':'approved-date'})
if pubdatetag is None:
if oldestChapter is None:
#this will only be true when updating metadata for stories that have 0 chapters
@ -240,16 +258,25 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
else:
self.story.setMetadata("datePublished", oldestChapter)
else:
pubDate = self.ordinal_date_string_to_date(pubdatetag('span')[1].text)
pubDate = self.date_span_tag_to_date(pubdatetag)
self.story.setMetadata("datePublished", pubDate)
# Characters
chars = storyContentBox.find("div", {"class":"extra_story_data"})
for character in chars.find_all("a", {"class":"character_icon"}):
self.story.addToList("characters", character['title'])
tags = storyContentBox.find("ul", {"class":"story-tags"})
for character in tags.find_all("a", {"class":"tag-character"}):
self.story.addToList("characters", stripHTML(character))
for genre in tags.find_all("a", {"class":"tag-genre"}):
self.story.addToList("genre", stripHTML(genre))
for series in tags.find_all("a", {"class":"tag-series"}):
#using 'fandoms' as the identifier to standardize with archiveofourown.org
self.story.addToList("fandoms", stripHTML(series))
for warning in tags.find_all("a", {"class":"tag-warning"}):
self.story.addToList("warnings", stripHTML(warning))
for content in tags.find_all("a", {"class":"tag-content"}):
self.story.addToList("content", stripHTML(content))
# Likes and dislikes
storyToolbar = soup.find('div', {'class':'story-toolbar'})
storyToolbar = soup.find('div', {'class':'story-top-toolbar'})
likes = storyToolbar.find('span', {'class':'likes'})
if not likes is None:
self.story.setMetadata("likes", stripHTML(likes))
@ -259,8 +286,9 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
# Highest view for a chapter and total views
viewSpan = storyToolbar.find('span', {'title':re.compile(r'.*\btotal views\b.*')})
self.story.setMetadata("views", re.sub(r'[^0-9]', '', stripHTML(viewSpan)))
self.story.setMetadata("total_views", re.sub(r'[^0-9]', '', viewSpan['title']))
viewResults = re.search(r'([0-9]*) views \/ ([0-9]*)', viewSpan['title'].replace(',',''))
self.story.setMetadata("views", viewResults.group(1))
self.story.setMetadata("total_views", viewResults.group(2))
# Comment count
commentSpan = storyToolbar.find('span', {'title':re.compile(r'.*\bcomments\b.*')})
@ -270,59 +298,68 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
descriptionMeta = soup.find('meta', {'property':'og:description'})
self.story.setMetadata("short_description", stripHTML(descriptionMeta['content']))
#groups
if soup.find('button', {'id':'button-view-all-groups'}):
groupResponse = self._fetchUrl("https://www.fimfiction.net/ajax/stories/%s/groups" % (self.story.getMetadata("storyId")))
groupData = json.loads(groupResponse)
groupList = self.make_soup(groupData["content"])
else:
# groups.
# If there are more than X groups, there's a 'Show all' button
# that calls for a JSON containing HTML with the full list.
# But it doesn't work reliably with FlareSolverr.
groupList = None
groupButton = soup.find('button', {'data-click':'showAll'})
if groupButton != None and groupButton.find('i', {'class':'fa-search-plus'}):
try:
groupResponse = self.get_request("https://www.fimfiction.net/ajax/stories/%s/groups" % (self.story.getMetadata("storyId")))
groupData = json.loads(groupResponse)
groupList = self.make_soup(groupData["content"])
except Exception as e:
logger.warning("Collecting 'groups' (AKA 'Featured In') from JSON failed:%s"%e)
logger.warning("Only 'groups' initially shown on the page will be collected.")
logger.warning("This is a known issue with JSON and FlareSolverr. See #1122")
if not groupList:
groupList = soup.find('ul', {'id':'story-groups-list'})
if not (groupList == None):
for groupName in groupList.find_all('a'):
self.story.addToList("groupsUrl", 'http://'+self.host+groupName["href"])
self.story.addToList("groups",stripHTML(groupName).replace(',', ';'))
if groupList:
for groupContent in groupList.find_all('a'):
self.story.addToList("groupsUrl", 'https://'+self.host+groupContent["href"])
groupName = groupContent.find('span', {"class":"group-name"})
if groupName != None:
self.story.addToList("groups",stripHTML(groupName).replace(',', ';'))
else:
self.story.addToList("groups",stripHTML(groupContent).replace(',', ';'))
#sequels
for header in soup.find_all('h1', {'class':'header-stories'}):
# I don't know why using text=re.compile with find() wouldn't work, but it didn't.
# I don't know why using string=re.compile with find() wouldn't work, but it didn't.
if header.text.startswith('Sequels'):
sequelContainer = header.parent
for sequel in sequelContainer.find_all('a', {'class':'story_link'}):
self.story.addToList("sequelsUrl", 'http://'+self.host+sequel["href"])
self.story.addToList("sequelsUrl", 'https://'+self.host+sequel["href"])
self.story.addToList("sequels", stripHTML(sequel).replace(',', ';'))
#author last login
userPageHeader = soup.find('div', {'class':re.compile(r'\buser-page-header\b')})
userPageHeader = soup.find('div', {'class':'user-page-header'})
if not userPageHeader == None:
infoContainer = userPageHeader.find('div', {'class':re.compile(r'\binfo-container\b')})
infoContainer = userPageHeader.find('ul', {'class':'mini-info-box'})
listItems = infoContainer.find_all('li')
lastLoginString = stripHTML(listItems[1])
lastLogin = None
if "online" in lastLoginString:
lastLogin = date.today()
elif "offline" in lastLoginString:
#this regex extracts the number of weeks and the number of days from the last login string.
#durations under a day are ignored.
#group 1 is weeks, group 2 is days
durationGroups = re.match(r"(?:[^0-9]*(\d+?)w)?[^0-9]*(?:(\d+?)d)?", lastLoginString)
lastLogin = date.today() - timedelta(days=int(durationGroups.group(2) or 0), weeks=int(durationGroups.group(1) or 0))
lastLogin = self.date_span_tag_to_date(listItems[1])
self.story.setMetadata("authorLastLogin", lastLogin)
#The link to the prequel is embedded in the description text, so erring
#on the side of caution and wrapping this whole thing in a try block.
#If anything goes wrong this probably wasn't a valid prequel link.
try:
description = soup.find('div', {'class':'description'})
firstHR = description.find("hr")
nextSib = firstHR.nextSibling
if "This story is a sequel to" in nextSib.string:
link = nextSib.nextSibling
if link.name == "a":
self.story.setMetadata("prequelUrl", 'http://'+self.host+link["href"])
self.story.setMetadata("prequel", stripHTML(link))
except:
pass
def date_span_tag_to_date(self, containingtag):
## <span data-time="1435421997" title="Saturday 27th of June 2015 @4:19pm">Jun 27th, 2015</span>
## No timezone adjustment is done.
span = containingtag.find('span',{'data-time':re.compile(r'^\d+$')})
if span != None:
return datetime.fromtimestamp(float(span['data-time']))
## Sometimes, for reasons that are unclear, data-time is not present. Parse the date out of the title instead.
else:
span = containingtag.find('span', title=True)
dateRegex = re.search('([a-zA-Z ]+)([0-9]+)(st of|th of|nd of|rd of)([a-zA-Z ]+[0-9]+)', span['title'])
dateString = dateRegex.group(2) + dateRegex.group(4)
return makeDate(dateString, "%d %B %Y")
def ordinal_date_string_to_date(self, datestring):
datestripped=re.sub(r"(\d+)(st|nd|rd|th)", r"\1", datestring.strip())
@ -346,21 +383,58 @@ class FimFictionNetSiteAdapter(BaseSiteAdapter):
def getChapterText(self, url):
logger.debug('Getting chapter text from: %s' % url)
data = self._fetchUrl(url)
data = self.get_request(url)
soup = self.make_soup(data)
if not (soup.find('form', {'id' : 'password_form'}) == None):
if self.password:
params = {}
params['password'] = self.password
data = self._postUrl(url, params)
else:
logger.error("Chapter %s needed password but no password was present" % url)
data = self.do_fix_blockquotes(data)
soup = self.make_soup(data).find('div', {'class' : 'chapter_content'})
if soup == None:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
if self.getConfig("include_author_notes",True):
soup = self.make_soup(data).find_all('div', {'class':re.compile(r'(.*\bauthors-note\b.*|.*\bchapter-body\b.*)')})
if soup == None:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
chapter_divs = [unicode(div) for div in soup]
soup = self.make_soup(" ".join(chapter_divs))
else:
soup = self.make_soup(data).find('div', {'id' : 'chapter-body'})
if soup == None:
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
return self.utf8FromSoup(url,soup)
def before_get_urls_from_page(self,url,normalize):
## Unlike most that show the links to 'adult' stories, but protect
## them, FimF doesn't even show them if not logged in.
# data = self.get_request(url)
if self.getConfig("is_adult"):
self.set_adult_cookie()
def get_urls_from_page(self,url,normalize):
iterate = self.getConfig('scrape_bookshelf', default=False)
if not re.search(r'fimfiction\.net/bookshelf/(?P<listid>.+?)/',url) or iterate == 'legacy':
return super().get_urls_from_page(url,normalize)
self.before_get_urls_from_page(url,normalize)
final_urls = list()
while True:
data = self.get_request(url,usecache=True)
soup = self.make_soup(data)
paginator = soup.select_one('div.paginator-container > div.page_list > ul').find_all('li')
logger.debug("Paginator: " + str(len(paginator)))
stories_container = soup.select_one('div.content > div.two-columns > div.left').find_all('article', recursive=False)
x = 0
logger.debug("Container "+str(len(stories_container)))
for story_raw in stories_container:
x += 1
story_url = story_raw.select_one('div.story_content_box > header.title > div > a.story_name').get('href')
url_story = ('https://' + self.getSiteDomain() + story_url)
#logger.debug(url_story)
final_urls.append(url_story)
logger.debug("Discovered %s new stories."%str(x))
next_button = paginator[-1].select_one('a')
logger.debug("Next button: " + next_button.get_text())
if next_button.get_text() or not iterate:
return {'urllist': final_urls}
url = ('https://' + self.getSiteDomain() + next_button.get('href'))

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2011 Fanficdownloader team, 2015 FanFicFare team
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -17,14 +17,14 @@
####################################################################################################
# Adapted by GComyn - December 10, 2016
####################################################################################################
from __future__ import absolute_import
''' This adapter will download the stories from the www.fireflyfans.net forum pages '''
import logging
import re
import sys
import time
import urllib2
# py2 vs py3 transition
from ..six import text_type as unicode
from base_adapter import BaseSiteAdapter, makeDate
from .base_adapter import BaseSiteAdapter, makeDate
from .. import exceptions as exceptions
from ..htmlcleanup import stripHTML
@ -43,12 +43,6 @@ class FireFlyFansNetSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
self.story.setMetadata('siteabbrev', 'fffans')
self.decode = ["Windows-1252",
"utf8",
"iso-8859-1"] # 1252 is a superset of iso-8859-1.
# Most sites that claim to be
# iso-8859-1 (and some that claim to be
# utf8) are really windows-1252.
self.is_adult = False
# get storyId from url--url validation guarantees query is only
@ -83,19 +77,12 @@ class FireFlyFansNetSiteAdapter(BaseSiteAdapter):
url = self.url
logger.debug("URL: " + url)
try:
data = self._fetchUrl(url)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist(self.url)
else:
raise e
data = self.get_request(url)
if 'Something bad happened, but hell if I know what it is.' in data:
raise exceptions.StoryDoesNotExist(
'{0} says: GORAMIT!!! SOMETHING WENT WRONG! Something bad happened, but hell if I know what it is.'.format(self.url))
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# Title
@ -106,6 +93,9 @@ class FireFlyFansNetSiteAdapter(BaseSiteAdapter):
a = soup.find('a', href=re.compile(r"profileshow.aspx\?u="))
self.story.setMetadata('authorId', a['href'].split('=')[1])
if not self.story.getMetadata('authorId'):
logger.warning("Site authorUrl missing authorId, using SiteMissingAuthorId")
self.story.setMetadata('authorId', 'SiteMissingAuthorId')
self.story.setMetadata('authorUrl', 'http://' +
self.host + '/' + a['href'])
self.story.setMetadata('author', a.string)
@ -114,9 +104,8 @@ class FireFlyFansNetSiteAdapter(BaseSiteAdapter):
# way to determine if there are other chapters to the same story, so you have
# to download them one at a time yourself. I'm also setting the status to
# complete
self.chapterUrls.append((self.story.getMetadata('title'), self.url))
self.story.setMetadata('numChapters', 1)
self.story.setMetadata('status', 'Complete')
self.add_chapter(self.story.getMetadata('title'), self.url)
self.story.setMetadata('status', 'Completed')
## some stories do not have a summary listed, so I'm setting it here.
summary = soup.find('span', {'id': 'MainContent_txtItemDescription'})
@ -137,7 +126,7 @@ class FireFlyFansNetSiteAdapter(BaseSiteAdapter):
# which is usualy FireFly on this site, but I'm going to get them
# anyway.a
category = soup.find('span', {'id': 'MainContent_txtItemDetails'})
category = stripHTML(str(category).replace(b"\xc2\xa0", ' '))
category = stripHTML(unicode(category).replace(u"\xa0", u' '))
metad = category.split(' ')
for meta in metad:
if ":" in meta:

View file

@ -1,329 +0,0 @@
# -*- coding: utf-8 -*-
# Copyright 2016 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
####################################################################################################
### Adapted by GComyn on December 15, 2016
###=================================================================================================
### I ran this through a linter, and formatted it as per the suggestions, hence some of the lines
### are "chopped"
###=================================================================================================
### I have started to use lines of # on the line just before a function so they are easier to find.
####################################################################################################
''' This adapter scrapes the metadata and chapter text from stories on firefly.populli.org '''
import logging
import re
import urllib2
import sys
from base_adapter import BaseSiteAdapter, makeDate
from .. import exceptions as exceptions
from ..htmlcleanup import stripHTML
logger = logging.getLogger(__name__)
####################################################################################################
def getClass():
return FireflyPopulliOrgSiteAdapter
####################################################################################################
class FireflyPopulliOrgSiteAdapter(BaseSiteAdapter):
def __init__(self, config, url):
BaseSiteAdapter.__init__(self, config, url)
# 1252 is a superset of iso-8859-1.
# Most sites that claim to be iso-8859-1 (and some that claim to be utf8)
# are really windows-1252. I've put the iso-8859-1 in just to cover the bases [GComyn]
self.decode = ["Windows-1252", "utf8", "iso-8859-1"]
self.is_adult = False
# normalized story URL.
m = re.match(self.getSiteURLPattern(),url)
if m:
self.story.setMetadata('storyId',m.group('id'))
# normalized story URL.
self._setURL('http://' + self.getSiteDomain() + '/archive/' +m.group('cat') +
'/' + self.story.getMetadata('storyId') +'.shtml')
else:
raise exceptions.InvalidStoryURL(url,
self.getSiteDomain(),
self.getSiteExampleURLs())
## each adapter needs to have a unique abbreviation, whih is set here.
self.story.setMetadata('siteabbrev', 'fga')
# The date format will vary from site to site.
# The below website give the list of variables that can be used to formulate the
# correct format
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
self.dateformat = "%m/%d/%y"
# This site has the entire story on one page, so I am initializing a variable to hold the
# soup so that the getChaperText function doesn't have to use bandwidth to get it again.
self.html = ''
################################################################################################
@staticmethod
def getSiteDomain():
return 'firefly.populli.org'
################################################################################################
@classmethod
def getSiteExampleURLs(cls):
return "http://" + cls.getSiteDomain() + "/archive/#/[storyId].shtml"
################################################################################################
def getSiteURLPattern(self):
return re.escape("http://"+self.getSiteDomain())+r'/archive/(?P<cat>\d+)/(?P<id>\S+)\.shtml'
################################################################################################
def get_page(self, page):
'''
This will download the url from the web and return the data
I'm using it since I call several places below, and this will
cut down on the size of the file
'''
try:
page_data = self._fetchUrl(page)
except urllib2.HTTPError, e:
if e.code == 404:
raise exceptions.StoryDoesNotExist('404 error: {}'.format(page))
else:
raise e
return page_data
################################################################################################
def extractChapterUrlsAndMetadata(self):
url = self.url
logger.debug("URL: " + url)
data = self.get_page(url)
# Since this is a site with the entire story on one page and there are no updates, I'm going
# to set the status to complete.
self.story.setMetadata('status', 'Complete')
# use BeautifulSoup HTML parser to make everything easier to find.
soup = self.make_soup(data)
# Title
## Some stories do not have the title in a tag that can be easily gotten.
title = soup.find('h2')
if not title:
raise exceptions.StoryDoesNotExist('Cannot find title on the page {}'.format(url))
self.story.setMetadata('title', stripHTML(soup.find('h2')))
# This site has the entire story on one page, so we will be using the normalized URL as
# the chapterUrl and the Title as the chapter Title
self.chapterUrls.append((self.story.getMetadata('title'), url))
## i would take this out, as it is not really needed, but the calibre plugin uses it,
## so it's staying
self.story.setMetadata('numChapters', 1)
# Find authorid and URL
## this site does not have dedicated pages for the authors, you have to use the searh engine.
## so that is what I will do. Some of the stories have multiple author names separated by
## commas or a colon. I'm going to take the first name as the author name, and use the rest
## as a coauthor site specific tag. I did it this way so we keep all of the information,
## because the author can be used in the filename, and if it's too long windows systems
## won't be able to use it.
mdata = stripHTML(soup.find('a', href=re.compile('mailto')))
if ':' in mdata:
self.story.setMetadata('coauthor', ' '.join(mdata.split(':')[1:]).strip())
mdata = mdata.split(':')[0]
if ',' in mdata:
self.story.setMetadata('coauthor', ', '.join(mdata.split(',')[1:]).strip())
mdata = mdata.split(',')[0]
# print mdata
# self.story.getMetadata('coauthor')
# sys.exit()
self.story.setMetadata('authorId', mdata)
self.story.setMetadata('author', mdata.title())
# Some stories list multiple authors, but the search engine only uses 1 author, and since
# we can't tell how many 'words' are in each name, I'm going to do a work around.
author_name = mdata.split(' ')[0].strip()
author_url = ('http://'+self.getSiteDomain()+'/cgi-bin/search.cgi?Author={}&SortBy=0'+
'&SortOrder=0&NumToList=0&FastSearch=0&ShortResults=0').format(author_name)
story_found = False
while not story_found:
logger.debug('Getting author page: %s' % author_url)
adata = self.get_page(author_url)
if 'No stories found for your search choices.' in adata:
author_name = ' '.join(author_name.split()[:-1])
author_url = ('http://'+self.getSiteDomain(
)+'/cgi-bin/search.cgi?Author={}&SortBy=0'+
'&SortOrder=0&NumToList=0&FastSearch=0' +
'&ShortResults=0').format(author_name)
pass
else:
asoup = self.make_soup(adata)
# Ok...this site does not have the stories encompassed by any sort of tag... so I have
# to make it.
stories = asoup.find_all('p', {'class':'search'})
if stories:
for story in stories:
# There alot of nbsp's (non broken spaces) in here, so I'm going to remove them
# I'm also getting rid of the bold tags and the nextline characters to make it
# easier to get the information below
story = repr(story).replace(b'\\xa0', '').replace(' ',' ').replace(
'<b>','').replace('</b>','').replace(r'\n','')
story = self.make_soup(story).find('p')
story_a = story.find('a')
title = self.story.getMetadata('title').split('-')[0].strip()
if story_a.get_text() == title:
story_found = True
break
if not story_found:
raise exceptions.StoryDoesNotExist(
"Could not find the story {} on the author's {} search page {}".format(
url, author_name, author_url))
self.story.setMetadata('authorUrl', author_url)
# The first element is the author, which we already have, so I'm going to drop it.
# Some prequel and sequel have links, so we are going to process them here, and get the
# series at the same time, then catch those that don't have links below
links = story.find_all('a')
for link in links:
label = link.previousSibling.strip()
if label == 'Series Title:':
## there is no way to tell which number of the series the story is, so we won't
# put a number
series_url = 'http://'+self.getSiteDomain()+'/'+link['href']
self.story.setMetadata('series', link.get_text())
self.story.setMetadata('seriesUrl', series_url)
elif label == 'Prequel to:':
value = link.string + ' (' + 'http://'+self.getSiteDomain()+link['href'] + ')'
self.story.setMetadata('prequelto', value)
elif label == 'Sequel to:':
value = link.string + ' (' + 'http://'+self.getSiteDomain()+link['href'] + ')'
self.story.setMetadata('sequelto', value)
# Some stories have alot of text in the "summary", and I've tried to keep down on creating
# new metadata from here, so I'm going to grab some, but the rest will be lumped into the
# summary metadata.
summary = ''
mdatas = story.find_all('br')
for mdata in mdatas:
meta = mdata.nextSibling.string
if meta:
# some of the "sentences" have a colon in them, but are not actually labels... so
# I'm checking to see if the colon is within the first 20 characters, and taking
# that as a label... otherwise, it will be added to the summary section below. I've
# decided that the entire section will be put into the summary section, unless it
# has specific labels
if meta.find(':') > 0 and meta.find(':') < 20:
label = meta.split(':', 2)[0].strip().lower()
value = meta[len(label)+1:].strip()
else:
label = meta.string
value = ''
if (label == 'series title' or label == 'author' or label == '[' or
label == 'prequel to'):
# we've either already got this or we don't want it so we'll pass
## I'm handling it here, to get it out of the way for the rest of the code since
# anything not captured is put into the summary
pass
elif label == 'details':
# for the details section, none of this is labeled, and some stories can have
# less than others, so I have to check what each is to determine where to put
# it.
for val in value.split('|'):
val = val.strip()
if len(val) == 0:
# we don't need the ones that don't have anything in it.
pass
elif val in ['Series', 'Standalone', 'Work-In-Progress']:
self.story.setMetadata('storytype', val)
elif val in ['G', 'NC-17', 'PG', 'PG-13', 'R']:
self.story.setMetadata('rating', val)
elif val.split()[0].replace(',','') in ['*slash*', 'gen', 'het']:
self.story.setMetadata('genre', val)
elif val[-1] == 'k':
self.story.setMetadata('size', val)
elif len(val) > 0:
# There is no update date, so I'm putting the date in both
self.story.setMetadata('datePublished',makeDate(val, self.dateformat))
self.story.setMetadata('dateUpdated',makeDate(val, self.dateformat))
else:
## This should catch anything else, and shouldn't ever really be gotten
# to, but I'm going to have it print out in the debugger, just in case
logger.debug('Metadata not caught: %s' % str(meta))
zzzzzzzz = 0
elif label == 'characters':
self.story.setMetadata('characters', value)
elif label == 'pairings':
self.story.setMetadata('ships', value)
elif label == 'warnings' or label == '[eta] warning':
self.story.setMetadata('warnings', value)
elif label == 'sequel to':
self.story.setMetadata('sequelto', value)
elif label == 'disclaimer':
self.story.setMetadata('disclaimer', value)
elif label == 'spoilers':
self.story.setMetadata('spoilers', value)
elif label == 'crossover with':
self.story.addToList('category', value)
elif label == 'summary':
summary += value + '<br/>'
else:
## since this is not really a labled string, I'm adding the original string to
# the summary. This may cause some of the sentences from the other site specific
# labels to be separated, but this is the only way I can figure out how to do
# this, at this time.
summary += meta.string + '<br/>'
self.setDescription(url, summary)
# since this is the only "chapter" that will be retrieved, I'm going to save the soup here
# so the getChapterText function doesn't have to use more bandwidth to get it again
self.html = soup
################################################################################################
def getChapterText(self, url):
logger.debug('Using the html retrieved previously from: %s' % url)
soup = self.html
story = soup.find('blockquote')
if None == story:
raise exceptions.FailedToDownload(
"Error downloading Chapter: %s! Missing required element!" % url)
## now that we have the story, there needs to be a little cleanup before we send it to the
# writers. Some of them really need editing to be cleaned up
## I am converting the text to raw unicode, then removing the <blockquote>, then removing
# the end of the section, which has alot of extraneous things, then adding my own div
# wrapper, recreating the soup, then getting that div from the soup again, before sending to
# the writers.
story = repr(story).replace(b'\\xa0', '').replace(' ',' ').replace(r'\n','').strip()
story = story[12:]
story = story[:story.find('<p align="center" class="comments">Please <')]
story = '<div class="chaptertext">' + story + '</div>'
story = self.make_soup(story).find('div', {'class':'chaptertext'})
return self.utf8FromSoup(url, story)

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2015 FanFicFare team
# Copyright 2024 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,15 +15,18 @@
# limitations under the License.
#
from base_xenforoforum_adapter import BaseXenForoForumAdapter
from __future__ import absolute_import
import re
from .base_xenforo2forum_adapter import BaseXenForo2ForumAdapter
def getClass():
return QuestionablequestingComAdapter
class QuestionablequestingComAdapter(BaseXenForoForumAdapter):
class QuestionablequestingComAdapter(BaseXenForo2ForumAdapter):
def __init__(self, config, url):
BaseXenForoForumAdapter.__init__(self, config, url)
BaseXenForo2ForumAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','qq')
@ -33,3 +36,12 @@ class QuestionablequestingComAdapter(BaseXenForoForumAdapter):
# The site domain. Does have www here, if it uses it.
return 'forum.questionablequesting.com'
@classmethod
def getAcceptDomains(cls):
return [cls.getSiteDomain(),
cls.getSiteDomain().replace('forum.','')]
def getSiteURLPattern(self):
## QQ accepts forum.questionablequesting.com and questionablequesting.com
## We will use forum. as canonical for all
return super(QuestionablequestingComAdapter, self).getSiteURLPattern().replace(re.escape("forum."),r"(forum\.)?")

View file

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
# Copyright 2015 FanFicFare team
# Copyright 2019 FanFicFare team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,15 +15,18 @@
# limitations under the License.
#
from base_xenforoforum_adapter import BaseXenForoForumAdapter
from __future__ import absolute_import
import re
from .base_xenforo2forum_adapter import BaseXenForo2ForumAdapter
def getClass():
return ForumsSpacebattlesComAdapter
class ForumsSpacebattlesComAdapter(BaseXenForoForumAdapter):
class ForumsSpacebattlesComAdapter(BaseXenForo2ForumAdapter):
def __init__(self, config, url):
BaseXenForoForumAdapter.__init__(self, config, url)
BaseXenForo2ForumAdapter.__init__(self, config, url)
# Each adapter needs to have a unique site abbreviation.
self.story.setMetadata('siteabbrev','fsb')
@ -33,3 +36,12 @@ class ForumsSpacebattlesComAdapter(BaseXenForoForumAdapter):
# The site domain. Does have www here, if it uses it.
return 'forums.spacebattles.com'
@classmethod
def getAcceptDomains(cls):
return [cls.getSiteDomain(),
cls.getSiteDomain().replace('forums.','forum.')]
def getSiteURLPattern(self):
## SB accepts forums.spacebattles.com and forum.spacebattles.com
## We will use forums. as canonical for all
return super(ForumsSpacebattlesComAdapter, self).getSiteURLPattern().replace(re.escape("forums."),r"forums?\.")

Some files were not shown because too many files have changed in this diff Show more