mirror of
https://github.com/JimmXinu/FanFicFare.git
synced 2026-05-09 05:21:13 +02:00
Compare commits
No commits in common. "main" and "calibre-plugin-1.5.8" have entirely different histories.
main
...
calibre-pl
536 changed files with 31301 additions and 221155 deletions
37
.gitignore
vendored
37
.gitignore
vendored
|
|
@ -1,37 +0,0 @@
|
|||
#############
|
||||
## Python
|
||||
#############
|
||||
*.py[cod]
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
|
||||
# Emacs autosave files
|
||||
\#*#
|
||||
*~
|
||||
|
||||
# Windows batch files
|
||||
*.bat
|
||||
|
||||
# usually perl -pi.back -e edits.
|
||||
*.back
|
||||
*.bak
|
||||
|
||||
# pycharm project specific settings files
|
||||
.idea
|
||||
|
||||
# vscode project specific settings file
|
||||
.vscode
|
||||
|
||||
cleanup.sh
|
||||
FanFictionDownLoader.zip
|
||||
*.epub
|
||||
*Thumbs.db
|
||||
FanFicFare.zip
|
||||
output
|
||||
build
|
||||
dist
|
||||
FanFicFare.egg-info
|
||||
personal.ini
|
||||
appcfg_oauth2_tokens
|
||||
venv/
|
||||
|
|
@ -1,15 +0,0 @@
|
|||
FanFicFare (FFF)
|
||||
=======================
|
||||
|
||||
FanFicFare is a tool for downloading fanfiction and original stories
|
||||
from various sites into ebook form.
|
||||
|
||||
FanFicFare(FFF) is the renamed successor to
|
||||
FanFictionDownLoader(FFDL). The project was renamed due to another,
|
||||
unrelated project sharing the same name.
|
||||
|
||||
FanFicFare can download stories from over 100 different fanfiction and
|
||||
original fiction sites.
|
||||
|
||||
FanFicFare can output stories into EPUB (the preferred format), HTML,
|
||||
plain text and MOBI formats.
|
||||
884
LICENSE
884
LICENSE
|
|
@ -1,884 +0,0 @@
|
|||
The code in fanficfare and webservice are under the Apache License.
|
||||
|
||||
The code in calibre-plugin, because it derives from other GPLv3 code,
|
||||
is also GPLv3. GPLv3 follows the Apache License.
|
||||
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 3, 29 June 2007
|
||||
|
||||
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The GNU General Public License is a free, copyleft license for
|
||||
software and other kinds of works.
|
||||
|
||||
The licenses for most software and other practical works are designed
|
||||
to take away your freedom to share and change the works. By contrast,
|
||||
the GNU General Public License is intended to guarantee your freedom to
|
||||
share and change all versions of a program--to make sure it remains free
|
||||
software for all its users. We, the Free Software Foundation, use the
|
||||
GNU General Public License for most of our software; it applies also to
|
||||
any other work released this way by its authors. You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
them if you wish), that you receive source code or can get it if you
|
||||
want it, that you can change the software or use pieces of it in new
|
||||
free programs, and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to prevent others from denying you
|
||||
these rights or asking you to surrender the rights. Therefore, you have
|
||||
certain responsibilities if you distribute copies of the software, or if
|
||||
you modify it: responsibilities to respect the freedom of others.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must pass on to the recipients the same
|
||||
freedoms that you received. You must make sure that they, too, receive
|
||||
or can get the source code. And you must show them these terms so they
|
||||
know their rights.
|
||||
|
||||
Developers that use the GNU GPL protect your rights with two steps:
|
||||
(1) assert copyright on the software, and (2) offer you this License
|
||||
giving you legal permission to copy, distribute and/or modify it.
|
||||
|
||||
For the developers' and authors' protection, the GPL clearly explains
|
||||
that there is no warranty for this free software. For both users' and
|
||||
authors' sake, the GPL requires that modified versions be marked as
|
||||
changed, so that their problems will not be attributed erroneously to
|
||||
authors of previous versions.
|
||||
|
||||
Some devices are designed to deny users access to install or run
|
||||
modified versions of the software inside them, although the manufacturer
|
||||
can do so. This is fundamentally incompatible with the aim of
|
||||
protecting users' freedom to change the software. The systematic
|
||||
pattern of such abuse occurs in the area of products for individuals to
|
||||
use, which is precisely where it is most unacceptable. Therefore, we
|
||||
have designed this version of the GPL to prohibit the practice for those
|
||||
products. If such problems arise substantially in other domains, we
|
||||
stand ready to extend this provision to those domains in future versions
|
||||
of the GPL, as needed to protect the freedom of users.
|
||||
|
||||
Finally, every program is threatened constantly by software patents.
|
||||
States should not allow patents to restrict development and use of
|
||||
software on general-purpose computers, but in those that do, we wish to
|
||||
avoid the special danger that patents applied to a free program could
|
||||
make it effectively proprietary. To prevent this, the GPL assures that
|
||||
patents cannot be used to render the program non-free.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
TERMS AND CONDITIONS
|
||||
|
||||
0. Definitions.
|
||||
|
||||
"This License" refers to version 3 of the GNU General Public License.
|
||||
|
||||
"Copyright" also means copyright-like laws that apply to other kinds of
|
||||
works, such as semiconductor masks.
|
||||
|
||||
"The Program" refers to any copyrightable work licensed under this
|
||||
License. Each licensee is addressed as "you". "Licensees" and
|
||||
"recipients" may be individuals or organizations.
|
||||
|
||||
To "modify" a work means to copy from or adapt all or part of the work
|
||||
in a fashion requiring copyright permission, other than the making of an
|
||||
exact copy. The resulting work is called a "modified version" of the
|
||||
earlier work or a work "based on" the earlier work.
|
||||
|
||||
A "covered work" means either the unmodified Program or a work based
|
||||
on the Program.
|
||||
|
||||
To "propagate" a work means to do anything with it that, without
|
||||
permission, would make you directly or secondarily liable for
|
||||
infringement under applicable copyright law, except executing it on a
|
||||
computer or modifying a private copy. Propagation includes copying,
|
||||
distribution (with or without modification), making available to the
|
||||
public, and in some countries other activities as well.
|
||||
|
||||
To "convey" a work means any kind of propagation that enables other
|
||||
parties to make or receive copies. Mere interaction with a user through
|
||||
a computer network, with no transfer of a copy, is not conveying.
|
||||
|
||||
An interactive user interface displays "Appropriate Legal Notices"
|
||||
to the extent that it includes a convenient and prominently visible
|
||||
feature that (1) displays an appropriate copyright notice, and (2)
|
||||
tells the user that there is no warranty for the work (except to the
|
||||
extent that warranties are provided), that licensees may convey the
|
||||
work under this License, and how to view a copy of this License. If
|
||||
the interface presents a list of user commands or options, such as a
|
||||
menu, a prominent item in the list meets this criterion.
|
||||
|
||||
1. Source Code.
|
||||
|
||||
The "source code" for a work means the preferred form of the work
|
||||
for making modifications to it. "Object code" means any non-source
|
||||
form of a work.
|
||||
|
||||
A "Standard Interface" means an interface that either is an official
|
||||
standard defined by a recognized standards body, or, in the case of
|
||||
interfaces specified for a particular programming language, one that
|
||||
is widely used among developers working in that language.
|
||||
|
||||
The "System Libraries" of an executable work include anything, other
|
||||
than the work as a whole, that (a) is included in the normal form of
|
||||
packaging a Major Component, but which is not part of that Major
|
||||
Component, and (b) serves only to enable use of the work with that
|
||||
Major Component, or to implement a Standard Interface for which an
|
||||
implementation is available to the public in source code form. A
|
||||
"Major Component", in this context, means a major essential component
|
||||
(kernel, window system, and so on) of the specific operating system
|
||||
(if any) on which the executable work runs, or a compiler used to
|
||||
produce the work, or an object code interpreter used to run it.
|
||||
|
||||
The "Corresponding Source" for a work in object code form means all
|
||||
the source code needed to generate, install, and (for an executable
|
||||
work) run the object code and to modify the work, including scripts to
|
||||
control those activities. However, it does not include the work's
|
||||
System Libraries, or general-purpose tools or generally available free
|
||||
programs which are used unmodified in performing those activities but
|
||||
which are not part of the work. For example, Corresponding Source
|
||||
includes interface definition files associated with source files for
|
||||
the work, and the source code for shared libraries and dynamically
|
||||
linked subprograms that the work is specifically designed to require,
|
||||
such as by intimate data communication or control flow between those
|
||||
subprograms and other parts of the work.
|
||||
|
||||
The Corresponding Source need not include anything that users
|
||||
can regenerate automatically from other parts of the Corresponding
|
||||
Source.
|
||||
|
||||
The Corresponding Source for a work in source code form is that
|
||||
same work.
|
||||
|
||||
2. Basic Permissions.
|
||||
|
||||
All rights granted under this License are granted for the term of
|
||||
copyright on the Program, and are irrevocable provided the stated
|
||||
conditions are met. This License explicitly affirms your unlimited
|
||||
permission to run the unmodified Program. The output from running a
|
||||
covered work is covered by this License only if the output, given its
|
||||
content, constitutes a covered work. This License acknowledges your
|
||||
rights of fair use or other equivalent, as provided by copyright law.
|
||||
|
||||
You may make, run and propagate covered works that you do not
|
||||
convey, without conditions so long as your license otherwise remains
|
||||
in force. You may convey covered works to others for the sole purpose
|
||||
of having them make modifications exclusively for you, or provide you
|
||||
with facilities for running those works, provided that you comply with
|
||||
the terms of this License in conveying all material for which you do
|
||||
not control copyright. Those thus making or running the covered works
|
||||
for you must do so exclusively on your behalf, under your direction
|
||||
and control, on terms that prohibit them from making any copies of
|
||||
your copyrighted material outside their relationship with you.
|
||||
|
||||
Conveying under any other circumstances is permitted solely under
|
||||
the conditions stated below. Sublicensing is not allowed; section 10
|
||||
makes it unnecessary.
|
||||
|
||||
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
|
||||
|
||||
No covered work shall be deemed part of an effective technological
|
||||
measure under any applicable law fulfilling obligations under article
|
||||
11 of the WIPO copyright treaty adopted on 20 December 1996, or
|
||||
similar laws prohibiting or restricting circumvention of such
|
||||
measures.
|
||||
|
||||
When you convey a covered work, you waive any legal power to forbid
|
||||
circumvention of technological measures to the extent such circumvention
|
||||
is effected by exercising rights under this License with respect to
|
||||
the covered work, and you disclaim any intention to limit operation or
|
||||
modification of the work as a means of enforcing, against the work's
|
||||
users, your or third parties' legal rights to forbid circumvention of
|
||||
technological measures.
|
||||
|
||||
4. Conveying Verbatim Copies.
|
||||
|
||||
You may convey verbatim copies of the Program's source code as you
|
||||
receive it, in any medium, provided that you conspicuously and
|
||||
appropriately publish on each copy an appropriate copyright notice;
|
||||
keep intact all notices stating that this License and any
|
||||
non-permissive terms added in accord with section 7 apply to the code;
|
||||
keep intact all notices of the absence of any warranty; and give all
|
||||
recipients a copy of this License along with the Program.
|
||||
|
||||
You may charge any price or no price for each copy that you convey,
|
||||
and you may offer support or warranty protection for a fee.
|
||||
|
||||
5. Conveying Modified Source Versions.
|
||||
|
||||
You may convey a work based on the Program, or the modifications to
|
||||
produce it from the Program, in the form of source code under the
|
||||
terms of section 4, provided that you also meet all of these conditions:
|
||||
|
||||
a) The work must carry prominent notices stating that you modified
|
||||
it, and giving a relevant date.
|
||||
|
||||
b) The work must carry prominent notices stating that it is
|
||||
released under this License and any conditions added under section
|
||||
7. This requirement modifies the requirement in section 4 to
|
||||
"keep intact all notices".
|
||||
|
||||
c) You must license the entire work, as a whole, under this
|
||||
License to anyone who comes into possession of a copy. This
|
||||
License will therefore apply, along with any applicable section 7
|
||||
additional terms, to the whole of the work, and all its parts,
|
||||
regardless of how they are packaged. This License gives no
|
||||
permission to license the work in any other way, but it does not
|
||||
invalidate such permission if you have separately received it.
|
||||
|
||||
d) If the work has interactive user interfaces, each must display
|
||||
Appropriate Legal Notices; however, if the Program has interactive
|
||||
interfaces that do not display Appropriate Legal Notices, your
|
||||
work need not make them do so.
|
||||
|
||||
A compilation of a covered work with other separate and independent
|
||||
works, which are not by their nature extensions of the covered work,
|
||||
and which are not combined with it such as to form a larger program,
|
||||
in or on a volume of a storage or distribution medium, is called an
|
||||
"aggregate" if the compilation and its resulting copyright are not
|
||||
used to limit the access or legal rights of the compilation's users
|
||||
beyond what the individual works permit. Inclusion of a covered work
|
||||
in an aggregate does not cause this License to apply to the other
|
||||
parts of the aggregate.
|
||||
|
||||
6. Conveying Non-Source Forms.
|
||||
|
||||
You may convey a covered work in object code form under the terms
|
||||
of sections 4 and 5, provided that you also convey the
|
||||
machine-readable Corresponding Source under the terms of this License,
|
||||
in one of these ways:
|
||||
|
||||
a) Convey the object code in, or embodied in, a physical product
|
||||
(including a physical distribution medium), accompanied by the
|
||||
Corresponding Source fixed on a durable physical medium
|
||||
customarily used for software interchange.
|
||||
|
||||
b) Convey the object code in, or embodied in, a physical product
|
||||
(including a physical distribution medium), accompanied by a
|
||||
written offer, valid for at least three years and valid for as
|
||||
long as you offer spare parts or customer support for that product
|
||||
model, to give anyone who possesses the object code either (1) a
|
||||
copy of the Corresponding Source for all the software in the
|
||||
product that is covered by this License, on a durable physical
|
||||
medium customarily used for software interchange, for a price no
|
||||
more than your reasonable cost of physically performing this
|
||||
conveying of source, or (2) access to copy the
|
||||
Corresponding Source from a network server at no charge.
|
||||
|
||||
c) Convey individual copies of the object code with a copy of the
|
||||
written offer to provide the Corresponding Source. This
|
||||
alternative is allowed only occasionally and noncommercially, and
|
||||
only if you received the object code with such an offer, in accord
|
||||
with subsection 6b.
|
||||
|
||||
d) Convey the object code by offering access from a designated
|
||||
place (gratis or for a charge), and offer equivalent access to the
|
||||
Corresponding Source in the same way through the same place at no
|
||||
further charge. You need not require recipients to copy the
|
||||
Corresponding Source along with the object code. If the place to
|
||||
copy the object code is a network server, the Corresponding Source
|
||||
may be on a different server (operated by you or a third party)
|
||||
that supports equivalent copying facilities, provided you maintain
|
||||
clear directions next to the object code saying where to find the
|
||||
Corresponding Source. Regardless of what server hosts the
|
||||
Corresponding Source, you remain obligated to ensure that it is
|
||||
available for as long as needed to satisfy these requirements.
|
||||
|
||||
e) Convey the object code using peer-to-peer transmission, provided
|
||||
you inform other peers where the object code and Corresponding
|
||||
Source of the work are being offered to the general public at no
|
||||
charge under subsection 6d.
|
||||
|
||||
A separable portion of the object code, whose source code is excluded
|
||||
from the Corresponding Source as a System Library, need not be
|
||||
included in conveying the object code work.
|
||||
|
||||
A "User Product" is either (1) a "consumer product", which means any
|
||||
tangible personal property which is normally used for personal, family,
|
||||
or household purposes, or (2) anything designed or sold for incorporation
|
||||
into a dwelling. In determining whether a product is a consumer product,
|
||||
doubtful cases shall be resolved in favor of coverage. For a particular
|
||||
product received by a particular user, "normally used" refers to a
|
||||
typical or common use of that class of product, regardless of the status
|
||||
of the particular user or of the way in which the particular user
|
||||
actually uses, or expects or is expected to use, the product. A product
|
||||
is a consumer product regardless of whether the product has substantial
|
||||
commercial, industrial or non-consumer uses, unless such uses represent
|
||||
the only significant mode of use of the product.
|
||||
|
||||
"Installation Information" for a User Product means any methods,
|
||||
procedures, authorization keys, or other information required to install
|
||||
and execute modified versions of a covered work in that User Product from
|
||||
a modified version of its Corresponding Source. The information must
|
||||
suffice to ensure that the continued functioning of the modified object
|
||||
code is in no case prevented or interfered with solely because
|
||||
modification has been made.
|
||||
|
||||
If you convey an object code work under this section in, or with, or
|
||||
specifically for use in, a User Product, and the conveying occurs as
|
||||
part of a transaction in which the right of possession and use of the
|
||||
User Product is transferred to the recipient in perpetuity or for a
|
||||
fixed term (regardless of how the transaction is characterized), the
|
||||
Corresponding Source conveyed under this section must be accompanied
|
||||
by the Installation Information. But this requirement does not apply
|
||||
if neither you nor any third party retains the ability to install
|
||||
modified object code on the User Product (for example, the work has
|
||||
been installed in ROM).
|
||||
|
||||
The requirement to provide Installation Information does not include a
|
||||
requirement to continue to provide support service, warranty, or updates
|
||||
for a work that has been modified or installed by the recipient, or for
|
||||
the User Product in which it has been modified or installed. Access to a
|
||||
network may be denied when the modification itself materially and
|
||||
adversely affects the operation of the network or violates the rules and
|
||||
protocols for communication across the network.
|
||||
|
||||
Corresponding Source conveyed, and Installation Information provided,
|
||||
in accord with this section must be in a format that is publicly
|
||||
documented (and with an implementation available to the public in
|
||||
source code form), and must require no special password or key for
|
||||
unpacking, reading or copying.
|
||||
|
||||
7. Additional Terms.
|
||||
|
||||
"Additional permissions" are terms that supplement the terms of this
|
||||
License by making exceptions from one or more of its conditions.
|
||||
Additional permissions that are applicable to the entire Program shall
|
||||
be treated as though they were included in this License, to the extent
|
||||
that they are valid under applicable law. If additional permissions
|
||||
apply only to part of the Program, that part may be used separately
|
||||
under those permissions, but the entire Program remains governed by
|
||||
this License without regard to the additional permissions.
|
||||
|
||||
When you convey a copy of a covered work, you may at your option
|
||||
remove any additional permissions from that copy, or from any part of
|
||||
it. (Additional permissions may be written to require their own
|
||||
removal in certain cases when you modify the work.) You may place
|
||||
additional permissions on material, added by you to a covered work,
|
||||
for which you have or can give appropriate copyright permission.
|
||||
|
||||
Notwithstanding any other provision of this License, for material you
|
||||
add to a covered work, you may (if authorized by the copyright holders of
|
||||
that material) supplement the terms of this License with terms:
|
||||
|
||||
a) Disclaiming warranty or limiting liability differently from the
|
||||
terms of sections 15 and 16 of this License; or
|
||||
|
||||
b) Requiring preservation of specified reasonable legal notices or
|
||||
author attributions in that material or in the Appropriate Legal
|
||||
Notices displayed by works containing it; or
|
||||
|
||||
c) Prohibiting misrepresentation of the origin of that material, or
|
||||
requiring that modified versions of such material be marked in
|
||||
reasonable ways as different from the original version; or
|
||||
|
||||
d) Limiting the use for publicity purposes of names of licensors or
|
||||
authors of the material; or
|
||||
|
||||
e) Declining to grant rights under trademark law for use of some
|
||||
trade names, trademarks, or service marks; or
|
||||
|
||||
f) Requiring indemnification of licensors and authors of that
|
||||
material by anyone who conveys the material (or modified versions of
|
||||
it) with contractual assumptions of liability to the recipient, for
|
||||
any liability that these contractual assumptions directly impose on
|
||||
those licensors and authors.
|
||||
|
||||
All other non-permissive additional terms are considered "further
|
||||
restrictions" within the meaning of section 10. If the Program as you
|
||||
received it, or any part of it, contains a notice stating that it is
|
||||
governed by this License along with a term that is a further
|
||||
restriction, you may remove that term. If a license document contains
|
||||
a further restriction but permits relicensing or conveying under this
|
||||
License, you may add to a covered work material governed by the terms
|
||||
of that license document, provided that the further restriction does
|
||||
not survive such relicensing or conveying.
|
||||
|
||||
If you add terms to a covered work in accord with this section, you
|
||||
must place, in the relevant source files, a statement of the
|
||||
additional terms that apply to those files, or a notice indicating
|
||||
where to find the applicable terms.
|
||||
|
||||
Additional terms, permissive or non-permissive, may be stated in the
|
||||
form of a separately written license, or stated as exceptions;
|
||||
the above requirements apply either way.
|
||||
|
||||
8. Termination.
|
||||
|
||||
You may not propagate or modify a covered work except as expressly
|
||||
provided under this License. Any attempt otherwise to propagate or
|
||||
modify it is void, and will automatically terminate your rights under
|
||||
this License (including any patent licenses granted under the third
|
||||
paragraph of section 11).
|
||||
|
||||
However, if you cease all violation of this License, then your
|
||||
license from a particular copyright holder is reinstated (a)
|
||||
provisionally, unless and until the copyright holder explicitly and
|
||||
finally terminates your license, and (b) permanently, if the copyright
|
||||
holder fails to notify you of the violation by some reasonable means
|
||||
prior to 60 days after the cessation.
|
||||
|
||||
Moreover, your license from a particular copyright holder is
|
||||
reinstated permanently if the copyright holder notifies you of the
|
||||
violation by some reasonable means, this is the first time you have
|
||||
received notice of violation of this License (for any work) from that
|
||||
copyright holder, and you cure the violation prior to 30 days after
|
||||
your receipt of the notice.
|
||||
|
||||
Termination of your rights under this section does not terminate the
|
||||
licenses of parties who have received copies or rights from you under
|
||||
this License. If your rights have been terminated and not permanently
|
||||
reinstated, you do not qualify to receive new licenses for the same
|
||||
material under section 10.
|
||||
|
||||
9. Acceptance Not Required for Having Copies.
|
||||
|
||||
You are not required to accept this License in order to receive or
|
||||
run a copy of the Program. Ancillary propagation of a covered work
|
||||
occurring solely as a consequence of using peer-to-peer transmission
|
||||
to receive a copy likewise does not require acceptance. However,
|
||||
nothing other than this License grants you permission to propagate or
|
||||
modify any covered work. These actions infringe copyright if you do
|
||||
not accept this License. Therefore, by modifying or propagating a
|
||||
covered work, you indicate your acceptance of this License to do so.
|
||||
|
||||
10. Automatic Licensing of Downstream Recipients.
|
||||
|
||||
Each time you convey a covered work, the recipient automatically
|
||||
receives a license from the original licensors, to run, modify and
|
||||
propagate that work, subject to this License. You are not responsible
|
||||
for enforcing compliance by third parties with this License.
|
||||
|
||||
An "entity transaction" is a transaction transferring control of an
|
||||
organization, or substantially all assets of one, or subdividing an
|
||||
organization, or merging organizations. If propagation of a covered
|
||||
work results from an entity transaction, each party to that
|
||||
transaction who receives a copy of the work also receives whatever
|
||||
licenses to the work the party's predecessor in interest had or could
|
||||
give under the previous paragraph, plus a right to possession of the
|
||||
Corresponding Source of the work from the predecessor in interest, if
|
||||
the predecessor has it or can get it with reasonable efforts.
|
||||
|
||||
You may not impose any further restrictions on the exercise of the
|
||||
rights granted or affirmed under this License. For example, you may
|
||||
not impose a license fee, royalty, or other charge for exercise of
|
||||
rights granted under this License, and you may not initiate litigation
|
||||
(including a cross-claim or counterclaim in a lawsuit) alleging that
|
||||
any patent claim is infringed by making, using, selling, offering for
|
||||
sale, or importing the Program or any portion of it.
|
||||
|
||||
11. Patents.
|
||||
|
||||
A "contributor" is a copyright holder who authorizes use under this
|
||||
License of the Program or a work on which the Program is based. The
|
||||
work thus licensed is called the contributor's "contributor version".
|
||||
|
||||
A contributor's "essential patent claims" are all patent claims
|
||||
owned or controlled by the contributor, whether already acquired or
|
||||
hereafter acquired, that would be infringed by some manner, permitted
|
||||
by this License, of making, using, or selling its contributor version,
|
||||
but do not include claims that would be infringed only as a
|
||||
consequence of further modification of the contributor version. For
|
||||
purposes of this definition, "control" includes the right to grant
|
||||
patent sublicenses in a manner consistent with the requirements of
|
||||
this License.
|
||||
|
||||
Each contributor grants you a non-exclusive, worldwide, royalty-free
|
||||
patent license under the contributor's essential patent claims, to
|
||||
make, use, sell, offer for sale, import and otherwise run, modify and
|
||||
propagate the contents of its contributor version.
|
||||
|
||||
In the following three paragraphs, a "patent license" is any express
|
||||
agreement or commitment, however denominated, not to enforce a patent
|
||||
(such as an express permission to practice a patent or covenant not to
|
||||
sue for patent infringement). To "grant" such a patent license to a
|
||||
party means to make such an agreement or commitment not to enforce a
|
||||
patent against the party.
|
||||
|
||||
If you convey a covered work, knowingly relying on a patent license,
|
||||
and the Corresponding Source of the work is not available for anyone
|
||||
to copy, free of charge and under the terms of this License, through a
|
||||
publicly available network server or other readily accessible means,
|
||||
then you must either (1) cause the Corresponding Source to be so
|
||||
available, or (2) arrange to deprive yourself of the benefit of the
|
||||
patent license for this particular work, or (3) arrange, in a manner
|
||||
consistent with the requirements of this License, to extend the patent
|
||||
license to downstream recipients. "Knowingly relying" means you have
|
||||
actual knowledge that, but for the patent license, your conveying the
|
||||
covered work in a country, or your recipient's use of the covered work
|
||||
in a country, would infringe one or more identifiable patents in that
|
||||
country that you have reason to believe are valid.
|
||||
|
||||
If, pursuant to or in connection with a single transaction or
|
||||
arrangement, you convey, or propagate by procuring conveyance of, a
|
||||
covered work, and grant a patent license to some of the parties
|
||||
receiving the covered work authorizing them to use, propagate, modify
|
||||
or convey a specific copy of the covered work, then the patent license
|
||||
you grant is automatically extended to all recipients of the covered
|
||||
work and works based on it.
|
||||
|
||||
A patent license is "discriminatory" if it does not include within
|
||||
the scope of its coverage, prohibits the exercise of, or is
|
||||
conditioned on the non-exercise of one or more of the rights that are
|
||||
specifically granted under this License. You may not convey a covered
|
||||
work if you are a party to an arrangement with a third party that is
|
||||
in the business of distributing software, under which you make payment
|
||||
to the third party based on the extent of your activity of conveying
|
||||
the work, and under which the third party grants, to any of the
|
||||
parties who would receive the covered work from you, a discriminatory
|
||||
patent license (a) in connection with copies of the covered work
|
||||
conveyed by you (or copies made from those copies), or (b) primarily
|
||||
for and in connection with specific products or compilations that
|
||||
contain the covered work, unless you entered into that arrangement,
|
||||
or that patent license was granted, prior to 28 March 2007.
|
||||
|
||||
Nothing in this License shall be construed as excluding or limiting
|
||||
any implied license or other defenses to infringement that may
|
||||
otherwise be available to you under applicable patent law.
|
||||
|
||||
12. No Surrender of Others' Freedom.
|
||||
|
||||
If conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot convey a
|
||||
covered work so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you may
|
||||
not convey it at all. For example, if you agree to terms that obligate you
|
||||
to collect a royalty for further conveying from those to whom you convey
|
||||
the Program, the only way you could satisfy both those terms and this
|
||||
License would be to refrain entirely from conveying the Program.
|
||||
|
||||
13. Use with the GNU Affero General Public License.
|
||||
|
||||
Notwithstanding any other provision of this License, you have
|
||||
permission to link or combine any covered work with a work licensed
|
||||
under version 3 of the GNU Affero General Public License into a single
|
||||
combined work, and to convey the resulting work. The terms of this
|
||||
License will continue to apply to the part which is the covered work,
|
||||
but the special requirements of the GNU Affero General Public License,
|
||||
section 13, concerning interaction through a network will apply to the
|
||||
combination as such.
|
||||
|
||||
14. Revised Versions of this License.
|
||||
|
||||
The Free Software Foundation may publish revised and/or new versions of
|
||||
the GNU General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the
|
||||
Program specifies that a certain numbered version of the GNU General
|
||||
Public License "or any later version" applies to it, you have the
|
||||
option of following the terms and conditions either of that numbered
|
||||
version or of any later version published by the Free Software
|
||||
Foundation. If the Program does not specify a version number of the
|
||||
GNU General Public License, you may choose any version ever published
|
||||
by the Free Software Foundation.
|
||||
|
||||
If the Program specifies that a proxy can decide which future
|
||||
versions of the GNU General Public License can be used, that proxy's
|
||||
public statement of acceptance of a version permanently authorizes you
|
||||
to choose that version for the Program.
|
||||
|
||||
Later license versions may give you additional or different
|
||||
permissions. However, no additional obligations are imposed on any
|
||||
author or copyright holder as a result of your choosing to follow a
|
||||
later version.
|
||||
|
||||
15. Disclaimer of Warranty.
|
||||
|
||||
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
|
||||
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
|
||||
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
|
||||
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
|
||||
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
|
||||
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
|
||||
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||
|
||||
16. Limitation of Liability.
|
||||
|
||||
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
|
||||
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
|
||||
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
|
||||
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
|
||||
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
|
||||
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
|
||||
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
|
||||
SUCH DAMAGES.
|
||||
|
||||
17. Interpretation of Sections 15 and 16.
|
||||
|
||||
If the disclaimer of warranty and limitation of liability provided
|
||||
above cannot be given local legal effect according to their terms,
|
||||
reviewing courts shall apply local law that most closely approximates
|
||||
an absolute waiver of all civil liability in connection with the
|
||||
Program, unless a warranty or assumption of liability accompanies a
|
||||
copy of the Program in return for a fee.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
state the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation, either version 3 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program. If not, see <http://www.gnu.org/licenses/>.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program does terminal interaction, make it output a short
|
||||
notice like this when it starts in an interactive mode:
|
||||
|
||||
<program> Copyright (C) <year> <name of author>
|
||||
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, your program's commands
|
||||
might be different; for a GUI interface, you would use an "about box".
|
||||
|
||||
You should also get your employer (if you work as a programmer) or school,
|
||||
if any, to sign a "copyright disclaimer" for the program, if necessary.
|
||||
For more information on this, and how to apply and follow the GNU GPL, see
|
||||
<http://www.gnu.org/licenses/>.
|
||||
|
||||
The GNU General Public License does not permit incorporating your program
|
||||
into proprietary programs. If your program is a subroutine library, you
|
||||
may consider it more useful to permit linking proprietary applications with
|
||||
the library. If this is what you want to do, use the GNU Lesser General
|
||||
Public License instead of this License. But first, please read
|
||||
<http://www.gnu.org/philosophy/why-not-lgpl.html>.
|
||||
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
include DESCRIPTION.rst
|
||||
include README.md
|
||||
include LICENSE
|
||||
71
README.md
71
README.md
|
|
@ -1,71 +0,0 @@
|
|||
[FanFicFare](https://github.com/JimmXinu/FanFicFare)
|
||||
==========
|
||||
|
||||
FanFicFare makes reading stories from various websites much easier by helping
|
||||
you download them to EBook files.
|
||||
|
||||
FanFicFare was previously known as FanFictionDownLoader (AKA
|
||||
FFDL, AKA fanficdownloader).
|
||||
|
||||
Main features:
|
||||
|
||||
- Download FanFiction stories from over [100 different sites](https://github.com/JimmXinu/FanFicFare/wiki/SupportedSites). into ebooks.
|
||||
|
||||
- Update previously downloaded EPUB format ebooks, downloading only new chapters.
|
||||
|
||||
- Get Story URLs from Web Pages.
|
||||
|
||||
- Support for downloading images in the story text. (EPUB and HTML
|
||||
only -- download EPUB and convert to AZW3 for Kindle) More details on
|
||||
configuring images in stories and cover images can be found in the
|
||||
[FAQs] or [this post in the old FFDL thread].
|
||||
|
||||
- Support for cover image. (EPUB only)
|
||||
|
||||
- Optionally keep an Update Log of past updates (EPUB only).
|
||||
|
||||
There's additional info in the project [wiki] pages.
|
||||
|
||||
There's also a [FanFicFare maillist] for discussion and announcements and a [discussion thread] for the Calibre plugin.
|
||||
|
||||
Getting FanFicFare
|
||||
==================
|
||||
|
||||
### Official Releases
|
||||
|
||||
This program is available as:
|
||||
|
||||
- A Calibre plugin from within Calibre or directly from the plugin [discussion thread], or;
|
||||
- A Command Line Interface (CLI) [Python
|
||||
package](https://pypi.python.org/pypi/FanFicFare) that you can
|
||||
install with:
|
||||
```
|
||||
pip install FanFicFare
|
||||
```
|
||||
- _As of late November 2019, the web service version is shutdown. See the [Wiki Home](https://github.com/JimmXinu/FanFicFare/wiki#web-service-version) page for details._
|
||||
|
||||
### Test Versions
|
||||
|
||||
FanFicFare is released roughly every month, but new test versions are posted more frequently as changes are made.
|
||||
|
||||
Test versions are available at:
|
||||
|
||||
- The [test plugin] is posted at MobileRead.
|
||||
- The test version of CLI for pip install is uploaded to the testpypi repository and can be installed with:
|
||||
```
|
||||
pip install --extra-index-url https://test.pypi.org/simple/ --upgrade FanFicFare
|
||||
```
|
||||
|
||||
### Other Releases
|
||||
|
||||
Other versions may be available depending on your OS. I(JimmXinu) don't directly support these:
|
||||
|
||||
- **Arch Linux**: The latest CLI release can be obtained from the [fanficfare](https://aur.archlinux.org/packages/fanficfare) AUR package. It will install the calibre plugin, if calibre is installed.
|
||||
|
||||
|
||||
[this post in the old FFDL thread]: https://www.mobileread.com/forums/showthread.php?p=1982785#post1982785
|
||||
[FAQs]: https://github.com/JimmXinu/FanFicFare/wiki/FAQs#can-fanficfare-download-a-story-containing-images
|
||||
[FanFicFare maillist]: https://groups.google.com/group/fanfic-downloader
|
||||
[wiki]: https://github.com/JimmXinu/FanFicFare/wiki
|
||||
[discussion thread]: https://www.mobileread.com/forums/showthread.php?t=259221
|
||||
[test plugin]: https://www.mobileread.com/forums/showthread.php?p=3084025&postcount=2
|
||||
46
app.yaml
Normal file
46
app.yaml
Normal file
|
|
@ -0,0 +1,46 @@
|
|||
# ffd-retief-hrd fanfictiondownloader
|
||||
application: fanfictiondownloader
|
||||
version: 4-4-2
|
||||
runtime: python27
|
||||
api_version: 1
|
||||
threadsafe: true
|
||||
|
||||
handlers:
|
||||
|
||||
- url: /r3m0v3r.*
|
||||
script: utils.remover.app
|
||||
login: admin
|
||||
|
||||
- url: /tally.*
|
||||
script: utils.tally.app
|
||||
login: admin
|
||||
|
||||
- url: /fdownloadtask
|
||||
script: main.app
|
||||
login: admin
|
||||
|
||||
- url: /css
|
||||
static_dir: css
|
||||
|
||||
- url: /js
|
||||
static_dir: js
|
||||
|
||||
- url: /static
|
||||
static_dir: static
|
||||
|
||||
- url: /favicon\.ico
|
||||
static_files: static/favicon.ico
|
||||
upload: static/favicon\.ico
|
||||
|
||||
- url: /.*
|
||||
script: main.app
|
||||
|
||||
builtins:
|
||||
- datastore_admin: on
|
||||
|
||||
libraries:
|
||||
- name: django
|
||||
version: "1.2"
|
||||
|
||||
- name: PIL
|
||||
version: "1.1.7"
|
||||
|
|
@ -1,9 +0,0 @@
|
|||
[main]
|
||||
host = https://www.transifex.com
|
||||
|
||||
[o:calibre:p:calibre-plugins:r:fanfictiondownloader]
|
||||
file_filter = translations/<lang>.po
|
||||
source_file = translations/en.po
|
||||
source_lang = en
|
||||
type = PO
|
||||
|
||||
|
|
@ -1,63 +1,39 @@
|
|||
#!/usr/bin/env python
|
||||
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
|
||||
# -*- coding: utf-8 -*-
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2019, Jim Miller'
|
||||
__copyright__ = '2011, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import sys, os
|
||||
if sys.version_info >= (2, 7):
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
loghandler=logging.StreamHandler()
|
||||
loghandler.setFormatter(logging.Formatter("FFF: %(levelname)s: %(asctime)s: %(filename)s(%(lineno)d): %(message)s"))
|
||||
logger.addHandler(loghandler)
|
||||
|
||||
from calibre.constants import DEBUG
|
||||
if os.environ.get('CALIBRE_WORKER', None) is not None or DEBUG:
|
||||
loghandler.setLevel(logging.DEBUG)
|
||||
logger.setLevel(logging.DEBUG)
|
||||
else:
|
||||
loghandler.setLevel(logging.CRITICAL)
|
||||
logger.setLevel(logging.CRITICAL)
|
||||
|
||||
# pulls in translation files for _() strings
|
||||
try:
|
||||
load_translations()
|
||||
except NameError:
|
||||
pass # load_translations() added in calibre 1.9
|
||||
|
||||
# The class that all Interface Action plugin wrappers must inherit from
|
||||
from calibre.customize import InterfaceActionBase
|
||||
|
||||
# pulled out from FanFicFareBase for saving in prefs.py
|
||||
__version__ = (4, 57, 7)
|
||||
|
||||
## Apparently the name for this class doesn't matter--it was still
|
||||
## 'demo' for the first few versions.
|
||||
class FanFicFareBase(InterfaceActionBase):
|
||||
class FanFictionDownLoaderBase(InterfaceActionBase):
|
||||
'''
|
||||
This class is a simple wrapper that provides information about the
|
||||
actual plugin class. The actual interface plugin class is called
|
||||
InterfacePlugin and is defined in the fff_plugin.py file, as
|
||||
InterfacePlugin and is defined in the ffdl_plugin.py file, as
|
||||
specified in the actual_plugin field below.
|
||||
|
||||
The reason for having two classes is that it allows the command line
|
||||
calibre utilities to run without needing to load the GUI libraries.
|
||||
'''
|
||||
name = 'FanFicFare'
|
||||
description = _('UI plugin to download FanFiction stories from various sites.')
|
||||
name = 'FanFictionDownLoader'
|
||||
description = 'UI plugin to download FanFiction stories from various sites.'
|
||||
supported_platforms = ['windows', 'osx', 'linux']
|
||||
author = 'Jim Miller'
|
||||
version = __version__
|
||||
minimum_calibre_version = (2, 85, 1)
|
||||
version = (1, 5, 8)
|
||||
minimum_calibre_version = (0, 8, 30)
|
||||
|
||||
#: This field defines the GUI plugin class that contains all the code
|
||||
#: that actually does something. Its format is module_path:class_name
|
||||
#: The specified class must be defined in the specified module.
|
||||
actual_plugin = 'calibre_plugins.fanficfare_plugin.fff_plugin:FanFicFarePlugin'
|
||||
actual_plugin = 'calibre_plugins.fanfictiondownloader_plugin.ffdl_plugin:FanFictionDownLoaderPlugin'
|
||||
|
||||
def is_customizable(self):
|
||||
'''
|
||||
|
|
@ -88,7 +64,7 @@ class FanFicFareBase(InterfaceActionBase):
|
|||
# top of the module as importing the config class will also cause the
|
||||
# GUI libraries to be loaded, which we do not want when using calibre
|
||||
# from the command line
|
||||
from calibre_plugins.fanficfare_plugin.config import ConfigWidget
|
||||
from calibre_plugins.fanfictiondownloader_plugin.config import ConfigWidget
|
||||
return ConfigWidget(self.actual_plugin_)
|
||||
|
||||
def save_settings(self, config_widget):
|
||||
|
|
@ -104,47 +80,11 @@ class FanFicFareBase(InterfaceActionBase):
|
|||
if ac is not None:
|
||||
ac.apply_settings()
|
||||
|
||||
def load_actual_plugin(self, gui):
|
||||
# so the sys.path was modified while loading the plug impl.
|
||||
with self:
|
||||
|
||||
# Make sure the fanficfare module is available globally
|
||||
# under its simple name, -- This is the only reason other
|
||||
# plugin files can import fanficfare instead of
|
||||
# calibre_plugins.fanficfare_plugin.fanficfare.
|
||||
#
|
||||
# Added specifically for the benefit of
|
||||
# eli-schwartz/eschwartz's Arch Linux distro that wants to
|
||||
# package FFF plugin outside Calibre.
|
||||
import fanficfare
|
||||
|
||||
return InterfaceActionBase.load_actual_plugin(self,gui)
|
||||
|
||||
def cli_main(self,argv):
|
||||
|
||||
with self: # so the sys.path was modified appropriately
|
||||
# I believe there's no performance hit loading these here when
|
||||
# CLI--it would load everytime anyway.
|
||||
from calibre.library import db
|
||||
from fanficfare.cli import main as fff_main
|
||||
from calibre_plugins.fanficfare_plugin.prefs import PrefsFacade
|
||||
from fanficfare.six import ensure_text
|
||||
from optparse import OptionParser
|
||||
|
||||
parser = OptionParser('%prog --run-plugin '+self.name+' -- [options] <storyurl>')
|
||||
parser.add_option('--library-path', '--with-library', default=None, help=_('Path to the calibre library. Default is to use the path stored in the settings.'))
|
||||
# parser.add_option('--dont-notify-gui', default=False, action='store_true',
|
||||
# help=_('Do not notify the running calibre GUI (if any) that the database has'
|
||||
# ' changed. Use with care, as it can lead to database corruption!'))
|
||||
|
||||
pargs = [x for x in argv if x.startswith('--with-library') or x.startswith('--library-path')
|
||||
or not x.startswith('-')]
|
||||
opts, args = parser.parse_args(pargs)
|
||||
fff_prefs = PrefsFacade(db(path=opts.library_path,
|
||||
read_only=True))
|
||||
|
||||
fff_main(argv[1:],
|
||||
parser=parser,
|
||||
passed_defaultsini=ensure_text(get_resources("fanficfare/defaults.ini")),
|
||||
passed_personalini=ensure_text(fff_prefs["personal.ini"]),
|
||||
)
|
||||
# For testing, run from command line with this:
|
||||
# calibre-debug -e __init__.py
|
||||
#
|
||||
if __name__ == '__main__':
|
||||
from PyQt4.Qt import QApplication
|
||||
from calibre.gui2.preferences import test_widget
|
||||
app = QApplication([])
|
||||
test_widget('Advanced', 'Plugins')
|
||||
|
|
|
|||
|
|
@ -1,32 +0,0 @@
|
|||
<hr />
|
||||
|
||||
<p>Plugin created by Jim Miller, originally borrowing heavily from Grant Drake's
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=134856">Reading List</a>',
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=126727">Extract ISBN</a>' and
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=134000">Count Pages</a>'
|
||||
plugins.</p>
|
||||
|
||||
<p>
|
||||
Calibre officially distributes plugins from the mobileread.com forum site.
|
||||
The official distro channel and discussion thread for this plugin is there: <a href="http://www.mobileread.com/forums/showthread.php?t=259221">FanFicFare</a>
|
||||
</p>
|
||||
|
||||
<p> I also monitor the
|
||||
<a href="http://groups.google.com/group/fanfic-downloader">general users
|
||||
group</a> for the downloader CLI, too.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The source for this plugin is available at it's
|
||||
<a href="https://github.com/JimmXinu/FanFicFare">project home</a>.
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
|
||||
<p>
|
||||
See the <a href="https://github.com/JimmXinu/FanFicFare/wiki/Supportedsites">list of supported sites</a>.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Read the <a href="https://github.com/JimmXinu/FanFicFare/wiki/FAQs">FAQs</a>.
|
||||
</p>
|
||||
20
calibre-plugin/about.txt
Normal file
20
calibre-plugin/about.txt
Normal file
|
|
@ -0,0 +1,20 @@
|
|||
<hr />
|
||||
|
||||
<p>Created by Jim Miller, borrowing heavily from Grant Drake's
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=134856">Reading List</a>',
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=126727">Extract ISBN</a>' and
|
||||
'<a href="http://www.mobileread.com/forums/showthread.php?t=134000">Count Pages</a>'
|
||||
plugins.</p>
|
||||
|
||||
<p>
|
||||
Calibre officially distributes plugins from the mobileread.com forum site.
|
||||
The official distro channel for this plugin is there: <a href="http://www.mobileread.com/forums/showthread.php?t=163261">FanFictionDownLoader</a>
|
||||
</p>
|
||||
|
||||
<p> I also monitor the
|
||||
<a href="http://groups.google.com/group/fanfic-downloader">general users
|
||||
group</a> for the downloader. That covers the web application and CLI, too.
|
||||
</p>
|
||||
|
||||
The source for this plugin is available at it's
|
||||
<a href="http://code.google.com/p/fanficdownloader">project home</a>.
|
||||
|
|
@ -1,20 +0,0 @@
|
|||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2024, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
## References:
|
||||
## https://www.mobileread.com/forums/showthread.php?p=4435205&postcount=65
|
||||
## https://www.mobileread.com/forums/showthread.php?p=4102834&postcount=389
|
||||
|
||||
from calibre_plugins.action_chains.events import ChainEvent
|
||||
|
||||
class FanFicFareDownloadFinished(ChainEvent):
|
||||
|
||||
# replace with the name of your event
|
||||
name = 'FanFicFare Download Finished'
|
||||
|
||||
def get_event_signal(self):
|
||||
return self.gui.iactions['FanFicFare'].download_finished_signal
|
||||
|
|
@ -1,62 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (absolute_import, unicode_literals, division,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2015, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import re
|
||||
|
||||
from PyQt5.Qt import (Qt, QSyntaxHighlighter, QTextCharFormat, QBrush)
|
||||
|
||||
from fanficfare.six import string_types
|
||||
|
||||
class BasicIniHighlighter(QSyntaxHighlighter):
|
||||
'''
|
||||
QSyntaxHighlighter class for use with QTextEdit for highlighting
|
||||
ini config files.
|
||||
|
||||
I looked high and low to find a high lighter for basic ini config
|
||||
format, so I'm leaving this in the project even though I'm not
|
||||
using.
|
||||
'''
|
||||
|
||||
def __init__( self, parent, theme ):
|
||||
QSyntaxHighlighter.__init__( self, parent )
|
||||
self.parent = parent
|
||||
|
||||
self.highlightingRules = []
|
||||
|
||||
# keyword
|
||||
self.highlightingRules.append( HighlightingRule( r"^[^:=\s][^:=]*[:=]",
|
||||
Qt.blue,
|
||||
Qt.SolidPattern ) )
|
||||
|
||||
# section
|
||||
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\]",
|
||||
Qt.darkBlue,
|
||||
Qt.SolidPattern ) )
|
||||
|
||||
# comment
|
||||
self.highlightingRules.append( HighlightingRule( r"#[^\n]*" ,
|
||||
Qt.darkYellow,
|
||||
Qt.SolidPattern ) )
|
||||
|
||||
def highlightBlock( self, text ):
|
||||
for rule in self.highlightingRules:
|
||||
for match in rule.pattern.finditer(text):
|
||||
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
|
||||
self.setCurrentBlockState( 0 )
|
||||
|
||||
class HighlightingRule():
|
||||
def __init__( self, pattern, color, style ):
|
||||
if isinstance(pattern, string_types):
|
||||
self.pattern = re.compile(pattern)
|
||||
else:
|
||||
self.pattern=pattern
|
||||
charfmt = QTextCharFormat()
|
||||
brush = QBrush(color, style)
|
||||
charfmt.setForeground(brush)
|
||||
self.highlight = charfmt
|
||||
|
|
@ -1,454 +1,447 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2011, Grant Drake <grant.drake@gmail.com>, 2018, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import os
|
||||
from contextlib import contextmanager
|
||||
from PyQt5.Qt import (QApplication, Qt, QIcon, QPixmap, QLabel, QDialog, QHBoxLayout,
|
||||
QTableWidgetItem, QFont, QLineEdit, QComboBox,
|
||||
QVBoxLayout, QDialogButtonBox, QStyledItemDelegate, QDateTime,
|
||||
QTextEdit, QListWidget, QAbstractItemView, QCursor)
|
||||
|
||||
from calibre.constants import numeric_version as calibre_version
|
||||
from calibre.constants import iswindows, DEBUG
|
||||
from calibre.gui2 import UNDEFINED_QDATETIME, gprefs, info_dialog
|
||||
from calibre.gui2.actions import menu_action_unique_name
|
||||
from calibre.gui2.keyboard import ShortcutConfig
|
||||
from calibre.utils.config import config_dir
|
||||
from calibre.utils.date import now, format_date, qt_to_dt, UNDEFINED_DATE
|
||||
|
||||
import fanficfare.six as six
|
||||
from six import text_type as unicode
|
||||
|
||||
# Global definition of our plugin name. Used for common functions that require this.
|
||||
plugin_name = None
|
||||
# Global definition of our plugin resources. Used to share between the xxxAction and xxxBase
|
||||
# classes if you need any zip images to be displayed on the configuration dialog.
|
||||
plugin_icon_resources = {}
|
||||
|
||||
|
||||
def set_plugin_icon_resources(name, resources):
|
||||
'''
|
||||
Set our global store of plugin name and icon resources for sharing between
|
||||
the InterfaceAction class which reads them and the ConfigWidget
|
||||
if needed for use on the customization dialog for this plugin.
|
||||
'''
|
||||
global plugin_icon_resources, plugin_name
|
||||
plugin_name = name
|
||||
plugin_icon_resources = resources
|
||||
|
||||
# print_tracebacks_for_missing_resources first appears in cal 6.2.0
|
||||
if calibre_version >= (6,2,0):
|
||||
def get_icons_nolog(icon_name,plugin_name):
|
||||
return get_icons(icon_name,
|
||||
plugin_name,
|
||||
print_tracebacks_for_missing_resources=False)
|
||||
else:
|
||||
get_icons_nolog = get_icons
|
||||
|
||||
def get_icon_6plus(icon_name):
|
||||
'''
|
||||
Retrieve a QIcon for the named image from
|
||||
1. Calibre's image cache
|
||||
2. resources/images
|
||||
3. the icon theme
|
||||
4. the plugin zip
|
||||
Only plugin zip has images/ in the image name for backward
|
||||
compatibility.
|
||||
'''
|
||||
icon = None
|
||||
if icon_name:
|
||||
icon = QIcon.ic(icon_name)
|
||||
## both .ic and get_icons return an empty QIcon if not found.
|
||||
if not icon or icon.isNull():
|
||||
# don't need a tracestack from get_icons just because
|
||||
# there's no icon in the theme
|
||||
icon = get_icons_nolog(icon_name.replace('images/',''),
|
||||
plugin_name)
|
||||
if not icon or icon.isNull():
|
||||
icon = get_icons(icon_name,plugin_name)
|
||||
if not icon:
|
||||
icon = QIcon()
|
||||
return icon
|
||||
|
||||
def get_icon_old(icon_name):
|
||||
'''
|
||||
Retrieve a QIcon for the named image from the zip file if it exists,
|
||||
or if not then from Calibre's image cache.
|
||||
'''
|
||||
if icon_name:
|
||||
pixmap = get_pixmap(icon_name)
|
||||
if pixmap is None:
|
||||
# Look in Calibre's cache for the icon
|
||||
return QIcon(I(icon_name))
|
||||
else:
|
||||
return QIcon(pixmap)
|
||||
return QIcon()
|
||||
|
||||
# get_icons changed in Cal6.
|
||||
if calibre_version >= (6,0,0):
|
||||
get_icon = get_icon_6plus
|
||||
else:
|
||||
get_icon = get_icon_old
|
||||
|
||||
def get_pixmap(icon_name):
|
||||
'''
|
||||
Retrieve a QPixmap for the named image
|
||||
Any icons belonging to the plugin must be prefixed with 'images/'
|
||||
'''
|
||||
global plugin_icon_resources, plugin_name
|
||||
|
||||
if not icon_name.startswith('images/'):
|
||||
# We know this is definitely not an icon belonging to this plugin
|
||||
pixmap = QPixmap()
|
||||
pixmap.load(I(icon_name))
|
||||
return pixmap
|
||||
|
||||
# Check to see whether the icon exists as a Calibre resource
|
||||
# This will enable skinning if the user stores icons within a folder like:
|
||||
# ...\AppData\Roaming\calibre\resources\images\Plugin Name\
|
||||
if plugin_name:
|
||||
local_images_dir = get_local_images_dir(plugin_name)
|
||||
local_image_path = os.path.join(local_images_dir, icon_name.replace('images/', ''))
|
||||
if os.path.exists(local_image_path):
|
||||
pixmap = QPixmap()
|
||||
pixmap.load(local_image_path)
|
||||
return pixmap
|
||||
|
||||
# As we did not find an icon elsewhere, look within our zip resources
|
||||
if icon_name in plugin_icon_resources:
|
||||
pixmap = QPixmap()
|
||||
pixmap.loadFromData(plugin_icon_resources[icon_name])
|
||||
return pixmap
|
||||
return None
|
||||
|
||||
|
||||
def get_local_images_dir(subfolder=None):
|
||||
'''
|
||||
Returns a path to the user's local resources/images folder
|
||||
If a subfolder name parameter is specified, appends this to the path
|
||||
'''
|
||||
images_dir = os.path.join(config_dir, 'resources/images')
|
||||
if subfolder:
|
||||
images_dir = os.path.join(images_dir, subfolder)
|
||||
if iswindows:
|
||||
images_dir = os.path.normpath(images_dir)
|
||||
return images_dir
|
||||
|
||||
|
||||
def create_menu_action_unique(ia, parent_menu, menu_text, image=None, tooltip=None,
|
||||
shortcut=None, triggered=None, is_checked=None, shortcut_name=None,
|
||||
unique_name=None):
|
||||
'''
|
||||
Create a menu action with the specified criteria and action, using the new
|
||||
InterfaceAction.create_menu_action() function which ensures that regardless of
|
||||
whether a shortcut is specified it will appear in Preferences->Keyboard
|
||||
'''
|
||||
orig_shortcut = shortcut
|
||||
kb = ia.gui.keyboard
|
||||
if unique_name is None:
|
||||
unique_name = menu_text
|
||||
if not shortcut == False:
|
||||
full_unique_name = menu_action_unique_name(ia, unique_name)
|
||||
if full_unique_name in kb.shortcuts:
|
||||
shortcut = False
|
||||
else:
|
||||
if shortcut is not None and not shortcut == False:
|
||||
if len(shortcut) == 0:
|
||||
shortcut = None
|
||||
else:
|
||||
shortcut = shortcut
|
||||
|
||||
if shortcut_name is None:
|
||||
shortcut_name = menu_text.replace('&','')
|
||||
|
||||
ac = ia.create_menu_action(parent_menu, unique_name, menu_text, icon=None, shortcut=shortcut,
|
||||
description=tooltip, triggered=triggered, shortcut_name=shortcut_name)
|
||||
if shortcut == False and not orig_shortcut == False:
|
||||
if ac.calibre_shortcut_unique_name in ia.gui.keyboard.shortcuts:
|
||||
kb.replace_action(ac.calibre_shortcut_unique_name, ac)
|
||||
if image:
|
||||
ac.setIcon(get_icon(image))
|
||||
if is_checked is not None:
|
||||
ac.setCheckable(True)
|
||||
if is_checked:
|
||||
ac.setChecked(True)
|
||||
return ac
|
||||
|
||||
|
||||
def get_library_uuid(db):
|
||||
try:
|
||||
library_uuid = db.library_id
|
||||
except:
|
||||
library_uuid = ''
|
||||
return library_uuid
|
||||
|
||||
# Call as ' with busy_cursor:"
|
||||
@contextmanager
|
||||
def busy_cursor():
|
||||
try:
|
||||
QApplication.setOverrideCursor(QCursor(Qt.WaitCursor))
|
||||
yield
|
||||
finally:
|
||||
QApplication.restoreOverrideCursor()
|
||||
|
||||
class ImageTitleLayout(QHBoxLayout):
|
||||
'''
|
||||
A reusable layout widget displaying an image followed by a title
|
||||
'''
|
||||
def __init__(self, parent, icon_name, title, tooltip=None):
|
||||
QHBoxLayout.__init__(self)
|
||||
title_image_label = QLabel(parent)
|
||||
pixmap = get_pixmap(icon_name)
|
||||
if pixmap is None:
|
||||
pixmap = get_pixmap('library.png')
|
||||
# error_dialog(parent, _('Restart required'),
|
||||
# _('You must restart Calibre before using this plugin!'), show=True)
|
||||
else:
|
||||
title_image_label.setPixmap(pixmap)
|
||||
title_image_label.setMaximumSize(32, 32)
|
||||
title_image_label.setScaledContents(True)
|
||||
self.addWidget(title_image_label)
|
||||
|
||||
title_font = QFont()
|
||||
title_font.setPointSize(16)
|
||||
shelf_label = QLabel(title, parent)
|
||||
shelf_label.setFont(title_font)
|
||||
self.addWidget(shelf_label)
|
||||
self.insertStretch(-1)
|
||||
|
||||
if tooltip:
|
||||
title_image_label.setToolTip(tooltip)
|
||||
shelf_label.setToolTip(tooltip)
|
||||
|
||||
class SizePersistedDialog(QDialog):
|
||||
'''
|
||||
This dialog is a base class for any dialogs that want their size/position
|
||||
restored when they are next opened.
|
||||
'''
|
||||
def __init__(self, parent, unique_pref_name):
|
||||
QDialog.__init__(self, parent)
|
||||
self.unique_pref_name = unique_pref_name
|
||||
self.geom = gprefs.get(unique_pref_name, None)
|
||||
self.finished.connect(self.dialog_closing)
|
||||
|
||||
def resize_dialog(self):
|
||||
if self.geom is None:
|
||||
self.resize(self.sizeHint())
|
||||
else:
|
||||
self.restoreGeometry(self.geom)
|
||||
|
||||
def dialog_closing(self, result):
|
||||
self.geom = bytearray(self.saveGeometry())
|
||||
gprefs[self.unique_pref_name] = self.geom
|
||||
|
||||
class EditableTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, text):
|
||||
if text is None:
|
||||
text = ''
|
||||
QTableWidgetItem.__init__(self, text)
|
||||
self.setFlags(Qt.ItemIsSelectable|Qt.ItemIsEnabled|Qt.ItemIsEditable)
|
||||
|
||||
class ReadOnlyTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, text):
|
||||
if text is None:
|
||||
text = ''
|
||||
QTableWidgetItem.__init__(self, text)
|
||||
self.setFlags(Qt.ItemIsSelectable|Qt.ItemIsEnabled)
|
||||
|
||||
|
||||
class TextIconWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, text, icon):
|
||||
QTableWidgetItem.__init__(self, text)
|
||||
if icon:
|
||||
self.setIcon(icon)
|
||||
|
||||
|
||||
class ReadOnlyTextIconWidgetItem(ReadOnlyTableWidgetItem):
|
||||
|
||||
def __init__(self, text, icon):
|
||||
ReadOnlyTableWidgetItem.__init__(self, text)
|
||||
if icon:
|
||||
self.setIcon(icon)
|
||||
|
||||
|
||||
class KeyboardConfigDialog(SizePersistedDialog):
|
||||
'''
|
||||
This dialog is used to allow editing of keyboard shortcuts.
|
||||
'''
|
||||
def __init__(self, gui, group_name):
|
||||
SizePersistedDialog.__init__(self, gui, 'FanFicFare plugin:Keyboard shortcut dialog')
|
||||
self.gui = gui
|
||||
self.setWindowTitle(_('Keyboard shortcuts'))
|
||||
layout = QVBoxLayout(self)
|
||||
self.setLayout(layout)
|
||||
|
||||
self.keyboard_widget = ShortcutConfig(self)
|
||||
layout.addWidget(self.keyboard_widget)
|
||||
self.group_name = group_name
|
||||
|
||||
button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel)
|
||||
button_box.accepted.connect(self.commit)
|
||||
button_box.rejected.connect(self.reject)
|
||||
layout.addWidget(button_box)
|
||||
|
||||
# Cause our dialog size to be restored from prefs or created on first usage
|
||||
self.resize_dialog()
|
||||
self.initialize()
|
||||
|
||||
def initialize(self):
|
||||
self.keyboard_widget.initialize(self.gui.keyboard)
|
||||
self.keyboard_widget.highlight_group(self.group_name)
|
||||
|
||||
def commit(self):
|
||||
self.keyboard_widget.commit()
|
||||
self.accept()
|
||||
|
||||
|
||||
class PrefsViewerDialog(SizePersistedDialog):
|
||||
|
||||
def __init__(self, gui, namespace):
|
||||
SizePersistedDialog.__init__(self, gui, _('Prefs Viewer dialog'))
|
||||
self.setWindowTitle(_('Preferences for: ')+namespace)
|
||||
|
||||
self.gui = gui
|
||||
self.db = gui.current_db
|
||||
self.namespace = namespace
|
||||
self._init_controls()
|
||||
self.resize_dialog()
|
||||
|
||||
self._populate_settings()
|
||||
|
||||
if self.keys_list.count():
|
||||
self.keys_list.setCurrentRow(0)
|
||||
|
||||
def _init_controls(self):
|
||||
layout = QVBoxLayout(self)
|
||||
self.setLayout(layout)
|
||||
|
||||
ml = QHBoxLayout()
|
||||
layout.addLayout(ml, 1)
|
||||
|
||||
self.keys_list = QListWidget(self)
|
||||
self.keys_list.setSelectionMode(QAbstractItemView.SingleSelection)
|
||||
self.keys_list.setFixedWidth(150)
|
||||
self.keys_list.setAlternatingRowColors(True)
|
||||
ml.addWidget(self.keys_list)
|
||||
self.value_text = QTextEdit(self)
|
||||
self.value_text.setReadOnly(True)
|
||||
ml.addWidget(self.value_text, 1)
|
||||
|
||||
button_box = QDialogButtonBox(QDialogButtonBox.Ok)
|
||||
button_box.accepted.connect(self.accept)
|
||||
self.clear_button = button_box.addButton(_('Clear'), QDialogButtonBox.ResetRole)
|
||||
self.clear_button.setIcon(get_icon('trash.png'))
|
||||
self.clear_button.setToolTip(_('Clear all settings for this plugin'))
|
||||
self.clear_button.clicked.connect(self._clear_settings)
|
||||
|
||||
if DEBUG:
|
||||
self.edit_button = button_box.addButton(_('Edit'), QDialogButtonBox.ResetRole)
|
||||
self.edit_button.setIcon(get_icon('edit_input.png'))
|
||||
self.edit_button.setToolTip(_('Edit settings.'))
|
||||
self.edit_button.clicked.connect(self._edit_settings)
|
||||
|
||||
self.save_button = button_box.addButton(_('Save'), QDialogButtonBox.ResetRole)
|
||||
self.save_button.setIcon(get_icon('save.png'))
|
||||
self.save_button.setToolTip(_('Save setting for this plugin'))
|
||||
self.save_button.clicked.connect(self._save_settings)
|
||||
self.save_button.setEnabled(False)
|
||||
layout.addWidget(button_box)
|
||||
|
||||
def _populate_settings(self):
|
||||
self.keys_list.clear()
|
||||
ns_prefix = self._get_ns_prefix()
|
||||
keys = sorted([k[len(ns_prefix):] for k in six.iterkeys(self.db.prefs)
|
||||
if k.startswith(ns_prefix)])
|
||||
for key in keys:
|
||||
self.keys_list.addItem(key)
|
||||
self.keys_list.setMinimumWidth(self.keys_list.sizeHintForColumn(0))
|
||||
self.keys_list.currentRowChanged[int].connect(self._current_row_changed)
|
||||
|
||||
def _current_row_changed(self, new_row):
|
||||
if new_row < 0:
|
||||
self.value_text.clear()
|
||||
return
|
||||
key = unicode(self.keys_list.currentItem().text())
|
||||
val = self.db.prefs.get_namespaced(self.namespace, key, '')
|
||||
self.value_text.setPlainText(self.db.prefs.to_raw(val))
|
||||
|
||||
def _get_ns_prefix(self):
|
||||
return 'namespaced:%s:'% self.namespace
|
||||
|
||||
def _edit_settings(self):
|
||||
from calibre.gui2.dialogs.confirm_delete import confirm
|
||||
message = '<p>' + _('Are you sure you want to edit settings in this library for this plugin?') + '</p>' \
|
||||
+ '<p>' + _('The FanFicFare team does not support hand edited configurations.') + '</p>'
|
||||
if confirm(message, self.namespace+'_edit_settings', self):
|
||||
self.save_button.setEnabled(True)
|
||||
self.edit_button.setEnabled(False)
|
||||
self.value_text.setReadOnly(False)
|
||||
|
||||
def _save_settings(self):
|
||||
from calibre.gui2.dialogs.confirm_delete import confirm
|
||||
message = '<p>' + _('Are you sure you want to save this setting in this library for this plugin?') + '</p>' \
|
||||
+ '<p>' + _('Any settings in other libraries or stored in a JSON file in your calibre plugins folder will not be touched.') + '</p>' \
|
||||
+ '<p>' + _('You must restart calibre afterwards.') + '</p>'
|
||||
if not confirm(message, self.namespace+'_save_settings', self):
|
||||
return
|
||||
ns_prefix = self._get_ns_prefix()
|
||||
key = unicode(self.keys_list.currentItem().text())
|
||||
self.db.prefs.set_namespaced(self.namespace, key,
|
||||
self.db.prefs.raw_to_object(self.value_text.toPlainText()))
|
||||
d = info_dialog(self, 'Settings saved',
|
||||
'<p>' + _('All settings for this plugin in this library have been saved.') + '</p>' \
|
||||
+ '<p>' + _('Please restart calibre now.') + '</p>',
|
||||
show_copy_button=False)
|
||||
b = d.bb.addButton(_('Restart calibre now'), d.bb.AcceptRole)
|
||||
b.setIcon(QIcon(I('lt.png')))
|
||||
d.do_restart = False
|
||||
def rf():
|
||||
d.do_restart = True
|
||||
b.clicked.connect(rf)
|
||||
d.set_details('')
|
||||
d.exec_()
|
||||
b.clicked.disconnect()
|
||||
self.close()
|
||||
if d.do_restart:
|
||||
self.gui.quit(restart=True)
|
||||
|
||||
def _clear_settings(self):
|
||||
from calibre.gui2.dialogs.confirm_delete import confirm
|
||||
message = '<p>' + _('Are you sure you want to clear your settings in this library for this plugin?') + '</p>' \
|
||||
+ '<p>' + _('Any settings in other libraries or stored in a JSON file in your calibre plugins folder will not be touched.') + '</p>' \
|
||||
+ '<p>' + _('You must restart calibre afterwards.') + '</p>'
|
||||
if not confirm(message, self.namespace+'_clear_settings', self):
|
||||
return
|
||||
ns_prefix = self._get_ns_prefix()
|
||||
keys = [k for k in six.iterkeys(self.db.prefs) if k.startswith(ns_prefix)]
|
||||
for k in keys:
|
||||
del self.db.prefs[k]
|
||||
self._populate_settings()
|
||||
d = info_dialog(self, 'Settings deleted',
|
||||
'<p>' + _('All settings for this plugin in this library have been cleared.') + '</p>' \
|
||||
+ '<p>' + _('Please restart calibre now.') + '</p>',
|
||||
show_copy_button=False)
|
||||
b = d.bb.addButton(_('Restart calibre now'), d.bb.AcceptRole)
|
||||
b.setIcon(QIcon(I('lt.png')))
|
||||
d.do_restart = False
|
||||
def rf():
|
||||
d.do_restart = True
|
||||
b.clicked.connect(rf)
|
||||
d.set_details('')
|
||||
d.exec_()
|
||||
b.clicked.disconnect()
|
||||
self.close()
|
||||
if d.do_restart:
|
||||
self.gui.quit(restart=True)
|
||||
#!/usr/bin/env python
|
||||
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2011, Grant Drake <grant.drake@gmail.com>'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import os
|
||||
from PyQt4 import QtGui
|
||||
from PyQt4.Qt import (Qt, QIcon, QPixmap, QLabel, QDialog, QHBoxLayout,
|
||||
QTableWidgetItem, QFont, QLineEdit, QComboBox,
|
||||
QVBoxLayout, QDialogButtonBox, QStyledItemDelegate, QDateTime)
|
||||
from calibre.constants import iswindows
|
||||
from calibre.gui2 import gprefs, error_dialog, UNDEFINED_QDATETIME
|
||||
from calibre.gui2.actions import menu_action_unique_name
|
||||
from calibre.gui2.keyboard import ShortcutConfig
|
||||
from calibre.utils.config import config_dir
|
||||
from calibre.utils.date import now, format_date, qt_to_dt, UNDEFINED_DATE
|
||||
|
||||
# Global definition of our plugin name. Used for common functions that require this.
|
||||
plugin_name = None
|
||||
# Global definition of our plugin resources. Used to share between the xxxAction and xxxBase
|
||||
# classes if you need any zip images to be displayed on the configuration dialog.
|
||||
plugin_icon_resources = {}
|
||||
|
||||
|
||||
def set_plugin_icon_resources(name, resources):
|
||||
'''
|
||||
Set our global store of plugin name and icon resources for sharing between
|
||||
the InterfaceAction class which reads them and the ConfigWidget
|
||||
if needed for use on the customization dialog for this plugin.
|
||||
'''
|
||||
global plugin_icon_resources, plugin_name
|
||||
plugin_name = name
|
||||
plugin_icon_resources = resources
|
||||
|
||||
|
||||
def get_icon(icon_name):
|
||||
'''
|
||||
Retrieve a QIcon for the named image from the zip file if it exists,
|
||||
or if not then from Calibre's image cache.
|
||||
'''
|
||||
if icon_name:
|
||||
pixmap = get_pixmap(icon_name)
|
||||
if pixmap is None:
|
||||
# Look in Calibre's cache for the icon
|
||||
return QIcon(I(icon_name))
|
||||
else:
|
||||
return QIcon(pixmap)
|
||||
return QIcon()
|
||||
|
||||
|
||||
def get_pixmap(icon_name):
|
||||
'''
|
||||
Retrieve a QPixmap for the named image
|
||||
Any icons belonging to the plugin must be prefixed with 'images/'
|
||||
'''
|
||||
global plugin_icon_resources, plugin_name
|
||||
|
||||
if not icon_name.startswith('images/'):
|
||||
# We know this is definitely not an icon belonging to this plugin
|
||||
pixmap = QPixmap()
|
||||
pixmap.load(I(icon_name))
|
||||
return pixmap
|
||||
|
||||
# Check to see whether the icon exists as a Calibre resource
|
||||
# This will enable skinning if the user stores icons within a folder like:
|
||||
# ...\AppData\Roaming\calibre\resources\images\Plugin Name\
|
||||
if plugin_name:
|
||||
local_images_dir = get_local_images_dir(plugin_name)
|
||||
local_image_path = os.path.join(local_images_dir, icon_name.replace('images/', ''))
|
||||
if os.path.exists(local_image_path):
|
||||
pixmap = QPixmap()
|
||||
pixmap.load(local_image_path)
|
||||
return pixmap
|
||||
|
||||
# As we did not find an icon elsewhere, look within our zip resources
|
||||
if icon_name in plugin_icon_resources:
|
||||
pixmap = QPixmap()
|
||||
pixmap.loadFromData(plugin_icon_resources[icon_name])
|
||||
return pixmap
|
||||
return None
|
||||
|
||||
|
||||
def get_local_images_dir(subfolder=None):
|
||||
'''
|
||||
Returns a path to the user's local resources/images folder
|
||||
If a subfolder name parameter is specified, appends this to the path
|
||||
'''
|
||||
images_dir = os.path.join(config_dir, 'resources/images')
|
||||
if subfolder:
|
||||
images_dir = os.path.join(images_dir, subfolder)
|
||||
if iswindows:
|
||||
images_dir = os.path.normpath(images_dir)
|
||||
return images_dir
|
||||
|
||||
|
||||
def create_menu_item(ia, parent_menu, menu_text, image=None, tooltip=None,
|
||||
shortcut=(), triggered=None, is_checked=None):
|
||||
'''
|
||||
Create a menu action with the specified criteria and action
|
||||
Note that if no shortcut is specified, will not appear in Preferences->Keyboard
|
||||
This method should only be used for actions which either have no shortcuts,
|
||||
or register their menus only once. Use create_menu_action_unique for all else.
|
||||
'''
|
||||
if shortcut is not None:
|
||||
if len(shortcut) == 0:
|
||||
shortcut = ()
|
||||
else:
|
||||
shortcut = _(shortcut)
|
||||
ac = ia.create_action(spec=(menu_text, None, tooltip, shortcut),
|
||||
attr=menu_text)
|
||||
if image:
|
||||
ac.setIcon(get_icon(image))
|
||||
if triggered is not None:
|
||||
ac.triggered.connect(triggered)
|
||||
if is_checked is not None:
|
||||
ac.setCheckable(True)
|
||||
if is_checked:
|
||||
ac.setChecked(True)
|
||||
|
||||
parent_menu.addAction(ac)
|
||||
return ac
|
||||
|
||||
|
||||
def create_menu_action_unique(ia, parent_menu, menu_text, image=None, tooltip=None,
|
||||
shortcut=None, triggered=None, is_checked=None, shortcut_name=None,
|
||||
unique_name=None):
|
||||
'''
|
||||
Create a menu action with the specified criteria and action, using the new
|
||||
InterfaceAction.create_menu_action() function which ensures that regardless of
|
||||
whether a shortcut is specified it will appear in Preferences->Keyboard
|
||||
'''
|
||||
orig_shortcut = shortcut
|
||||
kb = ia.gui.keyboard
|
||||
if unique_name is None:
|
||||
unique_name = menu_text
|
||||
if not shortcut == False:
|
||||
full_unique_name = menu_action_unique_name(ia, unique_name)
|
||||
if full_unique_name in kb.shortcuts:
|
||||
shortcut = False
|
||||
else:
|
||||
if shortcut is not None and not shortcut == False:
|
||||
if len(shortcut) == 0:
|
||||
shortcut = None
|
||||
else:
|
||||
shortcut = _(shortcut)
|
||||
|
||||
if shortcut_name is None:
|
||||
shortcut_name = menu_text.replace('&','')
|
||||
|
||||
ac = ia.create_menu_action(parent_menu, unique_name, menu_text, icon=None, shortcut=shortcut,
|
||||
description=tooltip, triggered=triggered, shortcut_name=shortcut_name)
|
||||
if shortcut == False and not orig_shortcut == False:
|
||||
if ac.calibre_shortcut_unique_name in ia.gui.keyboard.shortcuts:
|
||||
kb.replace_action(ac.calibre_shortcut_unique_name, ac)
|
||||
if image:
|
||||
ac.setIcon(get_icon(image))
|
||||
if is_checked is not None:
|
||||
ac.setCheckable(True)
|
||||
if is_checked:
|
||||
ac.setChecked(True)
|
||||
return ac
|
||||
|
||||
|
||||
def swap_author_names(author):
|
||||
if author.find(',') == -1:
|
||||
return author
|
||||
name_parts = author.strip().partition(',')
|
||||
return name_parts[2].strip() + ' ' + name_parts[0]
|
||||
|
||||
|
||||
def get_library_uuid(db):
|
||||
try:
|
||||
library_uuid = db.library_id
|
||||
except:
|
||||
library_uuid = ''
|
||||
return library_uuid
|
||||
|
||||
|
||||
class ImageLabel(QLabel):
|
||||
|
||||
def __init__(self, parent, icon_name, size=16):
|
||||
QLabel.__init__(self, parent)
|
||||
pixmap = get_pixmap(icon_name)
|
||||
self.setPixmap(pixmap)
|
||||
self.setMaximumSize(size, size)
|
||||
self.setScaledContents(True)
|
||||
|
||||
|
||||
class ImageTitleLayout(QHBoxLayout):
|
||||
'''
|
||||
A reusable layout widget displaying an image followed by a title
|
||||
'''
|
||||
def __init__(self, parent, icon_name, title):
|
||||
QHBoxLayout.__init__(self)
|
||||
title_image_label = QLabel(parent)
|
||||
pixmap = get_pixmap(icon_name)
|
||||
if pixmap is None:
|
||||
pixmap = get_pixmap('library.png')
|
||||
# error_dialog(parent, _('Restart required'),
|
||||
# _('You must restart Calibre before using this plugin!'), show=True)
|
||||
else:
|
||||
title_image_label.setPixmap(pixmap)
|
||||
title_image_label.setMaximumSize(32, 32)
|
||||
title_image_label.setScaledContents(True)
|
||||
self.addWidget(title_image_label)
|
||||
|
||||
title_font = QFont()
|
||||
title_font.setPointSize(16)
|
||||
shelf_label = QLabel(title, parent)
|
||||
shelf_label.setFont(title_font)
|
||||
self.addWidget(shelf_label)
|
||||
self.insertStretch(-1)
|
||||
|
||||
|
||||
class SizePersistedDialog(QDialog):
|
||||
'''
|
||||
This dialog is a base class for any dialogs that want their size/position
|
||||
restored when they are next opened.
|
||||
'''
|
||||
def __init__(self, parent, unique_pref_name):
|
||||
QDialog.__init__(self, parent)
|
||||
self.unique_pref_name = unique_pref_name
|
||||
self.geom = gprefs.get(unique_pref_name, None)
|
||||
self.finished.connect(self.dialog_closing)
|
||||
|
||||
def resize_dialog(self):
|
||||
if self.geom is None:
|
||||
self.resize(self.sizeHint())
|
||||
else:
|
||||
self.restoreGeometry(self.geom)
|
||||
|
||||
def dialog_closing(self, result):
|
||||
geom = bytearray(self.saveGeometry())
|
||||
gprefs[self.unique_pref_name] = geom
|
||||
|
||||
|
||||
class ReadOnlyTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, text):
|
||||
if text is None:
|
||||
text = ''
|
||||
QTableWidgetItem.__init__(self, text, QtGui.QTableWidgetItem.UserType)
|
||||
self.setFlags(Qt.ItemIsSelectable|Qt.ItemIsEnabled)
|
||||
|
||||
|
||||
class RatingTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, rating, is_read_only=False):
|
||||
QTableWidgetItem.__init__(self, '', QtGui.QTableWidgetItem.UserType)
|
||||
self.setData(Qt.DisplayRole, rating)
|
||||
if is_read_only:
|
||||
self.setFlags(Qt.ItemIsSelectable|Qt.ItemIsEnabled)
|
||||
|
||||
|
||||
class DateTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, date_read, is_read_only=False, default_to_today=False):
|
||||
if date_read == UNDEFINED_DATE and default_to_today:
|
||||
date_read = now()
|
||||
if is_read_only:
|
||||
QTableWidgetItem.__init__(self, format_date(date_read, None), QtGui.QTableWidgetItem.UserType)
|
||||
self.setFlags(Qt.ItemIsSelectable|Qt.ItemIsEnabled)
|
||||
else:
|
||||
QTableWidgetItem.__init__(self, '', QtGui.QTableWidgetItem.UserType)
|
||||
self.setData(Qt.DisplayRole, QDateTime(date_read))
|
||||
|
||||
|
||||
class NoWheelComboBox(QComboBox):
|
||||
|
||||
def wheelEvent (self, event):
|
||||
# Disable the mouse wheel on top of the combo box changing selection as plays havoc in a grid
|
||||
event.ignore()
|
||||
|
||||
|
||||
class CheckableTableWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, checked=False, is_tristate=False):
|
||||
QTableWidgetItem.__init__(self, '')
|
||||
self.setFlags(Qt.ItemFlags(Qt.ItemIsSelectable | Qt.ItemIsUserCheckable | Qt.ItemIsEnabled ))
|
||||
if is_tristate:
|
||||
self.setFlags(self.flags() | Qt.ItemIsTristate)
|
||||
if checked:
|
||||
self.setCheckState(Qt.Checked)
|
||||
else:
|
||||
if is_tristate and checked is None:
|
||||
self.setCheckState(Qt.PartiallyChecked)
|
||||
else:
|
||||
self.setCheckState(Qt.Unchecked)
|
||||
|
||||
def get_boolean_value(self):
|
||||
'''
|
||||
Return a boolean value indicating whether checkbox is checked
|
||||
If this is a tristate checkbox, a partially checked value is returned as None
|
||||
'''
|
||||
if self.checkState() == Qt.PartiallyChecked:
|
||||
return None
|
||||
else:
|
||||
return self.checkState() == Qt.Checked
|
||||
|
||||
|
||||
class TextIconWidgetItem(QTableWidgetItem):
|
||||
|
||||
def __init__(self, text, icon):
|
||||
QTableWidgetItem.__init__(self, text)
|
||||
if icon:
|
||||
self.setIcon(icon)
|
||||
|
||||
|
||||
class ReadOnlyTextIconWidgetItem(ReadOnlyTableWidgetItem):
|
||||
|
||||
def __init__(self, text, icon):
|
||||
ReadOnlyTableWidgetItem.__init__(self, text)
|
||||
if icon:
|
||||
self.setIcon(icon)
|
||||
|
||||
|
||||
class ReadOnlyLineEdit(QLineEdit):
|
||||
|
||||
def __init__(self, text, parent):
|
||||
if text is None:
|
||||
text = ''
|
||||
QLineEdit.__init__(self, text, parent)
|
||||
self.setEnabled(False)
|
||||
|
||||
|
||||
class KeyValueComboBox(QComboBox):
|
||||
|
||||
def __init__(self, parent, values, selected_key):
|
||||
QComboBox.__init__(self, parent)
|
||||
self.values = values
|
||||
self.populate_combo(selected_key)
|
||||
|
||||
def populate_combo(self, selected_key):
|
||||
self.clear()
|
||||
selected_idx = idx = -1
|
||||
for key, value in self.values.iteritems():
|
||||
idx = idx + 1
|
||||
self.addItem(value)
|
||||
if key == selected_key:
|
||||
selected_idx = idx
|
||||
self.setCurrentIndex(selected_idx)
|
||||
|
||||
def selected_key(self):
|
||||
for key, value in self.values.iteritems():
|
||||
if value == unicode(self.currentText()).strip():
|
||||
return key
|
||||
|
||||
|
||||
class CustomColumnComboBox(QComboBox):
|
||||
|
||||
def __init__(self, parent, custom_columns, selected_column, initial_items=['']):
|
||||
QComboBox.__init__(self, parent)
|
||||
self.populate_combo(custom_columns, selected_column, initial_items)
|
||||
|
||||
def populate_combo(self, custom_columns, selected_column, initial_items=['']):
|
||||
self.clear()
|
||||
self.column_names = initial_items
|
||||
if len(initial_items) > 0:
|
||||
self.addItems(initial_items)
|
||||
selected_idx = 0
|
||||
for idx, value in enumerate(initial_items):
|
||||
if value == selected_column:
|
||||
selected_idx = idx
|
||||
for key in sorted(custom_columns.keys()):
|
||||
self.column_names.append(key)
|
||||
self.addItem('%s (%s)'%(key, custom_columns[key]['name']))
|
||||
if key == selected_column:
|
||||
selected_idx = len(self.column_names) - 1
|
||||
self.setCurrentIndex(selected_idx)
|
||||
|
||||
def get_selected_column(self):
|
||||
return self.column_names[self.currentIndex()]
|
||||
|
||||
|
||||
class KeyboardConfigDialog(SizePersistedDialog):
|
||||
'''
|
||||
This dialog is used to allow editing of keyboard shortcuts.
|
||||
'''
|
||||
def __init__(self, gui, group_name):
|
||||
SizePersistedDialog.__init__(self, gui, 'Keyboard shortcut dialog')
|
||||
self.gui = gui
|
||||
self.setWindowTitle('Keyboard shortcuts')
|
||||
layout = QVBoxLayout(self)
|
||||
self.setLayout(layout)
|
||||
|
||||
self.keyboard_widget = ShortcutConfig(self)
|
||||
layout.addWidget(self.keyboard_widget)
|
||||
self.group_name = group_name
|
||||
|
||||
button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel)
|
||||
button_box.accepted.connect(self.commit)
|
||||
button_box.rejected.connect(self.reject)
|
||||
layout.addWidget(button_box)
|
||||
|
||||
# Cause our dialog size to be restored from prefs or created on first usage
|
||||
self.resize_dialog()
|
||||
self.initialize()
|
||||
|
||||
def initialize(self):
|
||||
self.keyboard_widget.initialize(self.gui.keyboard)
|
||||
self.keyboard_widget.highlight_group(self.group_name)
|
||||
|
||||
def commit(self):
|
||||
self.keyboard_widget.commit()
|
||||
self.accept()
|
||||
|
||||
|
||||
class DateDelegate(QStyledItemDelegate):
|
||||
'''
|
||||
Delegate for dates. Because this delegate stores the
|
||||
format as an instance variable, a new instance must be created for each
|
||||
column. This differs from all the other delegates.
|
||||
'''
|
||||
def __init__(self, parent):
|
||||
QStyledItemDelegate.__init__(self, parent)
|
||||
self.format = 'dd MMM yyyy'
|
||||
|
||||
def displayText(self, val, locale):
|
||||
d = val.toDateTime()
|
||||
if d <= UNDEFINED_QDATETIME:
|
||||
return ''
|
||||
return format_date(qt_to_dt(d, as_utc=False), self.format)
|
||||
|
||||
def createEditor(self, parent, option, index):
|
||||
qde = QStyledItemDelegate.createEditor(self, parent, option, index)
|
||||
qde.setDisplayFormat(self.format)
|
||||
qde.setMinimumDateTime(UNDEFINED_QDATETIME)
|
||||
qde.setSpecialValueText(_('Undefined'))
|
||||
qde.setCalendarPopup(True)
|
||||
return qde
|
||||
|
||||
def setEditorData(self, editor, index):
|
||||
val = index.model().data(index, Qt.DisplayRole).toDateTime()
|
||||
if val is None or val == UNDEFINED_QDATETIME:
|
||||
val = now()
|
||||
editor.setDateTime(val)
|
||||
|
||||
def setModelData(self, editor, model, index):
|
||||
val = editor.dateTime()
|
||||
if val <= UNDEFINED_QDATETIME:
|
||||
model.setData(index, UNDEFINED_QDATETIME, Qt.EditRole)
|
||||
else:
|
||||
model.setData(index, QDateTime(val), Qt.EditRole)
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
990
calibre-plugin/ffdl_plugin.py
Normal file
990
calibre-plugin/ffdl_plugin.py
Normal file
|
|
@ -0,0 +1,990 @@
|
|||
#!/usr/bin/env python
|
||||
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2012, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import time, os, copy, threading
|
||||
from ConfigParser import SafeConfigParser
|
||||
from StringIO import StringIO
|
||||
from functools import partial
|
||||
from datetime import datetime
|
||||
|
||||
from PyQt4.Qt import (QApplication, QMenu, QToolButton)
|
||||
|
||||
from PyQt4.Qt import QPixmap, Qt
|
||||
from PyQt4.QtCore import QBuffer
|
||||
|
||||
|
||||
from calibre.ptempfile import PersistentTemporaryFile, PersistentTemporaryDirectory, remove_dir
|
||||
from calibre.ebooks.metadata import MetaInformation, authors_to_string
|
||||
from calibre.ebooks.metadata.meta import get_metadata
|
||||
from calibre.gui2 import error_dialog, warning_dialog, question_dialog, info_dialog
|
||||
from calibre.gui2.dialogs.message_box import ViewLog
|
||||
from calibre.gui2.dialogs.confirm_delete import confirm
|
||||
from calibre.utils.date import local_tz
|
||||
|
||||
# The class that all interface action plugins must inherit from
|
||||
from calibre.gui2.actions import InterfaceAction
|
||||
|
||||
from calibre_plugins.fanfictiondownloader_plugin.common_utils import (set_plugin_icon_resources, get_icon,
|
||||
create_menu_action_unique, get_library_uuid)
|
||||
|
||||
from calibre_plugins.fanfictiondownloader_plugin.fanficdownloader import adapters, writers, exceptions
|
||||
from calibre_plugins.fanfictiondownloader_plugin.fanficdownloader.htmlcleanup import stripHTML
|
||||
from calibre_plugins.fanfictiondownloader_plugin.fanficdownloader.epubutils import get_dcsource, get_dcsource_chaptercount
|
||||
|
||||
from calibre_plugins.fanfictiondownloader_plugin.config import (prefs, permitted_values)
|
||||
from calibre_plugins.fanfictiondownloader_plugin.dialogs import (
|
||||
AddNewDialog, UpdateExistingDialog, display_story_list, DisplayStoryListDialog,
|
||||
LoopProgressDialog, UserPassDialog, AboutDialog,
|
||||
OVERWRITE, OVERWRITEALWAYS, UPDATE, UPDATEALWAYS, ADDNEW, SKIP, CALIBREONLY,
|
||||
NotGoingToDownload )
|
||||
|
||||
# because calibre immediately transforms html into zip and don't want
|
||||
# to have an 'if html'. db.has_format is cool with the case mismatch,
|
||||
# but if I'm doing it anyway...
|
||||
formmapping = {
|
||||
'epub':'EPUB',
|
||||
'mobi':'MOBI',
|
||||
'html':'ZIP',
|
||||
'txt':'TXT'
|
||||
}
|
||||
|
||||
PLUGIN_ICONS = ['images/icon.png']
|
||||
|
||||
class FanFictionDownLoaderPlugin(InterfaceAction):
|
||||
|
||||
name = 'FanFictionDownLoader'
|
||||
|
||||
# Declare the main action associated with this plugin
|
||||
# The keyboard shortcut can be None if you dont want to use a keyboard
|
||||
# shortcut. Remember that currently calibre has no central management for
|
||||
# keyboard shortcuts, so try to use an unusual/unused shortcut.
|
||||
# (text, icon_path, tooltip, keyboard shortcut)
|
||||
# icon_path isn't in the zip--icon loaded below.
|
||||
action_spec = (name, None,
|
||||
'Download FanFiction stories from various web sites', ())
|
||||
# None for keyboard shortcut doesn't allow shortcut. () does, there just isn't one yet
|
||||
|
||||
action_type = 'global'
|
||||
# make button menu drop down only
|
||||
#popup_type = QToolButton.InstantPopup
|
||||
|
||||
def genesis(self):
|
||||
|
||||
# This method is called once per plugin, do initial setup here
|
||||
|
||||
# Read the plugin icons and store for potential sharing with the config widget
|
||||
icon_resources = self.load_resources(PLUGIN_ICONS)
|
||||
set_plugin_icon_resources(self.name, icon_resources)
|
||||
|
||||
base = self.interface_action_base_plugin
|
||||
self.version = base.name+" v%d.%d.%d"%base.version
|
||||
|
||||
# Set the icon for this interface action
|
||||
# The get_icons function is a builtin function defined for all your
|
||||
# plugin code. It loads icons from the plugin zip file. It returns
|
||||
# QIcon objects, if you want the actual data, use the analogous
|
||||
# get_resources builtin function.
|
||||
|
||||
# Note that if you are loading more than one icon, for performance, you
|
||||
# should pass a list of names to get_icons. In this case, get_icons
|
||||
# will return a dictionary mapping names to QIcons. Names that
|
||||
# are not found in the zip file will result in null QIcons.
|
||||
icon = get_icon('images/icon.png')
|
||||
|
||||
#self.qaction.setText('FFDL')
|
||||
|
||||
# The qaction is automatically created from the action_spec defined
|
||||
# above
|
||||
self.qaction.setIcon(icon)
|
||||
|
||||
# Call function when plugin triggered.
|
||||
self.qaction.triggered.connect(self.plugin_button)
|
||||
|
||||
# Assign our menu to this action
|
||||
self.menu = QMenu(self.gui)
|
||||
self.old_actions_unique_map = {}
|
||||
self.qaction.setMenu(self.menu)
|
||||
self.menu.aboutToShow.connect(self.about_to_show_menu)
|
||||
|
||||
self.menus_lock = threading.RLock()
|
||||
|
||||
def initialization_complete(self):
|
||||
# otherwise configured hot keys won't work until the menu's
|
||||
# been displayed once.
|
||||
self.rebuild_menus()
|
||||
|
||||
def about_to_show_menu(self):
|
||||
self.rebuild_menus()
|
||||
|
||||
def library_changed(self, db):
|
||||
# We need to reset our menus after switching libraries
|
||||
self.rebuild_menus()
|
||||
|
||||
def rebuild_menus(self):
|
||||
with self.menus_lock:
|
||||
# Show the config dialog
|
||||
# The config dialog can also be shown from within
|
||||
# Preferences->Plugins, which is why the do_user_config
|
||||
# method is defined on the base plugin class
|
||||
do_user_config = self.interface_action_base_plugin.do_user_config
|
||||
self.menu.clear()
|
||||
self.actions_unique_map = {}
|
||||
self.add_action = self.create_menu_item_ex(self.menu, '&Add New from URL(s)', image='plus.png',
|
||||
unique_name='Add New FanFiction Book(s) from URL(s)',
|
||||
shortcut_name='Add New FanFiction Book(s) from URL(s)',
|
||||
triggered=self.add_dialog )
|
||||
|
||||
self.update_action = self.create_menu_item_ex(self.menu, '&Update Existing FanFiction Book(s)', image='plusplus.png',
|
||||
unique_name='Update Existing FanFiction Book(s)',
|
||||
shortcut_name='Update Existing FanFiction Book(s)',
|
||||
triggered=self.update_existing)
|
||||
|
||||
if 'Reading List' in self.gui.iactions and (prefs['addtolists'] or prefs['addtoreadlists']) :
|
||||
self.menu.addSeparator()
|
||||
addmenutxt, rmmenutxt = None, None
|
||||
if prefs['addtolists'] and prefs['addtoreadlists'] :
|
||||
addmenutxt = 'Add to "To Read" and "Send to Device" Lists'
|
||||
if prefs['addtolistsonread']:
|
||||
rmmenutxt = 'Remove from "To Read" and add to "Send to Device" Lists'
|
||||
else:
|
||||
rmmenutxt = 'Remove from "To Read" Lists'
|
||||
elif prefs['addtolists'] :
|
||||
addmenutxt = 'Add Selected to "Send to Device" Lists'
|
||||
elif prefs['addtoreadlists']:
|
||||
addmenutxt = 'Add to "To Read" Lists'
|
||||
rmmenutxt = 'Remove from "To Read" Lists'
|
||||
|
||||
if addmenutxt:
|
||||
self.add_send_action = self.create_menu_item_ex(self.menu, addmenutxt, image='plusplus.png',
|
||||
unique_name=addmenutxt,
|
||||
shortcut_name=addmenutxt,
|
||||
triggered=partial(self.update_lists,add=True))
|
||||
|
||||
if rmmenutxt:
|
||||
self.add_remove_action = self.create_menu_item_ex(self.menu, rmmenutxt, image='minusminus.png',
|
||||
unique_name=rmmenutxt,
|
||||
shortcut_name=rmmenutxt,
|
||||
triggered=partial(self.update_lists,add=False))
|
||||
|
||||
# try:
|
||||
# self.add_send_action.setEnabled( len(self.gui.library_view.get_selected_ids()) > 0 )
|
||||
# except:
|
||||
# pass
|
||||
# try:
|
||||
# self.add_remove_action.setEnabled( len(self.gui.library_view.get_selected_ids()) > 0 )
|
||||
# except:
|
||||
# pass
|
||||
|
||||
self.menu.addSeparator()
|
||||
self.get_list_action = self.create_menu_item_ex(self.menu, 'Get URLs from Selected Books', image='bookmarks.png',
|
||||
unique_name='Get URLs from Selected Books',
|
||||
shortcut_name='Get URLs from Selected Books',
|
||||
triggered=self.get_list_urls)
|
||||
|
||||
self.menu.addSeparator()
|
||||
self.config_action = create_menu_action_unique(self, self.menu, '&Configure Plugin', shortcut=False,
|
||||
image= 'config.png',
|
||||
unique_name='Configure FanFictionDownLoader',
|
||||
shortcut_name='Configure FanFictionDownLoader',
|
||||
triggered=partial(do_user_config,parent=self.gui))
|
||||
|
||||
self.config_action = create_menu_action_unique(self, self.menu, '&About Plugin', shortcut=False,
|
||||
image= 'images/icon.png',
|
||||
unique_name='About FanFictionDownLoader',
|
||||
shortcut_name='About FanFictionDownLoader',
|
||||
triggered=self.about)
|
||||
|
||||
# Before we finalize, make sure we delete any actions for menus that are no longer displayed
|
||||
for menu_id, unique_name in self.old_actions_unique_map.iteritems():
|
||||
if menu_id not in self.actions_unique_map:
|
||||
self.gui.keyboard.unregister_shortcut(unique_name)
|
||||
self.old_actions_unique_map = self.actions_unique_map
|
||||
self.gui.keyboard.finalize()
|
||||
|
||||
def about(self):
|
||||
# Get the about text from a file inside the plugin zip file
|
||||
# The get_resources function is a builtin function defined for all your
|
||||
# plugin code. It loads files from the plugin zip file. It returns
|
||||
# the bytes from the specified file.
|
||||
#
|
||||
# Note that if you are loading more than one file, for performance, you
|
||||
# should pass a list of names to get_resources. In this case,
|
||||
# get_resources will return a dictionary mapping names to bytes. Names that
|
||||
# are not found in the zip file will not be in the returned dictionary.
|
||||
|
||||
text = get_resources('about.txt')
|
||||
AboutDialog(self.gui,self.qaction.icon(),self.version + text).exec_()
|
||||
|
||||
def create_menu_item_ex(self, parent_menu, menu_text, image=None, tooltip=None,
|
||||
shortcut=None, triggered=None, is_checked=None, shortcut_name=None,
|
||||
unique_name=None):
|
||||
ac = create_menu_action_unique(self, parent_menu, menu_text, image, tooltip,
|
||||
shortcut, triggered, is_checked, shortcut_name, unique_name)
|
||||
self.actions_unique_map[ac.calibre_shortcut_unique_name] = ac.calibre_shortcut_unique_name
|
||||
return ac
|
||||
|
||||
def plugin_button(self):
|
||||
if len(self.gui.library_view.get_selected_ids()) > 0 and prefs['updatedefault']:
|
||||
self.update_existing()
|
||||
else:
|
||||
self.add_dialog()
|
||||
|
||||
def update_lists(self,add=True):
|
||||
if len(self.gui.library_view.get_selected_ids()) > 0 and \
|
||||
(prefs['addtolists'] or prefs['addtoreadlists']) :
|
||||
self._update_reading_lists(self.gui.library_view.get_selected_ids(),add)
|
||||
|
||||
def get_list_urls(self):
|
||||
if len(self.gui.library_view.get_selected_ids()) > 0:
|
||||
book_list = map( partial(self._convert_id_to_book, good=False), self.gui.library_view.get_selected_ids() )
|
||||
|
||||
LoopProgressDialog(self.gui,
|
||||
book_list,
|
||||
partial(self._get_story_url_for_list, db=self.gui.current_db),
|
||||
self._finish_get_list_urls,
|
||||
init_label="Collecting URLs for stories...",
|
||||
win_title="Get URLs for stories",
|
||||
status_prefix="URL retrieved")
|
||||
|
||||
def _get_story_url_for_list(self,book,db=None):
|
||||
book['url'] = self._get_story_url(db,book['calibre_id'])
|
||||
if book['url'] == None:
|
||||
book['good']=False
|
||||
else:
|
||||
book['good']=True
|
||||
|
||||
def _finish_get_list_urls(self, book_list):
|
||||
url_list = [ x['url'] for x in book_list if x['good'] ]
|
||||
if url_list:
|
||||
d = ViewLog(_("List of URLs"),"\n".join(url_list),parent=self.gui)
|
||||
d.setWindowIcon(get_icon('bookmarks.png'))
|
||||
d.exec_()
|
||||
else:
|
||||
info_dialog(self.gui, _('List of URLs'),
|
||||
_('No URLs found in selected books.'),
|
||||
show=True,
|
||||
show_copy_button=False)
|
||||
|
||||
def add_dialog(self):
|
||||
|
||||
#print("add_dialog()")
|
||||
|
||||
url_list = self.get_urls_clip()
|
||||
url_list_text = "\n".join(url_list)
|
||||
|
||||
# self.gui is the main calibre GUI. It acts as the gateway to access
|
||||
# all the elements of the calibre user interface, it should also be the
|
||||
# parent of the dialog
|
||||
# AddNewDialog just collects URLs, format and presents buttons.
|
||||
d = AddNewDialog(self.gui,
|
||||
prefs,
|
||||
self.qaction.icon(),
|
||||
url_list_text,
|
||||
)
|
||||
d.exec_()
|
||||
if d.result() != d.Accepted:
|
||||
return
|
||||
|
||||
url_list = get_url_list(d.get_urlstext())
|
||||
add_books = self._convert_urls_to_books(url_list)
|
||||
#print("add_books:%s"%add_books)
|
||||
#print("options:%s"%d.get_ffdl_options())
|
||||
|
||||
options = d.get_ffdl_options()
|
||||
options['version'] = self.version
|
||||
print(self.version)
|
||||
|
||||
self.start_downloads( options, add_books )
|
||||
|
||||
def update_existing(self):
|
||||
if len(self.gui.library_view.get_selected_ids()) == 0:
|
||||
return
|
||||
#print("update_existing()")
|
||||
|
||||
db = self.gui.current_db
|
||||
book_list = map( partial(self._convert_id_to_book, good=False), self.gui.library_view.get_selected_ids() )
|
||||
#book_ids = self.gui.library_view.get_selected_ids()
|
||||
|
||||
LoopProgressDialog(self.gui,
|
||||
book_list,
|
||||
partial(self._populate_book_from_calibre_id, db=self.gui.current_db),
|
||||
self._update_existing_2,
|
||||
init_label="Collecting stories for update...",
|
||||
win_title="Get stories for updates",
|
||||
status_prefix="URL retrieved")
|
||||
|
||||
#books = self._convert_calibre_ids_to_books(db, book_ids)
|
||||
#print("update books:%s"%books)
|
||||
|
||||
def _update_existing_2(self,book_list):
|
||||
|
||||
d = UpdateExistingDialog(self.gui,
|
||||
'Update Existing List',
|
||||
prefs,
|
||||
self.qaction.icon(),
|
||||
book_list,
|
||||
)
|
||||
d.exec_()
|
||||
if d.result() != d.Accepted:
|
||||
return
|
||||
|
||||
update_books = d.get_books()
|
||||
|
||||
#print("update_books:%s"%update_books)
|
||||
#print("options:%s"%d.get_ffdl_options())
|
||||
# only if there's some good ones.
|
||||
if 0 < len(filter(lambda x : x['good'], update_books)):
|
||||
options = d.get_ffdl_options()
|
||||
options['version'] = self.version
|
||||
print(self.version)
|
||||
self.start_downloads( options, update_books )
|
||||
|
||||
def get_urls_clip(self):
|
||||
url_list = []
|
||||
if prefs['urlsfromclip']:
|
||||
for url in unicode(QApplication.instance().clipboard().text()).split():
|
||||
if( self._is_good_downloader_url(url) ):
|
||||
url_list.append(url)
|
||||
return url_list
|
||||
|
||||
def apply_settings(self):
|
||||
# No need to do anything with perfs here, but we could.
|
||||
prefs
|
||||
|
||||
def start_downloads(self, options, books):
|
||||
|
||||
#print("start_downloads:%s"%books)
|
||||
|
||||
# create and pass temp dir.
|
||||
tdir = PersistentTemporaryDirectory(prefix='fanfictiondownloader_')
|
||||
options['tdir']=tdir
|
||||
|
||||
self.gui.status_bar.show_message(_('Started fetching metadata for %s stories.'%len(books)), 3000)
|
||||
|
||||
if 0 < len(filter(lambda x : x['good'], books)):
|
||||
LoopProgressDialog(self.gui,
|
||||
books,
|
||||
partial(self.get_metadata_for_book, options = options),
|
||||
partial(self.start_download_list, options = options))
|
||||
# LoopProgressDialog calls get_metadata_for_book for each 'good' story,
|
||||
# get_metadata_for_book updates book for each,
|
||||
# LoopProgressDialog calls start_download_list at the end which goes
|
||||
# into the BG, or shows list if no 'good' books.
|
||||
|
||||
def get_metadata_for_book(self,book,
|
||||
options={'fileform':'epub',
|
||||
'collision':ADDNEW,
|
||||
'updatemeta':True}):
|
||||
'''
|
||||
Update passed in book dict with metadata from website and
|
||||
necessary data. To be called from LoopProgressDialog
|
||||
'loop'. Also pops dialogs for is adult, user/pass.
|
||||
'''
|
||||
|
||||
# The current database shown in the GUI
|
||||
# db is an instance of the class LibraryDatabase2 from database.py
|
||||
# This class has many, many methods that allow you to do a lot of
|
||||
# things.
|
||||
db = self.gui.current_db
|
||||
|
||||
fileform = options['fileform']
|
||||
collision = options['collision']
|
||||
updatemeta= options['updatemeta']
|
||||
|
||||
if not book['good']:
|
||||
# book has already been flagged bad for whatever reason.
|
||||
return
|
||||
|
||||
url = book['url']
|
||||
print("url:%s"%url)
|
||||
skip_date_update = False
|
||||
|
||||
## was self.ffdlconfig, but we need to be able to change it
|
||||
## when doing epub update.
|
||||
ffdlconfig = SafeConfigParser()
|
||||
ffdlconfig.readfp(StringIO(get_resources("plugin-defaults.ini")))
|
||||
ffdlconfig.readfp(StringIO(prefs['personal.ini']))
|
||||
adapter = adapters.getAdapter(ffdlconfig,url,fileform)
|
||||
|
||||
options['personal.ini'] = prefs['personal.ini']
|
||||
if prefs['includeimages']:
|
||||
# this is a cheat to make it easier for users.
|
||||
options['personal.ini'] = '''[defaults]
|
||||
include_images:true
|
||||
keep_summary_html:true
|
||||
make_firstimage_cover:true
|
||||
''' + options['personal.ini']
|
||||
|
||||
## three tries, that's enough if both user/pass & is_adult needed,
|
||||
## or a couple tries of one or the other
|
||||
for x in range(0,2):
|
||||
try:
|
||||
adapter.getStoryMetadataOnly()
|
||||
except exceptions.FailedToLogin, f:
|
||||
print("Login Failed, Need Username/Password.")
|
||||
userpass = UserPassDialog(self.gui,url,f)
|
||||
userpass.exec_() # exec_ will make it act modal
|
||||
if userpass.status:
|
||||
adapter.username = userpass.user.text()
|
||||
adapter.password = userpass.passwd.text()
|
||||
|
||||
except exceptions.AdultCheckRequired:
|
||||
if question_dialog(self.gui, 'Are You Adult?', '<p>'+
|
||||
"%s requires that you be an adult. Please confirm you are an adult in your locale:"%url,
|
||||
show_copy_button=False):
|
||||
adapter.is_adult=True
|
||||
|
||||
# let other exceptions percolate up.
|
||||
story = adapter.getStoryMetadataOnly()
|
||||
writer = writers.getWriter(options['fileform'],adapter.config,adapter)
|
||||
|
||||
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
|
||||
book['title'] = story.getMetadata("title", removeallentities=True)
|
||||
book['author_sort'] = book['author'] = story.getMetadata("author", removeallentities=True)
|
||||
book['publisher'] = story.getMetadata("site")
|
||||
book['tags'] = writer.getTags() # getTags could be moved up into adapter now. Adapter didn't used to know the fileform
|
||||
book['comments'] = stripHTML(story.getMetadata("description")) #, removeallentities=True) comments handles entities better.
|
||||
book['series'] = story.getMetadata("series")
|
||||
|
||||
# adapter.opener is the element with a threadlock. But del
|
||||
# adapter.opener doesn't work--subproc fails when it tries
|
||||
# to pull in the adapter object that hasn't been imported yet.
|
||||
# book['adapter'] = adapter
|
||||
|
||||
book['is_adult'] = adapter.is_adult
|
||||
book['username'] = adapter.username
|
||||
book['password'] = adapter.password
|
||||
|
||||
book['icon'] = 'plus.png'
|
||||
if story.getMetadataRaw('datePublished'):
|
||||
# should only happen when an adapter is broken, but better to
|
||||
# fail gracefully.
|
||||
book['pubdate'] = story.getMetadataRaw('datePublished').replace(tzinfo=local_tz)
|
||||
book['timestamp'] = None # filled below if not skipped.
|
||||
|
||||
if collision in (CALIBREONLY):
|
||||
book['icon'] = 'metadata.png'
|
||||
|
||||
# Dialogs should prevent this case now.
|
||||
if collision in (UPDATE,UPDATEALWAYS) and fileform != 'epub':
|
||||
raise NotGoingToDownload("Cannot update non-epub format.")
|
||||
|
||||
book_id = None
|
||||
|
||||
if book['calibre_id'] != None:
|
||||
# updating an existing book. Update mode applies.
|
||||
print("update existing id:%s"%book['calibre_id'])
|
||||
book_id = book['calibre_id']
|
||||
# No handling needed: OVERWRITEALWAYS,CALIBREONLY
|
||||
|
||||
# only care about collisions when not ADDNEW
|
||||
elif collision != ADDNEW:
|
||||
# 'new' book from URL. collision handling applies.
|
||||
print("from URL")
|
||||
|
||||
# find dups
|
||||
mi = MetaInformation(story.getMetadata("title", removeallentities=True),
|
||||
(story.getMetadata("author", removeallentities=True),)) # author is a list.
|
||||
identicalbooks = db.find_identical_books(mi)
|
||||
## removed for being overkill.
|
||||
# for ib in identicalbooks:
|
||||
# # only *really* identical if URL matches, too.
|
||||
# # XXX make an option?
|
||||
# if self._get_story_url(db,ib) == url:
|
||||
# identicalbooks.append(ib)
|
||||
#print("identicalbooks:%s"%identicalbooks)
|
||||
|
||||
if collision == SKIP and identicalbooks:
|
||||
raise NotGoingToDownload("Skipping duplicate story.","list_remove.png")
|
||||
|
||||
if len(identicalbooks) > 1:
|
||||
raise NotGoingToDownload("More than one identical book--can't tell which to update/overwrite.","minusminus.png")
|
||||
|
||||
if collision == CALIBREONLY and not identicalbooks:
|
||||
raise NotGoingToDownload("Not updating Calibre Metadata, no existing book to update.","search_delete_saved.png")
|
||||
|
||||
if len(identicalbooks)>0:
|
||||
book_id = identicalbooks.pop()
|
||||
book['calibre_id'] = book_id
|
||||
book['icon'] = 'edit-redo.png'
|
||||
|
||||
if book_id != None and collision != ADDNEW:
|
||||
if options['collision'] in (CALIBREONLY):
|
||||
book['comment'] = 'Metadata collected.'
|
||||
# don't need temp file created below.
|
||||
return
|
||||
|
||||
## newer/chaptercount checks are the same for both:
|
||||
# Update epub, but only if more chapters.
|
||||
if collision in (UPDATE,UPDATEALWAYS): # collision == UPDATE
|
||||
# 'book' can exist without epub. If there's no existing epub,
|
||||
# let it go and it will download it.
|
||||
if db.has_format(book_id,fileform,index_is_id=True):
|
||||
(epuburl,chaptercount) = \
|
||||
get_dcsource_chaptercount(StringIO(db.format(book_id,'EPUB',
|
||||
index_is_id=True)))
|
||||
urlchaptercount = int(story.getMetadata('numChapters'))
|
||||
if chaptercount == urlchaptercount:
|
||||
if collision == UPDATE:
|
||||
raise NotGoingToDownload("Already contains %d chapters."%chaptercount,'edit-undo.png')
|
||||
else:
|
||||
# UPDATEALWAYS
|
||||
skip_date_update = True
|
||||
elif chaptercount > urlchaptercount:
|
||||
raise NotGoingToDownload("Existing epub contains %d chapters, web site only has %d. Use Overwrite to force update." % (chaptercount,urlchaptercount),'dialog_error.png')
|
||||
|
||||
if collision == OVERWRITE and \
|
||||
db.has_format(book_id,formmapping[fileform],index_is_id=True):
|
||||
# check make sure incoming is newer.
|
||||
lastupdated=story.getMetadataRaw('dateUpdated').date()
|
||||
fileupdated=datetime.fromtimestamp(os.stat(db.format_abspath(book_id, formmapping[fileform], index_is_id=True))[8]).date()
|
||||
if fileupdated > lastupdated:
|
||||
raise NotGoingToDownload("Not Overwriting, web site is not newer.",'edit-undo.png')
|
||||
|
||||
# For update, provide a tmp file copy of the existing epub so
|
||||
# it can't change underneath us.
|
||||
if collision in (UPDATE,UPDATEALWAYS) and \
|
||||
db.has_format(book['calibre_id'],'EPUB',index_is_id=True):
|
||||
tmp = PersistentTemporaryFile(prefix='old-%s-'%book['calibre_id'],
|
||||
suffix='.epub',
|
||||
dir=options['tdir'])
|
||||
db.copy_format_to(book_id,fileform,tmp,index_is_id=True)
|
||||
print("existing epub tmp:"+tmp.name)
|
||||
book['epub_for_update'] = tmp.name
|
||||
|
||||
if collision != CALIBREONLY and not skip_date_update:
|
||||
# I'm half convinced this should be dateUpdated instead, but
|
||||
# this behavior matches how epubs come out when imported
|
||||
# dateCreated == packaged--epub/etc created.
|
||||
book['timestamp'] = story.getMetadataRaw('dateCreated').replace(tzinfo=local_tz)
|
||||
|
||||
if book['good']: # there shouldn't be any !'good' books at this point.
|
||||
# if still 'good', make a temp file to write the output to.
|
||||
tmp = PersistentTemporaryFile(prefix='new-%s-'%book['calibre_id'],
|
||||
suffix='.'+options['fileform'],
|
||||
dir=options['tdir'])
|
||||
print("title:"+book['title'])
|
||||
print("outfile:"+tmp.name)
|
||||
book['outfile'] = tmp.name
|
||||
|
||||
return
|
||||
|
||||
def start_download_list(self,book_list,
|
||||
options={'fileform':'epub',
|
||||
'collision':ADDNEW,
|
||||
'updatemeta':True}):
|
||||
'''
|
||||
Called by LoopProgressDialog to start story downloads BG processing.
|
||||
adapter_list is a list of tuples of (url,adapter)
|
||||
'''
|
||||
#print("start_download_list:book_list:%s"%book_list)
|
||||
|
||||
## No need to BG process when CALIBREONLY! Fake it.
|
||||
if options['collision'] in (CALIBREONLY):
|
||||
class NotJob(object):
|
||||
def __init__(self,result):
|
||||
self.failed=False
|
||||
self.result=result
|
||||
notjob = NotJob(book_list)
|
||||
self.download_list_completed(notjob,options=options)
|
||||
return
|
||||
|
||||
for book in book_list:
|
||||
if book['good']:
|
||||
break
|
||||
else:
|
||||
## No good stories to try to download, go straight to
|
||||
## list.
|
||||
d = DisplayStoryListDialog(self.gui,
|
||||
'Nothing to Download',
|
||||
prefs,
|
||||
self.qaction.icon(),
|
||||
book_list,
|
||||
label_text='None of the URLs/stories given can be/need to be downloaded.'
|
||||
)
|
||||
d.exec_()
|
||||
return
|
||||
|
||||
func = 'arbitrary_n'
|
||||
cpus = self.gui.job_manager.server.pool_size
|
||||
args = ['calibre_plugins.fanfictiondownloader_plugin.jobs', 'do_download_worker',
|
||||
(book_list, options, cpus)]
|
||||
desc = 'Download FanFiction Book'
|
||||
job = self.gui.job_manager.run_job(
|
||||
self.Dispatcher(partial(self.download_list_completed,options=options)),
|
||||
func, args=args,
|
||||
description=desc)
|
||||
|
||||
self.gui.status_bar.show_message('Starting %d FanFictionDownLoads'%len(book_list),3000)
|
||||
|
||||
def _update_book(self,book,db=None,
|
||||
options={'fileform':'epub',
|
||||
'collision':ADDNEW,
|
||||
'updatemeta':True}):
|
||||
print("add/update %s %s"%(book['title'],book['url']))
|
||||
mi = self._make_mi_from_book(book)
|
||||
|
||||
if options['collision'] != CALIBREONLY:
|
||||
self._add_or_update_book(book,options,prefs,mi)
|
||||
|
||||
if options['collision'] == CALIBREONLY or \
|
||||
(options['updatemeta'] and book['good']):
|
||||
self._update_metadata(db, book['calibre_id'], book, mi, options)
|
||||
|
||||
def _update_books_completed(self, book_list, options={}):
|
||||
|
||||
add_list = filter(lambda x : x['good'] and x['added'], book_list)
|
||||
update_list = filter(lambda x : x['good'] and not x['added'], book_list)
|
||||
update_ids = [ x['calibre_id'] for x in update_list ]
|
||||
|
||||
if len(add_list):
|
||||
## even shows up added to searchs. Nice.
|
||||
self.gui.library_view.model().books_added(len(add_list))
|
||||
|
||||
if update_ids:
|
||||
self.gui.library_view.model().refresh_ids(update_ids)
|
||||
|
||||
current = self.gui.library_view.currentIndex()
|
||||
self.gui.library_view.model().current_changed(current, self.previous)
|
||||
self.gui.tags_view.recount()
|
||||
|
||||
if self.gui.cover_flow:
|
||||
self.gui.cover_flow.dataChanged()
|
||||
|
||||
self.gui.status_bar.show_message(_('Finished Adding/Updating %d books.'%(len(update_list) + len(add_list))), 3000)
|
||||
|
||||
if len(update_list) + len(add_list) != len(book_list):
|
||||
d = DisplayStoryListDialog(self.gui,
|
||||
'Updates completed, final status',
|
||||
prefs,
|
||||
self.qaction.icon(),
|
||||
book_list,
|
||||
label_text='Stories have be added or updated in Calibre, some had additional problems.'
|
||||
)
|
||||
d.exec_()
|
||||
|
||||
print("all done, remove temp dir.")
|
||||
remove_dir(options['tdir'])
|
||||
|
||||
def download_list_completed(self, job, options={}):
|
||||
if job.failed:
|
||||
self.gui.job_exception(job, dialog_title='Failed to Download Stories')
|
||||
return
|
||||
|
||||
self.previous = self.gui.library_view.currentIndex()
|
||||
db = self.gui.current_db
|
||||
|
||||
if display_story_list(self.gui,
|
||||
'Downloads finished, confirm to update Calibre',
|
||||
prefs,
|
||||
self.qaction.icon(),
|
||||
job.result,
|
||||
label_text='Stories will not be added or updated in Calibre without confirmation.',
|
||||
offer_skip=True):
|
||||
|
||||
book_list = job.result
|
||||
good_list = filter(lambda x : x['good'], book_list)
|
||||
total_good = len(good_list)
|
||||
|
||||
self.gui.status_bar.show_message(_('Adding/Updating %s books.'%total_good))
|
||||
|
||||
if total_good > 0:
|
||||
LoopProgressDialog(self.gui,
|
||||
good_list,
|
||||
partial(self._update_book, options=options, db=self.gui.current_db),
|
||||
partial(self._update_books_completed, options=options),
|
||||
init_label="Updating calibre for stories...",
|
||||
win_title="Update calibre for stories",
|
||||
status_prefix="Updated")
|
||||
|
||||
def _add_or_update_book(self,book,options,prefs,mi=None):
|
||||
db = self.gui.current_db
|
||||
|
||||
if mi == None:
|
||||
mi = self._make_mi_from_book(book)
|
||||
|
||||
book_id = book['calibre_id']
|
||||
if book_id == None:
|
||||
book_id = db.create_book_entry(mi,
|
||||
add_duplicates=True)
|
||||
book['calibre_id'] = book_id
|
||||
book['added'] = True
|
||||
else:
|
||||
book['added'] = False
|
||||
|
||||
if not db.add_format_with_hooks(book_id,
|
||||
options['fileform'],
|
||||
book['outfile'], index_is_id=True):
|
||||
book['comment'] = "Adding format to book failed for some reason..."
|
||||
book['good']=False
|
||||
book['icon']='dialog_error.png'
|
||||
|
||||
if prefs['deleteotherforms']:
|
||||
fmts = db.formats(book['calibre_id'], index_is_id=True).split(',')
|
||||
for fmt in fmts:
|
||||
if fmt != formmapping[options['fileform']]:
|
||||
print("remove f:"+fmt)
|
||||
db.remove_format(book['calibre_id'], fmt, index_is_id=True)#, notify=False
|
||||
|
||||
if prefs['addtolists'] or prefs['addtoreadlists']:
|
||||
self._update_reading_lists([book_id],add=True)
|
||||
|
||||
return book_id
|
||||
|
||||
def _update_metadata(self, db, book_id, book, mi, options):
|
||||
if prefs['keeptags']:
|
||||
old_tags = db.get_tags(book_id)
|
||||
# remove old Completed/In-Progress only if there's a new one.
|
||||
if 'Completed' in mi.tags or 'In-Progress' in mi.tags:
|
||||
old_tags = filter( lambda x : x not in ('Completed', 'In-Progress'), old_tags)
|
||||
# remove old Last Update tags if there are new ones.
|
||||
if len(filter( lambda x : not x.startswith("Last Update"), mi.tags)) > 0:
|
||||
old_tags = filter( lambda x : not x.startswith("Last Update"), old_tags)
|
||||
# mi.tags needs to be list, but set kills dups.
|
||||
mi.tags = list(set(list(old_tags)+mi.tags))
|
||||
|
||||
if 'langcode' in book['all_metadata']:
|
||||
mi.languages=[book['all_metadata']['langcode']]
|
||||
else:
|
||||
# Set language english, but only if not already set.
|
||||
oldmi = db.get_metadata(book_id,index_is_id=True)
|
||||
if not oldmi.languages:
|
||||
mi.languages=['eng']
|
||||
|
||||
if options['fileform'] == 'epub' and prefs['updatecover']:
|
||||
existingepub = db.format(book_id,'EPUB',index_is_id=True, as_file=True)
|
||||
epubmi = get_metadata(existingepub,'EPUB')
|
||||
if epubmi.cover_data[1] is not None:
|
||||
db.set_cover(book_id, epubmi.cover_data[1])
|
||||
#mi.cover = epubmi.cover_data[1]
|
||||
|
||||
db.set_metadata(book_id,mi)
|
||||
|
||||
# do configured column updates here.
|
||||
#print("all_metadata: %s"%book['all_metadata'])
|
||||
custom_columns = self.gui.library_view.model().custom_columns
|
||||
|
||||
#print("prefs['custom_cols'] %s"%prefs['custom_cols'])
|
||||
for col, meta in prefs['custom_cols'].iteritems():
|
||||
#print("setting %s to %s"%(col,meta))
|
||||
if col not in custom_columns:
|
||||
print("%s not an existing column, skipping."%col)
|
||||
continue
|
||||
coldef = custom_columns[col]
|
||||
if not meta.startswith('status-') and meta not in book['all_metadata']:
|
||||
print("No value for %s, skipping."%meta)
|
||||
continue
|
||||
if meta not in permitted_values[coldef['datatype']]:
|
||||
print("%s not a valid column type for %s, skipping."%(col,meta))
|
||||
continue
|
||||
label = coldef['label']
|
||||
if coldef['datatype'] in ('enumeration','text','comments','datetime','series'):
|
||||
db.set_custom(book_id, book['all_metadata'][meta], label=label, commit=False)
|
||||
elif coldef['datatype'] in ('int','float'):
|
||||
num = unicode(book['all_metadata'][meta]).replace(",","")
|
||||
db.set_custom(book_id, num, label=label, commit=False)
|
||||
elif coldef['datatype'] == 'bool' and meta.startswith('status-'):
|
||||
if meta == 'status-C':
|
||||
val = book['all_metadata']['status'] == 'Completed'
|
||||
if meta == 'status-I':
|
||||
val = book['all_metadata']['status'] == 'In-Progress'
|
||||
db.set_custom(book_id, val, label=label, commit=False)
|
||||
|
||||
db.commit()
|
||||
|
||||
def _get_clean_reading_lists(self,lists):
|
||||
if lists == None or lists.strip() == "" :
|
||||
return []
|
||||
else:
|
||||
return filter( lambda x : x, map( lambda x : x.strip(), lists.split(',') ) )
|
||||
|
||||
def _update_reading_lists(self,book_ids,add=True):
|
||||
try:
|
||||
rl_plugin = self.gui.iactions['Reading List']
|
||||
except:
|
||||
if prefs['addtolists'] or prefs['addtoreadlists']:
|
||||
message="<p>You configured FanFictionDownLoader to automatically update Reading Lists, but you don't have the Reading List plugin installed anymore?</p>"
|
||||
confirm(message,'fanfictiondownloader_no_reading_list_plugin', self.gui)
|
||||
return
|
||||
|
||||
# XXX check for existence of lists, warning if not.
|
||||
if prefs['addtoreadlists']:
|
||||
if add:
|
||||
addremovefunc = rl_plugin.add_books_to_list
|
||||
else:
|
||||
addremovefunc = rl_plugin.remove_books_from_list
|
||||
|
||||
lists = self._get_clean_reading_lists(prefs['read_lists'])
|
||||
if len(lists) < 1 :
|
||||
message="<p>You configured FanFictionDownLoader to automatically update \"To Read\" Reading Lists, but you don't have any lists set?</p>"
|
||||
confirm(message,'fanfictiondownloader_no_read_lists', self.gui)
|
||||
for l in lists:
|
||||
if l in rl_plugin.get_list_names():
|
||||
#print("add good read l:(%s)"%l)
|
||||
addremovefunc(l,
|
||||
book_ids,
|
||||
display_warnings=False)
|
||||
else:
|
||||
if l != '':
|
||||
message="<p>You configured FanFictionDownLoader to automatically update Reading List '%s', but you don't have a list of that name?</p>"%l
|
||||
confirm(message,'fanfictiondownloader_no_reading_list_%s'%l, self.gui)
|
||||
|
||||
if prefs['addtolists'] and (add or (prefs['addtolistsonread'] and prefs['addtoreadlists']) ):
|
||||
lists = self._get_clean_reading_lists(prefs['send_lists'])
|
||||
if len(lists) < 1 :
|
||||
message="<p>You configured FanFictionDownLoader to automatically update \"Send to Device\" Reading Lists, but you don't have any lists set?</p>"
|
||||
confirm(message,'fanfictiondownloader_no_send_lists', self.gui)
|
||||
for l in lists:
|
||||
if l in rl_plugin.get_list_names():
|
||||
#print("good send l:(%s)"%l)
|
||||
rl_plugin.add_books_to_list(l,
|
||||
book_ids,
|
||||
display_warnings=False)
|
||||
else:
|
||||
if l != '':
|
||||
message="<p>You configured FanFictionDownLoader to automatically update Reading List '%s', but you don't have a list of that name?</p>"%l
|
||||
confirm(message,'fanfictiondownloader_no_reading_list_%s'%l, self.gui)
|
||||
|
||||
def _find_existing_book_id(self,db,book,matchurl=True):
|
||||
mi = MetaInformation(book["title"],(book["author"],)) # author is a list.
|
||||
identicalbooks = db.find_identical_books(mi)
|
||||
if matchurl: # only *really* identical if URL matches, too.
|
||||
for ib in identicalbooks:
|
||||
if self._get_story_url(db,ib) == book['url']:
|
||||
return ib
|
||||
if identicalbooks:
|
||||
return identicalbooks.pop()
|
||||
return None
|
||||
|
||||
def _make_mi_from_book(self,book):
|
||||
mi = MetaInformation(book['title'],(book['author'],)) # author is a list.
|
||||
mi.set_identifiers({'url':book['url']})
|
||||
mi.publisher = book['publisher']
|
||||
mi.tags = book['tags']
|
||||
#mi.languages = ['en'] # handled in _update_metadata so it can check for existing lang.
|
||||
mi.pubdate = book['pubdate']
|
||||
mi.timestamp = book['timestamp']
|
||||
mi.comments = book['comments']
|
||||
mi.series = book['series']
|
||||
return mi
|
||||
|
||||
|
||||
def _convert_urls_to_books(self, urls):
|
||||
books = []
|
||||
uniqueurls = set()
|
||||
for url in urls:
|
||||
book = self._convert_url_to_book(url)
|
||||
if book['url'] in uniqueurls:
|
||||
book['good'] = False
|
||||
book['comment'] = "Same story already included."
|
||||
uniqueurls.add(book['url'])
|
||||
books.append(book)
|
||||
return books
|
||||
|
||||
def _convert_url_to_book(self, url):
|
||||
book = {}
|
||||
book['good'] = True
|
||||
book['calibre_id'] = None
|
||||
book['title'] = 'Unknown'
|
||||
book['author'] = 'Unknown'
|
||||
book['author_sort'] = 'Unknown'
|
||||
|
||||
book['comment'] = ''
|
||||
book['url'] = ''
|
||||
book['added'] = False
|
||||
|
||||
self._set_book_url_and_comment(book,url)
|
||||
return book
|
||||
|
||||
def _convert_id_to_book(self, idval, good=True):
|
||||
book = {}
|
||||
book['good'] = good
|
||||
book['calibre_id'] = idval
|
||||
book['title'] = 'Unknown'
|
||||
book['author'] = 'Unknown'
|
||||
book['author_sort'] = 'Unknown'
|
||||
|
||||
book['comment'] = ''
|
||||
book['url'] = ''
|
||||
book['added'] = False
|
||||
|
||||
return book
|
||||
|
||||
def _populate_book_from_calibre_id(self, book, db=None):
|
||||
mi = db.get_metadata(book['calibre_id'], index_is_id=True)
|
||||
#book = {}
|
||||
book['good'] = True
|
||||
book['calibre_id'] = mi.id
|
||||
book['title'] = mi.title
|
||||
book['author'] = authors_to_string(mi.authors)
|
||||
book['author_sort'] = mi.author_sort
|
||||
book['comment'] = ''
|
||||
book['url'] = ""
|
||||
book['added'] = False
|
||||
|
||||
url = self._get_story_url(db,book['calibre_id'])
|
||||
self._set_book_url_and_comment(book,url)
|
||||
#return book
|
||||
|
||||
def _set_book_url_and_comment(self,book,url):
|
||||
if not url:
|
||||
book['comment'] = "No story URL found."
|
||||
book['good'] = False
|
||||
book['icon'] = 'search_delete_saved.png'
|
||||
else:
|
||||
# get normalized url or None.
|
||||
book['url'] = self._is_good_downloader_url(url)
|
||||
if book['url'] == None:
|
||||
book['url'] = url
|
||||
book['comment'] = "URL is not a valid story URL."
|
||||
book['good'] = False
|
||||
book['icon']='dialog_error.png'
|
||||
|
||||
def _get_story_url(self, db, book_id):
|
||||
identifiers = db.get_identifiers(book_id,index_is_id=True)
|
||||
if 'url' in identifiers:
|
||||
# identifiers have :->| in url.
|
||||
#print("url from book:"+identifiers['url'].replace('|',':'))
|
||||
return identifiers['url'].replace('|',':')
|
||||
else:
|
||||
## only epub has URL in it--at least where I can easily find it.
|
||||
if db.has_format(book_id,'EPUB',index_is_id=True):
|
||||
existingepub = db.format(book_id,'EPUB',index_is_id=True, as_file=True)
|
||||
mi = get_metadata(existingepub,'EPUB')
|
||||
identifiers = mi.get_identifiers()
|
||||
if 'url' in identifiers:
|
||||
#print("url from epub:"+identifiers['url'].replace('|',':'))
|
||||
return identifiers['url'].replace('|',':')
|
||||
# look for dc:source
|
||||
return get_dcsource(existingepub)
|
||||
return None
|
||||
|
||||
def _is_good_downloader_url(self,url):
|
||||
# this is the accepted way to 'check for existance'? really?
|
||||
try:
|
||||
self.dummyconfig
|
||||
except AttributeError:
|
||||
self.dummyconfig = SafeConfigParser()
|
||||
# pulling up an adapter is pretty low over-head. If
|
||||
# it fails, it's a bad url.
|
||||
try:
|
||||
adapter = adapters.getAdapter(self.dummyconfig,url)
|
||||
url = adapter.url
|
||||
del adapter
|
||||
return url
|
||||
except:
|
||||
return None;
|
||||
|
||||
def get_url_list(urls):
|
||||
def f(x):
|
||||
if x.strip(): return True
|
||||
else: return False
|
||||
# set removes dups.
|
||||
return set(filter(f,urls.strip().splitlines()))
|
||||
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -1,116 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2020, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
from functools import reduce
|
||||
|
||||
from io import StringIO
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from fanficfare import adapters
|
||||
from fanficfare.configurable import Configuration
|
||||
from calibre_plugins.fanficfare_plugin.prefs import prefs
|
||||
from fanficfare.six import ensure_text
|
||||
from fanficfare.six.moves import configparser
|
||||
from fanficfare.six.moves import collections_abc
|
||||
|
||||
def get_fff_personalini():
|
||||
return prefs['personal.ini']
|
||||
|
||||
def get_fff_config(url,fileform="epub",personalini=None):
|
||||
if not personalini:
|
||||
personalini = get_fff_personalini()
|
||||
sections=['unknown']
|
||||
try:
|
||||
sections = adapters.getConfigSectionsFor(url)
|
||||
except Exception as e:
|
||||
logger.debug("Failed trying to get ini config for url(%s): %s, using section %s instead"%(url,e,sections))
|
||||
configuration = Configuration(sections,fileform)
|
||||
configuration.read_file(StringIO(ensure_text(get_resources("plugin-defaults.ini"))))
|
||||
configuration.read_file(StringIO(ensure_text(personalini)))
|
||||
|
||||
return configuration
|
||||
|
||||
def get_fff_adapter(url,fileform="epub",personalini=None):
|
||||
return adapters.getAdapter(get_fff_config(url,fileform,personalini),url)
|
||||
|
||||
def test_config(initext):
|
||||
try:
|
||||
configini = get_fff_config("test1.com?sid=555",
|
||||
personalini=initext)
|
||||
errors = configini.test_config()
|
||||
except configparser.ParsingError as pe:
|
||||
errors = pe.errors
|
||||
|
||||
return errors
|
||||
|
||||
|
||||
class OrderedSet(collections_abc.MutableSet):
|
||||
|
||||
def __init__(self, iterable=None):
|
||||
self.end = end = []
|
||||
end += [None, end, end] # sentinel node for doubly linked list
|
||||
self.map = {} # key --> [key, prev, next]
|
||||
if iterable is not None:
|
||||
self |= iterable
|
||||
|
||||
def __len__(self):
|
||||
return len(self.map)
|
||||
|
||||
def __contains__(self, key):
|
||||
return key in self.map
|
||||
|
||||
def add(self, key):
|
||||
if key not in self.map:
|
||||
end = self.end
|
||||
curr = end[1]
|
||||
curr[2] = end[1] = self.map[key] = [key, curr, end]
|
||||
|
||||
def discard(self, key):
|
||||
if key in self.map:
|
||||
key, prev, next = self.map.pop(key)
|
||||
prev[2] = next
|
||||
next[1] = prev
|
||||
|
||||
def __iter__(self):
|
||||
end = self.end
|
||||
curr = end[2]
|
||||
while curr is not end:
|
||||
yield curr[0]
|
||||
curr = curr[2]
|
||||
|
||||
def __reversed__(self):
|
||||
end = self.end
|
||||
curr = end[1]
|
||||
while curr is not end:
|
||||
yield curr[0]
|
||||
curr = curr[1]
|
||||
|
||||
def pop(self, last=True):
|
||||
if not self:
|
||||
raise KeyError('set is empty')
|
||||
key = self.end[1][0] if last else self.end[2][0]
|
||||
self.discard(key)
|
||||
return key
|
||||
|
||||
def __repr__(self):
|
||||
if not self:
|
||||
return '%s()' % (self.__class__.__name__,)
|
||||
return '%s(%r)' % (self.__class__.__name__, list(self))
|
||||
|
||||
def __eq__(self, other):
|
||||
if isinstance(other, OrderedSet):
|
||||
return len(self) == len(other) and list(self) == list(other)
|
||||
return set(self) == set(other)
|
||||
|
||||
def get_common_elements(ll):
|
||||
## returns a list of elements common to all lists in ll
|
||||
## https://www.tutorialspoint.com/find-common-elements-in-list-of-lists-in-python
|
||||
return list(reduce(lambda i, j: i & j, (OrderedSet(n) for n in ll)))
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 23 KiB After Width: | Height: | Size: 24 KiB |
Binary file not shown.
|
|
@ -1,159 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (absolute_import, unicode_literals, division,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2020, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import re
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from PyQt5.Qt import (QApplication, Qt, QColor, QSyntaxHighlighter,
|
||||
QTextCharFormat, QBrush, QFont)
|
||||
|
||||
try:
|
||||
# qt6 Calibre v6+
|
||||
QFontNormal = QFont.Weight.Normal
|
||||
QFontBold = QFont.Weight.Bold
|
||||
except:
|
||||
# qt5 Calibre v2-5
|
||||
QFontNormal = QFont.Normal
|
||||
QFontBold = QFont.Bold
|
||||
|
||||
from fanficfare.six import string_types
|
||||
|
||||
class IniHighlighter(QSyntaxHighlighter):
|
||||
'''
|
||||
QSyntaxHighlighter class for use with QTextEdit for highlighting
|
||||
ini config files.
|
||||
'''
|
||||
|
||||
def __init__( self, parent, sections=[], keywords=[], entries=[], entry_keywords=[] ):
|
||||
QSyntaxHighlighter.__init__( self, parent )
|
||||
self.parent = parent
|
||||
|
||||
self.highlightingRules = []
|
||||
|
||||
colors = {
|
||||
'knownentries':Qt.darkGreen,
|
||||
'errors':Qt.red,
|
||||
'allkeywords':Qt.darkMagenta,
|
||||
'knownkeywords':Qt.blue,
|
||||
'knownsections':Qt.darkBlue,
|
||||
'teststories':Qt.darkCyan,
|
||||
'storyUrls':Qt.darkMagenta,
|
||||
'comments':Qt.darkYellow
|
||||
}
|
||||
try:
|
||||
if( hasattr(QApplication.instance(),'is_dark_theme')
|
||||
and QApplication.instance().is_dark_theme ):
|
||||
colors = {
|
||||
'knownentries':Qt.green,
|
||||
'errors':Qt.red,
|
||||
'allkeywords':Qt.magenta,
|
||||
'knownkeywords':QColor(Qt.blue).lighter(150),
|
||||
'knownsections':Qt.darkCyan,
|
||||
'teststories':Qt.cyan,
|
||||
'storyUrls':QColor(Qt.magenta).lighter(150),
|
||||
'comments':Qt.yellow
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error("Failed to set dark theme highlight colors: %s"%e)
|
||||
|
||||
if entries:
|
||||
# *known* entries
|
||||
reentries = r'('+(r'|'.join(entries))+r')'
|
||||
self.highlightingRules.append( HighlightingRule( r"\b"+reentries+r"\b", colors['knownentries'] ) )
|
||||
|
||||
# true/false -- just to be nice.
|
||||
self.highlightingRules.append( HighlightingRule( r"\b(true|false)\b", colors['knownentries'] ) )
|
||||
|
||||
# *all* keywords -- change known later.
|
||||
self.errorRule = HighlightingRule( r"^[^:=\s][^:=]*[:=]", colors['errors'] )
|
||||
self.highlightingRules.append( self.errorRule )
|
||||
|
||||
# *all* entry keywords -- change known later.
|
||||
reentrykeywords = r'('+(r'|'.join([ e % r'[a-zA-Z0-9_]+' for e in entry_keywords ]))+r')'
|
||||
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"(_filelist)?\s*[:=]", colors['allkeywords'] ) )
|
||||
|
||||
if entries: # separate from known entries so entry named keyword won't be masked.
|
||||
# *known* entry keywords
|
||||
reentrykeywords = r'('+(r'|'.join([ e % reentries for e in entry_keywords ]))+r')'
|
||||
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+reentrykeywords+r"(_filelist)?\s*[:=]", colors['knownkeywords'] ) )
|
||||
|
||||
# *known* keywords
|
||||
rekeywords = r'('+(r'|'.join(keywords))+r')'
|
||||
self.highlightingRules.append( HighlightingRule( r"^(add_to_)?"+rekeywords+r"(_filelist)?\s*[:=]", colors['knownkeywords'] ) )
|
||||
|
||||
# *all* sections -- change known later.
|
||||
self.highlightingRules.append( HighlightingRule( r"^\[[^\]]+\].*?$", colors['errors'], QFontBold, blocknum=1 ) )
|
||||
|
||||
if sections:
|
||||
# *known* sections
|
||||
resections = r'('+(r'|'.join(sections))+r')'
|
||||
resections = resections.replace('.',r'\.') #escape dots.
|
||||
self.highlightingRules.append( HighlightingRule( r"^\["+resections+r"\]\s*$", colors['knownsections'], QFontBold, blocknum=2 ) )
|
||||
|
||||
# test story sections
|
||||
self.teststoryRule = HighlightingRule( r"^\[teststory:([0-9]+|defaults)\]", colors['teststories'], blocknum=3 )
|
||||
self.highlightingRules.append( self.teststoryRule )
|
||||
|
||||
# storyUrl sections
|
||||
# StoryUrls are *not* checked beyond looking for https?://
|
||||
self.storyUrlRule = HighlightingRule( r"^\[https?://.*\]", colors['storyUrls'], QFontBold, blocknum=2 )
|
||||
self.highlightingRules.append( self.storyUrlRule )
|
||||
|
||||
# NOT comments -- but can be custom columns, so don't flag.
|
||||
#self.highlightingRules.append( HighlightingRule( r"(?<!^)#[^\n]*" , colors['errors'] ) )
|
||||
|
||||
# comments -- comments must start from column 0.
|
||||
self.commentRule = HighlightingRule( r"^#[^\n]*" , colors['comments'] )
|
||||
self.highlightingRules.append( self.commentRule )
|
||||
|
||||
def highlightBlock( self, text ):
|
||||
|
||||
is_comment = False
|
||||
blocknum = self.previousBlockState()
|
||||
for rule in self.highlightingRules:
|
||||
for match in rule.pattern.finditer(text):
|
||||
self.setFormat( match.start(), match.end()-match.start(), rule.highlight )
|
||||
if rule == self.commentRule:
|
||||
is_comment = True
|
||||
if rule.blocknum > 0:
|
||||
blocknum = rule.blocknum
|
||||
|
||||
if not is_comment:
|
||||
# unknown section, error all:
|
||||
if blocknum == 1 and blocknum == self.previousBlockState():
|
||||
self.setFormat( 0, len(text), self.errorRule.highlight )
|
||||
|
||||
# teststory section rules:
|
||||
if blocknum == 3:
|
||||
self.setFormat( 0, len(text), self.teststoryRule.highlight )
|
||||
|
||||
## changed storyUrl section to also be blocknum=1 April 2023
|
||||
## storyUrl section rules:
|
||||
# if blocknum == 4:
|
||||
# self.setFormat( 0, len(text), self.storyUrlRule.highlight )
|
||||
|
||||
self.setCurrentBlockState( blocknum )
|
||||
|
||||
class HighlightingRule():
|
||||
def __init__( self, pattern, color,
|
||||
weight=QFontNormal,
|
||||
style=Qt.SolidPattern,
|
||||
blocknum=0):
|
||||
if isinstance(pattern, string_types):
|
||||
self.pattern = re.compile(pattern)
|
||||
else:
|
||||
self.pattern=pattern
|
||||
charfmt = QTextCharFormat()
|
||||
brush = QBrush(color, style)
|
||||
charfmt.setForeground(brush)
|
||||
charfmt.setFontWeight(weight)
|
||||
self.highlight = charfmt
|
||||
self.blocknum=blocknum
|
||||
|
|
@ -1,403 +1,163 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2020, Jim Miller, 2011, Grant Drake <grant.drake@gmail.com>'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from time import sleep
|
||||
from datetime import time
|
||||
from io import StringIO
|
||||
from collections import defaultdict
|
||||
import sys
|
||||
|
||||
from calibre.utils.date import local_tz
|
||||
|
||||
# pulls in translation files for _() strings
|
||||
try:
|
||||
load_translations()
|
||||
except NameError:
|
||||
pass # load_translations() added in calibre 1.9
|
||||
|
||||
# ------------------------------------------------------------------------------
|
||||
#
|
||||
# Functions to perform downloads using worker jobs
|
||||
#
|
||||
# ------------------------------------------------------------------------------
|
||||
|
||||
def do_download_worker_single(site,
|
||||
book_list,
|
||||
options,
|
||||
merge,
|
||||
notification=lambda x,y:x):
|
||||
|
||||
logger.info(options['version'])
|
||||
|
||||
## same info debug calibre prints out at startup. For when users
|
||||
## give me job output instead of debug log.
|
||||
from calibre.debug import print_basic_debug_info
|
||||
print_basic_debug_info(sys.stderr)
|
||||
|
||||
notification(0.01, _('Downloading FanFiction Stories'))
|
||||
from calibre_plugins.fanficfare_plugin import FanFicFareBase
|
||||
fffbase = FanFicFareBase(options['plugin_path'])
|
||||
with fffbase: # so the sys.path was modified while loading the
|
||||
# plug impl.
|
||||
from fanficfare.fff_profile import do_cprofile
|
||||
|
||||
## extra function just so I can easily use the same
|
||||
## @do_cprofile decorator
|
||||
@do_cprofile
|
||||
def profiled_func():
|
||||
count = 0
|
||||
totals = {}
|
||||
# can't do direct assignment in list comprehension? I'm sure it
|
||||
# makes sense to some pythonista.
|
||||
# [ totals[x['url']]=0.0 for x in book_list if x['good'] ]
|
||||
[ totals.update({x['url']:0.0}) for x in book_list if x['good'] ]
|
||||
# logger.debug(sites_lists.keys())
|
||||
|
||||
def do_indiv_notif(percent,msg):
|
||||
totals[msg] = percent/len(totals)
|
||||
notification(max(0.01,sum(totals.values())), _('%(count)d of %(total)d stories finished downloading')%{'count':count,'total':len(totals)})
|
||||
|
||||
do_list = []
|
||||
done_list = []
|
||||
logger.info("\n\n"+_("Downloading FanFiction Stories")+"\n%s\n"%("\n".join([ "%(status)s %(url)s %(comment)s" % book for book in book_list])))
|
||||
## pass failures from metadata through bg job so all results are
|
||||
## together.
|
||||
for book in book_list:
|
||||
if book['good']:
|
||||
do_list.append(book)
|
||||
else:
|
||||
done_list.append(book)
|
||||
for book in do_list:
|
||||
# logger.info("%s"%book['url'])
|
||||
done_list.append(do_download_for_worker(book,options,merge,do_indiv_notif))
|
||||
count += 1
|
||||
return finish_download(done_list)
|
||||
return profiled_func()
|
||||
|
||||
def finish_download(donelist):
|
||||
book_list = sorted(donelist,key=lambda x : x['listorder'])
|
||||
logger.info("\n"+_("Download Results:")+"\n%s\n"%("\n".join([ "%(status)s %(url)s %(comment)s" % book for book in book_list])))
|
||||
|
||||
good_lists = defaultdict(list)
|
||||
bad_lists = defaultdict(list)
|
||||
for book in book_list:
|
||||
if book['good']:
|
||||
good_lists[book['status']].append(book)
|
||||
else:
|
||||
bad_lists[book['status']].append(book)
|
||||
|
||||
order = [_('Add'),
|
||||
_('Update'),
|
||||
_('Meta'),
|
||||
_('Different URL'),
|
||||
_('Rejected'),
|
||||
_('Skipped'),
|
||||
_('Bad'),
|
||||
_('Error'),
|
||||
]
|
||||
stnum = 0
|
||||
for d in [ good_lists, bad_lists ]:
|
||||
for status in order:
|
||||
stnum += 1
|
||||
if d[status]:
|
||||
l = d[status]
|
||||
logger.info("\n"+status+"\n%s\n"%("\n".join([book['url'] for book in l])))
|
||||
for book in l:
|
||||
# Add prior listorder to 10000 * status num for
|
||||
# ordering of accumulated results with multiple bg
|
||||
# jobs
|
||||
book['reportorder'] = stnum*10000 + book['listorder']
|
||||
del d[status]
|
||||
# just in case a status is added but doesn't appear in order.
|
||||
for status in d.keys():
|
||||
logger.info("\n"+status+"\n%s\n"%("\n".join([book['url'] for book in d[status]])))
|
||||
|
||||
# return the book list as the job result
|
||||
return book_list
|
||||
|
||||
def do_download_for_worker(book,options,merge,notification=lambda x,y:x):
|
||||
'''
|
||||
Child job, to download story when run as a worker job
|
||||
'''
|
||||
|
||||
from calibre_plugins.fanficfare_plugin import FanFicFareBase
|
||||
fffbase = FanFicFareBase(options['plugin_path'])
|
||||
with fffbase: # so the sys.path was modified while loading the
|
||||
# plug impl.
|
||||
from calibre_plugins.fanficfare_plugin.prefs import (
|
||||
SAVE_YES, SAVE_YES_UNLESS_SITE, OVERWRITE, OVERWRITEALWAYS, UPDATE,
|
||||
UPDATEALWAYS, ADDNEW, SKIP, CALIBREONLY, CALIBREONLYSAVECOL)
|
||||
from calibre_plugins.fanficfare_plugin.wordcount import get_word_count
|
||||
from fanficfare import adapters, writers
|
||||
from fanficfare.epubutils import get_update_data
|
||||
from fanficfare.exceptions import NotGoingToDownload
|
||||
from fanficfare.six import text_type as unicode
|
||||
|
||||
from calibre_plugins.fanficfare_plugin.fff_util import get_fff_config
|
||||
|
||||
try:
|
||||
logger.info("\n\n" + ("-"*80) + " " + book['url'])
|
||||
## No need to download at all. Can happen now due to
|
||||
## collision moving into book for CALIBREONLY changing to
|
||||
## ADDNEW when story URL not in library.
|
||||
if book['collision'] in (CALIBREONLY, CALIBREONLYSAVECOL):
|
||||
logger.info("Skipping CALIBREONLY 'update' down inside worker")
|
||||
return book
|
||||
|
||||
book['comment'] = _('Download started...')
|
||||
|
||||
configuration = get_fff_config(book['url'],
|
||||
options['fileform'],
|
||||
options['personal.ini'])
|
||||
|
||||
# images only for epub, html, even if the user mistakenly
|
||||
# turned it on else where.
|
||||
if options['fileform'] not in ("epub","html"):
|
||||
configuration.set("overrides","include_images","false")
|
||||
|
||||
adapter = adapters.getAdapter(configuration,book['url'])
|
||||
adapter.is_adult = book['is_adult']
|
||||
adapter.username = book['username']
|
||||
adapter.password = book['password']
|
||||
adapter.totp = book['totp']
|
||||
adapter.setChaptersRange(book['begin'],book['end'])
|
||||
|
||||
## each site download job starts with a new copy of the
|
||||
## cookiejar and basic_cache from the FG process. They
|
||||
## are not shared between different sites' BG downloads
|
||||
if 'basic_cache' in options:
|
||||
configuration.set_basic_cache(options['basic_cache'])
|
||||
else:
|
||||
options['basic_cache'] = configuration.get_basic_cache()
|
||||
options['basic_cache'].load_cache(options['basic_cachefile'])
|
||||
if 'cookiejar' in options:
|
||||
configuration.set_cookiejar(options['cookiejar'])
|
||||
else:
|
||||
options['cookiejar'] = configuration.get_cookiejar()
|
||||
options['cookiejar'].load_cookiejar(options['cookiejarfile'])
|
||||
|
||||
story = adapter.getStoryMetadataOnly()
|
||||
if not story.getMetadata("series") and 'calibre_series' in book:
|
||||
adapter.setSeries(book['calibre_series'][0],book['calibre_series'][1])
|
||||
|
||||
# logger.debug(merge)
|
||||
# logger.debug(book.get('epub_for_update','(NONE)'))
|
||||
# logger.debug(options.get('mergebook','(NOMERGEBOOK)'))
|
||||
|
||||
# is a merge, is a pre-existing anthology, and is not a pre-existing book in anthology.
|
||||
if merge and 'mergebook' in options and 'epub_for_update' not in book:
|
||||
# internal for plugin anthologies to mark chapters
|
||||
# (new) in new stories
|
||||
story.setMetadata("newforanthology","true")
|
||||
logger.debug("metadata newforanthology:%s"%story.getMetadata("newforanthology"))
|
||||
|
||||
# set PI version instead of default.
|
||||
if 'version' in options:
|
||||
story.setMetadata('version',options['version'])
|
||||
|
||||
book['title'] = story.getMetadata("title", removeallentities=True)
|
||||
book['author_sort'] = book['author'] = story.getList("author", removeallentities=True)
|
||||
book['publisher'] = story.getMetadata("publisher")
|
||||
book['url'] = story.getMetadata("storyUrl", removeallentities=True)
|
||||
book['comments'] = story.get_sanitized_description()
|
||||
book['series'] = story.getMetadata("series", removeallentities=True)
|
||||
|
||||
if story.getMetadataRaw('datePublished'):
|
||||
book['pubdate'] = story.getMetadataRaw('datePublished').replace(tzinfo=local_tz)
|
||||
if story.getMetadataRaw('dateUpdated'):
|
||||
book['updatedate'] = story.getMetadataRaw('dateUpdated').replace(tzinfo=local_tz)
|
||||
if story.getMetadataRaw('dateCreated'):
|
||||
book['timestamp'] = story.getMetadataRaw('dateCreated').replace(tzinfo=local_tz)
|
||||
else:
|
||||
book['timestamp'] = datetime.now().replace(tzinfo=local_tz) # need *something* there for calibre.
|
||||
|
||||
writer = writers.getWriter(options['fileform'],configuration,adapter)
|
||||
outfile = book['outfile']
|
||||
|
||||
## checks were done earlier, it's new or not dup or newer--just write it.
|
||||
if book['collision'] in (ADDNEW, SKIP, OVERWRITE, OVERWRITEALWAYS) or \
|
||||
('epub_for_update' not in book and book['collision'] in (UPDATE, UPDATEALWAYS)):
|
||||
|
||||
# preserve logfile even on overwrite.
|
||||
if 'epub_for_update' in book:
|
||||
adapter.logfile = get_update_data(book['epub_for_update'])[6]
|
||||
# change the existing entries id to notid so
|
||||
# write_epub writes a whole new set to indicate overwrite.
|
||||
if adapter.logfile:
|
||||
adapter.logfile = adapter.logfile.replace("span id","span notid")
|
||||
|
||||
if book['collision'] == OVERWRITE and 'fileupdated' in book:
|
||||
lastupdated=story.getMetadataRaw('dateUpdated')
|
||||
fileupdated=book['fileupdated']
|
||||
|
||||
# updated doesn't have time (or is midnight), use dates only.
|
||||
# updated does have time, use full timestamps.
|
||||
if (lastupdated.time() == time.min and fileupdated.date() > lastupdated.date()) or \
|
||||
(lastupdated.time() != time.min and fileupdated > lastupdated):
|
||||
raise NotGoingToDownload(_("Not Overwriting, web site is not newer."),'edit-undo.png',showerror=False)
|
||||
|
||||
|
||||
logger.info("write to %s"%outfile)
|
||||
inject_cal_cols(book,story,configuration)
|
||||
writer.writeStory(outfilename=outfile,
|
||||
forceOverwrite=True,
|
||||
notification=notification)
|
||||
|
||||
if adapter.story.chapter_error_count > 0:
|
||||
book['comment'] = _('Download %(fileform)s completed, %(failed)s failed chapters, %(total)s total chapters.')%\
|
||||
{'fileform':options['fileform'],
|
||||
'failed':adapter.story.chapter_error_count,
|
||||
'total':story.getMetadata("numChapters")}
|
||||
book['chapter_error_count'] = adapter.story.chapter_error_count
|
||||
else:
|
||||
book['comment'] = _('Download %(fileform)s completed, %(total)s chapters.')%\
|
||||
{'fileform':options['fileform'],
|
||||
'total':story.getMetadata("numChapters")}
|
||||
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
|
||||
if options['savemetacol'] != '':
|
||||
book['savemetacol'] = story.dump_html_metadata()
|
||||
|
||||
## checks were done earlier, just update it.
|
||||
elif 'epub_for_update' in book and book['collision'] in (UPDATE, UPDATEALWAYS):
|
||||
|
||||
# update now handled by pre-populating the old images and
|
||||
# chapters in the adapter rather than merging epubs.
|
||||
#urlchaptercount = int(story.getMetadata('numChapters').replace(',',''))
|
||||
# returns int adjusted for start-end range.
|
||||
urlchaptercount = story.getChapterCount()
|
||||
(url,
|
||||
chaptercount,
|
||||
adapter.oldchapters,
|
||||
adapter.oldimgs,
|
||||
adapter.oldcover,
|
||||
adapter.calibrebookmark,
|
||||
adapter.logfile,
|
||||
adapter.oldchaptersmap,
|
||||
adapter.oldchaptersdata) = get_update_data(book['epub_for_update'])[0:9]
|
||||
|
||||
# dup handling from fff_plugin needed for anthology updates & BG metadata.
|
||||
if book['collision'] in (UPDATE,UPDATEALWAYS):
|
||||
if chaptercount == urlchaptercount and book['collision'] == UPDATE:
|
||||
if merge:
|
||||
## Deliberately pass for UPDATEALWAYS merge.
|
||||
book['comment']=_("Already contains %d chapters. Reuse as is.")%chaptercount
|
||||
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
|
||||
if options['savemetacol'] != '':
|
||||
book['savemetacol'] = story.dump_html_metadata()
|
||||
book['outfile'] = book['epub_for_update'] # for anthology merge ops.
|
||||
return book
|
||||
else:
|
||||
raise NotGoingToDownload(_("Already contains %d chapters.")%chaptercount,'edit-undo.png',showerror=False)
|
||||
elif chaptercount > urlchaptercount and not (book['collision'] == UPDATEALWAYS and adapter.getConfig('force_update_epub_always')):
|
||||
raise NotGoingToDownload(_("Existing epub contains %d chapters, web site only has %d. Use Overwrite or force_update_epub_always to force update.") % (chaptercount,urlchaptercount),'dialog_error.png')
|
||||
elif chaptercount == 0:
|
||||
raise NotGoingToDownload(_("FanFicFare doesn't recognize chapters in existing epub, epub is probably from a different source. Use Overwrite to force update."),'dialog_error.png')
|
||||
|
||||
if not (book['collision'] == UPDATEALWAYS and chaptercount == urlchaptercount) \
|
||||
and adapter.getConfig("do_update_hook"):
|
||||
chaptercount = adapter.hookForUpdates(chaptercount)
|
||||
|
||||
logger.info("Do update - epub(%d) vs url(%d)" % (chaptercount, urlchaptercount))
|
||||
logger.info("write to %s"%outfile)
|
||||
|
||||
inject_cal_cols(book,story,configuration)
|
||||
writer.writeStory(outfilename=outfile,
|
||||
forceOverwrite=True,
|
||||
notification=notification)
|
||||
|
||||
if adapter.story.chapter_error_count > 0:
|
||||
book['comment'] = _('Update %(fileform)s completed, added %(added)s chapters, %(failed)s failed chapters, for %(total)s total.')%\
|
||||
{'fileform':options['fileform'],
|
||||
'failed':adapter.story.chapter_error_count,
|
||||
'added':(urlchaptercount-chaptercount),
|
||||
'total':urlchaptercount}
|
||||
book['chapter_error_count'] = adapter.story.chapter_error_count
|
||||
else:
|
||||
book['comment'] = _('Update %(fileform)s completed, added %(added)s chapters for %(total)s total.')%\
|
||||
{'fileform':options['fileform'],'added':(urlchaptercount-chaptercount),'total':urlchaptercount}
|
||||
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
|
||||
if options['savemetacol'] != '':
|
||||
book['savemetacol'] = story.dump_html_metadata()
|
||||
else:
|
||||
## Shouldn't ever get here, but hey, it happened once
|
||||
## before with prefs['collision']
|
||||
raise Exception("Impossible state reached -- Book: %s:\nOptions:%s:"%(book,options))
|
||||
|
||||
if options['do_wordcount'] == SAVE_YES or (
|
||||
options['do_wordcount'] == SAVE_YES_UNLESS_SITE and not story.getMetadataRaw('numWords') ):
|
||||
try:
|
||||
wordcount = get_word_count(outfile)
|
||||
# logger.info("get_word_count:%s"%wordcount)
|
||||
# clear cache for the rather unusual case of
|
||||
# numWords affecting other previously cached
|
||||
# entries.
|
||||
story.clear_processed_metadata_cache()
|
||||
story.setMetadata('numWords',wordcount)
|
||||
writer.writeStory(outfilename=outfile, forceOverwrite=True)
|
||||
book['all_metadata'] = story.getAllMetadata(removeallentities=True)
|
||||
if options['savemetacol'] != '':
|
||||
book['savemetacol'] = story.dump_html_metadata()
|
||||
except:
|
||||
logger.error("WordCount failed")
|
||||
|
||||
if options['smarten_punctuation'] and options['fileform'] == "epub":
|
||||
# for smarten punc
|
||||
from calibre.ebooks.oeb.polish.main import polish, ALL_OPTS
|
||||
from calibre.utils.logging import Log
|
||||
from collections import namedtuple
|
||||
|
||||
# do smarten_punctuation from calibre's polish feature
|
||||
data = {'smarten_punctuation':True}
|
||||
opts = ALL_OPTS.copy()
|
||||
opts.update(data)
|
||||
O = namedtuple('Options', ' '.join(ALL_OPTS.keys()))
|
||||
opts = O(**opts)
|
||||
|
||||
log = Log(level=Log.DEBUG)
|
||||
polish({outfile:outfile}, opts, log, logger.info)
|
||||
## here to catch tags set in chapters in literotica for
|
||||
## both overwrites and updates.
|
||||
book['tags'] = story.getSubjectTags(removeallentities=True)
|
||||
except NotGoingToDownload as d:
|
||||
book['good']=False
|
||||
book['status']=_('Bad')
|
||||
book['showerror']=d.showerror
|
||||
book['comment']=unicode(d)
|
||||
book['icon'] = d.icon
|
||||
|
||||
except Exception as e:
|
||||
book['good']=False
|
||||
book['status']=_('Error')
|
||||
book['comment']=unicode(e)
|
||||
book['icon']='dialog_error.png'
|
||||
book['status'] = _('Error')
|
||||
logger.info("Exception: %s:%s"%(book,book['comment']),exc_info=True)
|
||||
return book
|
||||
|
||||
## calibre's columns for an existing book are passed in and injected
|
||||
## into the story's metadata. For convenience, we also add labels and
|
||||
## valid_entries for them in a special [injected] section that has
|
||||
## even less precedence than [defaults]
|
||||
def inject_cal_cols(book,story,configuration):
|
||||
configuration.remove_section('injected')
|
||||
if 'calibre_columns' in book:
|
||||
injectini = ['[injected]']
|
||||
extra_valid = []
|
||||
for k in book['calibre_columns'].keys():
|
||||
v = book['calibre_columns'][k]
|
||||
story.setMetadata(k,v['val'])
|
||||
injectini.append('%s_label:%s'%(k,v['label']))
|
||||
extra_valid.append(k)
|
||||
if extra_valid: # if empty, there's nothing to add.
|
||||
injectini.append("add_to_extra_valid_entries:,"+','.join(extra_valid))
|
||||
configuration.read_file(StringIO('\n'.join(injectini)))
|
||||
#print("added:\n%s\n"%('\n'.join(injectini)))
|
||||
#!/usr/bin/env python
|
||||
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2012, Jim Miller'
|
||||
__copyright__ = '2011, Grant Drake <grant.drake@gmail.com>'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import time, os, traceback
|
||||
|
||||
from ConfigParser import SafeConfigParser
|
||||
from StringIO import StringIO
|
||||
|
||||
from calibre.utils.ipc.server import Server
|
||||
from calibre.utils.ipc.job import ParallelJob
|
||||
from calibre.utils.logging import Log
|
||||
|
||||
from calibre_plugins.fanfictiondownloader_plugin.dialogs import (NotGoingToDownload,
|
||||
OVERWRITE, OVERWRITEALWAYS, UPDATE, UPDATEALWAYS, ADDNEW, SKIP, CALIBREONLY)
|
||||
from calibre_plugins.fanfictiondownloader_plugin.fanficdownloader import adapters, writers, exceptions
|
||||
from calibre_plugins.fanfictiondownloader_plugin.fanficdownloader.epubutils import get_update_data
|
||||
|
||||
# ------------------------------------------------------------------------------
|
||||
#
|
||||
# Functions to perform downloads using worker jobs
|
||||
#
|
||||
# ------------------------------------------------------------------------------
|
||||
|
||||
def do_download_worker(book_list, options,
|
||||
cpus, notification=lambda x,y:x):
|
||||
'''
|
||||
Master job, to launch child jobs to extract ISBN for a set of books
|
||||
This is run as a worker job in the background to keep the UI more
|
||||
responsive and get around the memory leak issues as it will launch
|
||||
a child job for each book as a worker process
|
||||
'''
|
||||
server = Server(pool_size=cpus)
|
||||
|
||||
print(options['version'])
|
||||
total = 0
|
||||
# Queue all the jobs
|
||||
print("Adding jobs for URLs:")
|
||||
for book in book_list:
|
||||
if book['good']:
|
||||
print("%s"%book['url'])
|
||||
total += 1
|
||||
args = ['calibre_plugins.fanfictiondownloader_plugin.jobs',
|
||||
'do_download_for_worker',
|
||||
(book,options)]
|
||||
job = ParallelJob('arbitrary',
|
||||
"url:(%s) id:(%s)"%(book['url'],book['calibre_id']),
|
||||
done=None,
|
||||
args=args)
|
||||
job._book = book
|
||||
# job._book_id = book_id
|
||||
# job._title = title
|
||||
# job._modified_date = modified_date
|
||||
# job._existing_isbn = existing_isbn
|
||||
server.add_job(job)
|
||||
|
||||
# This server is an arbitrary_n job, so there is a notifier available.
|
||||
# Set the % complete to a small number to avoid the 'unavailable' indicator
|
||||
notification(0.01, 'Downloading FanFiction Stories')
|
||||
|
||||
# dequeue the job results as they arrive, saving the results
|
||||
count = 0
|
||||
while True:
|
||||
job = server.changed_jobs_queue.get()
|
||||
# A job can 'change' when it is not finished, for example if it
|
||||
# produces a notification. Ignore these.
|
||||
job.update()
|
||||
if not job.is_finished:
|
||||
continue
|
||||
# A job really finished. Get the information.
|
||||
output_book = job.result
|
||||
#print("output_book:%s"%output_book)
|
||||
book_list.remove(job._book)
|
||||
book_list.append(job.result)
|
||||
book_id = job._book['calibre_id']
|
||||
#title = job._title
|
||||
count = count + 1
|
||||
notification(float(count)/total, 'Downloaded Story')
|
||||
# Add this job's output to the current log
|
||||
print('Logfile for book ID %s (%s)'%(book_id, job._book['title']))
|
||||
print(job.details)
|
||||
|
||||
if count >= total:
|
||||
# All done!
|
||||
break
|
||||
|
||||
server.close()
|
||||
|
||||
# return the book list as the job result
|
||||
return book_list
|
||||
|
||||
def do_download_for_worker(book,options):
|
||||
'''
|
||||
Child job, to extract isbn from formats for this specific book,
|
||||
when run as a worker job
|
||||
'''
|
||||
try:
|
||||
book['comment'] = 'Download started...'
|
||||
|
||||
ffdlconfig = SafeConfigParser()
|
||||
ffdlconfig.readfp(StringIO(get_resources("plugin-defaults.ini")))
|
||||
ffdlconfig.readfp(StringIO(options['personal.ini']))
|
||||
|
||||
adapter = adapters.getAdapter(ffdlconfig,book['url'],options['fileform'])
|
||||
adapter.is_adult = book['is_adult']
|
||||
adapter.username = book['username']
|
||||
adapter.password = book['password']
|
||||
|
||||
story = adapter.getStoryMetadataOnly()
|
||||
writer = writers.getWriter(options['fileform'],adapter.config,adapter)
|
||||
|
||||
outfile = book['outfile']
|
||||
|
||||
## No need to download at all. Shouldn't ever get down here.
|
||||
if options['collision'] in (CALIBREONLY):
|
||||
print("Skipping CALIBREONLY 'update' down inside worker--this shouldn't be happening...")
|
||||
book['comment'] = 'Metadata collected.'
|
||||
|
||||
## checks were done earlier, it's new or not dup or newer--just write it.
|
||||
elif options['collision'] in (ADDNEW, SKIP, OVERWRITE, OVERWRITEALWAYS) or \
|
||||
('epub_for_update' not in book and options['collision'] in (UPDATE, UPDATEALWAYS)):
|
||||
print("write to %s"%outfile)
|
||||
writer.writeStory(outfilename=outfile, forceOverwrite=True)
|
||||
book['comment'] = 'Download %s completed, %s chapters.'%(options['fileform'],story.getMetadata("numChapters"))
|
||||
|
||||
## checks were done earlier, just update it.
|
||||
elif 'epub_for_update' in book and options['collision'] in (UPDATE, UPDATEALWAYS):
|
||||
|
||||
# update now handled by pre-populating the old images and
|
||||
# chapters in the adapter rather than merging epubs.
|
||||
urlchaptercount = int(story.getMetadata('numChapters'))
|
||||
(url,chaptercount,
|
||||
adapter.oldchapters,
|
||||
adapter.oldimgs) = get_update_data(book['epub_for_update'])
|
||||
|
||||
print("Do update - epub(%d) vs url(%d)" % (chaptercount, urlchaptercount))
|
||||
print("write to %s"%outfile)
|
||||
|
||||
writer.writeStory(outfilename=outfile, forceOverwrite=True)
|
||||
|
||||
book['comment'] = 'Update %s completed, added %s chapters for %s total.'%\
|
||||
(options['fileform'],(urlchaptercount-chaptercount),urlchaptercount)
|
||||
|
||||
except NotGoingToDownload as d:
|
||||
book['good']=False
|
||||
book['comment']=unicode(d)
|
||||
book['icon'] = d.icon
|
||||
|
||||
except Exception as e:
|
||||
book['good']=False
|
||||
book['comment']=unicode(e)
|
||||
book['icon']='dialog_error.png'
|
||||
print("Exception: %s:%s"%(book,unicode(e)))
|
||||
traceback.print_exc()
|
||||
|
||||
#time.sleep(10)
|
||||
return book
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -1,76 +0,0 @@
|
|||
## This is an example of what your personal configuration might look
|
||||
## like. Uncomment options by removing the '#' in front of them.
|
||||
|
||||
[defaults]
|
||||
## [defaults] section applies to all formats and sites but may be
|
||||
## overridden at several levels. See
|
||||
## https://github.com/JimmXinu/FanFicFare/wiki/INI-File for more
|
||||
## details.
|
||||
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. Uncomment by removing '#' in front of is_adult.
|
||||
#is_adult:true
|
||||
|
||||
## Don't like the numbers at the start of chapter titles on some
|
||||
## sites? You can use strip_chapter_numbers to strip them off. Just
|
||||
## want to make them all look the same? Strip them off, then add them
|
||||
## back on with add_chapter_numbers. Don't like the way it strips
|
||||
## numbers or adds them back? See chapter_title_strip_pattern and
|
||||
## chapter_title_add_pattern in defaults.ini.
|
||||
#strip_chapter_numbers:true
|
||||
#add_chapter_numbers:true
|
||||
|
||||
|
||||
[epub]
|
||||
## Include images from img tags in the body and summary of stories.
|
||||
## Images will be converted to jpg for size if possible. Images work
|
||||
## in epub format only. To get mobi or other format with images,
|
||||
## download as epub and use Calibre to convert.
|
||||
## true by default, uncomment and set false to not include images.
|
||||
#include_images:true
|
||||
|
||||
## If set false, the summary will have all html stripped for safety.
|
||||
## Both this and include_images must be true to get images in the
|
||||
## summary.
|
||||
## true by default, uncomment and set false to not keep summary html.
|
||||
#keep_summary_html:true
|
||||
|
||||
## If set true, and there isn't a specific cover image, the first
|
||||
## image found in the story will be made the cover image. If
|
||||
## keep_summary_html is true, images in the summary will be before any
|
||||
## in chapters.
|
||||
## true by default, uncomment and set false to turn off
|
||||
#make_firstimage_cover:true
|
||||
|
||||
|
||||
## Most common, I expect will be using this to save username/passwords
|
||||
## for different sites. Here are a few examples. See defaults.ini
|
||||
## for the full list.
|
||||
|
||||
[www.twilighted.net]
|
||||
#username:YourPenname
|
||||
#password:YourPassword
|
||||
## default is false
|
||||
#collect_series: true
|
||||
|
||||
[www.fimfiction.net]
|
||||
#is_adult:true
|
||||
#fail_on_password: false
|
||||
|
||||
[www.tthfanfic.org]
|
||||
#is_adult:true
|
||||
## tth is a little unusual--it doesn't require user/pass, but the site
|
||||
## keeps track of which chapters you've read and won't send another
|
||||
## update until it thinks you're up to date. If you set
|
||||
## username/password, FFF will login to download. Then the site
|
||||
## thinks you're up to date.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
|
||||
## This section will override anything in the system defaults or other
|
||||
## sections here.
|
||||
[overrides]
|
||||
## default varies by site. Set true here to force all sites to
|
||||
## collect series.
|
||||
#collect_series: true
|
||||
|
|
@ -1,282 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2021, Jim Miller'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
import copy
|
||||
|
||||
from calibre.gui2.ui import get_gui
|
||||
|
||||
# pulls in translation files for _() strings
|
||||
try:
|
||||
load_translations()
|
||||
except NameError:
|
||||
pass # load_translations() added in calibre 1.9
|
||||
|
||||
from calibre_plugins.fanficfare_plugin import __version__ as plugin_version
|
||||
from calibre_plugins.fanficfare_plugin.common_utils import get_library_uuid
|
||||
|
||||
SKIP=_('Skip')
|
||||
ADDNEW=_('Add New Book')
|
||||
UPDATE=_('Update EPUB if New Chapters')
|
||||
UPDATEALWAYS=_('Update EPUB Always')
|
||||
OVERWRITE=_('Overwrite if Newer')
|
||||
OVERWRITEALWAYS=_('Overwrite Always')
|
||||
CALIBREONLY=_('Update Calibre Metadata from Web Site')
|
||||
CALIBREONLYSAVECOL=_('Update Calibre Metadata from Saved Metadata Column')
|
||||
collision_order=[SKIP,
|
||||
ADDNEW,
|
||||
UPDATE,
|
||||
UPDATEALWAYS,
|
||||
OVERWRITE,
|
||||
OVERWRITEALWAYS,
|
||||
CALIBREONLY,
|
||||
CALIBREONLYSAVECOL,]
|
||||
|
||||
# best idea I've had for how to deal with config/pref saving the
|
||||
# collision name in english.
|
||||
SAVE_SKIP='Skip'
|
||||
SAVE_ADDNEW='Add New Book'
|
||||
SAVE_UPDATE='Update EPUB if New Chapters'
|
||||
SAVE_UPDATEALWAYS='Update EPUB Always'
|
||||
SAVE_OVERWRITE='Overwrite if Newer'
|
||||
SAVE_OVERWRITEALWAYS='Overwrite Always'
|
||||
SAVE_CALIBREONLY='Update Calibre Metadata Only'
|
||||
SAVE_CALIBREONLYSAVECOL='Update Calibre Metadata Only(Saved Column)'
|
||||
save_collisions={
|
||||
SKIP:SAVE_SKIP,
|
||||
ADDNEW:SAVE_ADDNEW,
|
||||
UPDATE:SAVE_UPDATE,
|
||||
UPDATEALWAYS:SAVE_UPDATEALWAYS,
|
||||
OVERWRITE:SAVE_OVERWRITE,
|
||||
OVERWRITEALWAYS:SAVE_OVERWRITEALWAYS,
|
||||
CALIBREONLY:SAVE_CALIBREONLY,
|
||||
CALIBREONLYSAVECOL:SAVE_CALIBREONLYSAVECOL,
|
||||
SAVE_SKIP:SKIP,
|
||||
SAVE_ADDNEW:ADDNEW,
|
||||
SAVE_UPDATE:UPDATE,
|
||||
SAVE_UPDATEALWAYS:UPDATEALWAYS,
|
||||
SAVE_OVERWRITE:OVERWRITE,
|
||||
SAVE_OVERWRITEALWAYS:OVERWRITEALWAYS,
|
||||
SAVE_CALIBREONLY:CALIBREONLY,
|
||||
SAVE_CALIBREONLYSAVECOL:CALIBREONLYSAVECOL,
|
||||
}
|
||||
|
||||
anthology_collision_order=[UPDATE,
|
||||
UPDATEALWAYS,
|
||||
OVERWRITEALWAYS]
|
||||
|
||||
|
||||
# Show translated strings, but save the same string in prefs so your
|
||||
# prefs are the same in different languages.
|
||||
YES=_('Yes, Always')
|
||||
SAVE_YES='Yes'
|
||||
YES_IF_IMG=_('Yes, if EPUB has a cover image')
|
||||
SAVE_YES_IF_IMG='Yes, if img'
|
||||
YES_UNLESS_IMG=_('Yes, unless FanFicFare found a cover image')
|
||||
SAVE_YES_UNLESS_IMG='Yes, unless img'
|
||||
YES_UNLESS_SITE=_('Yes, unless found on site')
|
||||
SAVE_YES_UNLESS_SITE='Yes, unless site'
|
||||
NO=_('No')
|
||||
SAVE_NO='No'
|
||||
prefs_save_options = {
|
||||
YES:SAVE_YES,
|
||||
SAVE_YES:YES,
|
||||
YES_IF_IMG:SAVE_YES_IF_IMG,
|
||||
SAVE_YES_IF_IMG:YES_IF_IMG,
|
||||
YES_UNLESS_IMG:SAVE_YES_UNLESS_IMG,
|
||||
SAVE_YES_UNLESS_IMG:YES_UNLESS_IMG,
|
||||
NO:SAVE_NO,
|
||||
SAVE_NO:NO,
|
||||
YES_UNLESS_SITE:SAVE_YES_UNLESS_SITE,
|
||||
SAVE_YES_UNLESS_SITE:YES_UNLESS_SITE,
|
||||
}
|
||||
updatecalcover_order=[YES,YES_IF_IMG,NO]
|
||||
gencalcover_order=[YES,YES_UNLESS_IMG,NO]
|
||||
do_wordcount_order=[YES,YES_UNLESS_SITE,NO]
|
||||
|
||||
PREFS_NAMESPACE = 'FanFicFarePlugin'
|
||||
PREFS_KEY_SETTINGS = 'settings'
|
||||
|
||||
# Set defaults used by all. Library specific settings continue to
|
||||
# take from here.
|
||||
default_prefs = {}
|
||||
default_prefs['last_saved_version'] = (0,0,0)
|
||||
default_prefs['personal.ini'] = get_resources('plugin-example.ini')
|
||||
default_prefs['cal_cols_pass_in'] = False
|
||||
default_prefs['rejecturls'] = '' # removed, but need empty default for fallback
|
||||
default_prefs['rejectreasons'] = '''Sucked
|
||||
Boring
|
||||
Dup from another site'''
|
||||
default_prefs['reject_always'] = False
|
||||
default_prefs['reject_delete_default'] = True
|
||||
|
||||
default_prefs['updatemeta'] = True
|
||||
default_prefs['bgmeta'] = False
|
||||
#default_prefs['updateepubcover'] = True # removed in favor of always True Oct 2022
|
||||
default_prefs['keeptags'] = False
|
||||
default_prefs['suppressauthorsort'] = False
|
||||
default_prefs['suppresstitlesort'] = False
|
||||
default_prefs['authorcase'] = False
|
||||
default_prefs['titlecase'] = False
|
||||
default_prefs['seriescase'] = False
|
||||
default_prefs['setanthologyseries'] = False
|
||||
default_prefs['mark'] = False
|
||||
default_prefs['mark_success'] = True
|
||||
default_prefs['mark_failed'] = True
|
||||
default_prefs['mark_chapter_error'] = True
|
||||
default_prefs['showmarked'] = False
|
||||
default_prefs['autoconvert'] = False
|
||||
default_prefs['urlsfromclip'] = True
|
||||
default_prefs['button_instantpopup'] = False
|
||||
default_prefs['updatedefault'] = True
|
||||
default_prefs['fileform'] = 'epub'
|
||||
default_prefs['collision'] = SAVE_UPDATE
|
||||
default_prefs['deleteotherforms'] = False
|
||||
default_prefs['adddialogstaysontop'] = False
|
||||
default_prefs['lookforurlinhtml'] = False
|
||||
default_prefs['checkforseriesurlid'] = True
|
||||
default_prefs['auto_reject_seriesurlid'] = False
|
||||
default_prefs['mark_series_anthologies'] = False
|
||||
default_prefs['checkforurlchange'] = True
|
||||
default_prefs['injectseries'] = False
|
||||
default_prefs['matchtitleauth'] = True
|
||||
default_prefs['do_wordcount'] = SAVE_YES_UNLESS_SITE
|
||||
default_prefs['smarten_punctuation'] = False
|
||||
default_prefs['show_est_time'] = False
|
||||
|
||||
default_prefs['send_lists'] = ''
|
||||
default_prefs['read_lists'] = ''
|
||||
default_prefs['addtolists'] = False
|
||||
default_prefs['addtoreadlists'] = False
|
||||
default_prefs['addtolistsonread'] = False
|
||||
default_prefs['autounnew'] = False
|
||||
|
||||
default_prefs['updatecalcover'] = SAVE_YES_IF_IMG
|
||||
default_prefs['covernewonly'] = False
|
||||
default_prefs['gencalcover'] = SAVE_YES_UNLESS_IMG
|
||||
default_prefs['updatecover'] = False
|
||||
default_prefs['calibre_gen_cover'] = True
|
||||
default_prefs['plugin_gen_cover'] = False
|
||||
default_prefs['gcnewonly'] = True
|
||||
default_prefs['gc_site_settings'] = {}
|
||||
default_prefs['allow_gc_from_ini'] = True
|
||||
default_prefs['gc_polish_cover'] = False
|
||||
|
||||
default_prefs['countpagesstats'] = []
|
||||
default_prefs['wordcountmissing'] = False
|
||||
|
||||
default_prefs['errorcol'] = ''
|
||||
default_prefs['save_all_errors'] = True
|
||||
default_prefs['savemetacol'] = ''
|
||||
default_prefs['lastcheckedcol'] = ''
|
||||
default_prefs['custom_cols'] = {}
|
||||
default_prefs['custom_cols_newonly'] = {}
|
||||
default_prefs['allow_custcol_from_ini'] = True
|
||||
|
||||
default_prefs['std_cols_newonly'] = {}
|
||||
default_prefs['set_author_url'] = True
|
||||
default_prefs['set_series_url'] = True
|
||||
default_prefs['includecomments'] = False
|
||||
default_prefs['anth_comments_newonly'] = True
|
||||
|
||||
default_prefs['imapserver'] = ''
|
||||
default_prefs['imapuser'] = ''
|
||||
default_prefs['imappass'] = ''
|
||||
default_prefs['imapsessionpass'] = False
|
||||
default_prefs['imapfolder'] = 'INBOX'
|
||||
default_prefs['imaptags'] = ''
|
||||
default_prefs['imapmarkread'] = True
|
||||
default_prefs['auto_reject_from_email'] = False
|
||||
default_prefs['update_existing_only_from_email'] = False
|
||||
default_prefs['download_from_email_immediately'] = False
|
||||
|
||||
|
||||
#default_prefs['single_proc_jobs'] = True # setting and code removed
|
||||
default_prefs['site_split_jobs'] = True
|
||||
default_prefs['reconsolidate_jobs'] = True
|
||||
|
||||
def set_library_config(library_config,db,setting=PREFS_KEY_SETTINGS):
|
||||
db.prefs.set_namespaced(PREFS_NAMESPACE,
|
||||
setting,
|
||||
library_config)
|
||||
|
||||
def get_library_config(db,setting=PREFS_KEY_SETTINGS,def_prefs=default_prefs):
|
||||
library_id = get_library_uuid(db)
|
||||
library_config = None
|
||||
|
||||
if library_config is None:
|
||||
#print("get prefs from db")
|
||||
library_config = db.prefs.get_namespaced(PREFS_NAMESPACE,
|
||||
setting)
|
||||
|
||||
if library_config is None:
|
||||
# defaults.
|
||||
logger.info("Using default settings")
|
||||
library_config = copy.deepcopy(def_prefs)
|
||||
|
||||
return library_config
|
||||
|
||||
# fake out so I don't have to change the prefs calls anywhere. The
|
||||
# Java programmer in me is offended by op-overloading, but it's very
|
||||
# tidy.
|
||||
class PrefsFacade():
|
||||
def _get_db(self):
|
||||
if self.passed_db:
|
||||
return self.passed_db
|
||||
else:
|
||||
# In the GUI plugin we want current db so we detect when
|
||||
# it's changed. CLI plugin calls need to pass db in.
|
||||
return get_gui().current_db
|
||||
|
||||
def __init__(self,passed_db=None,setting=PREFS_KEY_SETTINGS,def_prefs=default_prefs):
|
||||
self.default_prefs = def_prefs
|
||||
self.setting=setting
|
||||
self.libraryid = None
|
||||
self.current_prefs = None
|
||||
self.passed_db=passed_db
|
||||
|
||||
def _get_prefs(self):
|
||||
libraryid = get_library_uuid(self._get_db())
|
||||
if self.current_prefs == None or self.libraryid != libraryid:
|
||||
#print("self.current_prefs == None(%s) or self.libraryid != libraryid(%s)"%(self.current_prefs == None,self.libraryid != libraryid))
|
||||
self.libraryid = libraryid
|
||||
self.current_prefs = get_library_config(self._get_db(),
|
||||
setting=self.setting,
|
||||
def_prefs=self.default_prefs)
|
||||
return self.current_prefs
|
||||
|
||||
def __getitem__(self,k):
|
||||
prefs = self._get_prefs()
|
||||
if k not in prefs:
|
||||
# pulls from default_prefs.defaults automatically if not set
|
||||
# in default_prefs
|
||||
return self.default_prefs[k]
|
||||
return prefs[k]
|
||||
|
||||
def __setitem__(self,k,v):
|
||||
prefs = self._get_prefs()
|
||||
prefs[k]=v
|
||||
# self._save_prefs(prefs)
|
||||
|
||||
def __delitem__(self,k):
|
||||
prefs = self._get_prefs()
|
||||
if k in prefs:
|
||||
del prefs[k]
|
||||
|
||||
def save_to_db(self):
|
||||
self['last_saved_version'] = plugin_version
|
||||
set_library_config(self._get_prefs(),self._get_db(),setting=self.setting)
|
||||
|
||||
prefs = PrefsFacade(setting=PREFS_KEY_SETTINGS,
|
||||
def_prefs=default_prefs)
|
||||
|
||||
rejects_data = PrefsFacade(setting="rejects_data",
|
||||
def_prefs={'rejecturls_data':[]})
|
||||
|
|
@ -1,6 +0,0 @@
|
|||
# Translations
|
||||
|
||||
If you're interested in helping provide translations for this project,
|
||||
please use the
|
||||
[Transifex](https://www.transifex.com/projects/p/calibre-plugins/resources/)
|
||||
website to add translations to this, or other calibre plugins that support it.
|
||||
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
|
|
@ -1,95 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
|
||||
from __future__ import (unicode_literals, division, absolute_import,
|
||||
print_function)
|
||||
|
||||
__license__ = 'GPL v3'
|
||||
__copyright__ = '2016, Jim Miller, 2011, Grant Drake <grant.drake@gmail.com>'
|
||||
__docformat__ = 'restructuredtext en'
|
||||
|
||||
'''
|
||||
A lot of this is lifted from Count Pages plugin by Grant Drake (with
|
||||
some changes from davidfor.)
|
||||
'''
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
import re
|
||||
|
||||
from calibre.ebooks.oeb.iterator import EbookIterator
|
||||
from fanficfare.six import text_type as unicode
|
||||
|
||||
RE_HTML_BODY = re.compile(u'<body[^>]*>(.*)</body>', re.UNICODE | re.DOTALL | re.IGNORECASE)
|
||||
RE_STRIP_MARKUP = re.compile(u'<[^>]+>', re.UNICODE)
|
||||
|
||||
|
||||
def get_word_count(book_path):
|
||||
'''
|
||||
Estimate a word count
|
||||
'''
|
||||
from calibre.utils.localization import get_lang
|
||||
|
||||
iterator = _open_epub_file(book_path)
|
||||
|
||||
lang = iterator.opf.language
|
||||
lang = get_lang() if not lang else lang
|
||||
count = _get_epub_standard_word_count(iterator, lang)
|
||||
|
||||
return count
|
||||
|
||||
def _open_epub_file(book_path, strip_html=False):
|
||||
'''
|
||||
Given a path to an EPUB file, read the contents into a giant block of text
|
||||
'''
|
||||
iterator = EbookIterator(book_path)
|
||||
iterator.__enter__(only_input_plugin=True, run_char_count=True,
|
||||
read_anchor_map=False)
|
||||
return iterator
|
||||
|
||||
def _get_epub_standard_word_count(iterator, lang='en'):
|
||||
'''
|
||||
This algorithm counts individual words instead of pages
|
||||
'''
|
||||
|
||||
book_text = _read_epub_contents(iterator, strip_html=True)
|
||||
|
||||
try:
|
||||
from calibre.spell.break_iterator import count_words
|
||||
wordcount = count_words(book_text, lang)
|
||||
logger.debug('\tWord count - count_words method:%s'%wordcount)
|
||||
except:
|
||||
try: # The above method is new and no-one will have it as of 08/01/2016. Use an older method for a beta.
|
||||
from calibre.spell.break_iterator import split_into_words_and_positions
|
||||
wordcount = len(split_into_words_and_positions(book_text, lang))
|
||||
logger.debug('\tWord count - split_into_words_and_positions method:%s'%wordcount)
|
||||
except:
|
||||
from calibre.utils.wordcount import get_wordcount_obj
|
||||
wordcount = get_wordcount_obj(book_text)
|
||||
wordcount = wordcount.words
|
||||
logger.debug('\tWord count - old method:%s'%wordcount)
|
||||
|
||||
return wordcount
|
||||
|
||||
def _read_epub_contents(iterator, strip_html=False):
|
||||
'''
|
||||
Given an iterator for an ePub file, read the contents into a giant block of text
|
||||
'''
|
||||
book_files = []
|
||||
for path in iterator.spine:
|
||||
with open(path, 'rb') as f:
|
||||
html = f.read().decode('utf-8', 'replace')
|
||||
if strip_html:
|
||||
html = unicode(_extract_body_text(html)).strip()
|
||||
#print('FOUND HTML:', html)
|
||||
book_files.append(html)
|
||||
return ''.join(book_files)
|
||||
|
||||
def _extract_body_text(data):
|
||||
'''
|
||||
Get the body text of this html content wit any html tags stripped
|
||||
'''
|
||||
body = RE_HTML_BODY.findall(data)
|
||||
if body:
|
||||
return RE_STRIP_MARKUP.sub('', body[0]).replace('.','. ')
|
||||
return ''
|
||||
10
cron.yaml
Normal file
10
cron.yaml
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
cron:
|
||||
- description: cleanup job
|
||||
url: /r3m0v3r
|
||||
schedule: every 2 hours
|
||||
|
||||
# There's a bug in the Python 2.7 runtime that prevents this from
|
||||
# working properly. In theory, there should never be orphans anyway.
|
||||
#- description: orphan cleanup job
|
||||
# url: /r3m0v3rOrphans
|
||||
# schedule: every 4 hours
|
||||
73
css/index.css
Normal file
73
css/index.css
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
body
|
||||
{
|
||||
font: 0.9em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif;
|
||||
}
|
||||
|
||||
#main
|
||||
{
|
||||
width: 60%;
|
||||
margin-left: 20%;
|
||||
background-color: #dae6ff;
|
||||
padding: 2em;
|
||||
}
|
||||
|
||||
#greeting
|
||||
{
|
||||
# margin-bottom: 1em;
|
||||
border-color: #efefef;
|
||||
}
|
||||
|
||||
|
||||
|
||||
#logpassword:hover, #logpasswordtable:hover, #urlbox:hover, #typebox:hover, #helpbox:hover, #yourfile:hover
|
||||
{
|
||||
border: thin solid #fffeff;
|
||||
}
|
||||
|
||||
h1
|
||||
{
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
#logpasswordtable
|
||||
{
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
#logpassword, #logpasswordtable {
|
||||
// display: none;
|
||||
}
|
||||
|
||||
#urlbox, #typebox, #logpasswordtable, #logpassword, #helpbox, #yourfile
|
||||
{
|
||||
margin: 1em;
|
||||
padding: 1em;
|
||||
border: thin dotted #fffeff;
|
||||
}
|
||||
|
||||
div.field
|
||||
{
|
||||
margin-bottom: 0.5em;
|
||||
}
|
||||
|
||||
#submitbtn
|
||||
{
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
#typelabel
|
||||
{
|
||||
}
|
||||
|
||||
#typeoptions
|
||||
{
|
||||
margin-top: 0.5em;
|
||||
}
|
||||
|
||||
#error
|
||||
{
|
||||
color: #f00;
|
||||
}
|
||||
.recent {
|
||||
font-size: large;
|
||||
}
|
||||
478
defaults.ini
Normal file
478
defaults.ini
Normal file
|
|
@ -0,0 +1,478 @@
|
|||
# Copyright 2012 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
[defaults]
|
||||
|
||||
## [defaults] section applies to all formats and sites but may be
|
||||
## overridden at several levels
|
||||
|
||||
## All available titlepage_entries and the label used for them:
|
||||
## <entryname>_label:<label>
|
||||
## Labels may be customized.
|
||||
title_label:Title
|
||||
storyUrl_label:Story URL
|
||||
description_label:Summary
|
||||
author_label:Author
|
||||
authorUrl_label:Author URL
|
||||
## epub, txt, html
|
||||
formatname_label:File Format
|
||||
## .epub, .txt, .html
|
||||
formatext_label:File Extension
|
||||
## Category and Genre have overlap, depending on the site.
|
||||
## Sometimes Harry Potter is a category and Fantasy a genre. (fanfiction.net)
|
||||
## Sometimes Fantasy is category *and* a genre (fictionpress.com)
|
||||
## Sometimes there are multiple categories and/or genres.
|
||||
category_label:Category
|
||||
genre_label:Genre
|
||||
language_label:Language
|
||||
characters_label:Characters
|
||||
series_label:Series
|
||||
## Completed/In-Progress
|
||||
status_label:Status
|
||||
## Dates story first published, last updated, and downloaded(last with time).
|
||||
datePublished_label:Published
|
||||
dateUpdated_label:Updated
|
||||
dateCreated_label:Packaged
|
||||
## Rating depends on the site. Some use K,T,M,etc, and some PG,R,NC-17
|
||||
rating_label:Rating
|
||||
## Also depends on the site.
|
||||
warnings_label:Warnings
|
||||
numChapters_label:Chapters
|
||||
numWords_label:Words
|
||||
## www.fanfiction.net, fictionalley.com, etc.
|
||||
site_label:Publisher
|
||||
## ffnet, fpcom, etc.
|
||||
siteabbrev_label:Site Abbrev
|
||||
## The site's unique story/author identifier. Usually a number.
|
||||
storyId_label:Story ID
|
||||
authorId_label:Author ID
|
||||
## Primarily to put specific values in dc:subject tags for epub. Will
|
||||
## show up in Calibre as tags. Also carried into mobi when converted.
|
||||
extratags_label:Extra Tags
|
||||
## The version of fanficdownloader
|
||||
##
|
||||
version_label:FFDL Version
|
||||
|
||||
## items to include in the title page
|
||||
## Empty entries will *not* appear, even if in the list.
|
||||
## All current formats already include title and author.
|
||||
titlepage_entries: series,category,genre,language,characters,status,datePublished,dateUpdated,dateCreated,rating,warnings,numChapters,numWords,site,description
|
||||
|
||||
## Try to collect series name and number of this story in series.
|
||||
## Some sites (ab)use 'series' for reading lists and personal
|
||||
## collections. This lets us turn it on and off by site without
|
||||
## keeping a lengthy titlepage_entries per site and prevents it
|
||||
## updating in the plugin.
|
||||
collect_series: true
|
||||
|
||||
## include title page as first page.
|
||||
include_titlepage: true
|
||||
|
||||
## include a TOC page before the story text
|
||||
include_tocpage: true
|
||||
|
||||
## website encoding(s) In theory, each website reports the character
|
||||
## encoding they use for each page. In practice, some sites report it
|
||||
## incorrectly. Each adapter has a default list, usually "utf8,
|
||||
## Windows-1252" or "Windows-1252, utf8", but this will let you
|
||||
## explicitly set the encoding and order if you need to. The special
|
||||
## value 'auto' will call chardet and use the encoding it reports if
|
||||
## it has +90% confidence. 'auto' is not reliable.
|
||||
#website_encodings: auto, utf8, Windows-1252
|
||||
|
||||
## python string Template, string with ${title}, ${author} etc, same as titlepage_entries
|
||||
## Can include directories. ${formatext} will be added if not in filename somewhere.
|
||||
#output_filename: books/${title}-${siteabbrev}_${storyId}${formatext}
|
||||
#output_filename: books/${formatname}/${siteabbrev}/${authorId}/${title}-${siteabbrev}_${storyId}${formatext}
|
||||
output_filename: ${title}-${siteabbrev}_${storyId}${formatext}
|
||||
|
||||
## Make directories as needed.
|
||||
make_directories: true
|
||||
|
||||
## Always overwrite output files. Otherwise, the downloader checks
|
||||
## the timestamp on the existing file and only overwrites if the story
|
||||
## has been updated more recently. Command line version only
|
||||
#always_overwrite: true
|
||||
|
||||
## put output (with output_filename) in a zip file zip_filename.
|
||||
zip_output: false
|
||||
|
||||
## Can include directories. .zip will be added if not in name somewhere
|
||||
zip_filename: ${title}-${siteabbrev}_${storyId}${formatext}.zip
|
||||
|
||||
## Normally, try to make the output file name 'safe' by removing
|
||||
## invalid filename chars. Applies to both output_filename &
|
||||
## zip_filename.
|
||||
allow_unsafe_filename: false
|
||||
|
||||
## entries to make epub subjects and calibre tags
|
||||
## lastupdate creates two tags: "Last Update Year/Month: %Y/%m" and "Last Update: %Y/%m/%d"
|
||||
include_subject_tags: extratags, genre, category, characters, lastupdate, status
|
||||
|
||||
## extra tags (comma separated) to include, primarily for epub.
|
||||
extratags: FanFiction
|
||||
|
||||
## number of seconds to sleep between calls to the story site. May by
|
||||
## useful if pulling large numbers of stories or if the site is slow.
|
||||
#slow_down_sleep_time:0.5
|
||||
|
||||
## For use only with stand-alone CLI version--run a command on the
|
||||
## generated file after it's produced. All of the titlepage_entries
|
||||
## values are available, plus output_filename.
|
||||
#post_process_cmd: addbook -f "${output_filename}" -t "${title}"
|
||||
|
||||
## Use regular expressions to find and replace (or remove) metadata.
|
||||
## For example, you could change Sci-Fi=>SF, remove *-Centered tags,
|
||||
## etc. See http://docs.python.org/library/re.html (look for re.sub)
|
||||
## for regexp details.
|
||||
## Make sure to keep at least one space at the start of each line and
|
||||
## to escape % to %%, if used.
|
||||
#replace_metadata:
|
||||
# Sci-Fi=>SF
|
||||
# Puella Magi Madoka Magica.* => Madoka
|
||||
# Comedy=>Humor
|
||||
# Crossover: (.*)=>\1
|
||||
# (.*)Great(.*)=>\1Moderate\2
|
||||
# .*-Centered=>
|
||||
|
||||
## Some readers don't show horizontal rule (<hr />) tags correctly.
|
||||
## This replaces them all with a centered '* * *'. (Note centering
|
||||
## doesn't work on some devices either.)
|
||||
#replace_hr: false
|
||||
|
||||
## Each output format has a section that overrides [defaults]
|
||||
[html]
|
||||
|
||||
## output background color--only used by html and epub (and ignored in
|
||||
## epub by many readers). Included below in output_css--will be
|
||||
## ignored if not in output_css.
|
||||
background_color: ffffff
|
||||
|
||||
## Allow customization of CSS. Make sure to keep at least one space
|
||||
## at the start of each line and to escape % to %%. Also need
|
||||
## background_color to be in the same section, if included in CSS.
|
||||
output_css:
|
||||
body { background-color: #%(background_color)s; }
|
||||
.CI {
|
||||
text-align:center;
|
||||
margin-top:0px;
|
||||
margin-bottom:0px;
|
||||
padding:0px;
|
||||
}
|
||||
.center {text-align: center;}
|
||||
.cover {text-align: center;}
|
||||
.full {width: 100%%; }
|
||||
.quarter {width: 25%%; }
|
||||
.smcap {font-variant: small-caps;}
|
||||
.u {text-decoration: underline;}
|
||||
.bold {font-weight: bold;}
|
||||
|
||||
[txt]
|
||||
## Add URLs since there aren't links.
|
||||
titlepage_entries: series,category,genre,language,status,datePublished,dateUpdated,dateCreated,rating,warnings,numChapters,numWords,site,storyUrl, authorUrl, description
|
||||
|
||||
## use \r\n for line endings, the windows convention. text output only.
|
||||
windows_eol: true
|
||||
|
||||
[epub]
|
||||
|
||||
## epub is already a zip file.
|
||||
zip_output: false
|
||||
|
||||
## epub carries the TOC in metadata.
|
||||
## mobi generated from epub will have a TOC at the end.
|
||||
include_tocpage: false
|
||||
|
||||
## epub->mobi conversions typically don't like tables.
|
||||
titlepage_use_table: false
|
||||
|
||||
## When using tables, make these span both columns.
|
||||
wide_titlepage_entries: description, storyUrl, author URL
|
||||
|
||||
## output background color--only used by html and epub (and ignored in
|
||||
## epub by many readers). Included below in output_css--will be
|
||||
## ignored if not in output_css.
|
||||
background_color: ffffff
|
||||
|
||||
## Allow customization of CSS. Make sure to keep at least one space
|
||||
## at the start of each line and to escape % to %%. Also need
|
||||
## background_color to be in the same section, if included in CSS.
|
||||
output_css:
|
||||
body { background-color: #%(background_color)s;
|
||||
text-align: justify;
|
||||
margin: 2%%; }
|
||||
pre { font-size: x-small; }
|
||||
sml { font-size: small; }
|
||||
h1 { text-align: center; }
|
||||
h2 { text-align: center; }
|
||||
h3 { text-align: center; }
|
||||
h4 { text-align: center; }
|
||||
h5 { text-align: center; }
|
||||
h6 { text-align: center; }
|
||||
.CI {
|
||||
text-align:center;
|
||||
margin-top:0px;
|
||||
margin-bottom:0px;
|
||||
padding:0px;
|
||||
}
|
||||
.center {text-align: center;}
|
||||
.cover {text-align: center;}
|
||||
.full {width: 100%%; }
|
||||
.quarter {width: 25%%; }
|
||||
.smcap {font-variant: small-caps;}
|
||||
.u {text-decoration: underline;}
|
||||
.bold {font-weight: bold;}
|
||||
|
||||
## include images from img tags in the body and summary of
|
||||
## stories. Images will be converted to jpg for size if possible.
|
||||
#include_images:false
|
||||
|
||||
## If not set, the summary will have all html stripped for safety.
|
||||
## Both this and include_images must be true to get images in the
|
||||
## summary.
|
||||
#keep_summary_html:false
|
||||
|
||||
## If set, the first image found will be made the cover image. If
|
||||
## keep_summary_html is true, any images in summary will be before any
|
||||
## in chapters.
|
||||
#make_firstimage_cover: false
|
||||
|
||||
## If set, the epub will never have a cover, even include_images is on
|
||||
## and the site has specific cover images.
|
||||
#never_make_cover: false
|
||||
|
||||
## If set, and there isn't already a cover image from the adapter or
|
||||
## from make_firstimage_cover, this image will be made the cover.
|
||||
## It can be either a 'file:' or 'http:' url.
|
||||
## Note that if you enable make_firstimage_cover in [epub], but want
|
||||
## to use default_cover_image for a specific site, use the site:format
|
||||
## section, for example: [www.ficwad.com:epub]
|
||||
#default_cover_image:file:///C:/Users/username/Desktop/nook/images/icon.png
|
||||
#default_cover_image:http://www.somesite.com/someimage.gif
|
||||
|
||||
## Resize images down to width, height, preserving aspect ratio.
|
||||
## Nook size, with margin.
|
||||
image_max_size: 580, 725
|
||||
|
||||
## Change image to grayscale, if graphics library allows, to save
|
||||
## space.
|
||||
#grayscale_images: false
|
||||
|
||||
## if the <img> tag doesn't have a div or a p around it, nook gets
|
||||
## confused and displays it on every page after that under the text
|
||||
## for the rest of the chapter. I doubt adding a div around the img
|
||||
## will break any other readers, but in case it does, the fix can be
|
||||
## turned off.
|
||||
nook_img_fix:true
|
||||
|
||||
[mobi]
|
||||
## mobi TOC cannot be turned off right now.
|
||||
#include_tocpage: true
|
||||
|
||||
## Each site has a section that overrides [defaults] *and* the format
|
||||
## sections test1.com specifically is not a real story site. Instead,
|
||||
## it is a fake site for testing configuration and output. It uses
|
||||
## URLs like: http://test1.com?sid=12345
|
||||
[test1.com]
|
||||
extratags: FanFiction,Testing
|
||||
|
||||
## If necessary, you can define [<site>:<format>] sections to
|
||||
## customize the formats differently for the same site. Overrides
|
||||
## defaults, format and site.
|
||||
[test1.com:txt]
|
||||
extratags: FanFiction,Testing,Text
|
||||
|
||||
[test1.com:html]
|
||||
extratags: FanFiction,Testing,HTML
|
||||
|
||||
[castlefans.org]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[fanfiction.mugglenet.com]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[fanfiction.portkey.org]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[fanfiction.tenhawkpresents.com]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[thequidditchpitch.org]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[www.adastrafanfic.com]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[www.archiveofourown.org]
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[www.fanfiction.net]
|
||||
|
||||
[www.ficbook.net]
|
||||
|
||||
[www.fictionalley.org]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
## fictionally.org storyIds are not unique. Combine with authorId.
|
||||
output_filename: ${title}-${siteabbrev}_${authorId}_${storyId}${formatext}
|
||||
|
||||
[www.fictionpress.com]
|
||||
## Clear FanFiction from defaults, fictionpress.com is original fiction.
|
||||
extratags:
|
||||
|
||||
[www.ficwad.com]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
[www.fimfiction.net]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
## fimfiction.net stories can be locked requiring individual
|
||||
## passwords. If fail_on_password is set, the downloader will fail
|
||||
## when a password is required rather than prompting every time.
|
||||
#fail_on_password: false
|
||||
|
||||
[www.gayauthors.org]
|
||||
|
||||
[www.harrypotterfanfiction.com]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[www.hpfandom.net]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
[www.mediaminer.org]
|
||||
|
||||
[www.potionsandsnitches.net]
|
||||
|
||||
[www.siye.co.uk]
|
||||
|
||||
[www.thewriterscoffeeshop.com]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
## thewriterscoffeeshop.com (ab)uses series as personal reading lists.
|
||||
collect_series: false
|
||||
|
||||
[www.tthfanfic.org]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content. In commandline version,
|
||||
## this should go in your personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
## tth is a little unusual--it doesn't require user/pass, but the site
|
||||
## keeps track of which chapters you've read and won't send another
|
||||
## update until it thinks you're up to date. This way, on download,
|
||||
## it thinks you're up to date.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
[www.twilighted.net]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## twilighted.net (ab)uses series as personal reading lists.
|
||||
collect_series: false
|
||||
|
||||
[www.twiwrite.net]
|
||||
## Some sites require login (or login for some rated stories) The
|
||||
## program can prompt you, or you can save it in config. In
|
||||
## commandline version, this should go in your personal.ini, not
|
||||
## defaults.ini.
|
||||
#username:YourName
|
||||
#password:yourpassword
|
||||
|
||||
## twiwrite.net (ab)uses series as personal reading lists.
|
||||
collect_series: false
|
||||
|
||||
[www.whofic.com]
|
||||
|
||||
[overrides]
|
||||
## It may sometimes be useful to override all of the specific format,
|
||||
## site and site:format sections in your private configuration. For
|
||||
## example, this extratags param here would override all of the
|
||||
## extratags params in all other sections. Only commandline options
|
||||
## beat overrides.
|
||||
#extratags:fanficdownloader
|
||||
59
delete_fic.py
Normal file
59
delete_fic.py
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
import os
|
||||
import cgi
|
||||
import sys
|
||||
import logging
|
||||
import traceback
|
||||
import StringIO
|
||||
|
||||
from google.appengine.api import users
|
||||
from google.appengine.ext import webapp
|
||||
from google.appengine.ext.webapp import util
|
||||
|
||||
from fanficdownloader.downaloder import *
|
||||
from fanficdownloader.ffnet import *
|
||||
from fanficdownloader.output import *
|
||||
|
||||
from google.appengine.ext import db
|
||||
|
||||
from fanficdownloader.zipdir import *
|
||||
|
||||
from ffstorage import *
|
||||
|
||||
def create_mac(user, fic_id, fic_url):
|
||||
return str(abs(hash(user)+hash(fic_id)))+str(abs(hash(fic_url)))
|
||||
|
||||
def check_mac(user, fic_id, fic_url, mac):
|
||||
return (create_mac(user, fic_id, fic_url) == mac)
|
||||
|
||||
def create_mac_for_fic(user, fic_id):
|
||||
key = db.Key(fic_id)
|
||||
fanfic = db.get(key)
|
||||
if fanfic.user != user:
|
||||
return None
|
||||
else:
|
||||
return create_mac(user, key, fanfic.url)
|
||||
|
||||
class DeleteFicHandler(webapp.RequestHandler):
|
||||
def get(self):
|
||||
user = users.get_current_user()
|
||||
if not user:
|
||||
self.redirect('/login')
|
||||
|
||||
fic_id = self.request.get('fic_id')
|
||||
fic_mac = self.request.get('key_id')
|
||||
|
||||
actual_mac = create_mac_for_fic(user, fic_id)
|
||||
if actual_mac != fic_mac:
|
||||
self.response.out.write("Ooops")
|
||||
else:
|
||||
key = db.Key(fic_id)
|
||||
fanfic = db.get(key)
|
||||
fanfic.delete()
|
||||
self.redirect('/recent')
|
||||
|
||||
|
||||
fics = db.GqlQuery("Select * From DownloadedFanfic WHERE user = :1", user)
|
||||
template_values = dict(fics = fics, nickname = user.nickname())
|
||||
path = os.path.join(os.path.dirname(__file__), 'recent.html')
|
||||
self.response.out.write(template.render(path, template_values))
|
||||
|
||||
202
downloader.py
Normal file
202
downloader.py
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import logging
|
||||
## XXX cli option for logging level.
|
||||
logging.basicConfig(level=logging.DEBUG,format="%(levelname)s:%(filename)s(%(lineno)d):%(message)s")
|
||||
|
||||
import sys, os
|
||||
from os.path import normpath, expanduser, isfile, join
|
||||
from StringIO import StringIO
|
||||
from optparse import OptionParser
|
||||
import getpass
|
||||
import string
|
||||
import ConfigParser
|
||||
from subprocess import call
|
||||
|
||||
from fanficdownloader import adapters,writers,exceptions
|
||||
from fanficdownloader.epubutils import get_dcsource_chaptercount, get_update_data
|
||||
|
||||
if sys.version_info < (2, 5):
|
||||
print "This program requires Python 2.5 or newer."
|
||||
sys.exit(1)
|
||||
|
||||
def writeStory(config,adapter,writeformat,metaonly=False,outstream=None):
|
||||
writer = writers.getWriter(writeformat,config,adapter)
|
||||
writer.writeStory(outstream=outstream,metaonly=metaonly)
|
||||
output_filename=writer.getOutputFileName()
|
||||
del writer
|
||||
return output_filename
|
||||
|
||||
def main():
|
||||
# read in args, anything starting with -- will be treated as --<varible>=<value>
|
||||
usage = "usage: %prog [options] storyurl"
|
||||
parser = OptionParser(usage)
|
||||
parser.add_option("-f", "--format", dest="format", default="epub",
|
||||
help="write story as FORMAT, epub(default), text or html", metavar="FORMAT")
|
||||
parser.add_option("-c", "--config",
|
||||
action="append", dest="configfile", default=None,
|
||||
help="read config from specified file(s) in addition to ~/.fanficdownloader/defaults.ini, ~/.fanficdownloader/personal.ini, ./defaults.ini, ./personal.ini", metavar="CONFIG")
|
||||
parser.add_option("-b", "--begin", dest="begin", default=None,
|
||||
help="Begin with Chapter START", metavar="START")
|
||||
parser.add_option("-e", "--end", dest="end", default=None,
|
||||
help="End with Chapter END", metavar="END")
|
||||
parser.add_option("-o", "--option",
|
||||
action="append", dest="options",
|
||||
help="set an option NAME=VALUE", metavar="NAME=VALUE")
|
||||
parser.add_option("-m", "--meta-only",
|
||||
action="store_true", dest="metaonly",
|
||||
help="Retrieve metadata and stop. Or, if --update-epub, update metadata title page only.",)
|
||||
parser.add_option("-u", "--update-epub",
|
||||
action="store_true", dest="update",
|
||||
help="Update an existing epub with new chapter, give epub filename instead of storyurl.",)
|
||||
parser.add_option("--force",
|
||||
action="store_true", dest="force",
|
||||
help="Force overwrite or update of an existing epub, download and overwrite all chapters.",)
|
||||
|
||||
(options, args) = parser.parse_args()
|
||||
|
||||
if len(args) != 1:
|
||||
parser.error("incorrect number of arguments")
|
||||
|
||||
if options.update and options.format != 'epub':
|
||||
parser.error("-u/--update-epub only works with epub")
|
||||
|
||||
config = ConfigParser.SafeConfigParser()
|
||||
|
||||
conflist = []
|
||||
homepath = join(expanduser("~"),".fanficdownloader")
|
||||
|
||||
if isfile(join(homepath,"defaults.ini")):
|
||||
conflist.append(join(homepath,"defaults.ini"))
|
||||
if isfile("defaults.ini"):
|
||||
conflist.append("defaults.ini")
|
||||
|
||||
if isfile(join(homepath,"personal.ini")):
|
||||
conflist.append(join(homepath,"personal.ini"))
|
||||
if isfile("personal.ini"):
|
||||
conflist.append("personal.ini")
|
||||
|
||||
if options.configfile:
|
||||
conflist.extend(options.configfile)
|
||||
|
||||
logging.debug('reading %s config file(s), if present'%conflist)
|
||||
config.read(conflist)
|
||||
|
||||
try:
|
||||
config.add_section("overrides")
|
||||
except ConfigParser.DuplicateSectionError:
|
||||
pass
|
||||
|
||||
if options.force:
|
||||
config.set("overrides","always_overwrite","true")
|
||||
|
||||
if options.options:
|
||||
for opt in options.options:
|
||||
(var,val) = opt.split('=')
|
||||
config.set("overrides",var,val)
|
||||
|
||||
try:
|
||||
## Attempt to update an existing epub.
|
||||
if options.update:
|
||||
(url,chaptercount) = get_dcsource_chaptercount(args[0])
|
||||
print "Updating %s, URL: %s" % (args[0],url)
|
||||
output_filename = args[0]
|
||||
config.set("overrides","output_filename",args[0])
|
||||
else:
|
||||
url = args[0]
|
||||
|
||||
adapter = adapters.getAdapter(config,url,options.format)
|
||||
|
||||
## Check for include_images and absence of PIL, give warning.
|
||||
if adapter.getConfig('include_images'):
|
||||
try:
|
||||
import Image
|
||||
except:
|
||||
print "You have include_images enabled, but Python Image Library(PIL) isn't found.\nImages will be included full size in original format.\nContinue? (y/n)?"
|
||||
if not sys.stdin.readline().strip().lower().startswith('y'):
|
||||
return
|
||||
|
||||
|
||||
## three tries, that's enough if both user/pass & is_adult needed,
|
||||
## or a couple tries of one or the other
|
||||
for x in range(0,2):
|
||||
try:
|
||||
adapter.getStoryMetadataOnly()
|
||||
except exceptions.FailedToLogin, f:
|
||||
if f.passwdonly:
|
||||
print "Story requires a password."
|
||||
else:
|
||||
print "Login Failed, Need Username/Password."
|
||||
sys.stdout.write("Username: ")
|
||||
adapter.username = sys.stdin.readline().strip()
|
||||
adapter.password = getpass.getpass(prompt='Password: ')
|
||||
#print("Login: `%s`, Password: `%s`" % (adapter.username, adapter.password))
|
||||
except exceptions.AdultCheckRequired:
|
||||
print "Please confirm you are an adult in your locale: (y/n)?"
|
||||
if sys.stdin.readline().strip().lower().startswith('y'):
|
||||
adapter.is_adult=True
|
||||
|
||||
if options.update and not options.force:
|
||||
urlchaptercount = int(adapter.getStoryMetadataOnly().getMetadata('numChapters'))
|
||||
|
||||
if chaptercount == urlchaptercount and not options.metaonly:
|
||||
print "%s already contains %d chapters." % (args[0],chaptercount)
|
||||
elif chaptercount > urlchaptercount:
|
||||
print "%s contains %d chapters, more than source: %d." % (args[0],chaptercount,urlchaptercount)
|
||||
else:
|
||||
print "Do update - epub(%d) vs url(%d)" % (chaptercount, urlchaptercount)
|
||||
if not options.metaonly:
|
||||
|
||||
# update now handled by pre-populating the old
|
||||
# images and chapters in the adapter rather than
|
||||
# merging epubs.
|
||||
(url,chaptercount,
|
||||
adapter.oldchapters,
|
||||
adapter.oldimgs) = get_update_data(args[0])
|
||||
|
||||
writeStory(config,adapter,"epub")
|
||||
|
||||
else:
|
||||
# regular download
|
||||
if options.metaonly:
|
||||
print adapter.getStoryMetadataOnly()
|
||||
|
||||
adapter.setChaptersRange(options.begin,options.end)
|
||||
|
||||
output_filename=writeStory(config,adapter,options.format,options.metaonly)
|
||||
|
||||
if not options.metaonly and adapter.getConfig("post_process_cmd"):
|
||||
metadata = adapter.story.metadata
|
||||
metadata['output_filename']=output_filename
|
||||
call(string.Template(adapter.getConfig("post_process_cmd"))
|
||||
.substitute(metadata), shell=True)
|
||||
|
||||
del adapter
|
||||
|
||||
except exceptions.InvalidStoryURL, isu:
|
||||
print isu
|
||||
except exceptions.StoryDoesNotExist, dne:
|
||||
print dne
|
||||
except exceptions.UnknownSite, us:
|
||||
print us
|
||||
|
||||
if __name__ == "__main__":
|
||||
#import time
|
||||
#start = time.time()
|
||||
main()
|
||||
#print("Total time seconds:%f"%(time.time()-start))
|
||||
89
editconfig.html
Normal file
89
editconfig.html
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
|
||||
<html>
|
||||
<head>
|
||||
<link href="/css/index.css" rel="stylesheet" type="text/css">
|
||||
<title>FanFictionDownLoader - read fanfiction from twilighted.net, fanfiction.net, fictionpress.com, fictionalley.org, ficwad.com, potionsandsnitches.net, harrypotterfanfiction.com, mediaminer.org on Kindle, Nook, Sony Reader, iPad, iPhone, Android, Aldiko, Stanza</title>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||||
<meta name="google-site-verification" content="kCFc-G4bka_pJN6Rv8CapPBcwmq0hbAUZPkKWqRsAYU" />
|
||||
<script type="text/javascript">
|
||||
|
||||
var _gaq = _gaq || [];
|
||||
_gaq.push(['_setAccount', 'UA-12136939-1']);
|
||||
_gaq.push(['_trackPageview']);
|
||||
|
||||
(function() {
|
||||
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
|
||||
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
|
||||
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
|
||||
})();
|
||||
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<div id='main' style="width: 80%; margin-left: 10%;">
|
||||
<h1>
|
||||
<a href="/" style="text-decoration: none; color: black;">FanFictionDownLoader</a>
|
||||
</h1>
|
||||
|
||||
<div style="text-align: center">
|
||||
<script type="text/javascript"><!--
|
||||
google_ad_client = "ca-pub-0320924304307555";
|
||||
/* Standard */
|
||||
google_ad_slot = "8974025478";
|
||||
google_ad_width = 468;
|
||||
google_ad_height = 60;
|
||||
//-->
|
||||
</script>
|
||||
<script type="text/javascript"
|
||||
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
|
||||
</script>
|
||||
</div>
|
||||
|
||||
<form action="/editconfig" method="post">
|
||||
<input type="hidden" name="update" value="true" />
|
||||
<div id='logpasswordtable'>
|
||||
<h3>Edit Config</h3>
|
||||
<div id='logpassword'>
|
||||
Editing configuration for {{ nickname }}.
|
||||
</div>
|
||||
<div class='fieldandlabel'>
|
||||
<textarea name="config" style="width: 100%; height: 200px;" wrap='off'>{{ config }}</textarea>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id='submitbtn'>
|
||||
<input type="submit" value="Save">
|
||||
</div>
|
||||
</form>
|
||||
|
||||
<div>
|
||||
<h3>Default System configuration</h3>
|
||||
<pre>
|
||||
{{ defaultsini }}
|
||||
</pre>
|
||||
</div>
|
||||
|
||||
<div style='text-align: center'>
|
||||
<img src="http://code.google.com/appengine/images/appengine-silver-120x30.gif"
|
||||
alt="Powered by Google App Engine" />
|
||||
<br/><br/>
|
||||
This is a web front-end to <A href="http://code.google.com/p/fanficdownloader/">FanFictionDownLoader</a><br/>
|
||||
Copyright © Fanficdownloader team
|
||||
</div>
|
||||
|
||||
<div style="margin-top: 1em; text-align: center'">
|
||||
<script type="text/javascript"><!--
|
||||
google_ad_client = "pub-2027714004231956";
|
||||
/* FFD */
|
||||
google_ad_slot = "7330682770";
|
||||
google_ad_width = 468;
|
||||
google_ad_height = 60;
|
||||
//-->
|
||||
</script>
|
||||
<script type="text/javascript"
|
||||
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
|
||||
</script>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
|
@ -1,35 +1,25 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2019 FanFicFare team
|
||||
#
|
||||
# epubmerge.py 1.0
|
||||
|
||||
# Copyright 2011, Jim Miller
|
||||
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from .adapter_test1 import TestSiteAdapter
|
||||
|
||||
class Test3SiteAdapter(TestSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
TestSiteAdapter.__init__(self, config, url)
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'test3.com'
|
||||
|
||||
def getClass():
|
||||
return Test3SiteAdapter
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print('''
|
||||
The this utility has been split out into it's own project.
|
||||
See: http://code.google.com/p/epubmerge/
|
||||
...for a CLI epubmerge.py program and calibre plugin.
|
||||
''')
|
||||
40
example.ini
Normal file
40
example.ini
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
## This is an example of what your personal configuration might look
|
||||
## like.
|
||||
|
||||
[defaults]
|
||||
## Some sites also require the user to confirm they are adult for
|
||||
## adult content. In commandline version, this should go in your
|
||||
## personal.ini, not defaults.ini.
|
||||
#is_adult:true
|
||||
|
||||
## Most common, I expect will be using this to save username/passwords
|
||||
## for different sites.
|
||||
[www.twilighted.net]
|
||||
#username:YourPenname
|
||||
#password:YourPassword
|
||||
|
||||
[www.ficwad.com]
|
||||
#username:YourUsername
|
||||
#password:YourPassword
|
||||
|
||||
[www.adastrafanfic.com]
|
||||
## Some sites do not require a login, but do require the user to
|
||||
## confirm they are adult for adult content.
|
||||
#is_adult:true
|
||||
|
||||
## The [defaults] section here will override the system [defaults],
|
||||
## but not format, site for site:format sections.
|
||||
[defaults]
|
||||
## Directories only useful in commandline or zip files.
|
||||
#output_filename: books/${title}-${siteabbrev}_${storyId}${formatext}
|
||||
#output_filename: books/${site}/${authorId}/${title}-${storyId}${formatext}
|
||||
|
||||
## For example, zip_output here will turn on zip for html and txt, but
|
||||
## not epub because the system [epub] section explicitly says
|
||||
## zip_output: false (epubs *are* specially formated zip files.)
|
||||
#zip_output: true
|
||||
#zip_filename: ${title}-${siteabbrev}_${storyId}${formatext}.zip
|
||||
|
||||
## This section will override anything in the system defaults or other
|
||||
## sections here.
|
||||
[overrides]
|
||||
2014
fanficdownloader/BeautifulSoup.py
Normal file
2014
fanficdownloader/BeautifulSoup.py
Normal file
File diff suppressed because it is too large
Load diff
1
fanficdownloader/__init__.py
Normal file
1
fanficdownloader/__init__.py
Normal file
|
|
@ -0,0 +1 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
103
fanficdownloader/adapters/__init__.py
Normal file
103
fanficdownloader/adapters/__init__.py
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import os, re, sys, glob, types
|
||||
from os.path import dirname, basename, normpath
|
||||
import logging
|
||||
import urlparse as up
|
||||
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
## must import each adapter here.
|
||||
|
||||
import adapter_test1
|
||||
import adapter_fanfictionnet
|
||||
import adapter_castlefansorg
|
||||
import adapter_fictionalleyorg
|
||||
import adapter_fictionpresscom
|
||||
import adapter_ficwadcom
|
||||
import adapter_fimfictionnet
|
||||
import adapter_harrypotterfanfictioncom
|
||||
import adapter_mediaminerorg
|
||||
import adapter_potionsandsnitchesnet
|
||||
import adapter_tenhawkpresentscom
|
||||
import adapter_adastrafanficcom
|
||||
import adapter_thewriterscoffeeshopcom
|
||||
import adapter_tthfanficorg
|
||||
import adapter_twilightednet
|
||||
import adapter_twiwritenet
|
||||
import adapter_whoficcom
|
||||
import adapter_siyecouk
|
||||
import adapter_archiveofourownorg
|
||||
import adapter_ficbooknet
|
||||
import adapter_gayauthorsorg
|
||||
import adapter_portkeyorg
|
||||
import adapter_mugglenetcom
|
||||
import adapter_hpfandomnet
|
||||
import adapter_thequidditchpitchorg
|
||||
|
||||
## This bit of complexity allows adapters to be added by just adding
|
||||
## importing. It eliminates the long if/else clauses we used to need
|
||||
## to pick out the adapter.
|
||||
|
||||
## List of registered site adapters.
|
||||
__class_list = []
|
||||
|
||||
def imports():
|
||||
for name, val in globals().items():
|
||||
if isinstance(val, types.ModuleType):
|
||||
yield val.__name__
|
||||
|
||||
for x in imports():
|
||||
if "fanficdownloader.adapters.adapter_" in x:
|
||||
#print x
|
||||
__class_list.append(sys.modules[x].getClass())
|
||||
|
||||
def getAdapter(config,url,fileform=None):
|
||||
## fix up leading protocol.
|
||||
fixedurl = re.sub(r"(?i)^[htp]+[:/]+","http://",url.strip())
|
||||
if not fixedurl.startswith("http"):
|
||||
fixedurl = "http://%s"%url
|
||||
## remove any trailing '#' locations.
|
||||
fixedurl = re.sub(r"#.*$","",fixedurl)
|
||||
|
||||
## remove any trailing '&' parameters--?sid=999 will be left.
|
||||
## that's all that any of the current adapters need or want.
|
||||
fixedurl = re.sub(r"&.*$","",fixedurl)
|
||||
|
||||
parsedUrl = up.urlparse(fixedurl)
|
||||
domain = parsedUrl.netloc.lower()
|
||||
if( domain != parsedUrl.netloc ):
|
||||
fixedurl = fixedurl.replace(parsedUrl.netloc,domain)
|
||||
|
||||
logging.debug("site:"+domain)
|
||||
cls = getClassFor(domain)
|
||||
if not cls:
|
||||
logging.debug("trying site:www."+domain)
|
||||
cls = getClassFor("www."+domain)
|
||||
fixedurl = fixedurl.replace("http://","http://www.")
|
||||
if cls:
|
||||
adapter = cls(config,fixedurl) # raises InvalidStoryURL
|
||||
adapter.setSectionOrder(adapter.getSiteDomain(),fileform)
|
||||
return adapter
|
||||
# No adapter found.
|
||||
raise exceptions.UnknownSite( url, [cls.getSiteDomain() for cls in __class_list] )
|
||||
|
||||
def getClassFor(domain):
|
||||
for cls in __class_list:
|
||||
if cls.matchesSite(domain):
|
||||
return cls
|
||||
228
fanficdownloader/adapters/adapter_adastrafanficcom.py
Normal file
228
fanficdownloader/adapters/adapter_adastrafanficcom.py
Normal file
|
|
@ -0,0 +1,228 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class AdAstraFanficComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','aaff')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Star Trek")
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.adastrafanfic.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
addurl = "&warning=5"
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "Content is only suitable for mature adults. May contain explicit language and adult themes. Equivalent of NC-17." in data:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## <meta name='description' content='<p>Description</p> ...' >
|
||||
## Summary, strangely, is in the content attr of a <meta name='description'> tag
|
||||
## which is escaped HTML. Unfortunately, we can't use it because they don't
|
||||
## escape (') chars in the desc, breakin the tag.
|
||||
#meta_desc = soup.find('meta',{'name':'description'})
|
||||
#metasoup = bs.BeautifulStoneSoup(meta_desc['content'])
|
||||
#self.story.setMetadata('description',stripHTML(metasoup))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ''
|
||||
while value and not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
# sometimes poorly formated desc (<p> w/o </p>) leads
|
||||
# to all labels being included.
|
||||
svalue=svalue[:svalue.find('<span class="label">')]
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2'))
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(value.strip(), "%d %b %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(value.strip(), "%d %b %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return AdAstraFanficComSiteAdapter
|
||||
|
||||
263
fanficdownloader/adapters/adapter_archiveofourownorg.py
Normal file
263
fanficdownloader/adapters/adapter_archiveofourownorg.py
Normal file
|
|
@ -0,0 +1,263 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
def getClass():
|
||||
return ArchiveOfOurOwnOrgAdapter
|
||||
|
||||
|
||||
class ArchiveOfOurOwnOrgAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["utf8",
|
||||
"Windows-1252"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
|
||||
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/works/'+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','ao3')
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%Y-%b-%d"
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.archiveofourown.org'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/works/123456"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/works/")+r"\d+(/chapters/\d+)?/?$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
addurl = "?view_adult=true"
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
meta = self.url+addurl
|
||||
url = self.url+'/navigate'+addurl
|
||||
logging.debug("URL: "+meta)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
meta = self._fetchUrl(meta)
|
||||
|
||||
if "This work could have adult content. If you proceed you have agreed that you are willing to see such content." in meta:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.meta)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
metasoup = bs.BeautifulSoup(meta)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r"^/works/\w+"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"^/users/\w+/pseuds/\w+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[2])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.text)
|
||||
|
||||
# Find the chapters:
|
||||
chapters=soup.findAll('a', href=re.compile(r'/works/'+self.story.getMetadata('storyId')+"/chapters/\d+$"))
|
||||
self.story.setMetadata('numChapters',len(chapters))
|
||||
logging.debug("numChapters: (%s)"%self.story.getMetadata('numChapters'))
|
||||
for x in range(0,len(chapters)):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
chapter=chapters[x]
|
||||
if len(chapters)==1:
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),'http://'+self.host+chapter['href']+addurl))
|
||||
else:
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+chapter['href']+addurl))
|
||||
|
||||
|
||||
|
||||
a = metasoup.find('blockquote',{'class':'userstuff'})
|
||||
if a != None:
|
||||
self.setDescription(url,a.text)
|
||||
#self.story.setMetadata('description',a.text)
|
||||
|
||||
a = metasoup.find('dd',{'class':"rating tags"})
|
||||
if a != None:
|
||||
self.story.setMetadata('rating',stripHTML(a.text))
|
||||
|
||||
a = metasoup.find('dd',{'class':"fandom tags"})
|
||||
fandoms = a.findAll('a',{'class':"tag"})
|
||||
for fandom in fandoms:
|
||||
self.story.addToList('category',fandom.string)
|
||||
|
||||
a = metasoup.find('dd',{'class':"warning tags"})
|
||||
if a != None:
|
||||
warnings = a.findAll('a',{'class':"tag"})
|
||||
for warning in warnings:
|
||||
if warning.string == "Author Chose Not To Use Archive Warnings":
|
||||
warning.string = "No Archive Warnings Apply"
|
||||
if warning.string != "No Archive Warnings Apply":
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
a = metasoup.find('dd',{'class':"freeform tags"})
|
||||
if a != None:
|
||||
genres = a.findAll('a',{'class':"tag"})
|
||||
for genre in genres:
|
||||
self.story.addToList('genre',genre.string)
|
||||
a = metasoup.find('dd',{'class':"category tags"})
|
||||
if a != None:
|
||||
genres = a.findAll('a',{'class':"tag"})
|
||||
for genre in genres:
|
||||
if genre != "Gen":
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
a = metasoup.find('dd',{'class':"character tags"})
|
||||
if a != None:
|
||||
chars = a.findAll('a',{'class':"tag"})
|
||||
for char in chars:
|
||||
self.story.addToList('characters',char.string)
|
||||
a = metasoup.find('dd',{'class':"relationship tags"})
|
||||
if a != None:
|
||||
chars = a.findAll('a',{'class':"tag"})
|
||||
for char in chars:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
|
||||
stats = metasoup.find('dl',{'class':'stats'})
|
||||
dt = stats.findAll('dt')
|
||||
dd = stats.findAll('dd')
|
||||
for x in range(0,len(dt)):
|
||||
label = dt[x].text
|
||||
value = dd[x].text
|
||||
|
||||
if 'Words:' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Chapters:' in label:
|
||||
if value.split('/')[0] == value.split('/')[1]:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Completed' in label:
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = metasoup.find('dd',{'class':"series"})
|
||||
b = a.find('a', href=re.compile(r"/series/\d+"))
|
||||
series_name = b.string
|
||||
series_url = 'http://'+self.host+'/fanfic/'+b['href']
|
||||
series_index = int(a.text.split(' ')[1])
|
||||
self.setSeries(series_name, series_index)
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
chapter=bs.BeautifulSoup('<div class="story"></div>')
|
||||
data = self._fetchUrl(url)
|
||||
soup = bs.BeautifulSoup(data,selfClosingTags=('br','hr'))
|
||||
|
||||
headnotes = soup.find('div', {'class' : "preface group"}).find('div', {'class' : "notes module"})
|
||||
if headnotes != None:
|
||||
headnotes = headnotes.find('blockquote', {'class' : "userstuff"})
|
||||
if headnotes != None:
|
||||
chapter.append("<b>Author's Note:</b>")
|
||||
chapter.append(headnotes)
|
||||
|
||||
chapsumm = soup.find('div', {'id' : "summary"})
|
||||
if chapsumm != None:
|
||||
chapsumm = chapsumm.find('blockquote')
|
||||
chapter.append("<b>Summary for the Chapter:</b>")
|
||||
chapter.append(chapsumm)
|
||||
chapnotes = soup.find('div', {'id' : "notes"})
|
||||
if chapnotes != None:
|
||||
chapnotes = chapnotes.find('blockquote')
|
||||
if chapnotes != None:
|
||||
chapter.append("<b>Notes for the Chapter:</b>")
|
||||
chapter.append(chapnotes)
|
||||
|
||||
text = soup.find('div', {'class' : "userstuff module"})
|
||||
chtext = text.find('h3', {'class' : "landmark heading"})
|
||||
if chtext:
|
||||
chtext.extract()
|
||||
chapter.append(text)
|
||||
|
||||
chapfoot = soup.find('div', {'class' : "end notes module", 'role' : "complementary"})
|
||||
if chapfoot != None:
|
||||
chapfoot = chapfoot.find('blockquote')
|
||||
chapter.append("<b>Notes for the Chapter:</b>")
|
||||
chapter.append(chapfoot)
|
||||
|
||||
footnotes = soup.find('div', {'id' : "work_endnotes"})
|
||||
if footnotes != None:
|
||||
footnotes = footnotes.find('blockquote')
|
||||
chapter.append("<b>Author's Note:</b>")
|
||||
chapter.append(footnotes)
|
||||
|
||||
if None == soup:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,chapter)
|
||||
|
|
@ -1,301 +1,310 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# By virtue of being recent and requiring both is_adult and user/pass,
|
||||
# adapter_fanficcastletvnet.py is the best choice for learning to
|
||||
# write adapters--especially for sites that use the eFiction system.
|
||||
# Most sites that have ".../viewstory.php?sid=123" in the story URL
|
||||
# are eFiction.
|
||||
|
||||
# For non-eFiction sites, it can be considerably more complex, but
|
||||
# this is still a good starting point.
|
||||
|
||||
# In general an 'adapter' needs to do these five things:
|
||||
|
||||
# - 'Register' correctly with the downloader
|
||||
# - Site Login (if needed)
|
||||
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
|
||||
# - Grab the chapter list
|
||||
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
|
||||
# - Grab the chapter texts
|
||||
|
||||
# Search for XXX comments--that's where things are most likely to need changing.
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return SheppardWeirComAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class SheppardWeirComAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
|
||||
self._setURL('https://' + self.getSiteDomain() + '/fanfics/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','swf') # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%B %d, %Y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'sheppardweir.com' # XXX
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(self):
|
||||
return "https://"+self.getSiteDomain()+"/fanfics/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"https?://"+re.escape(self.getSiteDomain()+"/fanfics/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Login seems to be reasonably standard across eFiction sites.
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'https://' + self.getSiteDomain() + '/fanfics/user.php?action=login'
|
||||
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self.post_request(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logger.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
addurl = "&ageconsent=ok&warning=4" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self.get_request(url)
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
if "Age Consent Required" in data: # XXX
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
soup = self.make_soup(data)
|
||||
# print data
|
||||
|
||||
|
||||
pagetitle = soup.find('div',{'id':'pagetitle'})
|
||||
## Title
|
||||
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(a))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
# (fetch multiple authors)
|
||||
alist = soup.find_all('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
for a in alist:
|
||||
self.story.addToList('authorId',a['href'].split('=')[1])
|
||||
self.story.addToList('authorUrl','https://'+self.host+'/fanfics/'+a['href'])
|
||||
self.story.addToList('author',a.string)
|
||||
|
||||
|
||||
# Reviews
|
||||
reviewdata = soup.find('div', {'id' : 'sort'})
|
||||
a = reviewdata.find_all('a', href=re.compile(r'reviews.php\?type=ST&(amp;)?item='+self.story.getMetadata('storyId')+"$"))[1] # second one.
|
||||
self.story.setMetadata('reviews',stripHTML(a))
|
||||
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'https://'+self.host+'/fanfics/'+chapter['href']+addurl)
|
||||
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# Summary
|
||||
summarydata = unicode(soup.find('div',{'class':'content'}))
|
||||
start='<span class="label">Summary: </span>'
|
||||
end='</div>'
|
||||
summarydata = summarydata[summarydata.index(start)+len(start):summarydata.rindex(end)]
|
||||
self.setDescription(url,self.make_soup(summarydata))
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## Not all sites use Genre, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
value=value.replace(' - ','')
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'https://'+self.host+'/fanfics/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# By virtue of being recent and requiring both is_adult and user/pass,
|
||||
# adapter_fanficcastletvnet.py is the best choice for learning to
|
||||
# write adapters--especially for sites that use the eFiction system.
|
||||
# Most sites that have ".../viewstory.php?sid=123" in the story URL
|
||||
# are eFiction.
|
||||
|
||||
# For non-eFiction sites, it can be considerably more complex, but
|
||||
# this is still a good starting point.
|
||||
|
||||
# In general an 'adapter' needs to do these five things:
|
||||
|
||||
# - 'Register' correctly with the downloader
|
||||
# - Site Login (if needed)
|
||||
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
|
||||
# - Grab the chapter list
|
||||
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
|
||||
# - Grab the chapter texts
|
||||
|
||||
# Search for XXX comments--that's where things are most likely to need changing.
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return CastleFansOrgAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class CastleFansOrgAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/fanfic/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','cslf') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Castle") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%b %d, %Y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'castlefans.org' # XXX
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/fanfic/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/fanfic/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Login seems to be reasonably standard across eFiction sites.
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/fanfic/user.php?action=login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
addurl = "&ageconsent=ok&warning=4" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
if "Age Consent Required" in data: # XXX
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fanfic/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## Not all sites use Genre, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/fanfic/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
278
fanficdownloader/adapters/adapter_fanfictionnet.py
Normal file
278
fanficdownloader/adapters/adapter_fanfictionnet.py
Normal file
|
|
@ -0,0 +1,278 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import time
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class FanFictionNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','ffnet')
|
||||
|
||||
# get storyId from url--url validation guarantees second part is storyId
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL("http://"+self.getSiteDomain()\
|
||||
+"/s/"+self.story.getMetadata('storyId')+"/1/")
|
||||
|
||||
# ffnet update emails have the latest chapter URL.
|
||||
# Frequently, when they arrive, not all the servers have the
|
||||
# latest chapter yet and going back to chapter 1 to pull the
|
||||
# chapter list doesn't get the latest. So save and use the
|
||||
# original URL given to pull chapter list & metadata.
|
||||
self.origurl = url
|
||||
if "http://m." in self.origurl:
|
||||
## accept m(mobile)url, but use www.
|
||||
self.origurl = self.origurl.replace("http://m.","http://www.")
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.fanfiction.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.fanfiction.net','m.fanfiction.net']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.fanfiction.net/s/1234/1/ http://www.fanfiction.net/s/1234/12/ http://www.fanfiction.net/s/1234/1/Story_Title"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"http://(www|m)?\.fanfiction\.net/s/\d+(/\d+)?(/|/[a-zA-Z0-9_-]+)?/?$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# fetch the chapter. From that we will get almost all the
|
||||
# metadata and chapter list
|
||||
|
||||
url = self.origurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
#print("\n===================\n%s\n===================\n"%data)
|
||||
soup = bs.BeautifulSoup(data)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "Unable to locate story with id of " in data:
|
||||
raise exceptions.StoryDoesNotExist(url)
|
||||
|
||||
# some times "Chapter not found...", sometimes "Chapter text not found..."
|
||||
if "not found. Please check to see you are not using an outdated url." in data:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! 'Chapter not found. Please check to see you are not using an outdated url.'" % url)
|
||||
|
||||
try:
|
||||
# rather nasty way to check for a newer chapter. ffnet has a
|
||||
# tendency to send out update notices in email before all
|
||||
# their servers are showing the update on the first chapter.
|
||||
try:
|
||||
chapcount = len(soup.find('select', { 'name' : 'chapter' } ).findAll('option'))
|
||||
# get chapter part of url.
|
||||
except:
|
||||
chapcount = 1
|
||||
chapter = url.split('/',)[5]
|
||||
tryurl = "http://%s/s/%s/%d/"%(self.getSiteDomain(),
|
||||
self.story.getMetadata('storyId'),
|
||||
chapcount+1)
|
||||
print('=Trying newer chapter: %s' % tryurl)
|
||||
newdata = self._fetchUrl(tryurl)
|
||||
if "not found. Please check to see you are not using an outdated url." \
|
||||
not in newdata:
|
||||
print('=======Found newer chapter: %s' % tryurl)
|
||||
soup = bs.BeautifulSoup(newdata)
|
||||
except:
|
||||
pass
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"^/u/\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[2])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
|
||||
# start by finding a script towards the bottom that has a
|
||||
# bunch of useful stuff in it.
|
||||
|
||||
# var storyid = 6577076;
|
||||
# var chapter = 1;
|
||||
# var chapters = 17;
|
||||
# var words = 42787;
|
||||
# var userid = 2645830;
|
||||
# var title = 'The+Invitation';
|
||||
# var title_t = 'The Invitation';
|
||||
# var summary = 'Dudley Dursley would be the first to say he lived a very normal life. But what happens when he gets invited to his cousin Harry Potter\'s wedding? Will Dudley get the courage to apologize for the torture he caused all those years ago? Harry/Ginny story.';
|
||||
# var categoryid = 224;
|
||||
# var cat_title = 'Harry Potter';
|
||||
# var datep = '12-21-10';
|
||||
# var dateu = '04-06-11';
|
||||
# var author = 'U n F a b u l o u s M e';
|
||||
|
||||
for script in soup.findAll('script', src=None):
|
||||
if not script:
|
||||
continue
|
||||
if not script.string:
|
||||
continue
|
||||
if 'var storyid' in script.string:
|
||||
for line in script.string.split('\n'):
|
||||
m = re.match(r"^ +var ([^ ]+) = '?(.*?)'?;\r?$",line)
|
||||
if m == None : continue
|
||||
var,value = m.groups()
|
||||
# remove javascript escaping from values.
|
||||
value = re.sub(r'\\(.)',r'\1',value)
|
||||
#print var,value
|
||||
if 'words' in var:
|
||||
self.story.setMetadata('numWords', value)
|
||||
if 'title_t' in var:
|
||||
self.story.setMetadata('title', value)
|
||||
if 'summary' in var:
|
||||
self.setDescription(url,value)
|
||||
#self.story.setMetadata('description', value)
|
||||
if 'datep' in var:
|
||||
self.story.setMetadata('datePublished',makeDate(value, '%m-%d-%y'))
|
||||
if 'dateu' in var:
|
||||
self.story.setMetadata('dateUpdated',makeDate(value, '%m-%d-%y'))
|
||||
if 'cat_title' in var:
|
||||
if "Crossover" in value:
|
||||
value = re.sub(r' Crossover$','',value)
|
||||
for c in value.split(' and '):
|
||||
self.story.addToList('category',c)
|
||||
# Screws up when the category itself
|
||||
# contains ' and '. But that's rare
|
||||
# and the only alternative is to find
|
||||
# the 'Crossover' category URL and
|
||||
# parse that page to search for <a>
|
||||
# with href /crossovers/(name)/(num)/
|
||||
# <a href="/crossovers/Harry_Potter/224/">Harry Potter</a>
|
||||
# <a href="/crossovers/Naruto/1402/">Naruto</a>
|
||||
else:
|
||||
self.story.addToList('category',value)
|
||||
break # for script in soup.findAll('script', src=None):
|
||||
|
||||
# Find the chapter selector
|
||||
select = soup.find('select', { 'name' : 'chapter' } )
|
||||
|
||||
if select is None:
|
||||
# no selector found, so it's a one-chapter story.
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
else:
|
||||
allOptions = select.findAll('option')
|
||||
for o in allOptions:
|
||||
url = u'http://%s/s/%s/%s/' % ( self.getSiteDomain(),
|
||||
self.story.getMetadata('storyId'),
|
||||
o['value'])
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
title = u"%s" % o
|
||||
title = re.sub(r'<[^>]+>','',title)
|
||||
self.chapterUrls.append((title,url))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## Pull some additional data from html. Find Rating and look around it.
|
||||
|
||||
a = soup.find('a', href='http://www.fictionratings.com/')
|
||||
self.story.setMetadata('rating',a.string)
|
||||
|
||||
# used below to get correct characters.
|
||||
metatext = a.findNext(text=re.compile(r' - Reviews:'))
|
||||
if metatext == None: # indicates there's no Reviews, look for id: instead.
|
||||
metatext = a.findNext(text=re.compile(r' - id:'))
|
||||
|
||||
# after Rating, the same bit of text containing id:123456 contains
|
||||
# Complete--if completed.
|
||||
if 'Complete' in a.findNext(text=re.compile(r'id:'+self.story.getMetadata('storyId'))):
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
# Parse genre(s) from <meta name="description" content="..."
|
||||
# <meta name="description" content="A Transformers/Beast Wars - Humor fanfiction with characters Prowl & Sideswipe. Story summary: Sideswipe is bored. Prowl appears to be so, too or at least, Sideswipe thinks he looks bored . So Sideswipe entertains them. After all, what's more fun than a race? Song-fic.">
|
||||
# <meta name="description" content="Chapter 1 of a Transformers/Beast Wars - Adventure/Friendship fanfiction with characters Bumblebee. TFA: What would you do if you was being abused all you life? Follow NightRunner as she goes through her spark breaking adventure of getting away from her father..">
|
||||
# (fp)<meta name="description" content="Chapter 1 of a Sci-Fi - Adventure/Humor fiction. Felix Max was just your regular hyperactive kid until he accidently caused his own fathers death. Now he has meta-humans trying to hunt him down with a corrupt goverment to back them up. Oh, and did I mention he has no Powers yet?.">
|
||||
# <meta name="description" content="Chapter 1 of a Bleach - Adventure/Angst fanfiction with characters Ichigo K. & Neliel T. O./Nel. Time travel with a twist. Time can be a real bi***. Ichigo finds that fact out when he accidentally goes back in time. Is this his second chance or is fate just screwing with him. Not a crack fic.IchixNelXHime.">
|
||||
# <meta name="description" content="Chapter 1 of a Harry Potter and Transformers - Humor/Adventure crossover fanfiction with characters: Harry P. & Ironhide. IT’s one thing to be tossed thru the Veil for something he didn’t do. It was quite another to wake in his animigus form in a world not his own. Harry just knew someone was laughing at him somewhere. Mech/Mech pairings inside..">
|
||||
m = re.match(r"^(?:Chapter \d+ of a|A) (?:.*?) (?:- (?P<genres>.*?) )?(?:crossover )?(?:fan)?fiction(?P<chars>[ ]+with characters)?",
|
||||
soup.find('meta',{'name':'description'})['content'])
|
||||
if m != None:
|
||||
genres=m.group('genres')
|
||||
if genres != None:
|
||||
# Hurt/Comfort is one genre.
|
||||
genres=re.sub('Hurt/Comfort','Hurt-Comfort',genres)
|
||||
for g in genres.split('/'):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
if m.group('chars') != None:
|
||||
|
||||
# At this point we've proven that there's character(s)
|
||||
# We can't reliably parse characters out of meta name="description".
|
||||
# There's no way to tell that "with characters Ichigo K. & Neliel T. O./Nel. " ends at "Nel.", not "T."
|
||||
# But we can pull them from the reviewstext line, now that we know about existance of chars.
|
||||
# reviewstext can take form of:
|
||||
# - English - Shinji H. - Updated: 01-13-12 - Published: 12-20-11 - id:7654123
|
||||
# - English - Adventure/Angst - Ichigo K. & Neliel T. O./Nel - Reviews:
|
||||
# - English - Humor/Adventure - Harry P. & Ironhide - Reviews:
|
||||
mc = re.match(r" - (?P<lang>[^ ]+ - )(?P<genres>[^ ]+ - )? (?P<chars>.+?) - (Reviews|Updated|Published)",
|
||||
metatext)
|
||||
chars = mc.group("chars")
|
||||
for c in chars.split(' & '):
|
||||
self.story.addToList('characters',c)
|
||||
m = re.match(r" - (?P<lang>[^ ]+)",metatext)
|
||||
if m.group('lang') != None:
|
||||
self.story.setMetadata('language',m.group('lang'))
|
||||
|
||||
return
|
||||
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
time.sleep(0.5) ## ffnet(and, I assume, fpcom) tends to fail
|
||||
## more if hit too fast. This is in
|
||||
## additional to what ever the
|
||||
## slow_down_sleep_time setting is.
|
||||
data = self._fetchUrl(url)
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Remove the 'share' button.
|
||||
sharediv = soup.find('div', {'class' : 'a2a_kit a2a_default_style'})
|
||||
if sharediv:
|
||||
sharediv.extract()
|
||||
else:
|
||||
logging.debug('share button div not found')
|
||||
|
||||
div = soup.find('div', {'id' : 'storytext'})
|
||||
|
||||
if None == div:
|
||||
logging.debug('div id=storytext not found. data:%s'%data)
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
|
||||
def getClass():
|
||||
return FanFictionNetSiteAdapter
|
||||
|
||||
222
fanficdownloader/adapters/adapter_ficbooknet.py
Normal file
222
fanficdownloader/adapters/adapter_ficbooknet.py
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import datetime
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
from .. import translit
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
|
||||
def getClass():
|
||||
return FicBookNetAdapter
|
||||
|
||||
|
||||
class FicBookNetAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["utf8",
|
||||
"Windows-1252"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/readfic/'+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','fbn')
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%d %m %Y"
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.ficbook.net'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/readfic/12345"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/readfic/")+r"\d+"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
url=self.url
|
||||
logging.debug("URL: "+url)
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
table = soup.find('td',{'width':'50%'})
|
||||
|
||||
## Title
|
||||
a = soup.find('h1')
|
||||
self.story.setMetadata('title',a.string)
|
||||
logging.debug("Title: (%s)"%self.story.getMetadata('title'))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = table.find('a')
|
||||
self.story.setMetadata('authorId',a.text) # Author's name is unique
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.text)
|
||||
logging.debug("Author: (%s)"%self.story.getMetadata('author'))
|
||||
|
||||
# Find the chapters:
|
||||
chapters = soup.find('div', {'class' : 'part_list'})
|
||||
if chapters != None:
|
||||
chapters=chapters.findAll('a', href=re.compile(r'/readfic/'+self.story.getMetadata('storyId')+"/\d+#part_content$"))
|
||||
self.story.setMetadata('numChapters',len(chapters))
|
||||
for x in range(0,len(chapters)):
|
||||
chapter=chapters[x]
|
||||
churl='http://'+self.host+chapter['href']
|
||||
self.chapterUrls.append((stripHTML(chapter),churl))
|
||||
if x == 0:
|
||||
pubdate = translit.translit(stripHTML(bs.BeautifulSoup(self._fetchUrl(churl)).find('div', {'class' : 'part_added'}).find('span')))
|
||||
if x == len(chapters)-1:
|
||||
update = translit.translit(stripHTML(bs.BeautifulSoup(self._fetchUrl(churl)).find('div', {'class' : 'part_added'}).find('span')))
|
||||
else:
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
self.story.setMetadata('numChapters',1)
|
||||
pubdate=translit.translit(stripHTML(soup.find('div', {'class' : 'part_added'}).find('span')))
|
||||
update=pubdate
|
||||
|
||||
logging.debug("numChapters: (%s)"%self.story.getMetadata('numChapters'))
|
||||
|
||||
if not ',' in pubdate:
|
||||
pubdate=datetime.date.today().strftime(self.dateformat)
|
||||
if not ',' in update:
|
||||
update=datetime.date.today().strftime(self.dateformat)
|
||||
pubdate=pubdate.split(',')[0]
|
||||
update=update.split(',')[0]
|
||||
|
||||
fullmon = {"yanvarya":"01", "января":"01",
|
||||
"fievralya":"02", "февраля":"02",
|
||||
"marta":"03", "марта":"03",
|
||||
"aprielya":"04", "апреля":"04",
|
||||
"maya":"05", "мая":"05",
|
||||
"iyunya":"06", "июня":"06",
|
||||
"iyulya":"07", "июля":"07",
|
||||
"avghusta":"08", "августа":"08",
|
||||
"sentyabrya":"09", "сентября":"09",
|
||||
"oktyabrya":"10", "октября":"10",
|
||||
"noyabrya":"11", "ноября":"11",
|
||||
"diekabrya":"12", "декабря":"12" }
|
||||
for (name,num) in fullmon.items():
|
||||
if name in pubdate:
|
||||
pubdate = pubdate.replace(name,num)
|
||||
if name in update:
|
||||
update = update.replace(name,num)
|
||||
|
||||
self.story.setMetadata('dateUpdated', makeDate(update, self.dateformat))
|
||||
self.story.setMetadata('datePublished', makeDate(pubdate, self.dateformat))
|
||||
self.story.setMetadata('language','Russian')
|
||||
|
||||
pr=soup.find('a', href=re.compile(r'/printfic/\w+'))
|
||||
pr='http://'+self.host+pr['href']
|
||||
pr = bs.BeautifulSoup(self._fetchUrl(pr))
|
||||
pr=pr.findAll('div', {'class' : 'part_text'})
|
||||
i=0
|
||||
for part in pr:
|
||||
i=i+len(stripHTML(part).split(' '))
|
||||
self.story.setMetadata('numWords', str(i))
|
||||
|
||||
i=0
|
||||
fandoms = table.findAll('a', href=re.compile(r'/fanfiction/\w+'))
|
||||
for fandom in fandoms:
|
||||
self.story.addToList('category',fandom.string)
|
||||
i=i+1
|
||||
if i > 1:
|
||||
self.story.addToList('genre', 'Кроссовер')
|
||||
|
||||
meta=table.findAll('a', href=re.compile(r'/ratings/'))
|
||||
i=0
|
||||
for m in meta:
|
||||
if i == 0:
|
||||
self.story.setMetadata('rating', m.find('b').text)
|
||||
i=1
|
||||
elif i == 1:
|
||||
if not "," in m.nextSibling:
|
||||
i=2
|
||||
self.story.addToList('genre', m.find('b').text)
|
||||
elif i == 2:
|
||||
self.story.addToList('warnings', m.find('b').text)
|
||||
|
||||
|
||||
if table.find('span', {'style' : 'color: green'}):
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In Progress')
|
||||
|
||||
|
||||
tags = table.findAll('b')
|
||||
for tag in tags:
|
||||
label = translit.translit(tag.text)
|
||||
if 'Piersonazhi:' in label or 'Персонажи:' in label:
|
||||
chars=tag.nextSibling.string.split(', ')
|
||||
for char in chars:
|
||||
self.story.addToList('characters',char)
|
||||
break
|
||||
|
||||
summary=soup.find('span', {'class' : 'urlize'})
|
||||
self.setDescription(url,summary.text)
|
||||
#self.story.setMetadata('description', summary.text)
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
chapter = soup.find('div', {'class' : 'public_beta'})
|
||||
if chapter == None:
|
||||
chapter = soup.find('div', {'class' : 'public_beta_disabled'})
|
||||
|
||||
if None == chapter:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,chapter)
|
||||
231
fanficdownloader/adapters/adapter_fictionalleyorg.py
Normal file
231
fanficdownloader/adapters/adapter_fictionalleyorg.py
Normal file
|
|
@ -0,0 +1,231 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class FictionAlleyOrgSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fa')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Harry Potter")
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query correct
|
||||
m = re.match(self.getSiteURLPattern(),url)
|
||||
if m:
|
||||
self.story.setMetadata('authorId',m.group('auth'))
|
||||
self.story.setMetadata('storyId',m.group('id'))
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
# normalized story URL.
|
||||
self._setURL(url)
|
||||
else:
|
||||
raise exceptions.InvalidStoryURL(url,
|
||||
self.getSiteDomain(),
|
||||
self.getSiteExampleURLs())
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.fictionalley.org'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/authors/drt/DA.html http://"+self.getSiteDomain()+"/authors/drt/JOTP01a.html"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
# http://www.fictionalley.org/authors/drt/DA.html
|
||||
# http://www.fictionalley.org/authors/drt/JOTP01a.html
|
||||
return re.escape("http://"+self.getSiteDomain())+"/authors/(?P<auth>[a-zA-Z0-9_]+)/(?P<id>[a-zA-Z0-9_]+)\.html"
|
||||
|
||||
def _postFetchWithIAmOld(self,url):
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
params={'iamold':'Yes',
|
||||
'action':'ageanswer'}
|
||||
logging.info("Attempting to get cookie for %s" % url)
|
||||
## posting on list doesn't work, but doesn't hurt, either.
|
||||
data = self._postUrl(url,params)
|
||||
else:
|
||||
data = self._fetchUrl(url)
|
||||
return data
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
## could be either chapter list page or one-shot text page.
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._postFetchWithIAmOld(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
chapterdata = data
|
||||
# If chapter list page, get the first chapter to look for adult check
|
||||
chapterlinklist = soup.findAll('a',{'class':'chapterlink'})
|
||||
if chapterlinklist:
|
||||
chapterdata = self._postFetchWithIAmOld(chapterlinklist[0]['href'])
|
||||
|
||||
if "Are you over seventeen years old" in chapterdata:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if not chapterlinklist:
|
||||
# no chapter list, chapter URL: change to list link.
|
||||
# second a tag inside div breadcrumbs
|
||||
storya = soup.find('div',{'class':'breadcrumbs'}).findAll('a')[1]
|
||||
self._setURL(storya['href'])
|
||||
url=self.url
|
||||
logging.debug("Normalizing to URL: "+url)
|
||||
## title's right there...
|
||||
self.story.setMetadata('title',storya.string)
|
||||
data = self._fetchUrl(url)
|
||||
soup = bs.BeautifulSoup(data)
|
||||
chapterlinklist = soup.findAll('a',{'class':'chapterlink'})
|
||||
else:
|
||||
## still need title from somewhere. If chapterlinklist,
|
||||
## then chapterdata contains a chapter, find title the
|
||||
## same way.
|
||||
chapsoup = bs.BeautifulSoup(chapterdata)
|
||||
storya = chapsoup.find('div',{'class':'breadcrumbs'}).findAll('a')[1]
|
||||
self.story.setMetadata('title',storya.string)
|
||||
del chapsoup
|
||||
|
||||
del chapterdata
|
||||
|
||||
## authorid already set.
|
||||
## <h1 class="title" align="center">Just Off The Platform II by <a href="http://www.fictionalley.org/authors/drt/">DrT</a></h1>
|
||||
authora=soup.find('h1',{'class':'title'}).find('a')
|
||||
self.story.setMetadata('author',authora.string)
|
||||
self.story.setMetadata('authorUrl',authora['href'])
|
||||
|
||||
if len(chapterlinklist) == 1:
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),chapterlinklist[0]['href']))
|
||||
else:
|
||||
# Find the chapters:
|
||||
for chapter in chapterlinklist:
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),chapter['href']))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## Go scrape the rest of the metadata from the author's page.
|
||||
data = self._fetchUrl(self.story.getMetadata('authorUrl'))
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
# <dl><dt><a class = "Rid story" href = "http://www.fictionalley.org/authors/aafro_man_ziegod/TMH.html">
|
||||
# [Rid] The Magical Hottiez</a> by <a class = "pen_name" href = "http://www.fictionalley.org/authors/aafro_man_ziegod/">Aafro Man Ziegod</a> </small></dt>
|
||||
# <dd><small class = "storyinfo"><a href = "http://www.fictionalley.org/ratings.html" target = "_new">Rating:</a> PG-13 - Spoilers: PS/SS, CoS, PoA, GoF, QTTA, FB - 4264 hits - 5060 words<br />
|
||||
# Genre: Humor, Romance - Main character(s): None - Ships: None - Era: Multiple Eras<br /></small>
|
||||
# Chaos ensues after Witch Weekly, seeking to increase readers, decides to create a boyband out of five seemingly talentless wizards: Harry Potter, Draco Malfoy, Ron Weasley, Neville Longbottom, and Oliver "Toss Your Knickers Here" Wood.<br />
|
||||
# <small class = "storyinfo">Published: June 3, 2002 (between Goblet of Fire and Order of Phoenix) - Updated: June 3, 2002</small>
|
||||
# </dd></dl>
|
||||
|
||||
storya = soup.find('a',{'href':self.story.getMetadata('storyUrl')})
|
||||
storydd = storya.findNext('dd')
|
||||
|
||||
# Rating: PG - Spoilers: None - 2525 hits - 736 words
|
||||
# Genre: Humor - Main character(s): H, R - Ships: None - Era: Multiple Eras
|
||||
# Harry and Ron are back at it again! They reeeeeeally don't want to be back, because they know what's awaiting them. "VH1 Goes Inside..." is back! Why? 'Cos there are soooo many more couples left to pick on.
|
||||
# Published: September 25, 2004 (between Order of Phoenix and Half-Blood Prince) - Updated: September 25, 2004
|
||||
|
||||
## change to text and regexp find.
|
||||
metastr = stripHTML(storydd).replace('\n',' ').replace('\t',' ')
|
||||
|
||||
m = re.match(r".*?Rating: (.+?) -.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('rating', m.group(1))
|
||||
|
||||
m = re.match(r".*?Genre: (.+?) -.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
m = re.match(r".*?Published: ([a-zA-Z]+ \d\d?, \d\d\d\d).*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('datePublished',makeDate(m.group(1), "%B %d, %Y"))
|
||||
|
||||
m = re.match(r".*?Updated: ([a-zA-Z]+ \d\d?, \d\d\d\d).*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('dateUpdated',makeDate(m.group(1), "%B %d, %Y"))
|
||||
|
||||
m = re.match(r".*? (\d+) words Genre.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('numWords', m.group(1))
|
||||
|
||||
for small in storydd.findAll('small'):
|
||||
small.extract() ## removes the <small> tags, leaving only the summary.
|
||||
self.setDescription(url,storydd)
|
||||
#self.story.setMetadata('description',stripHTML(storydd))
|
||||
|
||||
return
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# find <!-- headerend --> & <!-- footerstart --> and
|
||||
# replaced with matching div pair for easier parsing.
|
||||
# Yes, it's an evil kludge, but what can ya do? Using
|
||||
# something other than div prevents soup from pairing
|
||||
# our div with poor html inside the story text.
|
||||
data = data.replace('<!-- headerend -->','<crazytagstringnobodywouldstumbleonaccidently id="storytext">').replace('<!-- footerstart -->','</crazytagstringnobodywouldstumbleonaccidently>')
|
||||
|
||||
# problems with some stories confusing Soup. This is a nasty
|
||||
# hack, but it works.
|
||||
data = data[data.index("<crazytagstringnobodywouldstumbleonaccidently"):]
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
body = soup.findAll('body') ## some stories use a nested body and body
|
||||
## tag, in which case we don't
|
||||
## need crazytagstringnobodywouldstumbleonaccidently
|
||||
## and use the second one instead.
|
||||
if len(body)>1:
|
||||
text = body[1]
|
||||
text.name='div' # force to be a div to avoid multiple body tags.
|
||||
else:
|
||||
text = soup.find('crazytagstringnobodywouldstumbleonaccidently', {'id' : 'storytext'})
|
||||
text.name='div' # change to div tag.
|
||||
|
||||
if not data or not text:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,text)
|
||||
|
||||
def getClass():
|
||||
return FictionAlleyOrgSiteAdapter
|
||||
|
||||
|
|
@ -1,58 +1,49 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
|
||||
# py2 vs py3 transition
|
||||
|
||||
## They're from the same people and pretty much identical.
|
||||
from .adapter_fanfictionnet import FanFictionNetSiteAdapter
|
||||
|
||||
class FictionPressComSiteAdapter(FanFictionNetSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
FanFictionNetSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fpcom')
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.fictionpress.com'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.fictionpress.com','m.fictionpress.com']
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "https://www.fictionpress.com/s/1234/1/ https://www.fictionpress.com/s/1234/12/ http://www.fictionpress.com/s/1234/1/Story_Title http://m.fictionpress.com/s/1234/1/"
|
||||
|
||||
@classmethod
|
||||
def _get_site_url_pattern(cls):
|
||||
return r"https?://(www|m)?\.fictionpress\.com/s/(?P<id>\d+)(/\d+)?(/(?P<title>[^/]+))?/?$"
|
||||
|
||||
## normalized chapter URLs DO contain the story title now, but
|
||||
## normalized to current urltitle in case of title changes.
|
||||
def normalize_chapterurl(self,url):
|
||||
return re.sub(r"https?://(www|m)\.(?P<keep>fictionpress\.com/s/\d+/\d+/).*",
|
||||
r"https://www.\g<keep>",url)+self.urltitle
|
||||
|
||||
def getClass():
|
||||
return FictionPressComSiteAdapter
|
||||
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import time
|
||||
|
||||
## They're from the same people and pretty much identical.
|
||||
from adapter_fanfictionnet import FanFictionNetSiteAdapter
|
||||
|
||||
class FictionPressComSiteAdapter(FanFictionNetSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
FanFictionNetSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fpcom')
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.fictionpress.com'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.fictionpress.com','m.fictionpress.com']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.fictionpress.com/s/1234/1/ http://www.fictionpress.com/s/1234/12/ http://www.fictionpress.com/s/1234/1/Story_Title"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"http://(www|m)?\.fictionpress\.com/s/\d+(/\d+)?(/|/[a-zA-Z0-9_-]+)?/?$"
|
||||
|
||||
def getClass():
|
||||
return FictionPressComSiteAdapter
|
||||
|
||||
|
|
@ -1,215 +1,217 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
|
||||
from .. import exceptions as exceptions
|
||||
from ..htmlcleanup import stripHTML
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class FicwadComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fw')
|
||||
|
||||
# get storyId from url--url validation guarantees second part is storyId
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
|
||||
self.username = "NoneGiven"
|
||||
self.password = ""
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'ficwad.com'
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "https://ficwad.com/story/1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"https?:"+re.escape(r"//"+self.getSiteDomain())+r"/story/\d+?$"
|
||||
|
||||
def performLogin(self,url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['username'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['username'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
|
||||
loginUrl = 'https://' + self.getSiteDomain() + '/account/login'
|
||||
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['username']))
|
||||
d = self.post_request(loginUrl,params,usecache=False)
|
||||
|
||||
if "Login attempt failed..." in d or \
|
||||
'<div id="error">Please enter your username and password.</div>' in d:
|
||||
logger.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['username']))
|
||||
raise exceptions.FailedToLogin(url,params['username'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# fetch the chapter. From that we will get almost all the
|
||||
# metadata and chapter list
|
||||
|
||||
url = self.url
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
# non-existent/removed story urls get thrown to the front page.
|
||||
if "<h4>Featured Story</h4>" in data:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
soup = self.make_soup(data)
|
||||
|
||||
# if blocked, attempt login.
|
||||
if soup.find("div",{"class":"blocked"}) or soup.find("li",{"class":"blocked"}):
|
||||
if self.performLogin(url): # performLogin raises
|
||||
# FailedToLogin if it fails.
|
||||
soup = self.make_soup(self.get_request(url,usecache=False))
|
||||
|
||||
divstory = soup.find('div',id='story')
|
||||
storya = divstory.find('a',href=re.compile(r"^/story/\d+$"))
|
||||
if storya : # if there's a story link in the divstory header, this is a chapter page.
|
||||
# normalize story URL on chapter list.
|
||||
self.story.setMetadata('storyId',storya['href'].split('/',)[2])
|
||||
url = "https://"+self.getSiteDomain()+storya['href']
|
||||
logger.debug("Normalizing to URL: "+url)
|
||||
self._setURL(url)
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
# if blocked, attempt login.
|
||||
if soup.find("div",{"class":"blocked"}) or soup.find("li",{"class":"blocked"}):
|
||||
if self.performLogin(url): # performLogin raises
|
||||
# FailedToLogin if it fails.
|
||||
soup = self.make_soup(self.get_request(url,usecache=False))
|
||||
|
||||
# title - first h4 tag will be title.
|
||||
titleh4 = soup.find('div',{'class':'storylist'}).find('h4')
|
||||
self.story.setMetadata('title', stripHTML(titleh4.a))
|
||||
|
||||
if 'Deleted story' in self.story.getMetadataRaw('title'):
|
||||
raise exceptions.StoryDoesNotExist("This story was deleted. %s"%self.url)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('span',{'class':'author'}).find('a', href=re.compile(r"^/a/"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[2])
|
||||
self.story.setMetadata('authorUrl','https://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# description
|
||||
storydiv = soup.find("div",{"id":"story"})
|
||||
self.setDescription(url,storydiv.find("blockquote",{'class':'summary'}).p)
|
||||
#self.story.setMetadata('description', storydiv.find("blockquote",{'class':'summary'}).p.string)
|
||||
|
||||
# most of the meta data is here:
|
||||
metap = storydiv.find("div",{"class":"meta"})
|
||||
self.story.addToList('category',metap.find("a",href=re.compile(r"^/category/\d+")).string)
|
||||
|
||||
# warnings
|
||||
# <span class="req"><a href="/help/38" title="Medium Spoilers">[!!] </a> <a href="/help/38" title="Rape/Sexual Violence">[R] </a> <a href="/help/38" title="Violence">[V] </a> <a href="/help/38" title="Child/Underage Sex">[Y] </a></span>
|
||||
spanreq = metap.find("span",{"class":"story-warnings"})
|
||||
if spanreq: # can be no warnings.
|
||||
for a in spanreq.find_all("a"):
|
||||
self.story.addToList('warnings',a['title'])
|
||||
|
||||
## perhaps not the most efficient way to parse this, using
|
||||
## regexps for each rather than something more complex, but
|
||||
## IMO, it's more readable and amenable to change.
|
||||
metastr = stripHTML(unicode(metap)).replace('\n',' ').replace('\t',' ').replace(u'\u00a0',' ')
|
||||
|
||||
m = re.match(r".*?Rating: (.+?) -.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('rating', m.group(1))
|
||||
|
||||
## Genre appears even if list is empty. But there are a
|
||||
## limited number of genres allowed by the site.
|
||||
m = re.match(r".*?Genres: ((?:(?:Angst|Crossover|Drama|Erotica|Fantasy|Horror|Humor|Parody|Romance|Sci-fi)(?:,)?)+) -.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
m = re.match(r".*?Characters: (.*?) -.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
if g:
|
||||
self.story.addToList('characters',g)
|
||||
|
||||
m = re.match(r".*?Published: ([0-9-]+?) -.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('datePublished',makeDate(m.group(1), "%Y-%m-%d"))
|
||||
|
||||
# Updated can have more than one space after it. <shrug>
|
||||
m = re.match(r".*?Updated: ([0-9-]+?) +-.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('dateUpdated',makeDate(m.group(1), "%Y-%m-%d"))
|
||||
|
||||
m = re.match(r".*? - ([0-9,]+?) words.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('numWords',m.group(1))
|
||||
|
||||
if metastr.endswith("Complete"):
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
# get the chapter list first this time because that's how we
|
||||
# detect the need to login.
|
||||
storylistul = soup.find('ul',{'class':'storylist'})
|
||||
if not storylistul:
|
||||
# no list found, so it's a one-chapter story.
|
||||
self.add_chapter(self.story.getMetadata('title'),url)
|
||||
else:
|
||||
chapterlistlis = storylistul.find_all('li')
|
||||
for chapterli in chapterlistlis:
|
||||
if "blocked" in chapterli['class']:
|
||||
# paranoia check. We should already be logged in by now.
|
||||
raise exceptions.FailedToLogin(url,self.username)
|
||||
else:
|
||||
#print "chapterli.h4.a (%s)"%chapterli.h4.a
|
||||
self.add_chapter(chapterli.h4.a.string,
|
||||
u'https://%s%s'%(self.getSiteDomain(),
|
||||
chapterli.h4.a['href']))
|
||||
return
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
span = soup.find('div', {'id' : 'storytext'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return FicwadComSiteAdapter
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import time
|
||||
import httplib, urllib
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from .. import exceptions as exceptions
|
||||
from ..htmlcleanup import stripHTML
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class FicwadComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fw')
|
||||
|
||||
# get storyId from url--url validation guarantees second part is storyId
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
|
||||
self.username = "NoneGiven"
|
||||
self.password = ""
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.ficwad.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.ficwad.com/story/137169"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape(r"http://"+self.getSiteDomain())+"/story/\d+?$"
|
||||
|
||||
def performLogin(self,url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['username'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['username'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/account/login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['username']))
|
||||
d = self._postUrl(loginUrl,params)
|
||||
|
||||
if "Login attempt failed..." in d:
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['username']))
|
||||
raise exceptions.FailedToLogin(url,params['username'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# fetch the chapter. From that we will get almost all the
|
||||
# metadata and chapter list
|
||||
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
try:
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
h3 = soup.find('h3')
|
||||
storya = h3.find('a',href=re.compile("^/story/\d+$"))
|
||||
if storya : # if there's a story link in the h3 header, this is a chapter page.
|
||||
# normalize story URL on chapter list.
|
||||
self.story.setMetadata('storyId',storya['href'].split('/',)[2])
|
||||
url = "http://"+self.getSiteDomain()+storya['href']
|
||||
logging.debug("Normalizing to URL: "+url)
|
||||
self._setURL(url)
|
||||
try:
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# if blocked, attempt login.
|
||||
if soup.find("li",{"class":"blocked"}):
|
||||
if self.performLogin(url): # performLogin raises
|
||||
# FailedToLogin if it fails.
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
|
||||
# title - first h4 tag will be title.
|
||||
titleh4 = soup.find('h4')
|
||||
self.story.setMetadata('title', titleh4.a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"^/author/\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[2])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# description
|
||||
storydiv = soup.find("div",{"id":"story"})
|
||||
self.setDescription(url,storydiv.find("blockquote",{'class':'summary'}).p.string)
|
||||
#self.story.setMetadata('description', storydiv.find("blockquote",{'class':'summary'}).p.string)
|
||||
|
||||
# most of the meta data is here:
|
||||
metap = storydiv.find("p",{"class":"meta"})
|
||||
self.story.addToList('category',metap.find("a",href=re.compile(r"^/category/\d+")).string)
|
||||
|
||||
# warnings
|
||||
# <span class="req"><a href="/help/38" title="Medium Spoilers">[!!] </a> <a href="/help/38" title="Rape/Sexual Violence">[R] </a> <a href="/help/38" title="Violence">[V] </a> <a href="/help/38" title="Child/Underage Sex">[Y] </a></span>
|
||||
spanreq = metap.find("span",{"class":"req"})
|
||||
if spanreq: # can be no warnings.
|
||||
for a in spanreq.findAll("a"):
|
||||
self.story.addToList('warnings',a['title'])
|
||||
|
||||
## perhaps not the most efficient way to parse this, using
|
||||
## regexps for each rather than something more complex, but
|
||||
## IMO, it's more readable and amenable to change.
|
||||
metastr = stripHTML(str(metap)).replace('\n',' ').replace('\t',' ')
|
||||
#print "metap: (%s)"%metastr
|
||||
|
||||
m = re.match(r".*?Rating: (.+?) -.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('rating', m.group(1))
|
||||
|
||||
m = re.match(r".*?Genres: (.+?) -.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
m = re.match(r".*?Characters: (.*?) -.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
if g:
|
||||
self.story.addToList('characters',g)
|
||||
|
||||
m = re.match(r".*?Published: ([0-9/]+?) -.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('datePublished',makeDate(m.group(1), "%Y/%m/%d"))
|
||||
|
||||
# Updated can have more than one space after it. <shrug>
|
||||
m = re.match(r".*?Updated: ([0-9/]+?) +-.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('dateUpdated',makeDate(m.group(1), "%Y/%m/%d"))
|
||||
|
||||
m = re.match(r".*? - ([0-9/]+?) words.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('numWords',m.group(1))
|
||||
|
||||
if metastr.endswith("Complete"):
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
# get the chapter list first this time because that's how we
|
||||
# detect the need to login.
|
||||
storylistul = soup.find('ul',{'id':'storylist'})
|
||||
if not storylistul:
|
||||
# no list found, so it's a one-chapter story.
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
else:
|
||||
chapterlistlis = storylistul.findAll('li')
|
||||
for chapterli in chapterlistlis:
|
||||
if "blocked" in chapterli['class']:
|
||||
# paranoia check. We should already be logged in by now.
|
||||
raise exceptions.FailedToLogin(url,self.username)
|
||||
else:
|
||||
#print "chapterli.h4.a (%s)"%chapterli.h4.a
|
||||
self.chapterUrls.append((chapterli.h4.a.string,
|
||||
u'http://%s%s'%(self.getSiteDomain(),
|
||||
chapterli.h4.a['href'])))
|
||||
#print "self.chapterUrls:%s"%self.chapterUrls
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
return
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'storytext'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return FicwadComSiteAdapter
|
||||
|
||||
199
fanficdownloader/adapters/adapter_fimfictionnet.py
Normal file
199
fanficdownloader/adapters/adapter_fimfictionnet.py
Normal file
|
|
@ -0,0 +1,199 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import cookielib as cl
|
||||
import datetime
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
def getClass():
|
||||
return FimFictionNetSiteAdapter
|
||||
|
||||
class FimFictionNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','fimficnet')
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
self._setURL("http://"+self.getSiteDomain()+"/story/"+self.story.getMetadata('storyId')+"/")
|
||||
self.is_adult = False
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.fimfiction.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
# mobile.fimifction.com isn't actually a valid domain, but we can still get the story id from URLs anyway
|
||||
return ['www.fimfiction.net','mobile.fimfiction.net', 'www.fimfiction.com', 'mobile.fimfiction.com']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.fimfiction.net/story/1234/story-title-here http://www.fimfiction.net/story/1234/ http://www.fimfiction.com/story/1234/1/ http://mobile.fimfiction.net/story/1234/1/story-title-here/chapter-title-here"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"http://(www|mobile)\.fimfiction\.(net|com)/story/\d+/?.*"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
cookieproc = urllib2.HTTPCookieProcessor()
|
||||
cookie = cl.Cookie(version=0, name='view_mature', value='true',
|
||||
port=None, port_specified=False,
|
||||
domain=self.getSiteDomain(), domain_specified=False, domain_initial_dot=False,
|
||||
path='/story', path_specified=True,
|
||||
secure=False,
|
||||
expires=time.time()+10000,
|
||||
discard=False,
|
||||
comment=None,
|
||||
comment_url=None,
|
||||
rest={'HttpOnly': None},
|
||||
rfc2109=False)
|
||||
cookieproc.cookiejar.set_cookie(cookie)
|
||||
self.opener = urllib2.build_opener(cookieproc)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(self.url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource" in data:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
|
||||
if "/images/missing_story.png" in data:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
|
||||
if "This story has been marked as having adult content." in data:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if self.password:
|
||||
params = {}
|
||||
params['password'] = self.password
|
||||
data = self._postUrl(self.url,params)
|
||||
|
||||
if "Enter the password the author set for this story to view it." in data:
|
||||
if self.getConfig('fail_on_password'):
|
||||
raise exceptions.FailedToDownload("%s requires story password and fail_on_password is true."%self.url)
|
||||
else:
|
||||
raise exceptions.FailedToLogin(self.url,"Story requires individual password",passwdonly=True)
|
||||
|
||||
soup = bs.BeautifulSoup(data).find("div", {"class":"content_box post_content_box"})
|
||||
|
||||
titleheader = soup.find("h2")
|
||||
title = titleheader.find("a", href=re.compile(r'^/story/')).text
|
||||
author = titleheader.find("a", href=re.compile(r'^/user/')).text
|
||||
self.story.setMetadata("title", title)
|
||||
self.story.setMetadata("author", author)
|
||||
self.story.setMetadata("authorId", author) # The author's name will be unique
|
||||
self.story.setMetadata("authorUrl", "http://%s/user/%s" % (self.getSiteDomain(),author))
|
||||
|
||||
chapterDates = []
|
||||
|
||||
for chapter in soup.findAll("a", {"class":"chapter_link"}):
|
||||
chapterDates.append(chapter.span.extract().text.strip("()"))
|
||||
self.chapterUrls.append((chapter.text.strip(), "http://"+self.getSiteDomain() + chapter['href']))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
for character in [character_icon['title'] for character_icon in soup.findAll("a", {"class":"character_icon"})]:
|
||||
self.story.addToList("characters", character)
|
||||
for category in [category.text for category in soup.find("div", {"class":"categories"}).findAll("a")]:
|
||||
self.story.addToList("genre", category)
|
||||
self.story.addToList("category", "My Little Pony")
|
||||
|
||||
|
||||
# The very last list element in the list of chapters contains the status, rating and word count e.g.:
|
||||
#
|
||||
# <li>
|
||||
# Incomplete | Rating:
|
||||
# <span style="color:#c78238;">Teen</span>
|
||||
# <div class="word_count"><b>5,203</b>words total</div>
|
||||
# </li>
|
||||
#
|
||||
|
||||
status_bar = soup.findAll('li')[-1]
|
||||
# In the case of fimfiction.net, possible statuses are 'Completed', 'Incomplete', 'On Hiatus' and 'Cancelled'
|
||||
# For the sake of bringing it in line with the other adapters, 'Incomplete' and 'On Hiatus' become 'In-Progress'
|
||||
# and 'Complete' beomes 'Completed'. 'Cancelled' seems an important enough (not to mention more strictly true)
|
||||
# status to leave unchanged.
|
||||
status = status_bar.text.split("|")[0].strip().replace("Incomplete", "In-Progress").replace("On Hiatus", "In-Progress").replace("Complete", "Completed")
|
||||
self.story.setMetadata('status', status)
|
||||
self.story.setMetadata('rating', status_bar.span.text)
|
||||
# This way is less elegant, perhaps, but more robust in face of format changes.
|
||||
numWords = status_bar.find("div",{"class":"word_count"}).b.text
|
||||
self.story.setMetadata('numWords', numWords)
|
||||
|
||||
description_soup = soup.find("div", {"class":"description"})
|
||||
# Sometimes the description has an expanding element
|
||||
# This removes the ellipsis and the expand button
|
||||
try:
|
||||
description_soup.find('span', {"id":re.compile(r"description_more_elipses_\d+")}).extract() # Web designer can't spell 'ellipsis'
|
||||
description_soup.find('a', {"class":"more"}).extract()
|
||||
except:
|
||||
pass
|
||||
|
||||
# fimfic is the first site with an explicit cover image.
|
||||
story_img = soup.find('img',{'class':'story_image'})
|
||||
if self.getConfig('include_images') and story_img:
|
||||
self.story.addImgUrl(self,self.url,story_img['src'],self._fetchUrlRaw,cover=True)
|
||||
self.setDescription(self.url,description_soup.text)
|
||||
#self.story.setMetadata('description', description_soup.text)
|
||||
|
||||
# Unfortunately, nowhere on the page is the year mentioned.
|
||||
# Best effort to deal with this:
|
||||
# Use this year, if that's a date in the future, subtract one year.
|
||||
# Their earliest story is Jun, so they'll probably change the date
|
||||
# around then.
|
||||
|
||||
now = datetime.datetime.now()
|
||||
|
||||
dateUpdated_soup = bs.BeautifulSoup(data).find("div", {"class":"calendar"})
|
||||
dateUpdated_soup.find('span').extract()
|
||||
dateUpdated = makeDate("%s%s"%(now.year,dateUpdated_soup.text), "%Y%b%d")
|
||||
if dateUpdated > now :
|
||||
dateUpdated = dateUpdated.replace(year=now.year-1)
|
||||
self.story.setMetadata("dateUpdated", dateUpdated)
|
||||
|
||||
# Get the date of creation from the first chapter
|
||||
if len(chapterDates) > 0:
|
||||
datePublished_text = chapterDates[0]
|
||||
day, month = datePublished_text.split()
|
||||
day = re.sub(r"[^\d.]+", '', day)
|
||||
datePublished = makeDate("%s%s%s"%(now.year,month,day), "%Y%b%d")
|
||||
if datePublished > now :
|
||||
datePublished = datePublished.replace(year=now.year-1)
|
||||
self.story.setMetadata("datePublished", datePublished)
|
||||
else:
|
||||
self.story.setMetadata("datePublished", dateUpdated)
|
||||
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url),selfClosingTags=('br','hr')).find('div', {'id' : 'chapter_container'})
|
||||
if soup == None:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
return self.utf8FromSoup(url,soup)
|
||||
|
||||
205
fanficdownloader/adapters/adapter_gayauthorsorg.py
Normal file
205
fanficdownloader/adapters/adapter_gayauthorsorg.py
Normal file
|
|
@ -0,0 +1,205 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import datetime
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
from urllib import unquote
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
def getClass():
|
||||
return GayAuthorsAdapter
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class GayAuthorsAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["utf8",
|
||||
"Windows-1252"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[3])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# unqoute, change '_' and ' ' to '-', downcase, and remove non-[a-z0-9-]
|
||||
authid = unquote(self.parsedUrl.path.split('/',)[2])
|
||||
authid = authid.lower().replace('_','-').replace(' ','-')
|
||||
authid = re.sub(r"[^a-z0-9-]","",authid)
|
||||
|
||||
self.story.setMetadata('authorId',authid)
|
||||
logging.debug("authorId: (%s)"%self.story.getMetadata('authorId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/story/'+self.story.getMetadata('authorId') + '/' + self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','ga')
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%d %b %Y"
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.gayauthors.org'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/story/author/storytitle"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/story/")+r".*?/\w+.*?$"
|
||||
|
||||
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
msoup = soup.find('div', {'class' : 'story'})
|
||||
if msoup == None:
|
||||
msoup = soup.find('div', {'class' : 'story ispinned'})
|
||||
csoup = soup.find('div', {'id' : 'story_chapters'})
|
||||
|
||||
## Title
|
||||
a = msoup.find('span', {'class' : 'title'})
|
||||
title=a.find('span', {'itemprop' : 'name'})
|
||||
self.story.setMetadata('title',title.text)
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
series = a.find('span',{'class':"description"})
|
||||
series_name = series.find('a')
|
||||
series_name.extract()
|
||||
series_index = int(series.text.split(' ')[1])
|
||||
self.setSeries(series_name.text, series_index)
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = msoup.find('a', href=re.compile(r'/author/'+self.story.getMetadata('authorId')))
|
||||
self.story.setMetadata('authorUrl',a['href'])
|
||||
self.story.setMetadata('author',a.text)
|
||||
|
||||
|
||||
# Find the chapters:
|
||||
spans=csoup.findAll('span', {'class' : 'desc chapter-info'})
|
||||
for span in spans:
|
||||
span.extract()
|
||||
for chapter in csoup.findAll('a'):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
a=chapter['href'].split(self.story.getMetadata('author'))
|
||||
a=a[0]+self.story.getMetadata('authorId')+a[1]
|
||||
self.chapterUrls.append((stripHTML(chapter),a))
|
||||
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
cats = msoup.findAll('a', href=re.compile(r'/browse/list/page__filtertype_0__category\w+$'))
|
||||
for cat in cats:
|
||||
self.story.addToList('category',cat.text)
|
||||
|
||||
genres = msoup.findAll('a', href=re.compile(r'/browse/list/page__filtertype_1__genre\w+$'))
|
||||
for genre in genres:
|
||||
self.story.addToList('genre',genre.text)
|
||||
genres = msoup.findAll('a', href=re.compile(r'/browse/list/page__filtertype_2__tag\w+$'))
|
||||
for genre in genres:
|
||||
self.story.addToList('genre',genre.text)
|
||||
|
||||
status = msoup.find('a', href=re.compile(r'/browse/list/page__filtertype_3__status\w+$'))
|
||||
self.story.setMetadata('status',status.text)
|
||||
|
||||
rating = msoup.find('a', href=re.compile(r'/browse/list/page__filtertype_4__rating\w+$'))
|
||||
self.story.setMetadata('rating',rating.text)
|
||||
|
||||
summary = msoup.find('span', {'itemprop' : 'description'})
|
||||
self.setDescription(self.url,summary.text)
|
||||
#self.story.setMetadata('description',summary.text)
|
||||
|
||||
|
||||
stats = msoup.find('dl',{'class':'info'})
|
||||
dt = stats.findAll('dt')
|
||||
dd = stats.findAll('dd')
|
||||
for x in range(0,len(dt)):
|
||||
label = dt[x].text
|
||||
value = dd[x].text
|
||||
|
||||
if 'Words:' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Published:' in label:
|
||||
date=stripHTML(value.split(' - ')[0])
|
||||
if ',' in date:
|
||||
date=datetime.date.today().strftime(self.dateformat)
|
||||
self.story.setMetadata('datePublished', makeDate(date, self.dateformat))
|
||||
|
||||
if 'Updated:' in label:
|
||||
date=stripHTML(value.split(' - ')[0])
|
||||
if ',' in date:
|
||||
date=datetime.date.today().strftime(self.dateformat)
|
||||
self.story.setMetadata('dateUpdated', makeDate(date, self.dateformat))
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
data = data[data.index("<div id='chapter-content'>"):]
|
||||
soup = bs.BeautifulSoup(data) # this one's happier with Soup, not StoneSoup for some reason.
|
||||
|
||||
div = soup.find('div', {'id' : 'chapter-content'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
201
fanficdownloader/adapters/adapter_harrypotterfanfictioncom.py
Normal file
201
fanficdownloader/adapters/adapter_harrypotterfanfictioncom.py
Normal file
|
|
@ -0,0 +1,201 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class HarryPotterFanFictionComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','hp')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Harry Potter")
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only psid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?psid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.harrypotterfanfiction.com'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.harrypotterfanfiction.com','harrypotterfanfiction.com']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.harrypotterfanfiction.com/viewstory.php?psid=1234 http://harrypotterfanfiction.com/viewstory.php?psid=5678"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?"+re.escape("harrypotterfanfiction.com/viewstory.php?psid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'\?psid='+self.story.getMetadata('storyId')))
|
||||
self.story.setMetadata('title',a.string)
|
||||
## javascript:if (confirm('Please note. This story may contain adult themes. By clicking here you are stating that you are over 17. Click cancel if you do not meet this requirement.')) location = '?psid=290995'
|
||||
if "This story may contain adult themes." in a['href'] and not (self.is_adult or self.getConfig("is_adult")):
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?showuid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
## hpcom doesn't give us total words--but it does give
|
||||
## us words/chapter. I'd rather add than fetch and
|
||||
## parse another page.
|
||||
words=0
|
||||
for tr in soup.find('table',{'class':'text'}).findAll('tr'):
|
||||
tdstr = tr.findAll('td')[2].string
|
||||
if tdstr and tdstr.isdigit():
|
||||
words+=int(tdstr)
|
||||
self.story.setMetadata('numWords',str(words))
|
||||
|
||||
# Find the chapters:
|
||||
tablelist = soup.find('table',{'class':'text'})
|
||||
for chapter in tablelist.findAll('a', href=re.compile(r'\?chapterid=\d+')):
|
||||
#javascript:if (confirm('Please note. This story may contain adult themes. By clicking here you are stating that you are over 17. Click cancel if you do not meet this requirement.')) location = '?chapterid=433441&i=1'
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
chpt=re.sub(r'^.*?(\?chapterid=\d+).*?',r'\1',chapter['href'])
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/viewstory.php'+chpt))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## Finding the metadata is a bit of a pain. Desc is the only thing this color.
|
||||
desctable= soup.find('table',{'bgcolor':'#f0e8e8'})
|
||||
self.setDescription(url,desctable)
|
||||
#self.story.setMetadata('description',stripHTML(desctable))
|
||||
|
||||
## Finding the metadata is a bit of a pain. Most of the meta
|
||||
## data is in a center.table without a bgcolor.
|
||||
for center in soup.findAll('center'):
|
||||
table = center.find('table',{'bgcolor':None})
|
||||
if table:
|
||||
metastr = stripHTML(str(table)).replace('\n',' ').replace('\t',' ')
|
||||
# Rating: 12+ Story Reviews: 3
|
||||
# Chapters: 3
|
||||
# Characters: Andromeda, Ted, Bellatrix, R. Lestrange, Lucius, Narcissa, OC
|
||||
# Genre(s): Fluff, Romance, Young Adult Era: OtherPairings: Other Pairing, Lucius/Narcissa
|
||||
# Status: Completed
|
||||
# First Published: 2010.09.02
|
||||
# Last Published Chapter: 2010.09.28
|
||||
# Last Updated: 2010.09.28
|
||||
# Favorite Story Of: 1 users
|
||||
# Warnings: Scenes of a Mild Sexual Nature
|
||||
|
||||
m = re.match(r".*?Status: Completed.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('status','Completed')
|
||||
else:
|
||||
self.story.setMetadata('status','In-Progress')
|
||||
|
||||
m = re.match(r".*?Rating: (.+?) Story Reviews.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('rating', m.group(1))
|
||||
|
||||
m = re.match(r".*?Genre\(s\): (.+?) Era.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
m = re.match(r".*?Characters: (.+?) Genre.*?",metastr)
|
||||
if m:
|
||||
for g in m.group(1).split(','):
|
||||
self.story.addToList('characters',g)
|
||||
|
||||
m = re.match(r".*?Warnings: (.+).*?",metastr)
|
||||
if m:
|
||||
for w in m.group(1).split(','):
|
||||
if w != 'Now Warnings':
|
||||
self.story.addToList('warnings',w)
|
||||
|
||||
m = re.match(r".*?First Published: ([0-9\.]+).*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('datePublished',makeDate(m.group(1), "%Y.%m.%d"))
|
||||
|
||||
# Updated can have more than one space after it. <shrug>
|
||||
m = re.match(r".*?Last Updated: ([0-9\.]+).*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('dateUpdated',makeDate(m.group(1), "%Y.%m.%d"))
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
## most adapters use BeautifulStoneSoup here, but non-Stone
|
||||
## allows nested div tags.
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
div = soup.find('div', {'id' : 'fluidtext'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
|
||||
def getClass():
|
||||
return HarryPotterFanFictionComSiteAdapter
|
||||
|
||||
234
fanficdownloader/adapters/adapter_hpfandomnet.py
Normal file
234
fanficdownloader/adapters/adapter_hpfandomnet.py
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return HPFandomNetAdapterAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class HPFandomNetAdapterAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the /eff part. Replace all to remove it usually.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/eff/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','hpfdm') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Harry Potter") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%Y.%m.%d" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.hpfandom.net' # XXX
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/eff/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/eff/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/eff/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
## Going to get the rest from the author page.
|
||||
authdata = self._fetchUrl(self.story.getMetadata('authorUrl'))
|
||||
# fix a typo in the site HTML so I can find the Characters list.
|
||||
authdata = authdata.replace('<td width=10%">','<td width="10%">')
|
||||
|
||||
# hpfandom.net only seems to indicate adult-only by javascript on the story/chapter links.
|
||||
if "javascript:if (confirm('Slash/het fiction which incorporates sexual situations to a somewhat graphic degree and some violence. ')) location = 'viewstory.php?sid=%s'"%self.story.getMetadata('storyId') in authdata \
|
||||
and not (self.is_adult or self.getConfig("is_adult")):
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
authsoup = bs.BeautifulSoup(authdata)
|
||||
|
||||
reviewsa = authsoup.find('a', href="reviews.php?sid="+self.story.getMetadata('storyId')+"&a=")
|
||||
# <table><tr><td><p><b><a ...>
|
||||
metablock = reviewsa.findParent("table")
|
||||
#print("metablock:%s"%metablock)
|
||||
|
||||
## Title
|
||||
titlea = metablock.find('a', href=re.compile("viewstory.php"))
|
||||
#print("titlea:%s"%titlea)
|
||||
if titlea == None:
|
||||
raise exceptions.FailedToDownload("Story URL (%s) not found on author's page, can't use chapter URLs"%url)
|
||||
self.story.setMetadata('title',stripHTML(titlea))
|
||||
|
||||
# Find the chapters: !!! hpfandom.net differs from every other
|
||||
# eFiction site--the sid on viewstory for chapters is
|
||||
# *different* for each chapter
|
||||
for chapter in soup.findAll('a', {'href':re.compile(r"viewstory.php\?sid=\d+&i=\d+")}):
|
||||
m = re.match(r'.*?(viewstory.php\?sid=\d+&i=\d+).*?',chapter['href'])
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
#print("====chapter===%s"%m.group(1))
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/eff/'+m.group(1)))
|
||||
|
||||
if len(self.chapterUrls) == 0:
|
||||
self.chapterUrls.append((stripHTML(self.story.getMetadata('title')),url))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
summary = metablock.find("td",{"class":"summary"})
|
||||
summary.name='span'
|
||||
self.setDescription(url,summary)
|
||||
|
||||
# words & completed in first row of metablock.
|
||||
firstrow = stripHTML(metablock.find('tr'))
|
||||
# A Mother's Love xx Going Grey 1 (G+) by Kiristeen | Reviews - 18 | Words: 27468 | Completed: Yes
|
||||
m = re.match(r".*?\((?P<rating>[^)]+)\).*?Words: (?P<words>\d+).*?Completed: (?P<status>Yes|No)",firstrow)
|
||||
if m != None:
|
||||
if m.group('rating') != None:
|
||||
self.story.setMetadata('rating', m.group('rating'))
|
||||
|
||||
if m.group('words') != None:
|
||||
self.story.setMetadata('numWords', m.group('words'))
|
||||
|
||||
if m.group('status') != None:
|
||||
if 'Yes' in m.group('status'):
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
|
||||
# <tr><td width="10%" valign="top">Chapters:</td><td width="40%" valign="top">4</td>
|
||||
# <td width="10%" valign="top">Published:</td><td width="40%" valign="top">2010.09.29</td></tr>
|
||||
# <tr><td width="10%" valign="top">Completed:</td><td width="40%" valign="top">Yes</td><td width="10%" valign="top">Updated:</td><td width="40%" valign="top">2010.10.03</td></tr>
|
||||
labels = metablock.findAll('td',{'width':'10%'})
|
||||
for td in labels:
|
||||
label = td.string
|
||||
value = td.nextSibling.string
|
||||
#print("\nlabel:%s\nvalue:%s\n"%(label,value))
|
||||
|
||||
if 'Category' in label:
|
||||
cats = td.parent.findAll('a',href=re.compile(r'categories.php'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
for char in value.split(','):
|
||||
self.story.addToList('characters',char.strip())
|
||||
|
||||
if 'Genre' in label:
|
||||
for genre in value.split(','):
|
||||
self.story.addToList('genre',genre.strip())
|
||||
|
||||
if 'Warnings' in label:
|
||||
for warning in value.split(','):
|
||||
if warning.strip() != 'none':
|
||||
self.story.addToList('warnings',warning.strip())
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# There's no good wrapper around the chapter text. :-/
|
||||
# There are, however, tables with width=100% just above and below the real text.
|
||||
data = re.sub(r'<table width="100%">.*?</table>','<div name="storybody">',
|
||||
data,count=1,flags=re.DOTALL)
|
||||
|
||||
data = re.sub(r'<table width="100%">.*?</table>','</div>',
|
||||
data,count=1,flags=re.DOTALL)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
div = soup.find("div",{'name':'storybody'})
|
||||
#print("\n\ndiv:%s\n\n"%div)
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
return self.utf8FromSoup(url,div)
|
||||
235
fanficdownloader/adapters/adapter_mediaminerorg.py
Normal file
235
fanficdownloader/adapters/adapter_mediaminerorg.py
Normal file
|
|
@ -0,0 +1,235 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class MediaMinerOrgSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','mm')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
|
||||
# get storyId from url--url validation guarantees query correct
|
||||
m = re.match(self.getSiteURLPattern(),url)
|
||||
if m:
|
||||
self.story.setMetadata('storyId',m.group('id'))
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/fanfic/view_st.php/'+self.story.getMetadata('storyId'))
|
||||
else:
|
||||
raise exceptions.InvalidStoryURL(url,
|
||||
self.getSiteDomain(),
|
||||
self.getSiteExampleURLs())
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.mediaminer.org'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/fanfic/view_st.php/123456 http://"+self.getSiteDomain()+"/fanfic/view_ch.php/1234123/123444#fic_c"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
## http://www.mediaminer.org/fanfic/view_st.php/76882
|
||||
## http://www.mediaminer.org/fanfic/view_ch.php/167618/594087#fic_c
|
||||
return re.escape("http://"+self.getSiteDomain())+\
|
||||
"/fanfic/view_(st|ch)\.php/"+r"(?P<id>\d+)(/\d+(#fic_c)?)?$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
# [ A - All Readers ], strip '[' ']'
|
||||
## Above title because we remove the smtxt font to get title.
|
||||
smtxt = soup.find("font",{"class":"smtxt"})
|
||||
if not smtxt:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
rating = smtxt.string[1:-1]
|
||||
self.story.setMetadata('rating',rating)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"/fanfic/src.php/u/\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[-1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
## Title - Good grief. Title varies by chaptered, 1chapter and 'type=one shot'--and even 'one-shot's can have titled chapter.
|
||||
## But, if colspan=2, there's no chapter title.
|
||||
## <td class="ffh">Atmosphere: Chapter 1</b> <font class="smtxt">[ P - Pre-Teen ]</font></td>
|
||||
## <td colspan=2 class="ffh">Hearts of Ice <font class="smtxt">[ P - Pre-Teen ]</font></td>
|
||||
## <td colspan=2 class="ffh">Suzaku no Princess <font class="smtxt">[ P - Pre-Teen ]</font></td>
|
||||
## <td class="ffh">The Kraut, The Bartender, and The Drunkard: Chapter 1</b> <font class="smtxt">[ P - Pre-Teen ]</font></td>
|
||||
## <td class="ffh">Betrayal and Justice: A Cold Heart</b> <font size="-1">( Chapter 1 )</font> <font class="smtxt">[ A - All Readers ]</font></td>
|
||||
## <td class="ffh">Question and Answer: Question and Answer</b> <font size="-1">( One-Shot )</font> <font class="smtxt">[ A - All Readers ]</font></td>
|
||||
title = soup.find('td',{'class':'ffh'})
|
||||
for font in title.findAll('font'):
|
||||
font.extract() # removes 'font' tags from inside the td.
|
||||
if title.has_key('colspan'):
|
||||
titlet = title.text
|
||||
else:
|
||||
## No colspan, it's part chapter title--even if it's a one-shot.
|
||||
titlet = ':'.join(title.text.split(':')[:-1]) # strip trailing 'Chapter X' or chapter title
|
||||
self.story.setMetadata('title',titlet)
|
||||
## The story title is difficult to reliably parse from the
|
||||
## story pages. Getting it from the author page is, but costs
|
||||
## another fetch.
|
||||
# authsoup = bs.BeautifulSoup(self._fetchUrl(self.story.getMetadata('authorUrl')))
|
||||
# titlea = authsoup.find('a',{'href':'/fanfic/view_st.php/'+self.story.getMetadata('storyId')})
|
||||
# self.story.setMetadata('title',titlea.text)
|
||||
|
||||
# save date from first for later.
|
||||
firstdate=None
|
||||
|
||||
# Find the chapters
|
||||
select = soup.find('select',{'name':'cid'})
|
||||
if not select:
|
||||
self.chapterUrls.append(( self.story.getMetadata('title'),self.url))
|
||||
else:
|
||||
for option in select.findAll("option"):
|
||||
chapter = stripHTML(option.string)
|
||||
## chapter can be: Chapter 7 [Jan 23, 2011]
|
||||
## or: Vigilant Moonlight ( Chapter 1 ) [Jan 30, 2004]
|
||||
## or even: Prologue ( Prologue ) [Jul 31, 2010]
|
||||
m = re.match(r'^(.*?) (\( .*? \) )?\[(.*?)\]$',chapter)
|
||||
chapter = m.group(1)
|
||||
# save date from first for later.
|
||||
if not firstdate:
|
||||
firstdate = m.group(3)
|
||||
self.chapterUrls.append((chapter,'http://'+self.host+'/fanfic/view_ch.php/'+self.story.getMetadata('storyId')+'/'+option['value']))
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# category
|
||||
# <a href="/fanfic/src.php/a/567">Ranma 1/2</a>
|
||||
for a in soup.findAll('a',href=re.compile(r"^/fanfic/src.php/a/")):
|
||||
self.story.addToList('category',a.string)
|
||||
|
||||
# genre
|
||||
# <a href="/fanfic/src.php/a/567">Ranma 1/2</a>
|
||||
for a in soup.findAll('a',href=re.compile(r"^/fanfic/src.php/g/")):
|
||||
self.story.addToList('genre',a.string)
|
||||
|
||||
# if firstdate, then the block below will only have last updated.
|
||||
if firstdate:
|
||||
self.story.setMetadata('datePublished', makeDate(firstdate, "%b %d, %Y"))
|
||||
# Everything else is in <tr bgcolor="#EEEED4">
|
||||
|
||||
metastr = stripHTML(soup.find("tr",{"bgcolor":"#EEEED4"})).replace('\n',' ').replace('\r',' ').replace('\t',' ')
|
||||
# Latest Revision: August 03, 2010
|
||||
m = re.match(r".*?(?:Latest Revision|Uploaded On): ([a-zA-Z]+ \d\d, \d\d\d\d)",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('dateUpdated', makeDate(m.group(1), "%B %d, %Y"))
|
||||
if not firstdate:
|
||||
self.story.setMetadata('datePublished',
|
||||
self.story.getMetadataRaw('dateUpdated'))
|
||||
|
||||
else:
|
||||
self.story.setMetadata('dateUpdated',
|
||||
self.story.getMetadataRaw('datePublished'))
|
||||
|
||||
# Words: 123456
|
||||
m = re.match(r".*?\| Words: (\d+) \|",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('numWords', m.group(1))
|
||||
|
||||
# Summary: ....
|
||||
m = re.match(r".*?Summary: (.*)$",metastr)
|
||||
if m:
|
||||
self.setDescription(url, m.group(1))
|
||||
#self.story.setMetadata('description', m.group(1))
|
||||
|
||||
# completed
|
||||
m = re.match(r".*?Status: Completed.*?",metastr)
|
||||
if m:
|
||||
self.story.setMetadata('status','Completed')
|
||||
else:
|
||||
self.story.setMetadata('status','In-Progress')
|
||||
|
||||
return
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data=self._fetchUrl(url)
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
anchor = soup.find('a',{'name':'fic_c'})
|
||||
|
||||
if None == anchor:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
## find divs with align=left, those are paragraphs in newer stories.
|
||||
divlist = anchor.findAllNext('div',{'align':'left'})
|
||||
if divlist:
|
||||
for div in divlist:
|
||||
div.name='p' # convert to <p> mediaminer uses div with
|
||||
# a margin for paragraphs.
|
||||
anchor.append(div) # cheat! stuff all the content
|
||||
# divs into anchor just as a
|
||||
# holder.
|
||||
del div['style']
|
||||
del div['align']
|
||||
anchor.name='div'
|
||||
return self.utf8FromSoup(url,anchor)
|
||||
|
||||
else:
|
||||
logging.debug('Using kludgey text find for older mediaminer story.')
|
||||
## Some older mediaminer stories are unparsable with BeautifulSoup.
|
||||
## Really nasty formatting. Sooo... Cheat! Parse it ourselves a bit first.
|
||||
## Story stuff falls between:
|
||||
data = "<div id='HERE'>" + data[data.find('<a name="fic_c">'):] +"</div>"
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
for tag in soup.findAll('td',{'class':'ffh'}) + \
|
||||
soup.findAll('div',{'class':'acl'}) + \
|
||||
soup.findAll('div',{'class':'footer smtxt'}) + \
|
||||
soup.findAll('table',{'class':'tbbrdr'}):
|
||||
tag.extract() # remove tag from soup.
|
||||
|
||||
return self.utf8FromSoup(url,soup)
|
||||
|
||||
|
||||
def getClass():
|
||||
return MediaMinerOrgSiteAdapter
|
||||
|
||||
|
|
@ -1,277 +1,250 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Software: eFiction
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# Search for XXX comments--that's where things are most likely to need changing.
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return MidnightwhispersAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class MidnightwhispersAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
|
||||
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','mw') # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%B %d, %Y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.midnightwhispers.net' # XXX
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.midnightwhispers.net','www.midnightwhispers.ca']
|
||||
|
||||
@classmethod
|
||||
def getConfigSections(cls):
|
||||
"Only needs to be overriden if has additional ini sections."
|
||||
return ['www.midnightwhispers.ca',cls.getSiteDomain()]
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "https://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"https?:"+re.escape("//www.midnightwhispers.")+r"(ca|net)"+re.escape("/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
|
||||
# Furthermore, there's a couple sites now with more than
|
||||
# one warning level for different ratings. And they're
|
||||
# fussy about it. midnightwhispers has three: 10, 3 & 5.
|
||||
# we'll try 5 first.
|
||||
addurl = "&ageconsent=ok&warning=5" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
|
||||
# Since the warning text can change by warning level, let's
|
||||
# look for the warning pass url. nfacommunity uses
|
||||
# &warning= -- actually, so do other sites. Must be an
|
||||
# eFiction book.
|
||||
|
||||
# viewstory.php?sid=1882&warning=4
|
||||
# viewstory.php?sid=1654&ageconsent=ok&warning=5
|
||||
#print data
|
||||
#m = re.search(r"'viewstory.php\?sid=1882(&warning=4)'",data)
|
||||
m = re.search(r"'viewstory.php\?sid=\d+((?:&ageconsent=ok)?&warning=\d+)'",data)
|
||||
if m != None:
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# We tried the default and still got a warning, so
|
||||
# let's pull the warning number from the 'continue'
|
||||
# link and reload data.
|
||||
addurl = m.group(1)
|
||||
# correct stupid & error in url.
|
||||
addurl = addurl.replace("&","&")
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL 2nd try: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
else:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
soup = self.make_soup(data)
|
||||
# print data
|
||||
|
||||
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(a)) # title's inside a <b> tag.
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'https://'+self.host+'/'+chapter['href']+addurl)
|
||||
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while 'label' not in defaultGetattr(value,'class'):
|
||||
svalue += unicode(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## Not all sites use Genre, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'https://'+self.host+'/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
# skip 'report this' and 'TOC' links
|
||||
if 'contact.php' not in a['href'] and 'index' not in a['href']:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self.get_request(url)
|
||||
soup = self.make_soup(data)
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
if "A fatal MySQL error was encountered" in data:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Database error on the site reported!" % url)
|
||||
else:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return MuggleNetComAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class MuggleNetComAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','mgln') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Harry Potter") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%m/%d/%y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'fanfiction.mugglenet.com' # XXX
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
addurl = "&warning=5" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
if "This story may contain some sexuality, violence and or profanity not suitable for younger readers." in data: # XXX
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
# Dunno if this site uses this or not.
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## Not all sites use Genre, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
274
fanficdownloader/adapters/adapter_portkeyorg.py
Normal file
274
fanficdownloader/adapters/adapter_portkeyorg.py
Normal file
|
|
@ -0,0 +1,274 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import cookielib as cl
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# Search for XXX comments--that's where things are most likely to need changing.
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return PortkeyOrgAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class PortkeyOrgAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.path.split('/',)[2])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/story/'+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','prtky') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Harry Potter") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%d/%m/%y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'fanfiction.portkey.org' # XXX
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/story/1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/story/")+r"\d+(/\d+)?$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
# portkey screws around with using a different URL to set the
|
||||
# cookie and it's a pain. So... cheat!
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
cookieproc = urllib2.HTTPCookieProcessor()
|
||||
cookie = cl.Cookie(version=0, name='verify17', value='1',
|
||||
port=None, port_specified=False,
|
||||
domain=self.getSiteDomain(), domain_specified=False, domain_initial_dot=False,
|
||||
path='/', path_specified=True,
|
||||
secure=False,
|
||||
expires=time.time()+10000,
|
||||
discard=False,
|
||||
comment=None,
|
||||
comment_url=None,
|
||||
rest={'HttpOnly': None},
|
||||
rfc2109=False)
|
||||
cookieproc.cookiejar.set_cookie(cookie)
|
||||
self.opener = urllib2.build_opener(cookieproc)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "You must be over 18 years of age to view it" in data: # XXX
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
#print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"/profile/\d+"))
|
||||
#print("======a:%s"%a)
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[-1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
## Going to get the rest from the author page.
|
||||
authsoup = bs.BeautifulSoup(self._fetchUrl(self.story.getMetadata('authorUrl')))
|
||||
|
||||
## Title
|
||||
titlea = authsoup.find('a', href=re.compile(r'/story/'+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(titlea))
|
||||
metablock = titlea.parent
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find('select',{'name':'select5'}).findAll('option', {'value':re.compile(r'/story/'+self.story.getMetadata('storyId')+"/\d+$")}):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
chtitle = stripHTML(chapter)
|
||||
if not chtitle:
|
||||
chtitle = "(Untitled Chapter)"
|
||||
self.chapterUrls.append((chtitle,'http://'+self.host+chapter['value']))
|
||||
|
||||
if len(self.chapterUrls) == 0:
|
||||
self.chapterUrls.append((stripHTML(self.story.getMetadata('title')),url))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
# <SPAN class="dark-small-bold">Contents:</SPAN> <SPAN class="small-grey">NC17 </SPAN>
|
||||
# <SPAN class="dark-small-bold">Published: </SPAN><SPAN class="small-grey">12/11/07</SPAN>
|
||||
# <SPAN class="dark-small-bold"><BR>
|
||||
# Description:</SPAN> <SPAN class="small-black">A special book helps Harry tap into the power the Dark Lord knows not. Of course it’s a book on sex magic and rituals… but Harry’s not complaining. Spurned on by the ghost of a pervert founder, Harry leads his friends in the hunt for Voldemort’s Horcruxes.
|
||||
# EROTIC COMEDY! Loads of crude humor and sexual situations!
|
||||
# </SPAN>
|
||||
labels = metablock.findAll('span',{'class':'dark-small-bold'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.findNext('span').string
|
||||
label = stripHTML(labelspan)
|
||||
# print("\nlabel:%s\nlabel:%s\nvalue:%s\n"%(labelspan,label,value))
|
||||
|
||||
if 'Description' in label:
|
||||
self.setDescription(url,value)
|
||||
|
||||
if 'Contents' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Words' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
# if 'Categories' in label:
|
||||
# cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
# catstext = [cat.string for cat in cats]
|
||||
# for cat in catstext:
|
||||
# self.story.addToList('category',cat.string)
|
||||
|
||||
# if 'Characters' in label:
|
||||
# chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
# charstext = [char.string for char in chars]
|
||||
# for char in charstext:
|
||||
# self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
# genre is typo'ed on the site--it falls between the
|
||||
# dark-small-bold label and dark-small-bold content
|
||||
# spans.
|
||||
svalue = ""
|
||||
value = labelspan.nextSibling
|
||||
while not defaultGetattr(value,'class') == 'dark-small-bold':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
|
||||
for genre in svalue.split("/"):
|
||||
genre = genre.strip()
|
||||
if genre != 'None':
|
||||
self.story.addToList('genre',genre)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
# if 'Warnings' in label:
|
||||
# warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
# warningstext = [warning.string for warning in warnings]
|
||||
# self.warning = ', '.join(warningstext)
|
||||
# for warning in warningstext:
|
||||
# self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Status' in label:
|
||||
if 'Completed' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
# try:
|
||||
# # Find Series name from series URL.
|
||||
# a = metablock.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
# series_name = a.string
|
||||
# series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# # use BeautifulSoup HTML parser to make everything easier to find.
|
||||
# seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
# storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
# i=1
|
||||
# for a in storyas:
|
||||
# if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
# self.setSeries(series_name, i)
|
||||
# break
|
||||
# i+=1
|
||||
# except:
|
||||
# # I find it hard to care if the series parsing fails
|
||||
# pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
#print("soup:%s"%soup)
|
||||
td = soup.find('td', {'class' : 'story'})
|
||||
|
||||
centers = td.findAll('center')
|
||||
# first two and last two center tags are some script, 'report
|
||||
# story', 'report story' and an ad.
|
||||
centers[0].extract()
|
||||
centers[1].extract()
|
||||
centers[-1].extract()
|
||||
centers[-2].extract()
|
||||
|
||||
if None == td:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,td)
|
||||
|
|
@ -1,215 +1,209 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Software: eFiction
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class PotionsAndSnitchesOrgSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','pns')
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/fanfiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.potionsandsnitches.org'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['potionsandsnitches.org','potionsandsnitches.net']
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "http://www.potionsandsnitches.org/fanfiction/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?potionsandsnitches\.(net|org)/fanfiction/viewstory\.php\?sid=\d+$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
soup = self.make_soup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(a))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/fanfiction/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'http://'+self.host+'/fanfiction/'+chapter['href'])
|
||||
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next div class='listbox'
|
||||
svalue = ""
|
||||
while 'listbox' not in defaultGetattr(value,'class'):
|
||||
svalue += unicode(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Read' in label:
|
||||
self.story.setMetadata('reads', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
if "Snape and Harry (required)" in char:
|
||||
self.story.addToList('characters',"Snape")
|
||||
self.story.addToList('characters',"Harry")
|
||||
else:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Warning' in label:
|
||||
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
for warning in warnings:
|
||||
self.story.addToList('warnings',stripHTML(warning))
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
for genre in genres:
|
||||
self.story.addToList('genre',stripHTML(genre))
|
||||
|
||||
if 'Takes Place' in label:
|
||||
takesplaces = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
for takesplace in takesplaces:
|
||||
self.story.addToList('takesplaces',stripHTML(takesplace))
|
||||
|
||||
if 'Snape flavour' in label:
|
||||
snapeflavours = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
for snapeflavour in snapeflavours:
|
||||
self.story.addToList('snapeflavours',stripHTML(snapeflavour))
|
||||
|
||||
if 'Tags' in label:
|
||||
sitetags = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
for sitetag in sitetags:
|
||||
self.story.addToList('sitetags',stripHTML(sitetag))
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
# limit date values, there's some extra chars.
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value[:12]), "%d %b %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value[:12]), "%d %b %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/fanfiction/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
divsort = soup.find('div',id='sort')
|
||||
stars = len(divsort.find_all('img',src='images/star.gif'))
|
||||
stars = stars + 0.5 * len(divsort.find_all('img',src='images/starhalf.gif'))
|
||||
self.story.setMetadata('stars',stars)
|
||||
|
||||
a = divsort.find_all('a', href=re.compile(r'reviews.php\?type=ST&(amp;)?item='+self.story.getMetadata('storyId')+"$"))[1] # second one.
|
||||
self.story.setMetadata('reviews',stripHTML(a))
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
|
||||
def getClass():
|
||||
return PotionsAndSnitchesOrgSiteAdapter
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class PotionsAndSnitchesNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','pns')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Harry Potter")
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/fanfiction/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.potionsandsnitches.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.potionsandsnitches.net','potionsandsnitches.net']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.potionsandsnitches.net/fanfiction/viewstory.php?sid=1234 http://potionsandsnitches.net/fanfiction/viewstory.php?sid=5678"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?"+re.escape("potionsandsnitches.net/fanfiction/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/fanfiction/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/fanfiction/'+chapter['href']))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## <meta name='description' content='<p>Description</p> ...' >
|
||||
## Summary, strangely, is in the content attr of a <meta name='description'> tag
|
||||
## which is escaped HTML. Unfortunately, we can't use it because they don't
|
||||
## escape (') chars in the desc, breakin the tag.
|
||||
#meta_desc = soup.find('meta',{'name':'description'})
|
||||
#metasoup = bs.BeautifulStoneSoup(meta_desc['content'])
|
||||
#self.story.setMetadata('description',stripHTML(metasoup))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next div class='listbox'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'listbox':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
if char == "!Snape and Harry (required)":
|
||||
self.story.addToList('characters',"Snape")
|
||||
self.story.addToList('characters',"Harry")
|
||||
else:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), "%b %d %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), "%b %d %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/fanfiction/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
|
||||
def getClass():
|
||||
return PotionsAndSnitchesNetSiteAdapter
|
||||
|
||||
|
|
@ -1,243 +1,238 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2020 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Software: eFiction
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return SiyeCoUkAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class SiyeCoUkAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.story.setMetadata("storyId", self.parsed_QS["sid"])
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('https://' + self.getSiteDomain() + '/siye/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','siye') # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%Y.%m.%d" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.siye.co.uk' # XXX
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.siye.co.uk','siye.co.uk']
|
||||
|
||||
@classmethod
|
||||
def stripURLParameters(cls, url):
|
||||
return url
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "https://"+cls.getSiteDomain()+"/siye/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"https?://(www\.)?siye\.co\.uk/(siye/)?viewstory.php\?.*sid=\d+"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
# Except it doesn't this time. :-/
|
||||
url = self.url #+'&index=1'+addurl
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
soup = self.make_soup(data)
|
||||
# print data
|
||||
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
if a is None:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','https://'+self.host+'/siye/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# need(or easier) to pull other metadata from the author's list page.
|
||||
authsoup = self.make_soup(self.get_request(self.story.getMetadata('authorUrl')))
|
||||
|
||||
# remove author profile incase they've put the story URL in their bio.
|
||||
profile = authsoup.find('div',{'id':'profile'})
|
||||
if profile: # in case it changes.
|
||||
profile.extract()
|
||||
|
||||
## Title
|
||||
titlea = authsoup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(titlea))
|
||||
|
||||
# Find the chapters (from soup, not authsoup):
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'https://'+self.host+'/siye/'+chapter['href'])
|
||||
|
||||
if self.num_chapters() < 1:
|
||||
self.add_chapter(self.story.getMetadata('title'),url)
|
||||
|
||||
# The stuff we can get from the chapter list/one-shot page are
|
||||
# in the first table with 95% width.
|
||||
metatable = soup.find('table',{'width':'95%'})
|
||||
|
||||
# Categories
|
||||
cat_as = metatable.find_all('a', href=re.compile(r'categories.php'))
|
||||
for cat_a in cat_as:
|
||||
self.story.addToList('category',stripHTML(cat_a))
|
||||
|
||||
for label in metatable.find_all('b'):
|
||||
# html5lib doesn't give me \n for <br> anymore.
|
||||
# I expect there's a better way, but this is what came to
|
||||
# mind today. -JM
|
||||
part = stripHTML(label)
|
||||
nxtbr = label.find_next_sibling('br')
|
||||
nxtsib = label.next_sibling
|
||||
value = ""
|
||||
while nxtsib != nxtbr:
|
||||
value += stripHTML(nxtsib)
|
||||
nxtsib = nxtsib.next_sibling
|
||||
# logger.debug("label:%s value:%s"%(part,value))
|
||||
|
||||
if part.startswith("Characters:"):
|
||||
for item in value.split(', '):
|
||||
if item == "Harry/Ginny":
|
||||
self.story.addToList('characters',"Harry Potter")
|
||||
self.story.addToList('characters',"Ginny Weasley")
|
||||
elif item not in ("None","All"):
|
||||
self.story.addToList('characters',item)
|
||||
|
||||
if part.startswith("Genres:"):
|
||||
self.story.extendList('genre',value.split(', '))
|
||||
|
||||
if part.startswith("Warnings:"):
|
||||
if value != "None":
|
||||
self.story.extendList('warnings',value.split(', '))
|
||||
|
||||
if part.startswith("Rating:"):
|
||||
self.story.setMetadata('rating',value)
|
||||
|
||||
if part.startswith("Summary:"):
|
||||
# summary can include extra br and b tags go until Hitcount
|
||||
summary = ""
|
||||
nxt = label.next_sibling
|
||||
while nxt and "Hitcount:" not in stripHTML(nxt):
|
||||
summary += "%s"%nxt
|
||||
# logger.debug(summary)
|
||||
nxt = nxt.next_sibling
|
||||
if summary.strip().endswith("<br/>"):
|
||||
summary = summary.strip()[0:-len("<br/>")]
|
||||
self.setDescription(url,summary)
|
||||
|
||||
# Stuff from author block:
|
||||
|
||||
# SIYE formats stories in the author list differently when
|
||||
# their part of a series. Look for non-series...
|
||||
divdesc = titlea.parent.parent.find('div',{'class':'desc'})
|
||||
if not divdesc:
|
||||
# ... now look for series.
|
||||
divdesc = titlea.parent.parent.findNextSibling('tr').find('div',{'class':'desc'})
|
||||
|
||||
moremeta = stripHTML(divdesc)
|
||||
# logger.debug("moremeta:%s"%moremeta)
|
||||
# html5lib doesn't give me \n for <br> anymore.
|
||||
for part in moremeta.replace(' - ','\n').replace("Completed","\nCompleted").split('\n'):
|
||||
# logger.debug("part:%s"%part)
|
||||
try:
|
||||
(name,value) = part.split(': ')
|
||||
except:
|
||||
# not going to worry about fancier processing for the bits
|
||||
# that don't match.
|
||||
continue
|
||||
name=name.strip()
|
||||
value=value.strip()
|
||||
if name == 'Published':
|
||||
self.story.setMetadata('datePublished', makeDate(value, self.dateformat))
|
||||
if name == 'Updated':
|
||||
self.story.setMetadata('dateUpdated', makeDate(value, self.dateformat))
|
||||
if name == 'Completed':
|
||||
if value == 'Yes':
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
if name == 'Words':
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = titlea.findPrevious('a', href=re.compile(r"series.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'https://'+self.host+'/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
# soup = self.make_soup(self.get_request(url))
|
||||
# BeautifulSoup objects to <p> inside <span>, which
|
||||
# technically isn't allowed.
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
# not the most unique thing in the world, but it appears to be
|
||||
# the best we can do here.
|
||||
story = soup.find('span', {'style' : 'font-size: 100%;'})
|
||||
|
||||
if story is None:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
story.name='div'
|
||||
|
||||
return self.utf8FromSoup(url,story)
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return SiyeCoUkAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class SiyeCoUkAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8",]# 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/siye/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','siye') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Harry Potter") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%Y.%m.%d" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'www.siye.co.uk' # XXX
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.siye.co.uk','siye.co.uk']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/siye/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?siye\.co\.uk/(siye/)?"+re.escape("viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
# Except it doesn't this time. :-/
|
||||
url = self.url #+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
# print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/siye/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# need(or easier) to pull other metadata from the author's list page.
|
||||
authsoup = bs.BeautifulSoup(self._fetchUrl(self.story.getMetadata('authorUrl')))
|
||||
|
||||
## Title
|
||||
titlea = authsoup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',titlea.string)
|
||||
|
||||
# Find the chapters (from soup, not authsoup):
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/siye/'+chapter['href']))
|
||||
|
||||
if self.chapterUrls:
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
else:
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
self.story.setMetadata('numChapters',1)
|
||||
|
||||
# The stuff we can get from the chapter list/one-shot page are
|
||||
# in the first table with 95% width.
|
||||
metatable = soup.find('table',{'width':'95%'})
|
||||
|
||||
# Categories
|
||||
cat_as = metatable.findAll('a', href=re.compile(r'categories.php'))
|
||||
for cat_a in cat_as:
|
||||
self.story.addToList('category',stripHTML(cat_a))
|
||||
|
||||
moremetaparts = stripHTML(metatable).split('\n')
|
||||
for part in moremetaparts:
|
||||
part = part.strip()
|
||||
if part.startswith("Characters:"):
|
||||
part = part[part.find(':')+1:]
|
||||
for item in part.split(','):
|
||||
if item.strip() == "Harry/Ginny":
|
||||
self.story.addToList('characters',"Harry")
|
||||
self.story.addToList('characters',"Ginny")
|
||||
elif item.strip() not in ("None","All"):
|
||||
self.story.addToList('characters',item)
|
||||
|
||||
if part.startswith("Genres:"):
|
||||
part = part[part.find(':')+1:]
|
||||
for item in part.split(','):
|
||||
if item.strip() != "None":
|
||||
self.story.addToList('genre',item)
|
||||
|
||||
if part.startswith("Warnings:"):
|
||||
part = part[part.find(':')+1:]
|
||||
for item in part.split(','):
|
||||
if item.strip() != "None":
|
||||
self.story.addToList('warnings',item)
|
||||
|
||||
if part.startswith("Rating:"):
|
||||
part = part[part.find(':')+1:]
|
||||
self.story.setMetadata('rating',part)
|
||||
|
||||
if part.startswith("Summary:"):
|
||||
part = part[part.find(':')+1:]
|
||||
self.setDescription(url,part)
|
||||
#self.story.setMetadata('description',part)
|
||||
|
||||
# want to get the next tr of the table.
|
||||
#print("%s"%titlea.parent.parent.findNextSibling('tr'))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
moremeta = stripHTML(titlea.parent.parent.parent.find('div',{'class':'desc'}))
|
||||
for part in moremeta.replace(' - ','\n').split('\n'):
|
||||
#print("part:%s"%part)
|
||||
try:
|
||||
(name,value) = part.split(': ')
|
||||
except:
|
||||
# not going to worry about fancier processing for the bits
|
||||
# that don't match.
|
||||
continue
|
||||
name=name.strip()
|
||||
value=value.strip()
|
||||
if name == 'Published':
|
||||
self.story.setMetadata('datePublished', makeDate(value, self.dateformat))
|
||||
if name == 'Updated':
|
||||
self.story.setMetadata('dateUpdated', makeDate(value, self.dateformat))
|
||||
if name == 'Completed':
|
||||
if value == 'Yes':
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
if name == 'Words':
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = titlea.findPrevious('a', href=re.compile(r"series.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
# soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
# BeautifulSoup objects to <p> inside <span>, which
|
||||
# technically isn't allowed.
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
# not the most unique thing in the world, but it appears to be
|
||||
# the best we can do here.
|
||||
story = soup.find('span', {'style' : 'font-size: 100%;'})
|
||||
|
||||
if None == story:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,story)
|
||||
|
|
@ -1,234 +1,246 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Software: eFiction
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TenhawkPresentsSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','thpi')
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
self.dateformat = "%b %d %Y"
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 't.evancurrie.ca'
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "http://"+cls.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
# accept https, but don't use it--site SSL is broken.
|
||||
return r"https?:"+re.escape("//"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self.post_request(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logger.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
addurl = "&ageconsent=ok&warning=3"
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
addurl = "&ageconsent=ok&warning=4"
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("Changing URL: "+url)
|
||||
self.performLogin(url)
|
||||
data = self.get_request(url,usecache=False)
|
||||
|
||||
if "This story contains mature content which may include violence, sexual situations, and coarse language" in data:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
soup = self.make_soup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')))
|
||||
self.story.setMetadata('title',stripHTML(a))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'http://'+self.host+'/'+chapter['href']+addurl)
|
||||
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while 'label' not in defaultGetattr(value,'class'):
|
||||
svalue += unicode(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TenhawkPresentsSiteAdapter
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TenhawkPresentsComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','thpc')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
self.dateformat = "%b %d %Y"
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'fanfiction.tenhawkpresents.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
addurl = "&ageconsent=ok&warning=3"
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
addurl = "&ageconsent=ok&warning=4"
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("Changing URL: "+url)
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
if "This story contains mature content which may include violence, sexual situations, and coarse language" in data:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TenhawkPresentsComSiteAdapter
|
||||
|
||||
199
fanficdownloader/adapters/adapter_test1.py
Normal file
199
fanficdownloader/adapters/adapter_test1.py
Normal file
|
|
@ -0,0 +1,199 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import datetime
|
||||
import time
|
||||
import logging
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from .. import exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TestSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','tst1')
|
||||
self.crazystring = u" crazy tests:[bare amp(&) quote(') amp(&) gt(>) lt(<) ATnT(AT&T) pound(£)]"
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
self.username=''
|
||||
self.is_adult=False
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'test1.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return BaseSiteAdapter.getSiteURLPattern(self)+r'/?\?sid=\d+$'
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.story.getMetadata('storyId') == '665' and not (self.is_adult or self.getConfig("is_adult")):
|
||||
logging.warn("self.is_adult:%s"%self.is_adult)
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if self.story.getMetadata('storyId') == '666':
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
|
||||
if self.story.getMetadata('storyId').startswith('670'):
|
||||
time.sleep(1.0)
|
||||
|
||||
if self.story.getMetadata('storyId').startswith('671'):
|
||||
time.sleep(1.0)
|
||||
|
||||
if self.getConfig("username"):
|
||||
self.username = self.getConfig("username")
|
||||
|
||||
if self.story.getMetadata('storyId') == '668' and self.username != "Me" :
|
||||
raise exceptions.FailedToLogin(self.url,self.username)
|
||||
|
||||
if self.story.getMetadata('storyId') == '664':
|
||||
self.story.setMetadata(u'title',"Test Story Title "+self.story.getMetadata('storyId')+self.crazystring)
|
||||
self.story.setMetadata('author','Test Author aa bare amp(&) quote(') amp(&)')
|
||||
else:
|
||||
self.story.setMetadata(u'title',"Test Story Title "+self.story.getMetadata('storyId'))
|
||||
self.story.setMetadata('author','Test Author aa')
|
||||
self.story.setMetadata('storyUrl',self.url)
|
||||
self.story.setMetadata('description',u'Description '+self.crazystring+u''' Done
|
||||
|
||||
Some more longer description. "I suck at summaries!" "Better than it sounds!" "My first fic"
|
||||
''')
|
||||
self.story.setMetadata('datePublished',makeDate("1975-03-15","%Y-%m-%d"))
|
||||
if self.story.getMetadata('storyId') == '669':
|
||||
self.story.setMetadata('dateUpdated',datetime.datetime.now())
|
||||
else:
|
||||
self.story.setMetadata('dateUpdated',makeDate("1975-04-15","%Y-%m-%d"))
|
||||
self.story.setMetadata('numWords','123456')
|
||||
|
||||
idnum = int(self.story.getMetadata('storyId'))
|
||||
if idnum % 2 == 1:
|
||||
self.story.setMetadata('status','In-Progress')
|
||||
else:
|
||||
self.story.setMetadata('status','Completed')
|
||||
|
||||
langs = {
|
||||
0:"English",
|
||||
1:"Russian",
|
||||
2:"French",
|
||||
3:"German",
|
||||
}
|
||||
if idnum < 10:
|
||||
self.story.setMetadata('language',langs[idnum%len(langs)])
|
||||
# greater than 10, no language.
|
||||
|
||||
self.setSeries('The Great Test',idnum)
|
||||
|
||||
self.story.setMetadata('rating','Tweenie')
|
||||
|
||||
self.story.setMetadata('authorId','98765')
|
||||
self.story.setMetadata('authorUrl','http://author/url')
|
||||
|
||||
self.story.addToList('warnings','Swearing')
|
||||
self.story.addToList('warnings','Violence')
|
||||
|
||||
self.story.addToList('category','Harry Potter')
|
||||
self.story.addToList('category','Furbie')
|
||||
self.story.addToList('category','Crossover')
|
||||
self.story.addToList('category',u'Puella Magi Madoka Magica/魔法少女まどか★マギカ')
|
||||
self.story.addToList('category',u'Magical Girl Lyrical Nanoha')
|
||||
self.story.addToList('genre','Fantasy')
|
||||
self.story.addToList('genre','SF')
|
||||
self.story.addToList('genre','Noir')
|
||||
|
||||
self.chapterUrls = [(u'Prologue '+self.crazystring,self.url+"&chapter=1"),
|
||||
('Chapter 1, Xenos on Cinnabar',self.url+"&chapter=2"),
|
||||
('Chapter 2, Sinmay on Kintikin',self.url+"&chapter=3"),
|
||||
('Chapter 3, Over Cinnabar',self.url+"&chapter=4"),
|
||||
('Chapter 4',self.url+"&chapter=5"),
|
||||
('Chapter 5',self.url+"&chapter=6"),
|
||||
('Chapter 6',self.url+"&chapter=6"),
|
||||
# ('Chapter 7',self.url+"&chapter=6"),
|
||||
# ('Chapter 8',self.url+"&chapter=6"),
|
||||
# ('Chapter 9',self.url+"&chapter=6"),
|
||||
# ('Chapter 0',self.url+"&chapter=6"),
|
||||
# ('Chapter a',self.url+"&chapter=6"),
|
||||
# ('Chapter b',self.url+"&chapter=6"),
|
||||
# ('Chapter c',self.url+"&chapter=6"),
|
||||
# ('Chapter d',self.url+"&chapter=6"),
|
||||
# ('Chapter e',self.url+"&chapter=6"),
|
||||
# ('Chapter f',self.url+"&chapter=6"),
|
||||
# ('Chapter g',self.url+"&chapter=6"),
|
||||
# ('Chapter h',self.url+"&chapter=6"),
|
||||
# ('Chapter i',self.url+"&chapter=6"),
|
||||
# ('Chapter j',self.url+"&chapter=6"),
|
||||
# ('Chapter k',self.url+"&chapter=6"),
|
||||
# ('Chapter l',self.url+"&chapter=6"),
|
||||
# ('Chapter m',self.url+"&chapter=6"),
|
||||
# ('Chapter n',self.url+"&chapter=6"),
|
||||
]
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
if self.story.getMetadata('storyId') == '667':
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s!" % url)
|
||||
|
||||
if self.story.getMetadata('storyId').startswith('670') or \
|
||||
self.story.getMetadata('storyId').startswith('672'):
|
||||
time.sleep(1.0)
|
||||
|
||||
if "chapter=1" in url :
|
||||
text=u'''
|
||||
<div>
|
||||
<h3>Prologue</h3>
|
||||
<p>This is a fake adapter for testing purposes. Different storyId's will give different errors:</p>
|
||||
<p>http://test1.com?sid=664 - Crazy string title</p>
|
||||
<p>http://test1.com?sid=665 - raises AdultCheckRequired</p>
|
||||
<p>http://test1.com?sid=666 - raises StoryDoesNotExist</p>
|
||||
<p>http://test1.com?sid=667 - raises FailedToDownload on chapter 1</p>
|
||||
<p>http://test1.com?sid=668 - raises FailedToLogin unless username='Me'</p>
|
||||
<p>http://test1.com?sid=669 - Succeeds with Updated Date=now</p>
|
||||
<p>http://test1.com?sid=670 - Succeeds, but sleeps 2sec on each chapter</p>
|
||||
<p>http://test1.com?sid=671 - Succeeds, but sleeps 2sec metadata only</p>
|
||||
<p>http://test1.com?sid=672 - Succeeds, quick meta, sleeps 2sec chapters only</p>
|
||||
<p>And other storyId will succeed with the same output.</p>
|
||||
</div>
|
||||
'''
|
||||
else:
|
||||
text=u'''
|
||||
<div>
|
||||
<h3>Chapter title from site</h3>
|
||||
<p><center>Centered text</center></p>
|
||||
<p>Lorem '''+self.crazystring+''' <i>italics</i>, <b>bold</b>, <u>underline</u> consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
|
||||
br breaks<br><br>
|
||||
|
||||
<a href="http://code.google.com/p/fanficdownloader/wiki/FanFictionDownLoaderPluginWithReadingList" title="Tilt-a-Whirl by Jim & Sarah, on Flickr"><img src="http://i.imgur.com/bo8eD.png"></a><br/>
|
||||
br breaks<br><br>
|
||||
<hr>
|
||||
horizontal rules
|
||||
<hr size=1 noshade>
|
||||
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
|
||||
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
|
||||
</div>
|
||||
'''
|
||||
soup = bs.BeautifulStoneSoup(text,selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
return self.utf8FromSoup(url,soup)
|
||||
|
||||
def getClass():
|
||||
return TestSiteAdapter
|
||||
|
||||
|
|
@ -1,319 +1,293 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# By virtue of being recent and requiring both is_adult and user/pass,
|
||||
# adapter_fanficcastletvnet.py is the best choice for learning to
|
||||
# write adapters--especially for sites that use the eFiction system.
|
||||
# Most sites that have ".../viewstory.php?sid=123" in the story URL
|
||||
# are eFiction.
|
||||
|
||||
# For non-eFiction sites, it can be considerably more complex, but
|
||||
# this is still a good starting point.
|
||||
|
||||
# In general an 'adapter' needs to do these five things:
|
||||
|
||||
# - 'Register' correctly with the downloader
|
||||
# - Site Login (if needed)
|
||||
# - 'Are you adult?' check (if needed--some do one, some the other, some both)
|
||||
# - Grab the chapter list
|
||||
# - Grab the story meta-data (some (non-eFiction) adapters have to get it from the author page)
|
||||
# - Grab the chapter texts
|
||||
|
||||
# Search for XXX comments--that's where things are most likely to need changing.
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return SamAndJackNetAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class SamAndJackNetAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the /fanfic part. Replace all to remove it usually.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/fanfics/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','sjn') # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%b %d, %Y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'samandjack.net' # XXX
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/fanfics/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/fanfics/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Login seems to be reasonably standard across eFiction sites.
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/fanfics/user.php?action=login'
|
||||
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self.post_request(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logger.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
|
||||
# Furthermore, there's a couple sites now with more than
|
||||
# one warning level for different ratings. And they're
|
||||
# fussy about it. midnightwhispers has three: 10, 3 & 5.
|
||||
# we'll try 5 first.
|
||||
addurl = "&ageconsent=ok&warning=5" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
|
||||
# Since the warning text can change by warning level, let's
|
||||
# look for the warning pass url. nfacommunity uses
|
||||
# &warning= -- actually, so do other sites. Must be an
|
||||
# eFiction book.
|
||||
|
||||
# viewstory.php?sid=1882&warning=4
|
||||
# viewstory.php?sid=1654&ageconsent=ok&warning=5
|
||||
#print data
|
||||
#m = re.search(r"'viewstory.php\?sid=1882(&warning=4)'",data)
|
||||
m = re.search(r"'viewstory.php\?sid=\d+((?:&ageconsent=ok)?&warning=\d+)'",data)
|
||||
if m != None:
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# We tried the default and still got a warning, so
|
||||
# let's pull the warning number from the 'continue'
|
||||
# link and reload data.
|
||||
addurl = m.group(1)
|
||||
# correct stupid & error in url.
|
||||
addurl = addurl.replace("&","&")
|
||||
url = self.url+'&index=1'+addurl
|
||||
logger.debug("URL 2nd try: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
else:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
soup = self.make_soup(data)
|
||||
# print data
|
||||
|
||||
|
||||
pagetitle = soup.find('div',{'id':'pagetitle'})
|
||||
## Title
|
||||
a = pagetitle.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(a))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
# (fetch multiple authors)
|
||||
alist = soup.find_all('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
for a in alist:
|
||||
self.story.addToList('authorId',a['href'].split('=')[1])
|
||||
self.story.addToList('authorUrl','http://'+self.host+'/fanfics/'+a['href'])
|
||||
self.story.addToList('author',a.string)
|
||||
|
||||
# Reviews
|
||||
reviewdata = soup.find('div', {'id' : 'sort'})
|
||||
a = reviewdata.find_all('a', href=re.compile(r'reviews.php\?type=ST&(amp;)?item='+self.story.getMetadata('storyId')+"$"))[1] # second one.
|
||||
self.story.setMetadata('reviews',stripHTML(a))
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'http://'+self.host+'/fanfics/'+chapter['href']+addurl)
|
||||
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
self.setDescription(url,value)
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## Not all sites use Genre, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
## Not all sites use Warnings, but there's no harm to
|
||||
## leaving it in. Check to make sure the type_id number
|
||||
## is correct, though--it's site specific.
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
value=value.replace(' | ','')
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/fanfics/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = self.make_soup(self.get_request(url))
|
||||
|
||||
div = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
# This function is called by the downloader in all adapter_*.py files
|
||||
# in this dir to register the adapter class. So it needs to be
|
||||
# updated to reflect the class below it. That, plus getSiteDomain()
|
||||
# take care of 'Registering'.
|
||||
def getClass():
|
||||
return TheQuidditchPitchOrgAdapter # XXX
|
||||
|
||||
# Class name has to be unique. Our convention is camel case the
|
||||
# sitename with Adapter at the end. www is skipped.
|
||||
class TheQuidditchPitchOrgAdapter(BaseSiteAdapter): # XXX
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
# XXX Most sites don't have the part. Replace all to remove it usually.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
# Each adapter needs to have a unique site abbreviation.
|
||||
self.story.setMetadata('siteabbrev','tqdpch') # XXX
|
||||
|
||||
# If all stories from the site fall into the same category,
|
||||
# the site itself isn't likely to label them as such, so we
|
||||
# do.
|
||||
self.story.addToList("category","Harry Potter") # XXX
|
||||
|
||||
# The date format will vary from site to site.
|
||||
# http://docs.python.org/library/datetime.html#strftime-strptime-behavior
|
||||
self.dateformat = "%m/%d/%Y" # XXX
|
||||
|
||||
@staticmethod # must be @staticmethod, don't remove it.
|
||||
def getSiteDomain():
|
||||
# The site domain. Does have www here, if it uses it.
|
||||
return 'thequidditchpitch.org' # XXX
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.thequidditchpitch.org','thequidditchpitch.org']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+"(www\.)?"+re.escape(self.getSiteDomain()+"/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
## Login seems to be reasonably standard across eFiction sites.
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only - Not suitable for readers under the age of legal consent in their country.' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
## Getting the chapter list and the meta data, plus 'is adult' checking.
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
# Weirdly, different sites use different warning numbers.
|
||||
# If the title search below fails, there's a good chance
|
||||
# you need a different number. print data at that point
|
||||
# and see what the 'click here to continue' url says.
|
||||
addurl = "&ageconsent=ok&warning=4" # XXX
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
# index=1 makes sure we see the story chapter index. Some
|
||||
# sites skip that for one-chapter stories.
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
# The actual text that is used to announce you need to be an
|
||||
# adult varies from site to site. Again, print data before
|
||||
# the title search to troubleshoot.
|
||||
if ("Not suitable for readers under the age of legal consent in their country." in data \
|
||||
or "Not suitable for readers under 16 yrs. \r\nStories may contain violence, slight nudity, and/or sexual situations." in data ) \
|
||||
and not (self.is_adult or self.getConfig("is_adult")): # XXX
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
#print data
|
||||
|
||||
# Now go hunting for all the meta data and the chapter list.
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
# eFiction sites don't help us out a lot with their meta data
|
||||
# formating, so it's a little ugly.
|
||||
|
||||
# utility method
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=1')) # XXX
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=2')) # XXX
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warnings',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
# grab the text for an individual chapter.
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
# span? Really? span? Yeah... I don't think so.
|
||||
div = soup.find('span', {'style' : 'font-size: 100%;'})
|
||||
div.name='div'
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,div)
|
||||
252
fanficdownloader/adapters/adapter_thewriterscoffeeshopcom.py
Normal file
252
fanficdownloader/adapters/adapter_thewriterscoffeeshopcom.py
Normal file
|
|
@ -0,0 +1,252 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TheWritersCoffeeShopComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','twcs')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/library/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
self.dateformat = "%B %d, %Y"
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.thewriterscoffeeshop.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/library/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/library/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/library/user.php?action=login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
addurl = "&ageconsent=ok&warning=3"
|
||||
else:
|
||||
addurl=""
|
||||
|
||||
url = self.url+'&index=1'+addurl
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
if "Age Consent Required" in data:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/library/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/library/'+chapter['href']+addurl))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(stripHTML(value), self.dateformat))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/library/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TheWritersCoffeeShopComSiteAdapter
|
||||
|
||||
258
fanficdownloader/adapters/adapter_tthfanficorg.py
Normal file
258
fanficdownloader/adapters/adapter_tthfanficorg.py
Normal file
|
|
@ -0,0 +1,258 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
import time
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TwistingTheHellmouthSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','tth')
|
||||
self.dateformat = "%d %b %y"
|
||||
self.is_adult=False
|
||||
self.username = None
|
||||
self.password = None
|
||||
# get storyId from url--url validation guarantees query correct
|
||||
m = re.match(self.getSiteURLPattern(),url)
|
||||
if m:
|
||||
self.story.setMetadata('storyId',m.group('id'))
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
# normalized story URL.
|
||||
self._setURL("http://"+self.getSiteDomain()\
|
||||
+"/Story-"+self.story.getMetadata('storyId'))
|
||||
else:
|
||||
raise exceptions.InvalidStoryURL(url,
|
||||
self.getSiteDomain(),
|
||||
self.getSiteExampleURLs())
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.tthfanfic.org'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.tthfanfic.org/Story-5583 http://www.tthfanfic.org/Story-5583/Greywizard+Marked+By+Kane.htm http://www.tthfanfic.org/T-526321777890480578489880055880/Story-26448-15/batzulger+Willow+Rosenberg+and+the+Mind+Riders.htm"
|
||||
|
||||
# http://www.tthfanfic.org/T-526321777848988007890480555880/Story-26448-15/batzulger+Willow+Rosenberg+and+the+Mind+Riders.htm
|
||||
# http://www.tthfanfic.org/Story-5583
|
||||
# http://www.tthfanfic.org/Story-5583/Greywizard+Marked+By+Kane.htm
|
||||
# http://www.tthfanfic.org/story.php?no=26093
|
||||
def getSiteURLPattern(self):
|
||||
return r"http://www.tthfanfic.org(/(T-\d+/)?Story-|/story.php\?no=)(?P<id>\d+)(-\d+)?(/.*)?$"
|
||||
|
||||
# tth won't send you future updates if you aren't 'caught up'
|
||||
# on the story. Login isn't required for F21, but logging in will
|
||||
# mark stories you've downloaded as 'read' on tth.
|
||||
def performLogin(self):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['urealname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['urealname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['loginsubmit'] = 'Login'
|
||||
|
||||
if not params['password']:
|
||||
return
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/login.php'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['urealname']))
|
||||
|
||||
## need to pull empty login page first to get ctkn and
|
||||
## password name, which are BUSs
|
||||
# <form method='post' action='/login.php' accept-charset="utf-8">
|
||||
# <input type='hidden' name='ctkn' value='4bdf761f5bea06bf4477072afcbd0f8d721d1a4f989c09945a9e87afb7a66de1'/>
|
||||
# <input type='text' id='urealname' name='urealname' value=''/>
|
||||
# <input type='password' id='password' name='6bb3fcd148d148629223690bf19733b8'/>
|
||||
# <input type='submit' value='Login' name='loginsubmit'/>
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(loginUrl))
|
||||
params['ctkn']=soup.find('input', {'name':'ctkn'})['value']
|
||||
params[soup.find('input', {'id':'password'})['name']] = params['password']
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Stories Published" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
# fetch the chapter. From that we will get almost all the
|
||||
# metadata and chapter list
|
||||
|
||||
url=self.url
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
# tth won't send you future updates if you aren't 'caught up'
|
||||
# on the story. Login isn't required for F21, but logging in will
|
||||
# mark stories you've downloaded as 'read' on tth.
|
||||
self.performLogin()
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
soup = bs.BeautifulSoup(data)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
descurl = url
|
||||
|
||||
if "<h2>Story Not Found</h2>" in data:
|
||||
raise exceptions.StoryDoesNotExist(url)
|
||||
|
||||
if "NOTE: This story is rated FR21 which is above your chosen filter level." in data:
|
||||
if self.is_adult or self.getConfig("is_adult"):
|
||||
form = soup.find('form', {'id':'sitemaxratingform'})
|
||||
params={'ctkn':form.find('input', {'name':'ctkn'})['value'],
|
||||
'sitemaxrating':'5'}
|
||||
logging.info("Attempting to get rating cookie for %s" % url)
|
||||
data = self._postUrl("http://"+self.getSiteDomain()+'/setmaxrating.php',params)
|
||||
# refetch story page.
|
||||
data = self._fetchUrl(url)
|
||||
soup = bs.BeautifulSoup(data)
|
||||
else:
|
||||
raise exceptions.AdultCheckRequired(self.url)
|
||||
|
||||
# http://www.tthfanfic.org/AuthorStories-3449/Greywizard.htm
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"^/AuthorStories-\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('/')[1].split('-')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+a['href'])
|
||||
self.story.setMetadata('author',stripHTML(a))
|
||||
|
||||
try:
|
||||
# going to pull part of the meta data from author list page.
|
||||
logging.debug("**AUTHOR** URL: "+self.story.getMetadata('authorUrl'))
|
||||
authordata = self._fetchUrl(self.story.getMetadata('authorUrl'))
|
||||
descurl=self.story.getMetadata('authorUrl')
|
||||
authorsoup = bs.BeautifulSoup(authordata)
|
||||
# author can have several pages, scan until we find it.
|
||||
while( not authorsoup.find('a', href=re.compile(r"^/Story-"+self.story.getMetadata('storyId'))) ):
|
||||
nextpage = 'http://'+self.host+authorsoup.find('a', {'class':'arrowf'})['href']
|
||||
logging.debug("**AUTHOR** nextpage URL: "+nextpage)
|
||||
authordata = self._fetchUrl(nextpage)
|
||||
descurl=nextpage
|
||||
authorsoup = bs.BeautifulSoup(authordata)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
storydiv = authorsoup.find('div', {'id':'st'+self.story.getMetadata('storyId'), 'class':re.compile(r"storylistitem")})
|
||||
self.setDescription(descurl,storydiv.find('div',{'class':'storydesc'}))
|
||||
#self.story.setMetadata('description',stripHTML(storydiv.find('div',{'class':'storydesc'})))
|
||||
self.story.setMetadata('title',stripHTML(storydiv.find('a',{'class':'storylink'})))
|
||||
|
||||
verticaltable = soup.find('table', {'class':'verticaltable'})
|
||||
|
||||
BtVS = True
|
||||
BtVSNonX = False
|
||||
for cat in verticaltable.findAll('a', href=re.compile(r"^/Category-")):
|
||||
if cat.string not in ['General', 'Non-BtVS/AtS Stories', 'BtVS/AtS Non-Crossover', 'Non-BtVS Crossovers']:
|
||||
self.story.addToList('category',cat.string)
|
||||
else:
|
||||
if 'Non-BtVS' in cat.string:
|
||||
BtVS = False
|
||||
if 'BtVS/AtS Non-Crossover' == cat.string:
|
||||
BtVSNonX = True
|
||||
|
||||
verticaltabletds = verticaltable.findAll('td')
|
||||
self.story.setMetadata('rating', verticaltabletds[2].string)
|
||||
self.story.setMetadata('numWords', verticaltabletds[4].string)
|
||||
|
||||
# Complete--if completed.
|
||||
if 'Yes' in verticaltabletds[10].string:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
self.story.setMetadata('datePublished',makeDate(stripHTML(verticaltabletds[8].string), self.dateformat))
|
||||
self.story.setMetadata('dateUpdated',makeDate(stripHTML(verticaltabletds[9].string), self.dateformat))
|
||||
|
||||
for icon in storydiv.find('span',{'class':'storyicons'}).findAll('img'):
|
||||
if( icon['title'] not in ['Non-Crossover'] ) :
|
||||
self.story.addToList('genre',icon['title'])
|
||||
else:
|
||||
if not BtVSNonX:
|
||||
BtVS = False # Don't add BtVS if Non-Crossover, unless it's a BtVS/AtS Non-Crossover
|
||||
|
||||
print("BtVS: %s BtVSNonX: %s"%(BtVS,BtVSNonX))
|
||||
if BtVS:
|
||||
self.story.addToList('category','Buffy: The Vampire Slayer')
|
||||
|
||||
# Find the chapter selector
|
||||
select = soup.find('select', { 'name' : 'chapnav' } )
|
||||
|
||||
if select is None:
|
||||
# no selector found, so it's a one-chapter story.
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
else:
|
||||
allOptions = select.findAll('option')
|
||||
for o in allOptions:
|
||||
url = "http://"+self.host+o['value']
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(o),url))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
pseries = soup.find('p', {'style':'margin-top:0px'})
|
||||
m = re.match('This story is No\. (?P<num>\d+) in the series "(?P<series>.+)"\.',
|
||||
pseries.text)
|
||||
if m:
|
||||
self.setSeries(m.group('series'),m.group('num'))
|
||||
|
||||
return
|
||||
|
||||
|
||||
def getChapterText(self, url):
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
|
||||
div = soup.find('div', {'id' : 'storyinnerbody'})
|
||||
|
||||
if None == div:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
# strip out included chapter title, if present, to avoid doubling up.
|
||||
try:
|
||||
div.find('h3').extract()
|
||||
except:
|
||||
pass
|
||||
return self.utf8FromSoup(url,div)
|
||||
|
||||
def getClass():
|
||||
return TwistingTheHellmouthSiteAdapter
|
||||
|
||||
|
|
@ -1,236 +1,250 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team, 2018 FanFicFare team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# Software: eFiction
|
||||
from __future__ import absolute_import
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
import re
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
# py2 vs py3 transition
|
||||
from ..six import text_type as unicode
|
||||
|
||||
from .base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TwilightedNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','tw')
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('https://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.twilighted.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.twilighted.net','twilighted.net']
|
||||
|
||||
@classmethod
|
||||
def getSiteExampleURLs(cls):
|
||||
return "https://www.twilighted.net/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return r"https?"+re.escape("://")+r"(www\.)?"+re.escape("twilighted.net/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'https://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logger.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self.post_request(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logger.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logger.debug("URL: "+url)
|
||||
|
||||
data = self.get_request(url)
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self.get_request(url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.AccessDenied(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
# twilighted isn't writing <body> ??? wtf?
|
||||
data = "<html><body>"+data[data.index("</head>"):]
|
||||
|
||||
soup = self.make_soup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',stripHTML(a))
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','https://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.find_all('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+r"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.add_chapter(chapter,'https://'+self.host+'/'+chapter['href'])
|
||||
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.find_all('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while 'label' not in defaultGetattr(value,'class'):
|
||||
svalue += unicode(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## twilighted.net doesn't use genre.
|
||||
# if 'Genre' in label:
|
||||
# genres = labelspan.parent.find_all('a',href=re.compile(r'browse.php\?type=class'))
|
||||
# genrestext = [genre.string for genre in genres]
|
||||
# self.genre = ', '.join(genrestext)
|
||||
# for genre in genrestext:
|
||||
# self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'https://'+self.host+'/'+a['href']
|
||||
|
||||
seriessoup = self.make_soup(self.get_request(series_url))
|
||||
storyas = seriessoup.find_all('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
self.story.setMetadata('seriesUrl',series_url)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logger.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self.get_request(url)
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
# twilighted isn't writing <body> ??? wtf?
|
||||
data = "<html><body>"+data[data.index("</head>"):]
|
||||
|
||||
soup = self.make_soup(data)
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TwilightedNetSiteAdapter
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TwilightedNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','tw')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Twilight")
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.twilighted.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.twilighted.net','twilighted.net']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.twilighted.net/viewstory.php?sid=1234 http://twilighted.net/viewstory.php?sid=5678"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?"+re.escape("twilighted.net/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logging.debug("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
# twilighted isn't writing <body> ??? wtf?
|
||||
data = "<html><body>"+data[data.index("</head>"):]
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
## twilighted.net doesn't use genre.
|
||||
# if 'Genre' in label:
|
||||
# genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class'))
|
||||
# genrestext = [genre.string for genre in genres]
|
||||
# self.genre = ', '.join(genrestext)
|
||||
# for genre in genrestext:
|
||||
# self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
#value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
# twilighted isn't writing <body> ??? wtf?
|
||||
data = "<html><body>"+data[data.index("</head>"):]
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TwilightedNetSiteAdapter
|
||||
|
||||
263
fanficdownloader/adapters/adapter_twiwritenet.py
Normal file
263
fanficdownloader/adapters/adapter_twiwritenet.py
Normal file
|
|
@ -0,0 +1,263 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class TwiwriteNetSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','twrt')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
self.story.addToList("category","Twilight")
|
||||
self.username = "NoneGiven" # if left empty, twiwrite.net doesn't return any message at all.
|
||||
self.password = ""
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
logging.debug("storyId: (%s)"%self.story.getMetadata('storyId'))
|
||||
|
||||
# normalized story URL.
|
||||
self._setURL('http://' + self.getSiteDomain() + '/viewstory.php?sid='+self.story.getMetadata('storyId'))
|
||||
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.twiwrite.net'
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return ['www.twiwrite.net','twiwrite.net']
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://www.twiwrite.net/viewstory.php?sid=1234 http://twiwrite.net/viewstory.php?sid=5678"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://")+r"(www\.)?"+re.escape("twiwrite.net/viewstory.php?sid=")+r"\d+$"
|
||||
|
||||
def needToLoginCheck(self, data):
|
||||
if 'Registered Users Only' in data \
|
||||
or 'There is no such account on our website' in data \
|
||||
or "That password doesn't match the one in our database" in data:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def performLogin(self, url):
|
||||
params = {}
|
||||
|
||||
if self.password:
|
||||
params['penname'] = self.username
|
||||
params['password'] = self.password
|
||||
else:
|
||||
params['penname'] = self.getConfig("username")
|
||||
params['password'] = self.getConfig("password")
|
||||
params['cookiecheck'] = '1'
|
||||
params['submit'] = 'Submit'
|
||||
|
||||
loginUrl = 'http://' + self.getSiteDomain() + '/user.php?action=login'
|
||||
logging.info("Will now login to URL (%s) as (%s)" % (loginUrl,
|
||||
params['penname']))
|
||||
|
||||
d = self._fetchUrl(loginUrl, params)
|
||||
|
||||
if "Member Account" not in d : #Member Account
|
||||
logging.info("Failed to login to URL %s as %s" % (loginUrl,
|
||||
params['penname']))
|
||||
raise exceptions.FailedToLogin(url,params['penname'])
|
||||
return False
|
||||
else:
|
||||
return True
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
url = self.url+'&index=1'
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
try:
|
||||
data = self._fetchUrl(url)
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
if self.needToLoginCheck(data):
|
||||
# need to log in for this one.
|
||||
self.performLogin(url)
|
||||
data = self._fetchUrl(url)
|
||||
|
||||
if "Access denied. This story has not been validated by the adminstrators of this site." in data:
|
||||
raise exceptions.FailedToDownload(self.getSiteDomain() +" says: Access denied. This story has not been validated by the adminstrators of this site.")
|
||||
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
soup = bs.BeautifulSoup(data)
|
||||
|
||||
## Title
|
||||
a = soup.find('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"$"))
|
||||
self.story.setMetadata('title',a.string)
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php\?uid=\d+"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
self.story.setMetadata('author',a.string)
|
||||
|
||||
# Find the chapters:
|
||||
for chapter in soup.findAll('a', href=re.compile(r'viewstory.php\?sid='+self.story.getMetadata('storyId')+"&chapter=\d+$")):
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
self.chapterUrls.append((stripHTML(chapter),'http://'+self.host+'/'+chapter['href']))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## <meta name='description' content='<p>Description</p> ...' >
|
||||
## Summary, strangely, is in the content attr of a <meta name='description'> tag
|
||||
## which is escaped HTML. Unfortunately, we can't use it because they don't
|
||||
## escape (') chars in the desc, breakin the tag.
|
||||
#meta_desc = soup.find('meta',{'name':'description'})
|
||||
#metasoup = bs.BeautifulStoneSoup(meta_desc['content'])
|
||||
#self.story.setMetadata('description',stripHTML(metasoup))
|
||||
|
||||
def defaultGetattr(d,k):
|
||||
try:
|
||||
return d[k]
|
||||
except:
|
||||
return ""
|
||||
|
||||
# <span class="label">Rated:</span> NC-17<br /> etc
|
||||
labels = soup.findAll('span',{'class':'label'})
|
||||
for labelspan in labels:
|
||||
value = labelspan.nextSibling
|
||||
label = labelspan.string
|
||||
|
||||
if 'Summary' in label:
|
||||
## Everything until the next span class='label'
|
||||
svalue = ""
|
||||
while not defaultGetattr(value,'class') == 'label':
|
||||
svalue += str(value)
|
||||
value = value.nextSibling
|
||||
self.setDescription(url,svalue)
|
||||
#self.story.setMetadata('description',stripHTML(svalue))
|
||||
|
||||
if 'Rated' in label:
|
||||
self.story.setMetadata('rating', value)
|
||||
|
||||
if 'Word count' in label:
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
if 'Categories' in label:
|
||||
cats = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=categories'))
|
||||
catstext = [cat.string for cat in cats]
|
||||
for cat in catstext:
|
||||
self.story.addToList('category',cat.string)
|
||||
|
||||
if 'Characters' in label:
|
||||
chars = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=characters'))
|
||||
charstext = [char.string for char in chars]
|
||||
for char in charstext:
|
||||
self.story.addToList('characters',char.string)
|
||||
|
||||
if 'Genre' in label:
|
||||
genres = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=3'))
|
||||
genrestext = [genre.string for genre in genres]
|
||||
self.genre = ', '.join(genrestext)
|
||||
for genre in genrestext:
|
||||
self.story.addToList('genre',genre.string)
|
||||
|
||||
if 'Warnings' in label:
|
||||
warnings = labelspan.parent.findAll('a',href=re.compile(r'browse.php\?type=class&type_id=8'))
|
||||
warningstext = [warning.string for warning in warnings]
|
||||
self.warning = ', '.join(warningstext)
|
||||
for warning in warningstext:
|
||||
self.story.addToList('warning',warning.string)
|
||||
|
||||
if 'Completed' in label:
|
||||
if 'Yes' in value:
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
|
||||
if 'Published' in label:
|
||||
self.story.setMetadata('datePublished', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
if 'Updated' in label:
|
||||
# there's a stray [ at the end.
|
||||
value = value[0:-1]
|
||||
self.story.setMetadata('dateUpdated', makeDate(value.strip(), "%B %d, %Y"))
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = soup.find('a', href=re.compile(r"viewseries.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
data = self._fetchUrl(url)
|
||||
# problems with some stories, but only in calibre. I suspect
|
||||
# issues with different SGML parsers in python. This is a
|
||||
# nasty hack, but it works.
|
||||
data = data[data.index("<body"):]
|
||||
|
||||
soup = bs.BeautifulStoneSoup(data,
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
span = soup.find('div', {'id' : 'story'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return TwiwriteNetSiteAdapter
|
||||
|
||||
232
fanficdownloader/adapters/adapter_whoficcom.py
Normal file
232
fanficdownloader/adapters/adapter_whoficcom.py
Normal file
|
|
@ -0,0 +1,232 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import time
|
||||
import logging
|
||||
import re
|
||||
import urllib2
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from .. import exceptions as exceptions
|
||||
|
||||
from base_adapter import BaseSiteAdapter, makeDate
|
||||
|
||||
class WhoficComSiteAdapter(BaseSiteAdapter):
|
||||
|
||||
def __init__(self, config, url):
|
||||
BaseSiteAdapter.__init__(self, config, url)
|
||||
self.story.setMetadata('siteabbrev','whof')
|
||||
self.decode = ["Windows-1252",
|
||||
"utf8"] # 1252 is a superset of iso-8859-1.
|
||||
# Most sites that claim to be
|
||||
# iso-8859-1 (and some that claim to be
|
||||
# utf8) are really windows-1252.
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
return 'www.whofic.com'
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
return "http://"+self.getSiteDomain()+"/viewstory.php?sid=1234"
|
||||
|
||||
def getSiteURLPattern(self):
|
||||
return re.escape("http://"+self.getSiteDomain()+"/viewstory.php?sid=")+"\d+$"
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
|
||||
# get storyId from url--url validation guarantees query is only sid=1234
|
||||
self.story.setMetadata('storyId',self.parsedUrl.query.split('=',)[1])
|
||||
|
||||
# fetch the first chapter. From that we will:
|
||||
# - determine title, authorname, authorid
|
||||
# - get chapter list, if not one-shot.
|
||||
|
||||
url = self.url+'&chapter=1'
|
||||
logging.debug("URL: "+url)
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
try:
|
||||
soup = bs.BeautifulSoup(self._fetchUrl(url))
|
||||
except urllib2.HTTPError, e:
|
||||
if e.code == 404:
|
||||
raise exceptions.StoryDoesNotExist(self.url)
|
||||
else:
|
||||
raise e
|
||||
|
||||
# pull title(title) and author from the HTML title.
|
||||
title = soup.find('title').string
|
||||
logging.debug('Title: %s' % title)
|
||||
title = title.split('::')[1].strip()
|
||||
self.story.setMetadata('title',title.split(' by ')[0].strip())
|
||||
self.story.setMetadata('author',title.split(' by ')[1].strip())
|
||||
|
||||
# Find authorid and URL from... author url.
|
||||
a = soup.find('a', href=re.compile(r"viewuser.php"))
|
||||
self.story.setMetadata('authorId',a['href'].split('=')[1])
|
||||
self.story.setMetadata('authorUrl','http://'+self.host+'/'+a['href'])
|
||||
|
||||
# Find the chapter selector
|
||||
select = soup.find('select', { 'name' : 'chapter' } )
|
||||
|
||||
if select is None:
|
||||
# no selector found, so it's a one-chapter story.
|
||||
self.chapterUrls.append((self.story.getMetadata('title'),url))
|
||||
else:
|
||||
allOptions = select.findAll('option')
|
||||
for o in allOptions:
|
||||
url = self.url + "&chapter=%s" % o['value']
|
||||
# just in case there's tags, like <i> in chapter titles.
|
||||
title = "%s" % o
|
||||
title = re.sub(r'<[^>]+>','',title)
|
||||
self.chapterUrls.append((title,url))
|
||||
|
||||
self.story.setMetadata('numChapters',len(self.chapterUrls))
|
||||
|
||||
## Whofic.com puts none of the other meta data in the chapters
|
||||
## or even the story chapter index page. Need to scrape the
|
||||
## author page to find it.
|
||||
|
||||
# <table width="100%" bordercolor="#333399" border="0" cellspacing="0" cellpadding="2"><tr><td>
|
||||
# <b><a href="viewstory.php?sid=38220">Accompaniment 2</a></b> by <a href="viewuser.php?uid=12412">clandestinemiscreant</a> [<a href="reviews.php?sid=38220">Reviews</a> - <a href="reviews.php?sid=38220">0</a>] <br>
|
||||
# This is a series of short stories written as an accompaniment to Season 2, Season 28 for us oldies, and each is unrelated except for that one factor. Each story is canon, in that it does not change established events at time of airing, based on things mentioned and/or implied and missing or deleted scenes that were not seen in the final aired episodes.<br>
|
||||
# <font size="-1"><b><a href="categories.php?catid=15">Tenth Doctor</a></b> - All Ages - None - Humor, Hurt/Comfort, Romance<br>
|
||||
# <i>Characters:</i> Rose Tyler<br>
|
||||
# <i>Series:</i> None<br>
|
||||
# <i>Published:</i> 2010.08.15 - <i>Updated:</i> 2010.08.16 - <i>Chapters:</i> 4 - <i>Completed:</i> Yes - <i>Word Count:</i> 4890 </font>
|
||||
# </td></tr></table>
|
||||
|
||||
logging.debug("Author URL: "+self.story.getMetadata('authorUrl'))
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(self.story.getMetadata('authorUrl')),
|
||||
selfClosingTags=('br')) # normalize <br> tags to <br />
|
||||
# find this story in the list, parse it's metadata based on
|
||||
# lots of assumptions about the html, since there's little
|
||||
# tagging.
|
||||
# Found a story once that had the story URL in the desc for a
|
||||
# series on the same author's page. Now using the reviews
|
||||
# link instead to find the appropriate metadata.
|
||||
a = soup.find('a', href=re.compile(r'reviews.php\?sid='+self.story.getMetadata('storyId')))
|
||||
metadata = a.findParent('td')
|
||||
metadatachunks = self.utf8FromSoup(None,metadata).split('<br />')
|
||||
# process metadata for this story.
|
||||
self.setDescription(url,metadatachunks[1])
|
||||
#self.story.setMetadata('description', metadatachunks[1])
|
||||
|
||||
# First line of the stuff with ' - ' separators
|
||||
moremeta = metadatachunks[2]
|
||||
moremeta = re.sub(r'<[^>]+>','',moremeta) # strip tags.
|
||||
|
||||
moremetaparts = moremeta.split(' - ')
|
||||
|
||||
# first part is category--whofic.com has categories
|
||||
# Doctor One-11, Torchwood, etc. We're going to
|
||||
# prepend any with 'Doctor' or 'Era' (Multi-Era, Other
|
||||
# Era) as 'Doctor Who'.
|
||||
#
|
||||
# Also push each in as 'extra tags'.
|
||||
category = moremetaparts[0]
|
||||
if 'Doctor' in category or 'Era' in category :
|
||||
self.story.addToList('category','Doctor Who')
|
||||
|
||||
for cat in category.split(', '):
|
||||
self.story.addToList('category',cat)
|
||||
|
||||
# next in that line is age rating.
|
||||
self.story.setMetadata('rating',moremetaparts[1])
|
||||
|
||||
# after that is a possible list fo specific warnings,
|
||||
# Explicit Violence, Swearing, etc
|
||||
if "None" not in moremetaparts[2]:
|
||||
for warn in moremetaparts[2].split(', '):
|
||||
self.story.addToList('warnings',warn)
|
||||
|
||||
# then genre. It's another comma list. All together
|
||||
# in genre, plus each in extra tags.
|
||||
genre=moremetaparts[3]
|
||||
for g in genre.split(r', '):
|
||||
self.story.addToList('genre',g)
|
||||
|
||||
# line 3 is characters.
|
||||
chars = metadatachunks[3]
|
||||
charsearch="<i>Characters:</i>"
|
||||
if charsearch in chars:
|
||||
chars = chars[metadatachunks[3].index(charsearch)+len(charsearch):]
|
||||
for c in chars.split(','):
|
||||
if c.strip() != u'None':
|
||||
self.story.addToList('characters',c)
|
||||
|
||||
# the next line is stuff with ' - ' separators *and* names--with tags.
|
||||
moremeta = metadatachunks[5]
|
||||
moremeta = re.sub(r'<[^>]+>','',moremeta) # strip tags.
|
||||
|
||||
moremetaparts = moremeta.split(' - ')
|
||||
|
||||
for part in moremetaparts:
|
||||
(name,value) = part.split(': ')
|
||||
name=name.strip()
|
||||
value=value.strip()
|
||||
if name == 'Published':
|
||||
self.story.setMetadata('datePublished', makeDate(value, '%Y.%m.%d'))
|
||||
if name == 'Updated':
|
||||
self.story.setMetadata('dateUpdated', makeDate(value, '%Y.%m.%d'))
|
||||
if name == 'Completed':
|
||||
if value == 'Yes':
|
||||
self.story.setMetadata('status', 'Completed')
|
||||
else:
|
||||
self.story.setMetadata('status', 'In-Progress')
|
||||
if name == 'Word Count':
|
||||
self.story.setMetadata('numWords', value)
|
||||
|
||||
try:
|
||||
# Find Series name from series URL.
|
||||
a = metadata.find('a', href=re.compile(r"series.php\?seriesid=\d+"))
|
||||
series_name = a.string
|
||||
series_url = 'http://'+self.host+'/'+a['href']
|
||||
|
||||
# use BeautifulSoup HTML parser to make everything easier to find.
|
||||
seriessoup = bs.BeautifulSoup(self._fetchUrl(series_url))
|
||||
storyas = seriessoup.findAll('a', href=re.compile(r'^viewstory.php\?sid=\d+$'))
|
||||
i=1
|
||||
for a in storyas:
|
||||
if a['href'] == ('viewstory.php?sid='+self.story.getMetadata('storyId')):
|
||||
self.setSeries(series_name, i)
|
||||
break
|
||||
i+=1
|
||||
|
||||
except:
|
||||
# I find it hard to care if the series parsing fails
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
|
||||
logging.debug('Getting chapter text from: %s' % url)
|
||||
|
||||
soup = bs.BeautifulStoneSoup(self._fetchUrl(url),
|
||||
selfClosingTags=('br','hr')) # otherwise soup eats the br/hr tags.
|
||||
|
||||
|
||||
# hardly a great identifier, I know, but whofic really doesn't
|
||||
# give us anything better to work with.
|
||||
span = soup.find('span', {'style' : 'font-size: 100%;'})
|
||||
|
||||
if None == span:
|
||||
raise exceptions.FailedToDownload("Error downloading Chapter: %s! Missing required element!" % url)
|
||||
|
||||
return self.utf8FromSoup(url,span)
|
||||
|
||||
def getClass():
|
||||
return WhoficComSiteAdapter
|
||||
|
||||
364
fanficdownloader/adapters/base_adapter.py
Normal file
364
fanficdownloader/adapters/base_adapter.py
Normal file
|
|
@ -0,0 +1,364 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright 2011 Fanficdownloader team
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import re
|
||||
import datetime
|
||||
import time
|
||||
import logging
|
||||
import urllib
|
||||
import urllib2 as u2
|
||||
import urlparse as up
|
||||
from functools import partial
|
||||
|
||||
from .. import BeautifulSoup as bs
|
||||
from ..htmlcleanup import stripHTML
|
||||
|
||||
try:
|
||||
from google.appengine.api import apiproxy_stub_map
|
||||
def urlfetch_timeout_hook(service, call, request, response):
|
||||
if call != 'Fetch':
|
||||
return
|
||||
# Make the default deadline 10 seconds instead of 5.
|
||||
if not request.has_deadline():
|
||||
request.set_deadline(10.0)
|
||||
|
||||
apiproxy_stub_map.apiproxy.GetPreCallHooks().Append(
|
||||
'urlfetch_timeout_hook', urlfetch_timeout_hook, 'urlfetch')
|
||||
logging.info("Hook to make default deadline 10.0 installed.")
|
||||
except:
|
||||
pass
|
||||
#logging.info("Hook to make default deadline 10.0 NOT installed--not using appengine")
|
||||
|
||||
from ..story import Story
|
||||
from ..gziphttp import GZipProcessor
|
||||
from ..configurable import Configurable
|
||||
from ..htmlcleanup import removeEntities, removeAllEntities, stripHTML
|
||||
from ..exceptions import InvalidStoryURL
|
||||
|
||||
try:
|
||||
from .. import chardet as chardet
|
||||
except ImportError:
|
||||
chardet = None
|
||||
|
||||
class BaseSiteAdapter(Configurable):
|
||||
|
||||
@classmethod
|
||||
def matchesSite(cls,site):
|
||||
return site in cls.getAcceptDomains()
|
||||
|
||||
@classmethod
|
||||
def getAcceptDomains(cls):
|
||||
return [cls.getSiteDomain()]
|
||||
|
||||
def validateURL(self):
|
||||
return re.match(self.getSiteURLPattern(), self.url)
|
||||
|
||||
def __init__(self, config, url):
|
||||
self.config = config
|
||||
Configurable.__init__(self, config)
|
||||
self.setSectionOrder(self.getSiteDomain())
|
||||
# self.addConfigSection(self.getSiteDomain())
|
||||
# self.addConfigSection("overrides")
|
||||
|
||||
self.username = "NoneGiven" # if left empty, site doesn't return any message at all.
|
||||
self.password = ""
|
||||
self.is_adult=False
|
||||
|
||||
self.opener = u2.build_opener(u2.HTTPCookieProcessor(),GZipProcessor())
|
||||
self.storyDone = False
|
||||
self.metadataDone = False
|
||||
self.story = Story()
|
||||
self.story.setMetadata('site',self.getSiteDomain())
|
||||
self.story.setMetadata('dateCreated',datetime.datetime.now())
|
||||
self.chapterUrls = [] # tuples of (chapter title,chapter url)
|
||||
self.chapterFirst = None
|
||||
self.chapterLast = None
|
||||
self.oldchapters = None
|
||||
self.oldimgs = None
|
||||
## order of preference for decoding.
|
||||
self.decode = ["utf8",
|
||||
"Windows-1252"] # 1252 is a superset of
|
||||
# iso-8859-1. Most sites that
|
||||
# claim to be iso-8859-1 (and
|
||||
# some that claim to be utf8)
|
||||
# are really windows-1252.
|
||||
self._setURL(url)
|
||||
if not self.validateURL():
|
||||
raise InvalidStoryURL(url,
|
||||
self.getSiteDomain(),
|
||||
self.getSiteExampleURLs())
|
||||
|
||||
def _setURL(self,url):
|
||||
self.url = url
|
||||
self.parsedUrl = up.urlparse(url)
|
||||
self.host = self.parsedUrl.netloc
|
||||
self.path = self.parsedUrl.path
|
||||
self.story.setMetadata('storyUrl',self.url)
|
||||
|
||||
## website encoding(s)--in theory, each website reports the character
|
||||
## encoding they use for each page. In practice, some sites report it
|
||||
## incorrectly. Each adapter has a default list, usually "utf8,
|
||||
## Windows-1252" or "Windows-1252, utf8". The special value 'auto'
|
||||
## will call chardet and use the encoding it reports if it has +90%
|
||||
## confidence. 'auto' is not reliable.
|
||||
def _decode(self,data):
|
||||
if self.getConfig('website_encodings'):
|
||||
decode = self.getConfigList('website_encodings')
|
||||
else:
|
||||
decode = self.decode
|
||||
|
||||
for code in decode:
|
||||
try:
|
||||
#print code
|
||||
if code == "auto":
|
||||
if not chardet:
|
||||
logging.info("chardet not available, skipping 'auto' encoding")
|
||||
continue
|
||||
detected = chardet.detect(data)
|
||||
#print detected
|
||||
if detected['confidence'] > 0.9:
|
||||
code=detected['encoding']
|
||||
else:
|
||||
continue
|
||||
return data.decode(code)
|
||||
except:
|
||||
logging.debug("code failed:"+code)
|
||||
pass
|
||||
logging.info("Could not decode story, tried:%s Stripping non-ASCII."%decode)
|
||||
return "".join([x for x in data if ord(x) < 128])
|
||||
|
||||
# Assumes application/x-www-form-urlencoded. parameters, headers are dict()s
|
||||
def _postUrl(self, url, parameters={}, headers={}):
|
||||
if self.getConfig('slow_down_sleep_time'):
|
||||
time.sleep(float(self.getConfig('slow_down_sleep_time')))
|
||||
|
||||
## u2.Request assumes POST when data!=None. Also assumes data
|
||||
## is application/x-www-form-urlencoded.
|
||||
if 'Content-type' not in headers:
|
||||
headers['Content-type']='application/x-www-form-urlencoded'
|
||||
if 'Accept' not in headers:
|
||||
headers['Accept']="text/html,*/*"
|
||||
req = u2.Request(url,
|
||||
data=urllib.urlencode(parameters),
|
||||
headers=headers)
|
||||
return self._decode(self.opener.open(req).read())
|
||||
|
||||
def _fetchUrlRaw(self, url, parameters=None):
|
||||
if parameters != None:
|
||||
return self.opener.open(url,urllib.urlencode(parameters)).read()
|
||||
else:
|
||||
return self.opener.open(url).read()
|
||||
|
||||
# parameters is a dict()
|
||||
def _fetchUrl(self, url, parameters=None):
|
||||
if self.getConfig('slow_down_sleep_time'):
|
||||
time.sleep(float(self.getConfig('slow_down_sleep_time')))
|
||||
|
||||
excpt=None
|
||||
for sleeptime in [0, 0.5, 4, 9]:
|
||||
time.sleep(sleeptime)
|
||||
try:
|
||||
return self._decode(self._fetchUrlRaw(url,parameters))
|
||||
except Exception, e:
|
||||
excpt=e
|
||||
logging.warn("Caught an exception reading URL: %s Exception %s."%(unicode(url),unicode(e)))
|
||||
|
||||
logging.error("Giving up on %s" %url)
|
||||
logging.exception(excpt)
|
||||
raise(excpt)
|
||||
|
||||
# Limit chapters to download. Input starts at 1, list starts at 0
|
||||
def setChaptersRange(self,first=None,last=None):
|
||||
if first:
|
||||
self.chapterFirst=int(first)-1
|
||||
if last:
|
||||
self.chapterLast=int(last)-1
|
||||
|
||||
# Does the download the first time it's called.
|
||||
def getStory(self):
|
||||
if not self.storyDone:
|
||||
self.getStoryMetadataOnly()
|
||||
|
||||
for index, (title,url) in enumerate(self.chapterUrls):
|
||||
if (self.chapterFirst!=None and index < self.chapterFirst) or \
|
||||
(self.chapterLast!=None and index > self.chapterLast):
|
||||
self.story.addChapter(removeEntities(title),
|
||||
None)
|
||||
else:
|
||||
if self.oldchapters and index < len(self.oldchapters):
|
||||
data = self.utf8FromSoup(None,
|
||||
self.oldchapters[index],
|
||||
partial(cachedfetch,self._fetchUrlRaw,self.oldimgs))
|
||||
else:
|
||||
data = self.getChapterText(url)
|
||||
self.story.addChapter(removeEntities(title),
|
||||
removeEntities(data))
|
||||
self.storyDone = True
|
||||
|
||||
# include image, but no cover from story, add default_cover_image cover.
|
||||
if self.getConfig('include_images') and \
|
||||
not self.story.cover and \
|
||||
self.getConfig('default_cover_image'):
|
||||
self.story.addImgUrl(self,
|
||||
None,
|
||||
self.getConfig('default_cover_image'),
|
||||
self._fetchUrlRaw,
|
||||
cover=True)
|
||||
return self.story
|
||||
|
||||
def getStoryMetadataOnly(self):
|
||||
if not self.metadataDone:
|
||||
self.extractChapterUrlsAndMetadata()
|
||||
self.metadataDone = True
|
||||
return self.story
|
||||
|
||||
###############################
|
||||
|
||||
@staticmethod
|
||||
def getSiteDomain():
|
||||
"Needs to be overriden in each adapter class."
|
||||
return 'no such domain'
|
||||
|
||||
## URL pattern validation is done *after* picking an adaptor based
|
||||
## on domain instead of *as* the adaptor selector so we can offer
|
||||
## the user example(s) for that particular site.
|
||||
## Override validateURL(self) instead if you need more control.
|
||||
def getSiteURLPattern(self):
|
||||
"Used to validate URL. Should be override in each adapter class."
|
||||
return '^http://'+re.escape(self.getSiteDomain())
|
||||
|
||||
def getSiteExampleURLs(self):
|
||||
"""
|
||||
Needs to be overriden in each adapter class. It's the adapter
|
||||
writer's responsibility to make sure the example(s) pass the
|
||||
URL validate.
|
||||
"""
|
||||
return 'no such example'
|
||||
|
||||
def extractChapterUrlsAndMetadata(self):
|
||||
"Needs to be overriden in each adapter class. Populates self.story metadata and self.chapterUrls"
|
||||
pass
|
||||
|
||||
def getChapterText(self, url):
|
||||
"Needs to be overriden in each adapter class."
|
||||
pass
|
||||
|
||||
# Just for series, in case we choose to change how it's stored or represented later.
|
||||
def setSeries(self,name,num):
|
||||
if self.getConfig('collect_series'):
|
||||
self.story.setMetadata('series','%s [%s]'%(name, num))
|
||||
|
||||
def setDescription(self,url,svalue):
|
||||
#print("\n\nsvalue:\n%s\n"%svalue)
|
||||
if self.getConfig('keep_summary_html'):
|
||||
if isinstance(svalue,str) or isinstance(svalue,unicode):
|
||||
svalue = bs.BeautifulSoup(svalue)
|
||||
self.story.setMetadata('description',self.utf8FromSoup(url,svalue))
|
||||
else:
|
||||
self.story.setMetadata('description',stripHTML(svalue))
|
||||
#print("\n\ndescription:\n"+self.story.getMetadata('description')+"\n\n")
|
||||
|
||||
# This gives us a unicode object, not just a string containing bytes.
|
||||
# (I gave soup a unicode string, you'd think it could give it back...)
|
||||
# Now also does a bunch of other common processing for us.
|
||||
def utf8FromSoup(self,url,soup,fetch=None):
|
||||
if not fetch:
|
||||
fetch=self._fetchUrlRaw
|
||||
|
||||
acceptable_attributes = ['href','name']
|
||||
#print("include_images:"+self.getConfig('include_images'))
|
||||
if self.getConfig('include_images'):
|
||||
acceptable_attributes.extend(('src','alt','longdesc'))
|
||||
for img in soup.findAll('img'):
|
||||
# some pre-existing epubs have img tags that had src stripped off.
|
||||
if img.has_key('src'):
|
||||
img['longdesc']=img['src']
|
||||
img['src']=self.story.addImgUrl(self,url,img['src'],fetch)
|
||||
|
||||
for attr in soup._getAttrMap().keys():
|
||||
if attr not in acceptable_attributes:
|
||||
del soup[attr] ## strip all tag attributes except href and name
|
||||
|
||||
for t in soup.findAll(recursive=True):
|
||||
for attr in t._getAttrMap().keys():
|
||||
if attr not in acceptable_attributes:
|
||||
del t[attr] ## strip all tag attributes except href and name
|
||||
|
||||
# these are not acceptable strict XHTML. But we do already have
|
||||
# CSS classes of the same names defined
|
||||
if t.name in ('u'):
|
||||
t['class']=t.name
|
||||
t.name='span'
|
||||
if t.name in ('center'):
|
||||
t['class']=t.name
|
||||
t.name='div'
|
||||
# removes paired, but empty tags.
|
||||
if t.string != None and len(t.string.strip()) == 0 :
|
||||
t.extract()
|
||||
|
||||
retval = soup.__str__('utf8').decode('utf-8')
|
||||
|
||||
if self.getConfig('replace_hr'):
|
||||
# replacing a self-closing tag with a container tag in the
|
||||
# soup is more difficult than it first appears. So cheat.
|
||||
retval = retval.replace("<hr />","<div class='center'>* * *</div>")
|
||||
|
||||
if self.getConfig('nook_img_fix'):
|
||||
# if the <img> tag doesn't have a div or a p around it,
|
||||
# nook gets confused and displays it on every page after
|
||||
# that under the text for the rest of the chapter.
|
||||
retval = re.sub(r"(?!<(div|p)>)\s*(?P<imgtag><img[^>]+>)\s*(?!</(div|p)>)",
|
||||
"<div>\g<imgtag></div>",retval)
|
||||
|
||||
# Don't want body tags in chapter html--writers add them.
|
||||
# This is primarily for epub updates.
|
||||
return re.sub(r"</?body>\r?\n?","",retval)
|
||||
|
||||
fullmon = {"January":"01", "February":"02", "March":"03", "April":"04", "May":"05",
|
||||
"June":"06","July":"07", "August":"08", "September":"09", "October":"10",
|
||||
"November":"11", "December":"12" }
|
||||
|
||||
def cachedfetch(realfetch,cache,url):
|
||||
if url in cache:
|
||||
print("cache hit")
|
||||
return cache[url]
|
||||
else:
|
||||
return realfetch(url)
|
||||
|
||||
|
||||
def makeDate(string,format):
|
||||
# Surprise! Abstracting this turned out to be more useful than
|
||||
# just saving bytes.
|
||||
|
||||
# fudge english month names for people who's locale is set to
|
||||
# non-english. All our current sites date in english, even if
|
||||
# there's non-english content. -- ficbook.net now makes that a
|
||||
# lie. It has to do something even more complicated to get
|
||||
# Russian month names correct everywhere.
|
||||
do_abbrev = "%b" in format
|
||||
|
||||
if "%B" in format or do_abbrev:
|
||||
format = format.replace("%B","%m").replace("%b","%m")
|
||||
for (name,num) in fullmon.items():
|
||||
if do_abbrev:
|
||||
name = name[:3] # first three for abbrev
|
||||
if name in string:
|
||||
string = string.replace(name,num)
|
||||
break
|
||||
|
||||
return datetime.datetime.strptime(string,format)
|
||||
|
||||
|
|
@ -1,36 +1,26 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# Contributor(s):
|
||||
# Dan Blanchard
|
||||
# Ian Cordasco
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
import sys
|
||||
__version__ = "2.0.1"
|
||||
|
||||
|
||||
if sys.version_info < (3, 0):
|
||||
PY2 = True
|
||||
PY3 = False
|
||||
string_types = (str, unicode)
|
||||
text_type = unicode
|
||||
iteritems = dict.iteritems
|
||||
else:
|
||||
PY2 = False
|
||||
PY3 = True
|
||||
string_types = (bytes, str)
|
||||
text_type = str
|
||||
iteritems = dict.items
|
||||
def detect(aBuf):
|
||||
import universaldetector
|
||||
u = universaldetector.UniversalDetector()
|
||||
u.reset()
|
||||
u.feed(aBuf)
|
||||
u.close()
|
||||
return u.result
|
||||
923
fanficdownloader/chardet/big5freq.py
Normal file
923
fanficdownloader/chardet/big5freq.py
Normal file
|
|
@ -0,0 +1,923 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Communicator client code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
# Big5 frequency table
|
||||
# by Taiwan's Mandarin Promotion Council
|
||||
# <http://www.edu.tw:81/mandr/>
|
||||
#
|
||||
# 128 --> 0.42261
|
||||
# 256 --> 0.57851
|
||||
# 512 --> 0.74851
|
||||
# 1024 --> 0.89384
|
||||
# 2048 --> 0.97583
|
||||
#
|
||||
# Ideal Distribution Ratio = 0.74851/(1-0.74851) =2.98
|
||||
# Random Distribution Ration = 512/(5401-512)=0.105
|
||||
#
|
||||
# Typical Distribution Ratio about 25% of Ideal one, still much higher than RDR
|
||||
|
||||
BIG5_TYPICAL_DISTRIBUTION_RATIO = 0.75
|
||||
|
||||
#Char to FreqOrder table
|
||||
BIG5_TABLE_SIZE = 5376
|
||||
|
||||
Big5CharToFreqOrder = ( \
|
||||
1,1801,1506, 255,1431, 198, 9, 82, 6,5008, 177, 202,3681,1256,2821, 110, # 16
|
||||
3814, 33,3274, 261, 76, 44,2114, 16,2946,2187,1176, 659,3971, 26,3451,2653, # 32
|
||||
1198,3972,3350,4202, 410,2215, 302, 590, 361,1964, 8, 204, 58,4510,5009,1932, # 48
|
||||
63,5010,5011, 317,1614, 75, 222, 159,4203,2417,1480,5012,3555,3091, 224,2822, # 64
|
||||
3682, 3, 10,3973,1471, 29,2787,1135,2866,1940, 873, 130,3275,1123, 312,5013, # 80
|
||||
4511,2052, 507, 252, 682,5014, 142,1915, 124, 206,2947, 34,3556,3204, 64, 604, # 96
|
||||
5015,2501,1977,1978, 155,1991, 645, 641,1606,5016,3452, 337, 72, 406,5017, 80, # 112
|
||||
630, 238,3205,1509, 263, 939,1092,2654, 756,1440,1094,3453, 449, 69,2987, 591, # 128
|
||||
179,2096, 471, 115,2035,1844, 60, 50,2988, 134, 806,1869, 734,2036,3454, 180, # 144
|
||||
995,1607, 156, 537,2907, 688,5018, 319,1305, 779,2145, 514,2379, 298,4512, 359, # 160
|
||||
2502, 90,2716,1338, 663, 11, 906,1099,2553, 20,2441, 182, 532,1716,5019, 732, # 176
|
||||
1376,4204,1311,1420,3206, 25,2317,1056, 113, 399, 382,1950, 242,3455,2474, 529, # 192
|
||||
3276, 475,1447,3683,5020, 117, 21, 656, 810,1297,2300,2334,3557,5021, 126,4205, # 208
|
||||
706, 456, 150, 613,4513, 71,1118,2037,4206, 145,3092, 85, 835, 486,2115,1246, # 224
|
||||
1426, 428, 727,1285,1015, 800, 106, 623, 303,1281,5022,2128,2359, 347,3815, 221, # 240
|
||||
3558,3135,5023,1956,1153,4207, 83, 296,1199,3093, 192, 624, 93,5024, 822,1898, # 256
|
||||
2823,3136, 795,2065, 991,1554,1542,1592, 27, 43,2867, 859, 139,1456, 860,4514, # 272
|
||||
437, 712,3974, 164,2397,3137, 695, 211,3037,2097, 195,3975,1608,3559,3560,3684, # 288
|
||||
3976, 234, 811,2989,2098,3977,2233,1441,3561,1615,2380, 668,2077,1638, 305, 228, # 304
|
||||
1664,4515, 467, 415,5025, 262,2099,1593, 239, 108, 300, 200,1033, 512,1247,2078, # 320
|
||||
5026,5027,2176,3207,3685,2682, 593, 845,1062,3277, 88,1723,2038,3978,1951, 212, # 336
|
||||
266, 152, 149, 468,1899,4208,4516, 77, 187,5028,3038, 37, 5,2990,5029,3979, # 352
|
||||
5030,5031, 39,2524,4517,2908,3208,2079, 55, 148, 74,4518, 545, 483,1474,1029, # 368
|
||||
1665, 217,1870,1531,3138,1104,2655,4209, 24, 172,3562, 900,3980,3563,3564,4519, # 384
|
||||
32,1408,2824,1312, 329, 487,2360,2251,2717, 784,2683, 4,3039,3351,1427,1789, # 400
|
||||
188, 109, 499,5032,3686,1717,1790, 888,1217,3040,4520,5033,3565,5034,3352,1520, # 416
|
||||
3687,3981, 196,1034, 775,5035,5036, 929,1816, 249, 439, 38,5037,1063,5038, 794, # 432
|
||||
3982,1435,2301, 46, 178,3278,2066,5039,2381,5040, 214,1709,4521, 804, 35, 707, # 448
|
||||
324,3688,1601,2554, 140, 459,4210,5041,5042,1365, 839, 272, 978,2262,2580,3456, # 464
|
||||
2129,1363,3689,1423, 697, 100,3094, 48, 70,1231, 495,3139,2196,5043,1294,5044, # 480
|
||||
2080, 462, 586,1042,3279, 853, 256, 988, 185,2382,3457,1698, 434,1084,5045,3458, # 496
|
||||
314,2625,2788,4522,2335,2336, 569,2285, 637,1817,2525, 757,1162,1879,1616,3459, # 512
|
||||
287,1577,2116, 768,4523,1671,2868,3566,2526,1321,3816, 909,2418,5046,4211, 933, # 528
|
||||
3817,4212,2053,2361,1222,4524, 765,2419,1322, 786,4525,5047,1920,1462,1677,2909, # 544
|
||||
1699,5048,4526,1424,2442,3140,3690,2600,3353,1775,1941,3460,3983,4213, 309,1369, # 560
|
||||
1130,2825, 364,2234,1653,1299,3984,3567,3985,3986,2656, 525,1085,3041, 902,2001, # 576
|
||||
1475, 964,4527, 421,1845,1415,1057,2286, 940,1364,3141, 376,4528,4529,1381, 7, # 592
|
||||
2527, 983,2383, 336,1710,2684,1846, 321,3461, 559,1131,3042,2752,1809,1132,1313, # 608
|
||||
265,1481,1858,5049, 352,1203,2826,3280, 167,1089, 420,2827, 776, 792,1724,3568, # 624
|
||||
4214,2443,3281,5050,4215,5051, 446, 229, 333,2753, 901,3818,1200,1557,4530,2657, # 640
|
||||
1921, 395,2754,2685,3819,4216,1836, 125, 916,3209,2626,4531,5052,5053,3820,5054, # 656
|
||||
5055,5056,4532,3142,3691,1133,2555,1757,3462,1510,2318,1409,3569,5057,2146, 438, # 672
|
||||
2601,2910,2384,3354,1068, 958,3043, 461, 311,2869,2686,4217,1916,3210,4218,1979, # 688
|
||||
383, 750,2755,2627,4219, 274, 539, 385,1278,1442,5058,1154,1965, 384, 561, 210, # 704
|
||||
98,1295,2556,3570,5059,1711,2420,1482,3463,3987,2911,1257, 129,5060,3821, 642, # 720
|
||||
523,2789,2790,2658,5061, 141,2235,1333, 68, 176, 441, 876, 907,4220, 603,2602, # 736
|
||||
710, 171,3464, 404, 549, 18,3143,2398,1410,3692,1666,5062,3571,4533,2912,4534, # 752
|
||||
5063,2991, 368,5064, 146, 366, 99, 871,3693,1543, 748, 807,1586,1185, 22,2263, # 768
|
||||
379,3822,3211,5065,3212, 505,1942,2628,1992,1382,2319,5066, 380,2362, 218, 702, # 784
|
||||
1818,1248,3465,3044,3572,3355,3282,5067,2992,3694, 930,3283,3823,5068, 59,5069, # 800
|
||||
585, 601,4221, 497,3466,1112,1314,4535,1802,5070,1223,1472,2177,5071, 749,1837, # 816
|
||||
690,1900,3824,1773,3988,1476, 429,1043,1791,2236,2117, 917,4222, 447,1086,1629, # 832
|
||||
5072, 556,5073,5074,2021,1654, 844,1090, 105, 550, 966,1758,2828,1008,1783, 686, # 848
|
||||
1095,5075,2287, 793,1602,5076,3573,2603,4536,4223,2948,2302,4537,3825, 980,2503, # 864
|
||||
544, 353, 527,4538, 908,2687,2913,5077, 381,2629,1943,1348,5078,1341,1252, 560, # 880
|
||||
3095,5079,3467,2870,5080,2054, 973, 886,2081, 143,4539,5081,5082, 157,3989, 496, # 896
|
||||
4224, 57, 840, 540,2039,4540,4541,3468,2118,1445, 970,2264,1748,1966,2082,4225, # 912
|
||||
3144,1234,1776,3284,2829,3695, 773,1206,2130,1066,2040,1326,3990,1738,1725,4226, # 928
|
||||
279,3145, 51,1544,2604, 423,1578,2131,2067, 173,4542,1880,5083,5084,1583, 264, # 944
|
||||
610,3696,4543,2444, 280, 154,5085,5086,5087,1739, 338,1282,3096, 693,2871,1411, # 960
|
||||
1074,3826,2445,5088,4544,5089,5090,1240, 952,2399,5091,2914,1538,2688, 685,1483, # 976
|
||||
4227,2475,1436, 953,4228,2055,4545, 671,2400, 79,4229,2446,3285, 608, 567,2689, # 992
|
||||
3469,4230,4231,1691, 393,1261,1792,2401,5092,4546,5093,5094,5095,5096,1383,1672, # 1008
|
||||
3827,3213,1464, 522,1119, 661,1150, 216, 675,4547,3991,1432,3574, 609,4548,2690, # 1024
|
||||
2402,5097,5098,5099,4232,3045, 0,5100,2476, 315, 231,2447, 301,3356,4549,2385, # 1040
|
||||
5101, 233,4233,3697,1819,4550,4551,5102, 96,1777,1315,2083,5103, 257,5104,1810, # 1056
|
||||
3698,2718,1139,1820,4234,2022,1124,2164,2791,1778,2659,5105,3097, 363,1655,3214, # 1072
|
||||
5106,2993,5107,5108,5109,3992,1567,3993, 718, 103,3215, 849,1443, 341,3357,2949, # 1088
|
||||
1484,5110,1712, 127, 67, 339,4235,2403, 679,1412, 821,5111,5112, 834, 738, 351, # 1104
|
||||
2994,2147, 846, 235,1497,1881, 418,1993,3828,2719, 186,1100,2148,2756,3575,1545, # 1120
|
||||
1355,2950,2872,1377, 583,3994,4236,2581,2995,5113,1298,3699,1078,2557,3700,2363, # 1136
|
||||
78,3829,3830, 267,1289,2100,2002,1594,4237, 348, 369,1274,2197,2178,1838,4552, # 1152
|
||||
1821,2830,3701,2757,2288,2003,4553,2951,2758, 144,3358, 882,4554,3995,2759,3470, # 1168
|
||||
4555,2915,5114,4238,1726, 320,5115,3996,3046, 788,2996,5116,2831,1774,1327,2873, # 1184
|
||||
3997,2832,5117,1306,4556,2004,1700,3831,3576,2364,2660, 787,2023, 506, 824,3702, # 1200
|
||||
534, 323,4557,1044,3359,2024,1901, 946,3471,5118,1779,1500,1678,5119,1882,4558, # 1216
|
||||
165, 243,4559,3703,2528, 123, 683,4239, 764,4560, 36,3998,1793, 589,2916, 816, # 1232
|
||||
626,1667,3047,2237,1639,1555,1622,3832,3999,5120,4000,2874,1370,1228,1933, 891, # 1248
|
||||
2084,2917, 304,4240,5121, 292,2997,2720,3577, 691,2101,4241,1115,4561, 118, 662, # 1264
|
||||
5122, 611,1156, 854,2386,1316,2875, 2, 386, 515,2918,5123,5124,3286, 868,2238, # 1280
|
||||
1486, 855,2661, 785,2216,3048,5125,1040,3216,3578,5126,3146, 448,5127,1525,5128, # 1296
|
||||
2165,4562,5129,3833,5130,4242,2833,3579,3147, 503, 818,4001,3148,1568, 814, 676, # 1312
|
||||
1444, 306,1749,5131,3834,1416,1030, 197,1428, 805,2834,1501,4563,5132,5133,5134, # 1328
|
||||
1994,5135,4564,5136,5137,2198, 13,2792,3704,2998,3149,1229,1917,5138,3835,2132, # 1344
|
||||
5139,4243,4565,2404,3580,5140,2217,1511,1727,1120,5141,5142, 646,3836,2448, 307, # 1360
|
||||
5143,5144,1595,3217,5145,5146,5147,3705,1113,1356,4002,1465,2529,2530,5148, 519, # 1376
|
||||
5149, 128,2133, 92,2289,1980,5150,4003,1512, 342,3150,2199,5151,2793,2218,1981, # 1392
|
||||
3360,4244, 290,1656,1317, 789, 827,2365,5152,3837,4566, 562, 581,4004,5153, 401, # 1408
|
||||
4567,2252, 94,4568,5154,1399,2794,5155,1463,2025,4569,3218,1944,5156, 828,1105, # 1424
|
||||
4245,1262,1394,5157,4246, 605,4570,5158,1784,2876,5159,2835, 819,2102, 578,2200, # 1440
|
||||
2952,5160,1502, 436,3287,4247,3288,2836,4005,2919,3472,3473,5161,2721,2320,5162, # 1456
|
||||
5163,2337,2068, 23,4571, 193, 826,3838,2103, 699,1630,4248,3098, 390,1794,1064, # 1472
|
||||
3581,5164,1579,3099,3100,1400,5165,4249,1839,1640,2877,5166,4572,4573, 137,4250, # 1488
|
||||
598,3101,1967, 780, 104, 974,2953,5167, 278, 899, 253, 402, 572, 504, 493,1339, # 1504
|
||||
5168,4006,1275,4574,2582,2558,5169,3706,3049,3102,2253, 565,1334,2722, 863, 41, # 1520
|
||||
5170,5171,4575,5172,1657,2338, 19, 463,2760,4251, 606,5173,2999,3289,1087,2085, # 1536
|
||||
1323,2662,3000,5174,1631,1623,1750,4252,2691,5175,2878, 791,2723,2663,2339, 232, # 1552
|
||||
2421,5176,3001,1498,5177,2664,2630, 755,1366,3707,3290,3151,2026,1609, 119,1918, # 1568
|
||||
3474, 862,1026,4253,5178,4007,3839,4576,4008,4577,2265,1952,2477,5179,1125, 817, # 1584
|
||||
4254,4255,4009,1513,1766,2041,1487,4256,3050,3291,2837,3840,3152,5180,5181,1507, # 1600
|
||||
5182,2692, 733, 40,1632,1106,2879, 345,4257, 841,2531, 230,4578,3002,1847,3292, # 1616
|
||||
3475,5183,1263, 986,3476,5184, 735, 879, 254,1137, 857, 622,1300,1180,1388,1562, # 1632
|
||||
4010,4011,2954, 967,2761,2665,1349, 592,2134,1692,3361,3003,1995,4258,1679,4012, # 1648
|
||||
1902,2188,5185, 739,3708,2724,1296,1290,5186,4259,2201,2202,1922,1563,2605,2559, # 1664
|
||||
1871,2762,3004,5187, 435,5188, 343,1108, 596, 17,1751,4579,2239,3477,3709,5189, # 1680
|
||||
4580, 294,3582,2955,1693, 477, 979, 281,2042,3583, 643,2043,3710,2631,2795,2266, # 1696
|
||||
1031,2340,2135,2303,3584,4581, 367,1249,2560,5190,3585,5191,4582,1283,3362,2005, # 1712
|
||||
240,1762,3363,4583,4584, 836,1069,3153, 474,5192,2149,2532, 268,3586,5193,3219, # 1728
|
||||
1521,1284,5194,1658,1546,4260,5195,3587,3588,5196,4261,3364,2693,1685,4262, 961, # 1744
|
||||
1673,2632, 190,2006,2203,3841,4585,4586,5197, 570,2504,3711,1490,5198,4587,2633, # 1760
|
||||
3293,1957,4588, 584,1514, 396,1045,1945,5199,4589,1968,2449,5200,5201,4590,4013, # 1776
|
||||
619,5202,3154,3294, 215,2007,2796,2561,3220,4591,3221,4592, 763,4263,3842,4593, # 1792
|
||||
5203,5204,1958,1767,2956,3365,3712,1174, 452,1477,4594,3366,3155,5205,2838,1253, # 1808
|
||||
2387,2189,1091,2290,4264, 492,5206, 638,1169,1825,2136,1752,4014, 648, 926,1021, # 1824
|
||||
1324,4595, 520,4596, 997, 847,1007, 892,4597,3843,2267,1872,3713,2405,1785,4598, # 1840
|
||||
1953,2957,3103,3222,1728,4265,2044,3714,4599,2008,1701,3156,1551, 30,2268,4266, # 1856
|
||||
5207,2027,4600,3589,5208, 501,5209,4267, 594,3478,2166,1822,3590,3479,3591,3223, # 1872
|
||||
829,2839,4268,5210,1680,3157,1225,4269,5211,3295,4601,4270,3158,2341,5212,4602, # 1888
|
||||
4271,5213,4015,4016,5214,1848,2388,2606,3367,5215,4603, 374,4017, 652,4272,4273, # 1904
|
||||
375,1140, 798,5216,5217,5218,2366,4604,2269, 546,1659, 138,3051,2450,4605,5219, # 1920
|
||||
2254, 612,1849, 910, 796,3844,1740,1371, 825,3845,3846,5220,2920,2562,5221, 692, # 1936
|
||||
444,3052,2634, 801,4606,4274,5222,1491, 244,1053,3053,4275,4276, 340,5223,4018, # 1952
|
||||
1041,3005, 293,1168, 87,1357,5224,1539, 959,5225,2240, 721, 694,4277,3847, 219, # 1968
|
||||
1478, 644,1417,3368,2666,1413,1401,1335,1389,4019,5226,5227,3006,2367,3159,1826, # 1984
|
||||
730,1515, 184,2840, 66,4607,5228,1660,2958, 246,3369, 378,1457, 226,3480, 975, # 2000
|
||||
4020,2959,1264,3592, 674, 696,5229, 163,5230,1141,2422,2167, 713,3593,3370,4608, # 2016
|
||||
4021,5231,5232,1186, 15,5233,1079,1070,5234,1522,3224,3594, 276,1050,2725, 758, # 2032
|
||||
1126, 653,2960,3296,5235,2342, 889,3595,4022,3104,3007, 903,1250,4609,4023,3481, # 2048
|
||||
3596,1342,1681,1718, 766,3297, 286, 89,2961,3715,5236,1713,5237,2607,3371,3008, # 2064
|
||||
5238,2962,2219,3225,2880,5239,4610,2505,2533, 181, 387,1075,4024, 731,2190,3372, # 2080
|
||||
5240,3298, 310, 313,3482,2304, 770,4278, 54,3054, 189,4611,3105,3848,4025,5241, # 2096
|
||||
1230,1617,1850, 355,3597,4279,4612,3373, 111,4280,3716,1350,3160,3483,3055,4281, # 2112
|
||||
2150,3299,3598,5242,2797,4026,4027,3009, 722,2009,5243,1071, 247,1207,2343,2478, # 2128
|
||||
1378,4613,2010, 864,1437,1214,4614, 373,3849,1142,2220, 667,4615, 442,2763,2563, # 2144
|
||||
3850,4028,1969,4282,3300,1840, 837, 170,1107, 934,1336,1883,5244,5245,2119,4283, # 2160
|
||||
2841, 743,1569,5246,4616,4284, 582,2389,1418,3484,5247,1803,5248, 357,1395,1729, # 2176
|
||||
3717,3301,2423,1564,2241,5249,3106,3851,1633,4617,1114,2086,4285,1532,5250, 482, # 2192
|
||||
2451,4618,5251,5252,1492, 833,1466,5253,2726,3599,1641,2842,5254,1526,1272,3718, # 2208
|
||||
4286,1686,1795, 416,2564,1903,1954,1804,5255,3852,2798,3853,1159,2321,5256,2881, # 2224
|
||||
4619,1610,1584,3056,2424,2764, 443,3302,1163,3161,5257,5258,4029,5259,4287,2506, # 2240
|
||||
3057,4620,4030,3162,2104,1647,3600,2011,1873,4288,5260,4289, 431,3485,5261, 250, # 2256
|
||||
97, 81,4290,5262,1648,1851,1558, 160, 848,5263, 866, 740,1694,5264,2204,2843, # 2272
|
||||
3226,4291,4621,3719,1687, 950,2479, 426, 469,3227,3720,3721,4031,5265,5266,1188, # 2288
|
||||
424,1996, 861,3601,4292,3854,2205,2694, 168,1235,3602,4293,5267,2087,1674,4622, # 2304
|
||||
3374,3303, 220,2565,1009,5268,3855, 670,3010, 332,1208, 717,5269,5270,3603,2452, # 2320
|
||||
4032,3375,5271, 513,5272,1209,2882,3376,3163,4623,1080,5273,5274,5275,5276,2534, # 2336
|
||||
3722,3604, 815,1587,4033,4034,5277,3605,3486,3856,1254,4624,1328,3058,1390,4035, # 2352
|
||||
1741,4036,3857,4037,5278, 236,3858,2453,3304,5279,5280,3723,3859,1273,3860,4625, # 2368
|
||||
5281, 308,5282,4626, 245,4627,1852,2480,1307,2583, 430, 715,2137,2454,5283, 270, # 2384
|
||||
199,2883,4038,5284,3606,2727,1753, 761,1754, 725,1661,1841,4628,3487,3724,5285, # 2400
|
||||
5286, 587, 14,3305, 227,2608, 326, 480,2270, 943,2765,3607, 291, 650,1884,5287, # 2416
|
||||
1702,1226, 102,1547, 62,3488, 904,4629,3489,1164,4294,5288,5289,1224,1548,2766, # 2432
|
||||
391, 498,1493,5290,1386,1419,5291,2056,1177,4630, 813, 880,1081,2368, 566,1145, # 2448
|
||||
4631,2291,1001,1035,2566,2609,2242, 394,1286,5292,5293,2069,5294, 86,1494,1730, # 2464
|
||||
4039, 491,1588, 745, 897,2963, 843,3377,4040,2767,2884,3306,1768, 998,2221,2070, # 2480
|
||||
397,1827,1195,1970,3725,3011,3378, 284,5295,3861,2507,2138,2120,1904,5296,4041, # 2496
|
||||
2151,4042,4295,1036,3490,1905, 114,2567,4296, 209,1527,5297,5298,2964,2844,2635, # 2512
|
||||
2390,2728,3164, 812,2568,5299,3307,5300,1559, 737,1885,3726,1210, 885, 28,2695, # 2528
|
||||
3608,3862,5301,4297,1004,1780,4632,5302, 346,1982,2222,2696,4633,3863,1742, 797, # 2544
|
||||
1642,4043,1934,1072,1384,2152, 896,4044,3308,3727,3228,2885,3609,5303,2569,1959, # 2560
|
||||
4634,2455,1786,5304,5305,5306,4045,4298,1005,1308,3728,4299,2729,4635,4636,1528, # 2576
|
||||
2610, 161,1178,4300,1983, 987,4637,1101,4301, 631,4046,1157,3229,2425,1343,1241, # 2592
|
||||
1016,2243,2570, 372, 877,2344,2508,1160, 555,1935, 911,4047,5307, 466,1170, 169, # 2608
|
||||
1051,2921,2697,3729,2481,3012,1182,2012,2571,1251,2636,5308, 992,2345,3491,1540, # 2624
|
||||
2730,1201,2071,2406,1997,2482,5309,4638, 528,1923,2191,1503,1874,1570,2369,3379, # 2640
|
||||
3309,5310, 557,1073,5311,1828,3492,2088,2271,3165,3059,3107, 767,3108,2799,4639, # 2656
|
||||
1006,4302,4640,2346,1267,2179,3730,3230, 778,4048,3231,2731,1597,2667,5312,4641, # 2672
|
||||
5313,3493,5314,5315,5316,3310,2698,1433,3311, 131, 95,1504,4049, 723,4303,3166, # 2688
|
||||
1842,3610,2768,2192,4050,2028,2105,3731,5317,3013,4051,1218,5318,3380,3232,4052, # 2704
|
||||
4304,2584, 248,1634,3864, 912,5319,2845,3732,3060,3865, 654, 53,5320,3014,5321, # 2720
|
||||
1688,4642, 777,3494,1032,4053,1425,5322, 191, 820,2121,2846, 971,4643, 931,3233, # 2736
|
||||
135, 664, 783,3866,1998, 772,2922,1936,4054,3867,4644,2923,3234, 282,2732, 640, # 2752
|
||||
1372,3495,1127, 922, 325,3381,5323,5324, 711,2045,5325,5326,4055,2223,2800,1937, # 2768
|
||||
4056,3382,2224,2255,3868,2305,5327,4645,3869,1258,3312,4057,3235,2139,2965,4058, # 2784
|
||||
4059,5328,2225, 258,3236,4646, 101,1227,5329,3313,1755,5330,1391,3314,5331,2924, # 2800
|
||||
2057, 893,5332,5333,5334,1402,4305,2347,5335,5336,3237,3611,5337,5338, 878,1325, # 2816
|
||||
1781,2801,4647, 259,1385,2585, 744,1183,2272,4648,5339,4060,2509,5340, 684,1024, # 2832
|
||||
4306,5341, 472,3612,3496,1165,3315,4061,4062, 322,2153, 881, 455,1695,1152,1340, # 2848
|
||||
660, 554,2154,4649,1058,4650,4307, 830,1065,3383,4063,4651,1924,5342,1703,1919, # 2864
|
||||
5343, 932,2273, 122,5344,4652, 947, 677,5345,3870,2637, 297,1906,1925,2274,4653, # 2880
|
||||
2322,3316,5346,5347,4308,5348,4309, 84,4310, 112, 989,5349, 547,1059,4064, 701, # 2896
|
||||
3613,1019,5350,4311,5351,3497, 942, 639, 457,2306,2456, 993,2966, 407, 851, 494, # 2912
|
||||
4654,3384, 927,5352,1237,5353,2426,3385, 573,4312, 680, 921,2925,1279,1875, 285, # 2928
|
||||
790,1448,1984, 719,2168,5354,5355,4655,4065,4066,1649,5356,1541, 563,5357,1077, # 2944
|
||||
5358,3386,3061,3498, 511,3015,4067,4068,3733,4069,1268,2572,3387,3238,4656,4657, # 2960
|
||||
5359, 535,1048,1276,1189,2926,2029,3167,1438,1373,2847,2967,1134,2013,5360,4313, # 2976
|
||||
1238,2586,3109,1259,5361, 700,5362,2968,3168,3734,4314,5363,4315,1146,1876,1907, # 2992
|
||||
4658,2611,4070, 781,2427, 132,1589, 203, 147, 273,2802,2407, 898,1787,2155,4071, # 3008
|
||||
4072,5364,3871,2803,5365,5366,4659,4660,5367,3239,5368,1635,3872, 965,5369,1805, # 3024
|
||||
2699,1516,3614,1121,1082,1329,3317,4073,1449,3873, 65,1128,2848,2927,2769,1590, # 3040
|
||||
3874,5370,5371, 12,2668, 45, 976,2587,3169,4661, 517,2535,1013,1037,3240,5372, # 3056
|
||||
3875,2849,5373,3876,5374,3499,5375,2612, 614,1999,2323,3877,3110,2733,2638,5376, # 3072
|
||||
2588,4316, 599,1269,5377,1811,3735,5378,2700,3111, 759,1060, 489,1806,3388,3318, # 3088
|
||||
1358,5379,5380,2391,1387,1215,2639,2256, 490,5381,5382,4317,1759,2392,2348,5383, # 3104
|
||||
4662,3878,1908,4074,2640,1807,3241,4663,3500,3319,2770,2349, 874,5384,5385,3501, # 3120
|
||||
3736,1859, 91,2928,3737,3062,3879,4664,5386,3170,4075,2669,5387,3502,1202,1403, # 3136
|
||||
3880,2969,2536,1517,2510,4665,3503,2511,5388,4666,5389,2701,1886,1495,1731,4076, # 3152
|
||||
2370,4667,5390,2030,5391,5392,4077,2702,1216, 237,2589,4318,2324,4078,3881,4668, # 3168
|
||||
4669,2703,3615,3504, 445,4670,5393,5394,5395,5396,2771, 61,4079,3738,1823,4080, # 3184
|
||||
5397, 687,2046, 935, 925, 405,2670, 703,1096,1860,2734,4671,4081,1877,1367,2704, # 3200
|
||||
3389, 918,2106,1782,2483, 334,3320,1611,1093,4672, 564,3171,3505,3739,3390, 945, # 3216
|
||||
2641,2058,4673,5398,1926, 872,4319,5399,3506,2705,3112, 349,4320,3740,4082,4674, # 3232
|
||||
3882,4321,3741,2156,4083,4675,4676,4322,4677,2408,2047, 782,4084, 400, 251,4323, # 3248
|
||||
1624,5400,5401, 277,3742, 299,1265, 476,1191,3883,2122,4324,4325,1109, 205,5402, # 3264
|
||||
2590,1000,2157,3616,1861,5403,5404,5405,4678,5406,4679,2573, 107,2484,2158,4085, # 3280
|
||||
3507,3172,5407,1533, 541,1301, 158, 753,4326,2886,3617,5408,1696, 370,1088,4327, # 3296
|
||||
4680,3618, 579, 327, 440, 162,2244, 269,1938,1374,3508, 968,3063, 56,1396,3113, # 3312
|
||||
2107,3321,3391,5409,1927,2159,4681,3016,5410,3619,5411,5412,3743,4682,2485,5413, # 3328
|
||||
2804,5414,1650,4683,5415,2613,5416,5417,4086,2671,3392,1149,3393,4087,3884,4088, # 3344
|
||||
5418,1076, 49,5419, 951,3242,3322,3323, 450,2850, 920,5420,1812,2805,2371,4328, # 3360
|
||||
1909,1138,2372,3885,3509,5421,3243,4684,1910,1147,1518,2428,4685,3886,5422,4686, # 3376
|
||||
2393,2614, 260,1796,3244,5423,5424,3887,3324, 708,5425,3620,1704,5426,3621,1351, # 3392
|
||||
1618,3394,3017,1887, 944,4329,3395,4330,3064,3396,4331,5427,3744, 422, 413,1714, # 3408
|
||||
3325, 500,2059,2350,4332,2486,5428,1344,1911, 954,5429,1668,5430,5431,4089,2409, # 3424
|
||||
4333,3622,3888,4334,5432,2307,1318,2512,3114, 133,3115,2887,4687, 629, 31,2851, # 3440
|
||||
2706,3889,4688, 850, 949,4689,4090,2970,1732,2089,4335,1496,1853,5433,4091, 620, # 3456
|
||||
3245, 981,1242,3745,3397,1619,3746,1643,3326,2140,2457,1971,1719,3510,2169,5434, # 3472
|
||||
3246,5435,5436,3398,1829,5437,1277,4690,1565,2048,5438,1636,3623,3116,5439, 869, # 3488
|
||||
2852, 655,3890,3891,3117,4092,3018,3892,1310,3624,4691,5440,5441,5442,1733, 558, # 3504
|
||||
4692,3747, 335,1549,3065,1756,4336,3748,1946,3511,1830,1291,1192, 470,2735,2108, # 3520
|
||||
2806, 913,1054,4093,5443,1027,5444,3066,4094,4693, 982,2672,3399,3173,3512,3247, # 3536
|
||||
3248,1947,2807,5445, 571,4694,5446,1831,5447,3625,2591,1523,2429,5448,2090, 984, # 3552
|
||||
4695,3749,1960,5449,3750, 852, 923,2808,3513,3751, 969,1519, 999,2049,2325,1705, # 3568
|
||||
5450,3118, 615,1662, 151, 597,4095,2410,2326,1049, 275,4696,3752,4337, 568,3753, # 3584
|
||||
3626,2487,4338,3754,5451,2430,2275, 409,3249,5452,1566,2888,3514,1002, 769,2853, # 3600
|
||||
194,2091,3174,3755,2226,3327,4339, 628,1505,5453,5454,1763,2180,3019,4096, 521, # 3616
|
||||
1161,2592,1788,2206,2411,4697,4097,1625,4340,4341, 412, 42,3119, 464,5455,2642, # 3632
|
||||
4698,3400,1760,1571,2889,3515,2537,1219,2207,3893,2643,2141,2373,4699,4700,3328, # 3648
|
||||
1651,3401,3627,5456,5457,3628,2488,3516,5458,3756,5459,5460,2276,2092, 460,5461, # 3664
|
||||
4701,5462,3020, 962, 588,3629, 289,3250,2644,1116, 52,5463,3067,1797,5464,5465, # 3680
|
||||
5466,1467,5467,1598,1143,3757,4342,1985,1734,1067,4702,1280,3402, 465,4703,1572, # 3696
|
||||
510,5468,1928,2245,1813,1644,3630,5469,4704,3758,5470,5471,2673,1573,1534,5472, # 3712
|
||||
5473, 536,1808,1761,3517,3894,3175,2645,5474,5475,5476,4705,3518,2929,1912,2809, # 3728
|
||||
5477,3329,1122, 377,3251,5478, 360,5479,5480,4343,1529, 551,5481,2060,3759,1769, # 3744
|
||||
2431,5482,2930,4344,3330,3120,2327,2109,2031,4706,1404, 136,1468,1479, 672,1171, # 3760
|
||||
3252,2308, 271,3176,5483,2772,5484,2050, 678,2736, 865,1948,4707,5485,2014,4098, # 3776
|
||||
2971,5486,2737,2227,1397,3068,3760,4708,4709,1735,2931,3403,3631,5487,3895, 509, # 3792
|
||||
2854,2458,2890,3896,5488,5489,3177,3178,4710,4345,2538,4711,2309,1166,1010, 552, # 3808
|
||||
681,1888,5490,5491,2972,2973,4099,1287,1596,1862,3179, 358, 453, 736, 175, 478, # 3824
|
||||
1117, 905,1167,1097,5492,1854,1530,5493,1706,5494,2181,3519,2292,3761,3520,3632, # 3840
|
||||
4346,2093,4347,5495,3404,1193,2489,4348,1458,2193,2208,1863,1889,1421,3331,2932, # 3856
|
||||
3069,2182,3521, 595,2123,5496,4100,5497,5498,4349,1707,2646, 223,3762,1359, 751, # 3872
|
||||
3121, 183,3522,5499,2810,3021, 419,2374, 633, 704,3897,2394, 241,5500,5501,5502, # 3888
|
||||
838,3022,3763,2277,2773,2459,3898,1939,2051,4101,1309,3122,2246,1181,5503,1136, # 3904
|
||||
2209,3899,2375,1446,4350,2310,4712,5504,5505,4351,1055,2615, 484,3764,5506,4102, # 3920
|
||||
625,4352,2278,3405,1499,4353,4103,5507,4104,4354,3253,2279,2280,3523,5508,5509, # 3936
|
||||
2774, 808,2616,3765,3406,4105,4355,3123,2539, 526,3407,3900,4356, 955,5510,1620, # 3952
|
||||
4357,2647,2432,5511,1429,3766,1669,1832, 994, 928,5512,3633,1260,5513,5514,5515, # 3968
|
||||
1949,2293, 741,2933,1626,4358,2738,2460, 867,1184, 362,3408,1392,5516,5517,4106, # 3984
|
||||
4359,1770,1736,3254,2934,4713,4714,1929,2707,1459,1158,5518,3070,3409,2891,1292, # 4000
|
||||
1930,2513,2855,3767,1986,1187,2072,2015,2617,4360,5519,2574,2514,2170,3768,2490, # 4016
|
||||
3332,5520,3769,4715,5521,5522, 666,1003,3023,1022,3634,4361,5523,4716,1814,2257, # 4032
|
||||
574,3901,1603, 295,1535, 705,3902,4362, 283, 858, 417,5524,5525,3255,4717,4718, # 4048
|
||||
3071,1220,1890,1046,2281,2461,4107,1393,1599, 689,2575, 388,4363,5526,2491, 802, # 4064
|
||||
5527,2811,3903,2061,1405,2258,5528,4719,3904,2110,1052,1345,3256,1585,5529, 809, # 4080
|
||||
5530,5531,5532, 575,2739,3524, 956,1552,1469,1144,2328,5533,2329,1560,2462,3635, # 4096
|
||||
3257,4108, 616,2210,4364,3180,2183,2294,5534,1833,5535,3525,4720,5536,1319,3770, # 4112
|
||||
3771,1211,3636,1023,3258,1293,2812,5537,5538,5539,3905, 607,2311,3906, 762,2892, # 4128
|
||||
1439,4365,1360,4721,1485,3072,5540,4722,1038,4366,1450,2062,2648,4367,1379,4723, # 4144
|
||||
2593,5541,5542,4368,1352,1414,2330,2935,1172,5543,5544,3907,3908,4724,1798,1451, # 4160
|
||||
5545,5546,5547,5548,2936,4109,4110,2492,2351, 411,4111,4112,3637,3333,3124,4725, # 4176
|
||||
1561,2674,1452,4113,1375,5549,5550, 47,2974, 316,5551,1406,1591,2937,3181,5552, # 4192
|
||||
1025,2142,3125,3182, 354,2740, 884,2228,4369,2412, 508,3772, 726,3638, 996,2433, # 4208
|
||||
3639, 729,5553, 392,2194,1453,4114,4726,3773,5554,5555,2463,3640,2618,1675,2813, # 4224
|
||||
919,2352,2975,2353,1270,4727,4115, 73,5556,5557, 647,5558,3259,2856,2259,1550, # 4240
|
||||
1346,3024,5559,1332, 883,3526,5560,5561,5562,5563,3334,2775,5564,1212, 831,1347, # 4256
|
||||
4370,4728,2331,3909,1864,3073, 720,3910,4729,4730,3911,5565,4371,5566,5567,4731, # 4272
|
||||
5568,5569,1799,4732,3774,2619,4733,3641,1645,2376,4734,5570,2938, 669,2211,2675, # 4288
|
||||
2434,5571,2893,5572,5573,1028,3260,5574,4372,2413,5575,2260,1353,5576,5577,4735, # 4304
|
||||
3183, 518,5578,4116,5579,4373,1961,5580,2143,4374,5581,5582,3025,2354,2355,3912, # 4320
|
||||
516,1834,1454,4117,2708,4375,4736,2229,2620,1972,1129,3642,5583,2776,5584,2976, # 4336
|
||||
1422, 577,1470,3026,1524,3410,5585,5586, 432,4376,3074,3527,5587,2594,1455,2515, # 4352
|
||||
2230,1973,1175,5588,1020,2741,4118,3528,4737,5589,2742,5590,1743,1361,3075,3529, # 4368
|
||||
2649,4119,4377,4738,2295, 895, 924,4378,2171, 331,2247,3076, 166,1627,3077,1098, # 4384
|
||||
5591,1232,2894,2231,3411,4739, 657, 403,1196,2377, 542,3775,3412,1600,4379,3530, # 4400
|
||||
5592,4740,2777,3261, 576, 530,1362,4741,4742,2540,2676,3776,4120,5593, 842,3913, # 4416
|
||||
5594,2814,2032,1014,4121, 213,2709,3413, 665, 621,4380,5595,3777,2939,2435,5596, # 4432
|
||||
2436,3335,3643,3414,4743,4381,2541,4382,4744,3644,1682,4383,3531,1380,5597, 724, # 4448
|
||||
2282, 600,1670,5598,1337,1233,4745,3126,2248,5599,1621,4746,5600, 651,4384,5601, # 4464
|
||||
1612,4385,2621,5602,2857,5603,2743,2312,3078,5604, 716,2464,3079, 174,1255,2710, # 4480
|
||||
4122,3645, 548,1320,1398, 728,4123,1574,5605,1891,1197,3080,4124,5606,3081,3082, # 4496
|
||||
3778,3646,3779, 747,5607, 635,4386,4747,5608,5609,5610,4387,5611,5612,4748,5613, # 4512
|
||||
3415,4749,2437, 451,5614,3780,2542,2073,4388,2744,4389,4125,5615,1764,4750,5616, # 4528
|
||||
4390, 350,4751,2283,2395,2493,5617,4391,4126,2249,1434,4127, 488,4752, 458,4392, # 4544
|
||||
4128,3781, 771,1330,2396,3914,2576,3184,2160,2414,1553,2677,3185,4393,5618,2494, # 4560
|
||||
2895,2622,1720,2711,4394,3416,4753,5619,2543,4395,5620,3262,4396,2778,5621,2016, # 4576
|
||||
2745,5622,1155,1017,3782,3915,5623,3336,2313, 201,1865,4397,1430,5624,4129,5625, # 4592
|
||||
5626,5627,5628,5629,4398,1604,5630, 414,1866, 371,2595,4754,4755,3532,2017,3127, # 4608
|
||||
4756,1708, 960,4399, 887, 389,2172,1536,1663,1721,5631,2232,4130,2356,2940,1580, # 4624
|
||||
5632,5633,1744,4757,2544,4758,4759,5634,4760,5635,2074,5636,4761,3647,3417,2896, # 4640
|
||||
4400,5637,4401,2650,3418,2815, 673,2712,2465, 709,3533,4131,3648,4402,5638,1148, # 4656
|
||||
502, 634,5639,5640,1204,4762,3649,1575,4763,2623,3783,5641,3784,3128, 948,3263, # 4672
|
||||
121,1745,3916,1110,5642,4403,3083,2516,3027,4132,3785,1151,1771,3917,1488,4133, # 4688
|
||||
1987,5643,2438,3534,5644,5645,2094,5646,4404,3918,1213,1407,2816, 531,2746,2545, # 4704
|
||||
3264,1011,1537,4764,2779,4405,3129,1061,5647,3786,3787,1867,2897,5648,2018, 120, # 4720
|
||||
4406,4407,2063,3650,3265,2314,3919,2678,3419,1955,4765,4134,5649,3535,1047,2713, # 4736
|
||||
1266,5650,1368,4766,2858, 649,3420,3920,2546,2747,1102,2859,2679,5651,5652,2000, # 4752
|
||||
5653,1111,3651,2977,5654,2495,3921,3652,2817,1855,3421,3788,5655,5656,3422,2415, # 4768
|
||||
2898,3337,3266,3653,5657,2577,5658,3654,2818,4135,1460, 856,5659,3655,5660,2899, # 4784
|
||||
2978,5661,2900,3922,5662,4408, 632,2517, 875,3923,1697,3924,2296,5663,5664,4767, # 4800
|
||||
3028,1239, 580,4768,4409,5665, 914, 936,2075,1190,4136,1039,2124,5666,5667,5668, # 4816
|
||||
5669,3423,1473,5670,1354,4410,3925,4769,2173,3084,4137, 915,3338,4411,4412,3339, # 4832
|
||||
1605,1835,5671,2748, 398,3656,4413,3926,4138, 328,1913,2860,4139,3927,1331,4414, # 4848
|
||||
3029, 937,4415,5672,3657,4140,4141,3424,2161,4770,3425, 524, 742, 538,3085,1012, # 4864
|
||||
5673,5674,3928,2466,5675, 658,1103, 225,3929,5676,5677,4771,5678,4772,5679,3267, # 4880
|
||||
1243,5680,4142, 963,2250,4773,5681,2714,3658,3186,5682,5683,2596,2332,5684,4774, # 4896
|
||||
5685,5686,5687,3536, 957,3426,2547,2033,1931,2941,2467, 870,2019,3659,1746,2780, # 4912
|
||||
2781,2439,2468,5688,3930,5689,3789,3130,3790,3537,3427,3791,5690,1179,3086,5691, # 4928
|
||||
3187,2378,4416,3792,2548,3188,3131,2749,4143,5692,3428,1556,2549,2297, 977,2901, # 4944
|
||||
2034,4144,1205,3429,5693,1765,3430,3189,2125,1271, 714,1689,4775,3538,5694,2333, # 4960
|
||||
3931, 533,4417,3660,2184, 617,5695,2469,3340,3539,2315,5696,5697,3190,5698,5699, # 4976
|
||||
3932,1988, 618, 427,2651,3540,3431,5700,5701,1244,1690,5702,2819,4418,4776,5703, # 4992
|
||||
3541,4777,5704,2284,1576, 473,3661,4419,3432, 972,5705,3662,5706,3087,5707,5708, # 5008
|
||||
4778,4779,5709,3793,4145,4146,5710, 153,4780, 356,5711,1892,2902,4420,2144, 408, # 5024
|
||||
803,2357,5712,3933,5713,4421,1646,2578,2518,4781,4782,3934,5714,3935,4422,5715, # 5040
|
||||
2416,3433, 752,5716,5717,1962,3341,2979,5718, 746,3030,2470,4783,4423,3794, 698, # 5056
|
||||
4784,1893,4424,3663,2550,4785,3664,3936,5719,3191,3434,5720,1824,1302,4147,2715, # 5072
|
||||
3937,1974,4425,5721,4426,3192, 823,1303,1288,1236,2861,3542,4148,3435, 774,3938, # 5088
|
||||
5722,1581,4786,1304,2862,3939,4787,5723,2440,2162,1083,3268,4427,4149,4428, 344, # 5104
|
||||
1173, 288,2316, 454,1683,5724,5725,1461,4788,4150,2597,5726,5727,4789, 985, 894, # 5120
|
||||
5728,3436,3193,5729,1914,2942,3795,1989,5730,2111,1975,5731,4151,5732,2579,1194, # 5136
|
||||
425,5733,4790,3194,1245,3796,4429,5734,5735,2863,5736, 636,4791,1856,3940, 760, # 5152
|
||||
1800,5737,4430,2212,1508,4792,4152,1894,1684,2298,5738,5739,4793,4431,4432,2213, # 5168
|
||||
479,5740,5741, 832,5742,4153,2496,5743,2980,2497,3797, 990,3132, 627,1815,2652, # 5184
|
||||
4433,1582,4434,2126,2112,3543,4794,5744, 799,4435,3195,5745,4795,2113,1737,3031, # 5200
|
||||
1018, 543, 754,4436,3342,1676,4796,4797,4154,4798,1489,5746,3544,5747,2624,2903, # 5216
|
||||
4155,5748,5749,2981,5750,5751,5752,5753,3196,4799,4800,2185,1722,5754,3269,3270, # 5232
|
||||
1843,3665,1715, 481, 365,1976,1857,5755,5756,1963,2498,4801,5757,2127,3666,3271, # 5248
|
||||
433,1895,2064,2076,5758, 602,2750,5759,5760,5761,5762,5763,3032,1628,3437,5764, # 5264
|
||||
3197,4802,4156,2904,4803,2519,5765,2551,2782,5766,5767,5768,3343,4804,2905,5769, # 5280
|
||||
4805,5770,2864,4806,4807,1221,2982,4157,2520,5771,5772,5773,1868,1990,5774,5775, # 5296
|
||||
5776,1896,5777,5778,4808,1897,4158, 318,5779,2095,4159,4437,5780,5781, 485,5782, # 5312
|
||||
938,3941, 553,2680, 116,5783,3942,3667,5784,3545,2681,2783,3438,3344,2820,5785, # 5328
|
||||
3668,2943,4160,1747,2944,2983,5786,5787, 207,5788,4809,5789,4810,2521,5790,3033, # 5344
|
||||
890,3669,3943,5791,1878,3798,3439,5792,2186,2358,3440,1652,5793,5794,5795, 941, # 5360
|
||||
2299, 208,3546,4161,2020, 330,4438,3944,2906,2499,3799,4439,4811,5796,5797,5798, # 5376 #last 512
|
||||
#Everything below is of no interest for detection purpose
|
||||
2522,1613,4812,5799,3345,3945,2523,5800,4162,5801,1637,4163,2471,4813,3946,5802, # 5392
|
||||
2500,3034,3800,5803,5804,2195,4814,5805,2163,5806,5807,5808,5809,5810,5811,5812, # 5408
|
||||
5813,5814,5815,5816,5817,5818,5819,5820,5821,5822,5823,5824,5825,5826,5827,5828, # 5424
|
||||
5829,5830,5831,5832,5833,5834,5835,5836,5837,5838,5839,5840,5841,5842,5843,5844, # 5440
|
||||
5845,5846,5847,5848,5849,5850,5851,5852,5853,5854,5855,5856,5857,5858,5859,5860, # 5456
|
||||
5861,5862,5863,5864,5865,5866,5867,5868,5869,5870,5871,5872,5873,5874,5875,5876, # 5472
|
||||
5877,5878,5879,5880,5881,5882,5883,5884,5885,5886,5887,5888,5889,5890,5891,5892, # 5488
|
||||
5893,5894,5895,5896,5897,5898,5899,5900,5901,5902,5903,5904,5905,5906,5907,5908, # 5504
|
||||
5909,5910,5911,5912,5913,5914,5915,5916,5917,5918,5919,5920,5921,5922,5923,5924, # 5520
|
||||
5925,5926,5927,5928,5929,5930,5931,5932,5933,5934,5935,5936,5937,5938,5939,5940, # 5536
|
||||
5941,5942,5943,5944,5945,5946,5947,5948,5949,5950,5951,5952,5953,5954,5955,5956, # 5552
|
||||
5957,5958,5959,5960,5961,5962,5963,5964,5965,5966,5967,5968,5969,5970,5971,5972, # 5568
|
||||
5973,5974,5975,5976,5977,5978,5979,5980,5981,5982,5983,5984,5985,5986,5987,5988, # 5584
|
||||
5989,5990,5991,5992,5993,5994,5995,5996,5997,5998,5999,6000,6001,6002,6003,6004, # 5600
|
||||
6005,6006,6007,6008,6009,6010,6011,6012,6013,6014,6015,6016,6017,6018,6019,6020, # 5616
|
||||
6021,6022,6023,6024,6025,6026,6027,6028,6029,6030,6031,6032,6033,6034,6035,6036, # 5632
|
||||
6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052, # 5648
|
||||
6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068, # 5664
|
||||
6069,6070,6071,6072,6073,6074,6075,6076,6077,6078,6079,6080,6081,6082,6083,6084, # 5680
|
||||
6085,6086,6087,6088,6089,6090,6091,6092,6093,6094,6095,6096,6097,6098,6099,6100, # 5696
|
||||
6101,6102,6103,6104,6105,6106,6107,6108,6109,6110,6111,6112,6113,6114,6115,6116, # 5712
|
||||
6117,6118,6119,6120,6121,6122,6123,6124,6125,6126,6127,6128,6129,6130,6131,6132, # 5728
|
||||
6133,6134,6135,6136,6137,6138,6139,6140,6141,6142,6143,6144,6145,6146,6147,6148, # 5744
|
||||
6149,6150,6151,6152,6153,6154,6155,6156,6157,6158,6159,6160,6161,6162,6163,6164, # 5760
|
||||
6165,6166,6167,6168,6169,6170,6171,6172,6173,6174,6175,6176,6177,6178,6179,6180, # 5776
|
||||
6181,6182,6183,6184,6185,6186,6187,6188,6189,6190,6191,6192,6193,6194,6195,6196, # 5792
|
||||
6197,6198,6199,6200,6201,6202,6203,6204,6205,6206,6207,6208,6209,6210,6211,6212, # 5808
|
||||
6213,6214,6215,6216,6217,6218,6219,6220,6221,6222,6223,3670,6224,6225,6226,6227, # 5824
|
||||
6228,6229,6230,6231,6232,6233,6234,6235,6236,6237,6238,6239,6240,6241,6242,6243, # 5840
|
||||
6244,6245,6246,6247,6248,6249,6250,6251,6252,6253,6254,6255,6256,6257,6258,6259, # 5856
|
||||
6260,6261,6262,6263,6264,6265,6266,6267,6268,6269,6270,6271,6272,6273,6274,6275, # 5872
|
||||
6276,6277,6278,6279,6280,6281,6282,6283,6284,6285,4815,6286,6287,6288,6289,6290, # 5888
|
||||
6291,6292,4816,6293,6294,6295,6296,6297,6298,6299,6300,6301,6302,6303,6304,6305, # 5904
|
||||
6306,6307,6308,6309,6310,6311,4817,4818,6312,6313,6314,6315,6316,6317,6318,4819, # 5920
|
||||
6319,6320,6321,6322,6323,6324,6325,6326,6327,6328,6329,6330,6331,6332,6333,6334, # 5936
|
||||
6335,6336,6337,4820,6338,6339,6340,6341,6342,6343,6344,6345,6346,6347,6348,6349, # 5952
|
||||
6350,6351,6352,6353,6354,6355,6356,6357,6358,6359,6360,6361,6362,6363,6364,6365, # 5968
|
||||
6366,6367,6368,6369,6370,6371,6372,6373,6374,6375,6376,6377,6378,6379,6380,6381, # 5984
|
||||
6382,6383,6384,6385,6386,6387,6388,6389,6390,6391,6392,6393,6394,6395,6396,6397, # 6000
|
||||
6398,6399,6400,6401,6402,6403,6404,6405,6406,6407,6408,6409,6410,3441,6411,6412, # 6016
|
||||
6413,6414,6415,6416,6417,6418,6419,6420,6421,6422,6423,6424,6425,4440,6426,6427, # 6032
|
||||
6428,6429,6430,6431,6432,6433,6434,6435,6436,6437,6438,6439,6440,6441,6442,6443, # 6048
|
||||
6444,6445,6446,6447,6448,6449,6450,6451,6452,6453,6454,4821,6455,6456,6457,6458, # 6064
|
||||
6459,6460,6461,6462,6463,6464,6465,6466,6467,6468,6469,6470,6471,6472,6473,6474, # 6080
|
||||
6475,6476,6477,3947,3948,6478,6479,6480,6481,3272,4441,6482,6483,6484,6485,4442, # 6096
|
||||
6486,6487,6488,6489,6490,6491,6492,6493,6494,6495,6496,4822,6497,6498,6499,6500, # 6112
|
||||
6501,6502,6503,6504,6505,6506,6507,6508,6509,6510,6511,6512,6513,6514,6515,6516, # 6128
|
||||
6517,6518,6519,6520,6521,6522,6523,6524,6525,6526,6527,6528,6529,6530,6531,6532, # 6144
|
||||
6533,6534,6535,6536,6537,6538,6539,6540,6541,6542,6543,6544,6545,6546,6547,6548, # 6160
|
||||
6549,6550,6551,6552,6553,6554,6555,6556,2784,6557,4823,6558,6559,6560,6561,6562, # 6176
|
||||
6563,6564,6565,6566,6567,6568,6569,3949,6570,6571,6572,4824,6573,6574,6575,6576, # 6192
|
||||
6577,6578,6579,6580,6581,6582,6583,4825,6584,6585,6586,3950,2785,6587,6588,6589, # 6208
|
||||
6590,6591,6592,6593,6594,6595,6596,6597,6598,6599,6600,6601,6602,6603,6604,6605, # 6224
|
||||
6606,6607,6608,6609,6610,6611,6612,4826,6613,6614,6615,4827,6616,6617,6618,6619, # 6240
|
||||
6620,6621,6622,6623,6624,6625,4164,6626,6627,6628,6629,6630,6631,6632,6633,6634, # 6256
|
||||
3547,6635,4828,6636,6637,6638,6639,6640,6641,6642,3951,2984,6643,6644,6645,6646, # 6272
|
||||
6647,6648,6649,4165,6650,4829,6651,6652,4830,6653,6654,6655,6656,6657,6658,6659, # 6288
|
||||
6660,6661,6662,4831,6663,6664,6665,6666,6667,6668,6669,6670,6671,4166,6672,4832, # 6304
|
||||
3952,6673,6674,6675,6676,4833,6677,6678,6679,4167,6680,6681,6682,3198,6683,6684, # 6320
|
||||
6685,6686,6687,6688,6689,6690,6691,6692,6693,6694,6695,6696,6697,4834,6698,6699, # 6336
|
||||
6700,6701,6702,6703,6704,6705,6706,6707,6708,6709,6710,6711,6712,6713,6714,6715, # 6352
|
||||
6716,6717,6718,6719,6720,6721,6722,6723,6724,6725,6726,6727,6728,6729,6730,6731, # 6368
|
||||
6732,6733,6734,4443,6735,6736,6737,6738,6739,6740,6741,6742,6743,6744,6745,4444, # 6384
|
||||
6746,6747,6748,6749,6750,6751,6752,6753,6754,6755,6756,6757,6758,6759,6760,6761, # 6400
|
||||
6762,6763,6764,6765,6766,6767,6768,6769,6770,6771,6772,6773,6774,6775,6776,6777, # 6416
|
||||
6778,6779,6780,6781,4168,6782,6783,3442,6784,6785,6786,6787,6788,6789,6790,6791, # 6432
|
||||
4169,6792,6793,6794,6795,6796,6797,6798,6799,6800,6801,6802,6803,6804,6805,6806, # 6448
|
||||
6807,6808,6809,6810,6811,4835,6812,6813,6814,4445,6815,6816,4446,6817,6818,6819, # 6464
|
||||
6820,6821,6822,6823,6824,6825,6826,6827,6828,6829,6830,6831,6832,6833,6834,6835, # 6480
|
||||
3548,6836,6837,6838,6839,6840,6841,6842,6843,6844,6845,6846,4836,6847,6848,6849, # 6496
|
||||
6850,6851,6852,6853,6854,3953,6855,6856,6857,6858,6859,6860,6861,6862,6863,6864, # 6512
|
||||
6865,6866,6867,6868,6869,6870,6871,6872,6873,6874,6875,6876,6877,3199,6878,6879, # 6528
|
||||
6880,6881,6882,4447,6883,6884,6885,6886,6887,6888,6889,6890,6891,6892,6893,6894, # 6544
|
||||
6895,6896,6897,6898,6899,6900,6901,6902,6903,6904,4170,6905,6906,6907,6908,6909, # 6560
|
||||
6910,6911,6912,6913,6914,6915,6916,6917,6918,6919,6920,6921,6922,6923,6924,6925, # 6576
|
||||
6926,6927,4837,6928,6929,6930,6931,6932,6933,6934,6935,6936,3346,6937,6938,4838, # 6592
|
||||
6939,6940,6941,4448,6942,6943,6944,6945,6946,4449,6947,6948,6949,6950,6951,6952, # 6608
|
||||
6953,6954,6955,6956,6957,6958,6959,6960,6961,6962,6963,6964,6965,6966,6967,6968, # 6624
|
||||
6969,6970,6971,6972,6973,6974,6975,6976,6977,6978,6979,6980,6981,6982,6983,6984, # 6640
|
||||
6985,6986,6987,6988,6989,6990,6991,6992,6993,6994,3671,6995,6996,6997,6998,4839, # 6656
|
||||
6999,7000,7001,7002,3549,7003,7004,7005,7006,7007,7008,7009,7010,7011,7012,7013, # 6672
|
||||
7014,7015,7016,7017,7018,7019,7020,7021,7022,7023,7024,7025,7026,7027,7028,7029, # 6688
|
||||
7030,4840,7031,7032,7033,7034,7035,7036,7037,7038,4841,7039,7040,7041,7042,7043, # 6704
|
||||
7044,7045,7046,7047,7048,7049,7050,7051,7052,7053,7054,7055,7056,7057,7058,7059, # 6720
|
||||
7060,7061,7062,7063,7064,7065,7066,7067,7068,7069,7070,2985,7071,7072,7073,7074, # 6736
|
||||
7075,7076,7077,7078,7079,7080,4842,7081,7082,7083,7084,7085,7086,7087,7088,7089, # 6752
|
||||
7090,7091,7092,7093,7094,7095,7096,7097,7098,7099,7100,7101,7102,7103,7104,7105, # 6768
|
||||
7106,7107,7108,7109,7110,7111,7112,7113,7114,7115,7116,7117,7118,4450,7119,7120, # 6784
|
||||
7121,7122,7123,7124,7125,7126,7127,7128,7129,7130,7131,7132,7133,7134,7135,7136, # 6800
|
||||
7137,7138,7139,7140,7141,7142,7143,4843,7144,7145,7146,7147,7148,7149,7150,7151, # 6816
|
||||
7152,7153,7154,7155,7156,7157,7158,7159,7160,7161,7162,7163,7164,7165,7166,7167, # 6832
|
||||
7168,7169,7170,7171,7172,7173,7174,7175,7176,7177,7178,7179,7180,7181,7182,7183, # 6848
|
||||
7184,7185,7186,7187,7188,4171,4172,7189,7190,7191,7192,7193,7194,7195,7196,7197, # 6864
|
||||
7198,7199,7200,7201,7202,7203,7204,7205,7206,7207,7208,7209,7210,7211,7212,7213, # 6880
|
||||
7214,7215,7216,7217,7218,7219,7220,7221,7222,7223,7224,7225,7226,7227,7228,7229, # 6896
|
||||
7230,7231,7232,7233,7234,7235,7236,7237,7238,7239,7240,7241,7242,7243,7244,7245, # 6912
|
||||
7246,7247,7248,7249,7250,7251,7252,7253,7254,7255,7256,7257,7258,7259,7260,7261, # 6928
|
||||
7262,7263,7264,7265,7266,7267,7268,7269,7270,7271,7272,7273,7274,7275,7276,7277, # 6944
|
||||
7278,7279,7280,7281,7282,7283,7284,7285,7286,7287,7288,7289,7290,7291,7292,7293, # 6960
|
||||
7294,7295,7296,4844,7297,7298,7299,7300,7301,7302,7303,7304,7305,7306,7307,7308, # 6976
|
||||
7309,7310,7311,7312,7313,7314,7315,7316,4451,7317,7318,7319,7320,7321,7322,7323, # 6992
|
||||
7324,7325,7326,7327,7328,7329,7330,7331,7332,7333,7334,7335,7336,7337,7338,7339, # 7008
|
||||
7340,7341,7342,7343,7344,7345,7346,7347,7348,7349,7350,7351,7352,7353,4173,7354, # 7024
|
||||
7355,4845,7356,7357,7358,7359,7360,7361,7362,7363,7364,7365,7366,7367,7368,7369, # 7040
|
||||
7370,7371,7372,7373,7374,7375,7376,7377,7378,7379,7380,7381,7382,7383,7384,7385, # 7056
|
||||
7386,7387,7388,4846,7389,7390,7391,7392,7393,7394,7395,7396,7397,7398,7399,7400, # 7072
|
||||
7401,7402,7403,7404,7405,3672,7406,7407,7408,7409,7410,7411,7412,7413,7414,7415, # 7088
|
||||
7416,7417,7418,7419,7420,7421,7422,7423,7424,7425,7426,7427,7428,7429,7430,7431, # 7104
|
||||
7432,7433,7434,7435,7436,7437,7438,7439,7440,7441,7442,7443,7444,7445,7446,7447, # 7120
|
||||
7448,7449,7450,7451,7452,7453,4452,7454,3200,7455,7456,7457,7458,7459,7460,7461, # 7136
|
||||
7462,7463,7464,7465,7466,7467,7468,7469,7470,7471,7472,7473,7474,4847,7475,7476, # 7152
|
||||
7477,3133,7478,7479,7480,7481,7482,7483,7484,7485,7486,7487,7488,7489,7490,7491, # 7168
|
||||
7492,7493,7494,7495,7496,7497,7498,7499,7500,7501,7502,3347,7503,7504,7505,7506, # 7184
|
||||
7507,7508,7509,7510,7511,7512,7513,7514,7515,7516,7517,7518,7519,7520,7521,4848, # 7200
|
||||
7522,7523,7524,7525,7526,7527,7528,7529,7530,7531,7532,7533,7534,7535,7536,7537, # 7216
|
||||
7538,7539,7540,7541,7542,7543,7544,7545,7546,7547,7548,7549,3801,4849,7550,7551, # 7232
|
||||
7552,7553,7554,7555,7556,7557,7558,7559,7560,7561,7562,7563,7564,7565,7566,7567, # 7248
|
||||
7568,7569,3035,7570,7571,7572,7573,7574,7575,7576,7577,7578,7579,7580,7581,7582, # 7264
|
||||
7583,7584,7585,7586,7587,7588,7589,7590,7591,7592,7593,7594,7595,7596,7597,7598, # 7280
|
||||
7599,7600,7601,7602,7603,7604,7605,7606,7607,7608,7609,7610,7611,7612,7613,7614, # 7296
|
||||
7615,7616,4850,7617,7618,3802,7619,7620,7621,7622,7623,7624,7625,7626,7627,7628, # 7312
|
||||
7629,7630,7631,7632,4851,7633,7634,7635,7636,7637,7638,7639,7640,7641,7642,7643, # 7328
|
||||
7644,7645,7646,7647,7648,7649,7650,7651,7652,7653,7654,7655,7656,7657,7658,7659, # 7344
|
||||
7660,7661,7662,7663,7664,7665,7666,7667,7668,7669,7670,4453,7671,7672,7673,7674, # 7360
|
||||
7675,7676,7677,7678,7679,7680,7681,7682,7683,7684,7685,7686,7687,7688,7689,7690, # 7376
|
||||
7691,7692,7693,7694,7695,7696,7697,3443,7698,7699,7700,7701,7702,4454,7703,7704, # 7392
|
||||
7705,7706,7707,7708,7709,7710,7711,7712,7713,2472,7714,7715,7716,7717,7718,7719, # 7408
|
||||
7720,7721,7722,7723,7724,7725,7726,7727,7728,7729,7730,7731,3954,7732,7733,7734, # 7424
|
||||
7735,7736,7737,7738,7739,7740,7741,7742,7743,7744,7745,7746,7747,7748,7749,7750, # 7440
|
||||
3134,7751,7752,4852,7753,7754,7755,4853,7756,7757,7758,7759,7760,4174,7761,7762, # 7456
|
||||
7763,7764,7765,7766,7767,7768,7769,7770,7771,7772,7773,7774,7775,7776,7777,7778, # 7472
|
||||
7779,7780,7781,7782,7783,7784,7785,7786,7787,7788,7789,7790,7791,7792,7793,7794, # 7488
|
||||
7795,7796,7797,7798,7799,7800,7801,7802,7803,7804,7805,4854,7806,7807,7808,7809, # 7504
|
||||
7810,7811,7812,7813,7814,7815,7816,7817,7818,7819,7820,7821,7822,7823,7824,7825, # 7520
|
||||
4855,7826,7827,7828,7829,7830,7831,7832,7833,7834,7835,7836,7837,7838,7839,7840, # 7536
|
||||
7841,7842,7843,7844,7845,7846,7847,3955,7848,7849,7850,7851,7852,7853,7854,7855, # 7552
|
||||
7856,7857,7858,7859,7860,3444,7861,7862,7863,7864,7865,7866,7867,7868,7869,7870, # 7568
|
||||
7871,7872,7873,7874,7875,7876,7877,7878,7879,7880,7881,7882,7883,7884,7885,7886, # 7584
|
||||
7887,7888,7889,7890,7891,4175,7892,7893,7894,7895,7896,4856,4857,7897,7898,7899, # 7600
|
||||
7900,2598,7901,7902,7903,7904,7905,7906,7907,7908,4455,7909,7910,7911,7912,7913, # 7616
|
||||
7914,3201,7915,7916,7917,7918,7919,7920,7921,4858,7922,7923,7924,7925,7926,7927, # 7632
|
||||
7928,7929,7930,7931,7932,7933,7934,7935,7936,7937,7938,7939,7940,7941,7942,7943, # 7648
|
||||
7944,7945,7946,7947,7948,7949,7950,7951,7952,7953,7954,7955,7956,7957,7958,7959, # 7664
|
||||
7960,7961,7962,7963,7964,7965,7966,7967,7968,7969,7970,7971,7972,7973,7974,7975, # 7680
|
||||
7976,7977,7978,7979,7980,7981,4859,7982,7983,7984,7985,7986,7987,7988,7989,7990, # 7696
|
||||
7991,7992,7993,7994,7995,7996,4860,7997,7998,7999,8000,8001,8002,8003,8004,8005, # 7712
|
||||
8006,8007,8008,8009,8010,8011,8012,8013,8014,8015,8016,4176,8017,8018,8019,8020, # 7728
|
||||
8021,8022,8023,4861,8024,8025,8026,8027,8028,8029,8030,8031,8032,8033,8034,8035, # 7744
|
||||
8036,4862,4456,8037,8038,8039,8040,4863,8041,8042,8043,8044,8045,8046,8047,8048, # 7760
|
||||
8049,8050,8051,8052,8053,8054,8055,8056,8057,8058,8059,8060,8061,8062,8063,8064, # 7776
|
||||
8065,8066,8067,8068,8069,8070,8071,8072,8073,8074,8075,8076,8077,8078,8079,8080, # 7792
|
||||
8081,8082,8083,8084,8085,8086,8087,8088,8089,8090,8091,8092,8093,8094,8095,8096, # 7808
|
||||
8097,8098,8099,4864,4177,8100,8101,8102,8103,8104,8105,8106,8107,8108,8109,8110, # 7824
|
||||
8111,8112,8113,8114,8115,8116,8117,8118,8119,8120,4178,8121,8122,8123,8124,8125, # 7840
|
||||
8126,8127,8128,8129,8130,8131,8132,8133,8134,8135,8136,8137,8138,8139,8140,8141, # 7856
|
||||
8142,8143,8144,8145,4865,4866,8146,8147,8148,8149,8150,8151,8152,8153,8154,8155, # 7872
|
||||
8156,8157,8158,8159,8160,8161,8162,8163,8164,8165,4179,8166,8167,8168,8169,8170, # 7888
|
||||
8171,8172,8173,8174,8175,8176,8177,8178,8179,8180,8181,4457,8182,8183,8184,8185, # 7904
|
||||
8186,8187,8188,8189,8190,8191,8192,8193,8194,8195,8196,8197,8198,8199,8200,8201, # 7920
|
||||
8202,8203,8204,8205,8206,8207,8208,8209,8210,8211,8212,8213,8214,8215,8216,8217, # 7936
|
||||
8218,8219,8220,8221,8222,8223,8224,8225,8226,8227,8228,8229,8230,8231,8232,8233, # 7952
|
||||
8234,8235,8236,8237,8238,8239,8240,8241,8242,8243,8244,8245,8246,8247,8248,8249, # 7968
|
||||
8250,8251,8252,8253,8254,8255,8256,3445,8257,8258,8259,8260,8261,8262,4458,8263, # 7984
|
||||
8264,8265,8266,8267,8268,8269,8270,8271,8272,4459,8273,8274,8275,8276,3550,8277, # 8000
|
||||
8278,8279,8280,8281,8282,8283,8284,8285,8286,8287,8288,8289,4460,8290,8291,8292, # 8016
|
||||
8293,8294,8295,8296,8297,8298,8299,8300,8301,8302,8303,8304,8305,8306,8307,4867, # 8032
|
||||
8308,8309,8310,8311,8312,3551,8313,8314,8315,8316,8317,8318,8319,8320,8321,8322, # 8048
|
||||
8323,8324,8325,8326,4868,8327,8328,8329,8330,8331,8332,8333,8334,8335,8336,8337, # 8064
|
||||
8338,8339,8340,8341,8342,8343,8344,8345,8346,8347,8348,8349,8350,8351,8352,8353, # 8080
|
||||
8354,8355,8356,8357,8358,8359,8360,8361,8362,8363,4869,4461,8364,8365,8366,8367, # 8096
|
||||
8368,8369,8370,4870,8371,8372,8373,8374,8375,8376,8377,8378,8379,8380,8381,8382, # 8112
|
||||
8383,8384,8385,8386,8387,8388,8389,8390,8391,8392,8393,8394,8395,8396,8397,8398, # 8128
|
||||
8399,8400,8401,8402,8403,8404,8405,8406,8407,8408,8409,8410,4871,8411,8412,8413, # 8144
|
||||
8414,8415,8416,8417,8418,8419,8420,8421,8422,4462,8423,8424,8425,8426,8427,8428, # 8160
|
||||
8429,8430,8431,8432,8433,2986,8434,8435,8436,8437,8438,8439,8440,8441,8442,8443, # 8176
|
||||
8444,8445,8446,8447,8448,8449,8450,8451,8452,8453,8454,8455,8456,8457,8458,8459, # 8192
|
||||
8460,8461,8462,8463,8464,8465,8466,8467,8468,8469,8470,8471,8472,8473,8474,8475, # 8208
|
||||
8476,8477,8478,4180,8479,8480,8481,8482,8483,8484,8485,8486,8487,8488,8489,8490, # 8224
|
||||
8491,8492,8493,8494,8495,8496,8497,8498,8499,8500,8501,8502,8503,8504,8505,8506, # 8240
|
||||
8507,8508,8509,8510,8511,8512,8513,8514,8515,8516,8517,8518,8519,8520,8521,8522, # 8256
|
||||
8523,8524,8525,8526,8527,8528,8529,8530,8531,8532,8533,8534,8535,8536,8537,8538, # 8272
|
||||
8539,8540,8541,8542,8543,8544,8545,8546,8547,8548,8549,8550,8551,8552,8553,8554, # 8288
|
||||
8555,8556,8557,8558,8559,8560,8561,8562,8563,8564,4872,8565,8566,8567,8568,8569, # 8304
|
||||
8570,8571,8572,8573,4873,8574,8575,8576,8577,8578,8579,8580,8581,8582,8583,8584, # 8320
|
||||
8585,8586,8587,8588,8589,8590,8591,8592,8593,8594,8595,8596,8597,8598,8599,8600, # 8336
|
||||
8601,8602,8603,8604,8605,3803,8606,8607,8608,8609,8610,8611,8612,8613,4874,3804, # 8352
|
||||
8614,8615,8616,8617,8618,8619,8620,8621,3956,8622,8623,8624,8625,8626,8627,8628, # 8368
|
||||
8629,8630,8631,8632,8633,8634,8635,8636,8637,8638,2865,8639,8640,8641,8642,8643, # 8384
|
||||
8644,8645,8646,8647,8648,8649,8650,8651,8652,8653,8654,8655,8656,4463,8657,8658, # 8400
|
||||
8659,4875,4876,8660,8661,8662,8663,8664,8665,8666,8667,8668,8669,8670,8671,8672, # 8416
|
||||
8673,8674,8675,8676,8677,8678,8679,8680,8681,4464,8682,8683,8684,8685,8686,8687, # 8432
|
||||
8688,8689,8690,8691,8692,8693,8694,8695,8696,8697,8698,8699,8700,8701,8702,8703, # 8448
|
||||
8704,8705,8706,8707,8708,8709,2261,8710,8711,8712,8713,8714,8715,8716,8717,8718, # 8464
|
||||
8719,8720,8721,8722,8723,8724,8725,8726,8727,8728,8729,8730,8731,8732,8733,4181, # 8480
|
||||
8734,8735,8736,8737,8738,8739,8740,8741,8742,8743,8744,8745,8746,8747,8748,8749, # 8496
|
||||
8750,8751,8752,8753,8754,8755,8756,8757,8758,8759,8760,8761,8762,8763,4877,8764, # 8512
|
||||
8765,8766,8767,8768,8769,8770,8771,8772,8773,8774,8775,8776,8777,8778,8779,8780, # 8528
|
||||
8781,8782,8783,8784,8785,8786,8787,8788,4878,8789,4879,8790,8791,8792,4880,8793, # 8544
|
||||
8794,8795,8796,8797,8798,8799,8800,8801,4881,8802,8803,8804,8805,8806,8807,8808, # 8560
|
||||
8809,8810,8811,8812,8813,8814,8815,3957,8816,8817,8818,8819,8820,8821,8822,8823, # 8576
|
||||
8824,8825,8826,8827,8828,8829,8830,8831,8832,8833,8834,8835,8836,8837,8838,8839, # 8592
|
||||
8840,8841,8842,8843,8844,8845,8846,8847,4882,8848,8849,8850,8851,8852,8853,8854, # 8608
|
||||
8855,8856,8857,8858,8859,8860,8861,8862,8863,8864,8865,8866,8867,8868,8869,8870, # 8624
|
||||
8871,8872,8873,8874,8875,8876,8877,8878,8879,8880,8881,8882,8883,8884,3202,8885, # 8640
|
||||
8886,8887,8888,8889,8890,8891,8892,8893,8894,8895,8896,8897,8898,8899,8900,8901, # 8656
|
||||
8902,8903,8904,8905,8906,8907,8908,8909,8910,8911,8912,8913,8914,8915,8916,8917, # 8672
|
||||
8918,8919,8920,8921,8922,8923,8924,4465,8925,8926,8927,8928,8929,8930,8931,8932, # 8688
|
||||
4883,8933,8934,8935,8936,8937,8938,8939,8940,8941,8942,8943,2214,8944,8945,8946, # 8704
|
||||
8947,8948,8949,8950,8951,8952,8953,8954,8955,8956,8957,8958,8959,8960,8961,8962, # 8720
|
||||
8963,8964,8965,4884,8966,8967,8968,8969,8970,8971,8972,8973,8974,8975,8976,8977, # 8736
|
||||
8978,8979,8980,8981,8982,8983,8984,8985,8986,8987,8988,8989,8990,8991,8992,4885, # 8752
|
||||
8993,8994,8995,8996,8997,8998,8999,9000,9001,9002,9003,9004,9005,9006,9007,9008, # 8768
|
||||
9009,9010,9011,9012,9013,9014,9015,9016,9017,9018,9019,9020,9021,4182,9022,9023, # 8784
|
||||
9024,9025,9026,9027,9028,9029,9030,9031,9032,9033,9034,9035,9036,9037,9038,9039, # 8800
|
||||
9040,9041,9042,9043,9044,9045,9046,9047,9048,9049,9050,9051,9052,9053,9054,9055, # 8816
|
||||
9056,9057,9058,9059,9060,9061,9062,9063,4886,9064,9065,9066,9067,9068,9069,4887, # 8832
|
||||
9070,9071,9072,9073,9074,9075,9076,9077,9078,9079,9080,9081,9082,9083,9084,9085, # 8848
|
||||
9086,9087,9088,9089,9090,9091,9092,9093,9094,9095,9096,9097,9098,9099,9100,9101, # 8864
|
||||
9102,9103,9104,9105,9106,9107,9108,9109,9110,9111,9112,9113,9114,9115,9116,9117, # 8880
|
||||
9118,9119,9120,9121,9122,9123,9124,9125,9126,9127,9128,9129,9130,9131,9132,9133, # 8896
|
||||
9134,9135,9136,9137,9138,9139,9140,9141,3958,9142,9143,9144,9145,9146,9147,9148, # 8912
|
||||
9149,9150,9151,4888,9152,9153,9154,9155,9156,9157,9158,9159,9160,9161,9162,9163, # 8928
|
||||
9164,9165,9166,9167,9168,9169,9170,9171,9172,9173,9174,9175,4889,9176,9177,9178, # 8944
|
||||
9179,9180,9181,9182,9183,9184,9185,9186,9187,9188,9189,9190,9191,9192,9193,9194, # 8960
|
||||
9195,9196,9197,9198,9199,9200,9201,9202,9203,4890,9204,9205,9206,9207,9208,9209, # 8976
|
||||
9210,9211,9212,9213,9214,9215,9216,9217,9218,9219,9220,9221,9222,4466,9223,9224, # 8992
|
||||
9225,9226,9227,9228,9229,9230,9231,9232,9233,9234,9235,9236,9237,9238,9239,9240, # 9008
|
||||
9241,9242,9243,9244,9245,4891,9246,9247,9248,9249,9250,9251,9252,9253,9254,9255, # 9024
|
||||
9256,9257,4892,9258,9259,9260,9261,4893,4894,9262,9263,9264,9265,9266,9267,9268, # 9040
|
||||
9269,9270,9271,9272,9273,4467,9274,9275,9276,9277,9278,9279,9280,9281,9282,9283, # 9056
|
||||
9284,9285,3673,9286,9287,9288,9289,9290,9291,9292,9293,9294,9295,9296,9297,9298, # 9072
|
||||
9299,9300,9301,9302,9303,9304,9305,9306,9307,9308,9309,9310,9311,9312,9313,9314, # 9088
|
||||
9315,9316,9317,9318,9319,9320,9321,9322,4895,9323,9324,9325,9326,9327,9328,9329, # 9104
|
||||
9330,9331,9332,9333,9334,9335,9336,9337,9338,9339,9340,9341,9342,9343,9344,9345, # 9120
|
||||
9346,9347,4468,9348,9349,9350,9351,9352,9353,9354,9355,9356,9357,9358,9359,9360, # 9136
|
||||
9361,9362,9363,9364,9365,9366,9367,9368,9369,9370,9371,9372,9373,4896,9374,4469, # 9152
|
||||
9375,9376,9377,9378,9379,4897,9380,9381,9382,9383,9384,9385,9386,9387,9388,9389, # 9168
|
||||
9390,9391,9392,9393,9394,9395,9396,9397,9398,9399,9400,9401,9402,9403,9404,9405, # 9184
|
||||
9406,4470,9407,2751,9408,9409,3674,3552,9410,9411,9412,9413,9414,9415,9416,9417, # 9200
|
||||
9418,9419,9420,9421,4898,9422,9423,9424,9425,9426,9427,9428,9429,3959,9430,9431, # 9216
|
||||
9432,9433,9434,9435,9436,4471,9437,9438,9439,9440,9441,9442,9443,9444,9445,9446, # 9232
|
||||
9447,9448,9449,9450,3348,9451,9452,9453,9454,9455,9456,9457,9458,9459,9460,9461, # 9248
|
||||
9462,9463,9464,9465,9466,9467,9468,9469,9470,9471,9472,4899,9473,9474,9475,9476, # 9264
|
||||
9477,4900,9478,9479,9480,9481,9482,9483,9484,9485,9486,9487,9488,3349,9489,9490, # 9280
|
||||
9491,9492,9493,9494,9495,9496,9497,9498,9499,9500,9501,9502,9503,9504,9505,9506, # 9296
|
||||
9507,9508,9509,9510,9511,9512,9513,9514,9515,9516,9517,9518,9519,9520,4901,9521, # 9312
|
||||
9522,9523,9524,9525,9526,4902,9527,9528,9529,9530,9531,9532,9533,9534,9535,9536, # 9328
|
||||
9537,9538,9539,9540,9541,9542,9543,9544,9545,9546,9547,9548,9549,9550,9551,9552, # 9344
|
||||
9553,9554,9555,9556,9557,9558,9559,9560,9561,9562,9563,9564,9565,9566,9567,9568, # 9360
|
||||
9569,9570,9571,9572,9573,9574,9575,9576,9577,9578,9579,9580,9581,9582,9583,9584, # 9376
|
||||
3805,9585,9586,9587,9588,9589,9590,9591,9592,9593,9594,9595,9596,9597,9598,9599, # 9392
|
||||
9600,9601,9602,4903,9603,9604,9605,9606,9607,4904,9608,9609,9610,9611,9612,9613, # 9408
|
||||
9614,4905,9615,9616,9617,9618,9619,9620,9621,9622,9623,9624,9625,9626,9627,9628, # 9424
|
||||
9629,9630,9631,9632,4906,9633,9634,9635,9636,9637,9638,9639,9640,9641,9642,9643, # 9440
|
||||
4907,9644,9645,9646,9647,9648,9649,9650,9651,9652,9653,9654,9655,9656,9657,9658, # 9456
|
||||
9659,9660,9661,9662,9663,9664,9665,9666,9667,9668,9669,9670,9671,9672,4183,9673, # 9472
|
||||
9674,9675,9676,9677,4908,9678,9679,9680,9681,4909,9682,9683,9684,9685,9686,9687, # 9488
|
||||
9688,9689,9690,4910,9691,9692,9693,3675,9694,9695,9696,2945,9697,9698,9699,9700, # 9504
|
||||
9701,9702,9703,9704,9705,4911,9706,9707,9708,9709,9710,9711,9712,9713,9714,9715, # 9520
|
||||
9716,9717,9718,9719,9720,9721,9722,9723,9724,9725,9726,9727,9728,9729,9730,9731, # 9536
|
||||
9732,9733,9734,9735,4912,9736,9737,9738,9739,9740,4913,9741,9742,9743,9744,9745, # 9552
|
||||
9746,9747,9748,9749,9750,9751,9752,9753,9754,9755,9756,9757,9758,4914,9759,9760, # 9568
|
||||
9761,9762,9763,9764,9765,9766,9767,9768,9769,9770,9771,9772,9773,9774,9775,9776, # 9584
|
||||
9777,9778,9779,9780,9781,9782,4915,9783,9784,9785,9786,9787,9788,9789,9790,9791, # 9600
|
||||
9792,9793,4916,9794,9795,9796,9797,9798,9799,9800,9801,9802,9803,9804,9805,9806, # 9616
|
||||
9807,9808,9809,9810,9811,9812,9813,9814,9815,9816,9817,9818,9819,9820,9821,9822, # 9632
|
||||
9823,9824,9825,9826,9827,9828,9829,9830,9831,9832,9833,9834,9835,9836,9837,9838, # 9648
|
||||
9839,9840,9841,9842,9843,9844,9845,9846,9847,9848,9849,9850,9851,9852,9853,9854, # 9664
|
||||
9855,9856,9857,9858,9859,9860,9861,9862,9863,9864,9865,9866,9867,9868,4917,9869, # 9680
|
||||
9870,9871,9872,9873,9874,9875,9876,9877,9878,9879,9880,9881,9882,9883,9884,9885, # 9696
|
||||
9886,9887,9888,9889,9890,9891,9892,4472,9893,9894,9895,9896,9897,3806,9898,9899, # 9712
|
||||
9900,9901,9902,9903,9904,9905,9906,9907,9908,9909,9910,9911,9912,9913,9914,4918, # 9728
|
||||
9915,9916,9917,4919,9918,9919,9920,9921,4184,9922,9923,9924,9925,9926,9927,9928, # 9744
|
||||
9929,9930,9931,9932,9933,9934,9935,9936,9937,9938,9939,9940,9941,9942,9943,9944, # 9760
|
||||
9945,9946,4920,9947,9948,9949,9950,9951,9952,9953,9954,9955,4185,9956,9957,9958, # 9776
|
||||
9959,9960,9961,9962,9963,9964,9965,4921,9966,9967,9968,4473,9969,9970,9971,9972, # 9792
|
||||
9973,9974,9975,9976,9977,4474,9978,9979,9980,9981,9982,9983,9984,9985,9986,9987, # 9808
|
||||
9988,9989,9990,9991,9992,9993,9994,9995,9996,9997,9998,9999,10000,10001,10002,10003, # 9824
|
||||
10004,10005,10006,10007,10008,10009,10010,10011,10012,10013,10014,10015,10016,10017,10018,10019, # 9840
|
||||
10020,10021,4922,10022,4923,10023,10024,10025,10026,10027,10028,10029,10030,10031,10032,10033, # 9856
|
||||
10034,10035,10036,10037,10038,10039,10040,10041,10042,10043,10044,10045,10046,10047,10048,4924, # 9872
|
||||
10049,10050,10051,10052,10053,10054,10055,10056,10057,10058,10059,10060,10061,10062,10063,10064, # 9888
|
||||
10065,10066,10067,10068,10069,10070,10071,10072,10073,10074,10075,10076,10077,10078,10079,10080, # 9904
|
||||
10081,10082,10083,10084,10085,10086,10087,4475,10088,10089,10090,10091,10092,10093,10094,10095, # 9920
|
||||
10096,10097,4476,10098,10099,10100,10101,10102,10103,10104,10105,10106,10107,10108,10109,10110, # 9936
|
||||
10111,2174,10112,10113,10114,10115,10116,10117,10118,10119,10120,10121,10122,10123,10124,10125, # 9952
|
||||
10126,10127,10128,10129,10130,10131,10132,10133,10134,10135,10136,10137,10138,10139,10140,3807, # 9968
|
||||
4186,4925,10141,10142,10143,10144,10145,10146,10147,4477,4187,10148,10149,10150,10151,10152, # 9984
|
||||
10153,4188,10154,10155,10156,10157,10158,10159,10160,10161,4926,10162,10163,10164,10165,10166, #10000
|
||||
10167,10168,10169,10170,10171,10172,10173,10174,10175,10176,10177,10178,10179,10180,10181,10182, #10016
|
||||
10183,10184,10185,10186,10187,10188,10189,10190,10191,10192,3203,10193,10194,10195,10196,10197, #10032
|
||||
10198,10199,10200,4478,10201,10202,10203,10204,4479,10205,10206,10207,10208,10209,10210,10211, #10048
|
||||
10212,10213,10214,10215,10216,10217,10218,10219,10220,10221,10222,10223,10224,10225,10226,10227, #10064
|
||||
10228,10229,10230,10231,10232,10233,10234,4927,10235,10236,10237,10238,10239,10240,10241,10242, #10080
|
||||
10243,10244,10245,10246,10247,10248,10249,10250,10251,10252,10253,10254,10255,10256,10257,10258, #10096
|
||||
10259,10260,10261,10262,10263,10264,10265,10266,10267,10268,10269,10270,10271,10272,10273,4480, #10112
|
||||
4928,4929,10274,10275,10276,10277,10278,10279,10280,10281,10282,10283,10284,10285,10286,10287, #10128
|
||||
10288,10289,10290,10291,10292,10293,10294,10295,10296,10297,10298,10299,10300,10301,10302,10303, #10144
|
||||
10304,10305,10306,10307,10308,10309,10310,10311,10312,10313,10314,10315,10316,10317,10318,10319, #10160
|
||||
10320,10321,10322,10323,10324,10325,10326,10327,10328,10329,10330,10331,10332,10333,10334,4930, #10176
|
||||
10335,10336,10337,10338,10339,10340,10341,10342,4931,10343,10344,10345,10346,10347,10348,10349, #10192
|
||||
10350,10351,10352,10353,10354,10355,3088,10356,2786,10357,10358,10359,10360,4189,10361,10362, #10208
|
||||
10363,10364,10365,10366,10367,10368,10369,10370,10371,10372,10373,10374,10375,4932,10376,10377, #10224
|
||||
10378,10379,10380,10381,10382,10383,10384,10385,10386,10387,10388,10389,10390,10391,10392,4933, #10240
|
||||
10393,10394,10395,4934,10396,10397,10398,10399,10400,10401,10402,10403,10404,10405,10406,10407, #10256
|
||||
10408,10409,10410,10411,10412,3446,10413,10414,10415,10416,10417,10418,10419,10420,10421,10422, #10272
|
||||
10423,4935,10424,10425,10426,10427,10428,10429,10430,4936,10431,10432,10433,10434,10435,10436, #10288
|
||||
10437,10438,10439,10440,10441,10442,10443,4937,10444,10445,10446,10447,4481,10448,10449,10450, #10304
|
||||
10451,10452,10453,10454,10455,10456,10457,10458,10459,10460,10461,10462,10463,10464,10465,10466, #10320
|
||||
10467,10468,10469,10470,10471,10472,10473,10474,10475,10476,10477,10478,10479,10480,10481,10482, #10336
|
||||
10483,10484,10485,10486,10487,10488,10489,10490,10491,10492,10493,10494,10495,10496,10497,10498, #10352
|
||||
10499,10500,10501,10502,10503,10504,10505,4938,10506,10507,10508,10509,10510,2552,10511,10512, #10368
|
||||
10513,10514,10515,10516,3447,10517,10518,10519,10520,10521,10522,10523,10524,10525,10526,10527, #10384
|
||||
10528,10529,10530,10531,10532,10533,10534,10535,10536,10537,10538,10539,10540,10541,10542,10543, #10400
|
||||
4482,10544,4939,10545,10546,10547,10548,10549,10550,10551,10552,10553,10554,10555,10556,10557, #10416
|
||||
10558,10559,10560,10561,10562,10563,10564,10565,10566,10567,3676,4483,10568,10569,10570,10571, #10432
|
||||
10572,3448,10573,10574,10575,10576,10577,10578,10579,10580,10581,10582,10583,10584,10585,10586, #10448
|
||||
10587,10588,10589,10590,10591,10592,10593,10594,10595,10596,10597,10598,10599,10600,10601,10602, #10464
|
||||
10603,10604,10605,10606,10607,10608,10609,10610,10611,10612,10613,10614,10615,10616,10617,10618, #10480
|
||||
10619,10620,10621,10622,10623,10624,10625,10626,10627,4484,10628,10629,10630,10631,10632,4940, #10496
|
||||
10633,10634,10635,10636,10637,10638,10639,10640,10641,10642,10643,10644,10645,10646,10647,10648, #10512
|
||||
10649,10650,10651,10652,10653,10654,10655,10656,4941,10657,10658,10659,2599,10660,10661,10662, #10528
|
||||
10663,10664,10665,10666,3089,10667,10668,10669,10670,10671,10672,10673,10674,10675,10676,10677, #10544
|
||||
10678,10679,10680,4942,10681,10682,10683,10684,10685,10686,10687,10688,10689,10690,10691,10692, #10560
|
||||
10693,10694,10695,10696,10697,4485,10698,10699,10700,10701,10702,10703,10704,4943,10705,3677, #10576
|
||||
10706,10707,10708,10709,10710,10711,10712,4944,10713,10714,10715,10716,10717,10718,10719,10720, #10592
|
||||
10721,10722,10723,10724,10725,10726,10727,10728,4945,10729,10730,10731,10732,10733,10734,10735, #10608
|
||||
10736,10737,10738,10739,10740,10741,10742,10743,10744,10745,10746,10747,10748,10749,10750,10751, #10624
|
||||
10752,10753,10754,10755,10756,10757,10758,10759,10760,10761,4946,10762,10763,10764,10765,10766, #10640
|
||||
10767,4947,4948,10768,10769,10770,10771,10772,10773,10774,10775,10776,10777,10778,10779,10780, #10656
|
||||
10781,10782,10783,10784,10785,10786,10787,10788,10789,10790,10791,10792,10793,10794,10795,10796, #10672
|
||||
10797,10798,10799,10800,10801,10802,10803,10804,10805,10806,10807,10808,10809,10810,10811,10812, #10688
|
||||
10813,10814,10815,10816,10817,10818,10819,10820,10821,10822,10823,10824,10825,10826,10827,10828, #10704
|
||||
10829,10830,10831,10832,10833,10834,10835,10836,10837,10838,10839,10840,10841,10842,10843,10844, #10720
|
||||
10845,10846,10847,10848,10849,10850,10851,10852,10853,10854,10855,10856,10857,10858,10859,10860, #10736
|
||||
10861,10862,10863,10864,10865,10866,10867,10868,10869,10870,10871,10872,10873,10874,10875,10876, #10752
|
||||
10877,10878,4486,10879,10880,10881,10882,10883,10884,10885,4949,10886,10887,10888,10889,10890, #10768
|
||||
10891,10892,10893,10894,10895,10896,10897,10898,10899,10900,10901,10902,10903,10904,10905,10906, #10784
|
||||
10907,10908,10909,10910,10911,10912,10913,10914,10915,10916,10917,10918,10919,4487,10920,10921, #10800
|
||||
10922,10923,10924,10925,10926,10927,10928,10929,10930,10931,10932,4950,10933,10934,10935,10936, #10816
|
||||
10937,10938,10939,10940,10941,10942,10943,10944,10945,10946,10947,10948,10949,4488,10950,10951, #10832
|
||||
10952,10953,10954,10955,10956,10957,10958,10959,4190,10960,10961,10962,10963,10964,10965,10966, #10848
|
||||
10967,10968,10969,10970,10971,10972,10973,10974,10975,10976,10977,10978,10979,10980,10981,10982, #10864
|
||||
10983,10984,10985,10986,10987,10988,10989,10990,10991,10992,10993,10994,10995,10996,10997,10998, #10880
|
||||
10999,11000,11001,11002,11003,11004,11005,11006,3960,11007,11008,11009,11010,11011,11012,11013, #10896
|
||||
11014,11015,11016,11017,11018,11019,11020,11021,11022,11023,11024,11025,11026,11027,11028,11029, #10912
|
||||
11030,11031,11032,4951,11033,11034,11035,11036,11037,11038,11039,11040,11041,11042,11043,11044, #10928
|
||||
11045,11046,11047,4489,11048,11049,11050,11051,4952,11052,11053,11054,11055,11056,11057,11058, #10944
|
||||
4953,11059,11060,11061,11062,11063,11064,11065,11066,11067,11068,11069,11070,11071,4954,11072, #10960
|
||||
11073,11074,11075,11076,11077,11078,11079,11080,11081,11082,11083,11084,11085,11086,11087,11088, #10976
|
||||
11089,11090,11091,11092,11093,11094,11095,11096,11097,11098,11099,11100,11101,11102,11103,11104, #10992
|
||||
11105,11106,11107,11108,11109,11110,11111,11112,11113,11114,11115,3808,11116,11117,11118,11119, #11008
|
||||
11120,11121,11122,11123,11124,11125,11126,11127,11128,11129,11130,11131,11132,11133,11134,4955, #11024
|
||||
11135,11136,11137,11138,11139,11140,11141,11142,11143,11144,11145,11146,11147,11148,11149,11150, #11040
|
||||
11151,11152,11153,11154,11155,11156,11157,11158,11159,11160,11161,4956,11162,11163,11164,11165, #11056
|
||||
11166,11167,11168,11169,11170,11171,11172,11173,11174,11175,11176,11177,11178,11179,11180,4957, #11072
|
||||
11181,11182,11183,11184,11185,11186,4958,11187,11188,11189,11190,11191,11192,11193,11194,11195, #11088
|
||||
11196,11197,11198,11199,11200,3678,11201,11202,11203,11204,11205,11206,4191,11207,11208,11209, #11104
|
||||
11210,11211,11212,11213,11214,11215,11216,11217,11218,11219,11220,11221,11222,11223,11224,11225, #11120
|
||||
11226,11227,11228,11229,11230,11231,11232,11233,11234,11235,11236,11237,11238,11239,11240,11241, #11136
|
||||
11242,11243,11244,11245,11246,11247,11248,11249,11250,11251,4959,11252,11253,11254,11255,11256, #11152
|
||||
11257,11258,11259,11260,11261,11262,11263,11264,11265,11266,11267,11268,11269,11270,11271,11272, #11168
|
||||
11273,11274,11275,11276,11277,11278,11279,11280,11281,11282,11283,11284,11285,11286,11287,11288, #11184
|
||||
11289,11290,11291,11292,11293,11294,11295,11296,11297,11298,11299,11300,11301,11302,11303,11304, #11200
|
||||
11305,11306,11307,11308,11309,11310,11311,11312,11313,11314,3679,11315,11316,11317,11318,4490, #11216
|
||||
11319,11320,11321,11322,11323,11324,11325,11326,11327,11328,11329,11330,11331,11332,11333,11334, #11232
|
||||
11335,11336,11337,11338,11339,11340,11341,11342,11343,11344,11345,11346,11347,4960,11348,11349, #11248
|
||||
11350,11351,11352,11353,11354,11355,11356,11357,11358,11359,11360,11361,11362,11363,11364,11365, #11264
|
||||
11366,11367,11368,11369,11370,11371,11372,11373,11374,11375,11376,11377,3961,4961,11378,11379, #11280
|
||||
11380,11381,11382,11383,11384,11385,11386,11387,11388,11389,11390,11391,11392,11393,11394,11395, #11296
|
||||
11396,11397,4192,11398,11399,11400,11401,11402,11403,11404,11405,11406,11407,11408,11409,11410, #11312
|
||||
11411,4962,11412,11413,11414,11415,11416,11417,11418,11419,11420,11421,11422,11423,11424,11425, #11328
|
||||
11426,11427,11428,11429,11430,11431,11432,11433,11434,11435,11436,11437,11438,11439,11440,11441, #11344
|
||||
11442,11443,11444,11445,11446,11447,11448,11449,11450,11451,11452,11453,11454,11455,11456,11457, #11360
|
||||
11458,11459,11460,11461,11462,11463,11464,11465,11466,11467,11468,11469,4963,11470,11471,4491, #11376
|
||||
11472,11473,11474,11475,4964,11476,11477,11478,11479,11480,11481,11482,11483,11484,11485,11486, #11392
|
||||
11487,11488,11489,11490,11491,11492,4965,11493,11494,11495,11496,11497,11498,11499,11500,11501, #11408
|
||||
11502,11503,11504,11505,11506,11507,11508,11509,11510,11511,11512,11513,11514,11515,11516,11517, #11424
|
||||
11518,11519,11520,11521,11522,11523,11524,11525,11526,11527,11528,11529,3962,11530,11531,11532, #11440
|
||||
11533,11534,11535,11536,11537,11538,11539,11540,11541,11542,11543,11544,11545,11546,11547,11548, #11456
|
||||
11549,11550,11551,11552,11553,11554,11555,11556,11557,11558,11559,11560,11561,11562,11563,11564, #11472
|
||||
4193,4194,11565,11566,11567,11568,11569,11570,11571,11572,11573,11574,11575,11576,11577,11578, #11488
|
||||
11579,11580,11581,11582,11583,11584,11585,11586,11587,11588,11589,11590,11591,4966,4195,11592, #11504
|
||||
11593,11594,11595,11596,11597,11598,11599,11600,11601,11602,11603,11604,3090,11605,11606,11607, #11520
|
||||
11608,11609,11610,4967,11611,11612,11613,11614,11615,11616,11617,11618,11619,11620,11621,11622, #11536
|
||||
11623,11624,11625,11626,11627,11628,11629,11630,11631,11632,11633,11634,11635,11636,11637,11638, #11552
|
||||
11639,11640,11641,11642,11643,11644,11645,11646,11647,11648,11649,11650,11651,11652,11653,11654, #11568
|
||||
11655,11656,11657,11658,11659,11660,11661,11662,11663,11664,11665,11666,11667,11668,11669,11670, #11584
|
||||
11671,11672,11673,11674,4968,11675,11676,11677,11678,11679,11680,11681,11682,11683,11684,11685, #11600
|
||||
11686,11687,11688,11689,11690,11691,11692,11693,3809,11694,11695,11696,11697,11698,11699,11700, #11616
|
||||
11701,11702,11703,11704,11705,11706,11707,11708,11709,11710,11711,11712,11713,11714,11715,11716, #11632
|
||||
11717,11718,3553,11719,11720,11721,11722,11723,11724,11725,11726,11727,11728,11729,11730,4969, #11648
|
||||
11731,11732,11733,11734,11735,11736,11737,11738,11739,11740,4492,11741,11742,11743,11744,11745, #11664
|
||||
11746,11747,11748,11749,11750,11751,11752,4970,11753,11754,11755,11756,11757,11758,11759,11760, #11680
|
||||
11761,11762,11763,11764,11765,11766,11767,11768,11769,11770,11771,11772,11773,11774,11775,11776, #11696
|
||||
11777,11778,11779,11780,11781,11782,11783,11784,11785,11786,11787,11788,11789,11790,4971,11791, #11712
|
||||
11792,11793,11794,11795,11796,11797,4972,11798,11799,11800,11801,11802,11803,11804,11805,11806, #11728
|
||||
11807,11808,11809,11810,4973,11811,11812,11813,11814,11815,11816,11817,11818,11819,11820,11821, #11744
|
||||
11822,11823,11824,11825,11826,11827,11828,11829,11830,11831,11832,11833,11834,3680,3810,11835, #11760
|
||||
11836,4974,11837,11838,11839,11840,11841,11842,11843,11844,11845,11846,11847,11848,11849,11850, #11776
|
||||
11851,11852,11853,11854,11855,11856,11857,11858,11859,11860,11861,11862,11863,11864,11865,11866, #11792
|
||||
11867,11868,11869,11870,11871,11872,11873,11874,11875,11876,11877,11878,11879,11880,11881,11882, #11808
|
||||
11883,11884,4493,11885,11886,11887,11888,11889,11890,11891,11892,11893,11894,11895,11896,11897, #11824
|
||||
11898,11899,11900,11901,11902,11903,11904,11905,11906,11907,11908,11909,11910,11911,11912,11913, #11840
|
||||
11914,11915,4975,11916,11917,11918,11919,11920,11921,11922,11923,11924,11925,11926,11927,11928, #11856
|
||||
11929,11930,11931,11932,11933,11934,11935,11936,11937,11938,11939,11940,11941,11942,11943,11944, #11872
|
||||
11945,11946,11947,11948,11949,4976,11950,11951,11952,11953,11954,11955,11956,11957,11958,11959, #11888
|
||||
11960,11961,11962,11963,11964,11965,11966,11967,11968,11969,11970,11971,11972,11973,11974,11975, #11904
|
||||
11976,11977,11978,11979,11980,11981,11982,11983,11984,11985,11986,11987,4196,11988,11989,11990, #11920
|
||||
11991,11992,4977,11993,11994,11995,11996,11997,11998,11999,12000,12001,12002,12003,12004,12005, #11936
|
||||
12006,12007,12008,12009,12010,12011,12012,12013,12014,12015,12016,12017,12018,12019,12020,12021, #11952
|
||||
12022,12023,12024,12025,12026,12027,12028,12029,12030,12031,12032,12033,12034,12035,12036,12037, #11968
|
||||
12038,12039,12040,12041,12042,12043,12044,12045,12046,12047,12048,12049,12050,12051,12052,12053, #11984
|
||||
12054,12055,12056,12057,12058,12059,12060,12061,4978,12062,12063,12064,12065,12066,12067,12068, #12000
|
||||
12069,12070,12071,12072,12073,12074,12075,12076,12077,12078,12079,12080,12081,12082,12083,12084, #12016
|
||||
12085,12086,12087,12088,12089,12090,12091,12092,12093,12094,12095,12096,12097,12098,12099,12100, #12032
|
||||
12101,12102,12103,12104,12105,12106,12107,12108,12109,12110,12111,12112,12113,12114,12115,12116, #12048
|
||||
12117,12118,12119,12120,12121,12122,12123,4979,12124,12125,12126,12127,12128,4197,12129,12130, #12064
|
||||
12131,12132,12133,12134,12135,12136,12137,12138,12139,12140,12141,12142,12143,12144,12145,12146, #12080
|
||||
12147,12148,12149,12150,12151,12152,12153,12154,4980,12155,12156,12157,12158,12159,12160,4494, #12096
|
||||
12161,12162,12163,12164,3811,12165,12166,12167,12168,12169,4495,12170,12171,4496,12172,12173, #12112
|
||||
12174,12175,12176,3812,12177,12178,12179,12180,12181,12182,12183,12184,12185,12186,12187,12188, #12128
|
||||
12189,12190,12191,12192,12193,12194,12195,12196,12197,12198,12199,12200,12201,12202,12203,12204, #12144
|
||||
12205,12206,12207,12208,12209,12210,12211,12212,12213,12214,12215,12216,12217,12218,12219,12220, #12160
|
||||
12221,4981,12222,12223,12224,12225,12226,12227,12228,12229,12230,12231,12232,12233,12234,12235, #12176
|
||||
4982,12236,12237,12238,12239,12240,12241,12242,12243,12244,12245,4983,12246,12247,12248,12249, #12192
|
||||
4984,12250,12251,12252,12253,12254,12255,12256,12257,12258,12259,12260,12261,12262,12263,12264, #12208
|
||||
4985,12265,4497,12266,12267,12268,12269,12270,12271,12272,12273,12274,12275,12276,12277,12278, #12224
|
||||
12279,12280,12281,12282,12283,12284,12285,12286,12287,4986,12288,12289,12290,12291,12292,12293, #12240
|
||||
12294,12295,12296,2473,12297,12298,12299,12300,12301,12302,12303,12304,12305,12306,12307,12308, #12256
|
||||
12309,12310,12311,12312,12313,12314,12315,12316,12317,12318,12319,3963,12320,12321,12322,12323, #12272
|
||||
12324,12325,12326,12327,12328,12329,12330,12331,12332,4987,12333,12334,12335,12336,12337,12338, #12288
|
||||
12339,12340,12341,12342,12343,12344,12345,12346,12347,12348,12349,12350,12351,12352,12353,12354, #12304
|
||||
12355,12356,12357,12358,12359,3964,12360,12361,12362,12363,12364,12365,12366,12367,12368,12369, #12320
|
||||
12370,3965,12371,12372,12373,12374,12375,12376,12377,12378,12379,12380,12381,12382,12383,12384, #12336
|
||||
12385,12386,12387,12388,12389,12390,12391,12392,12393,12394,12395,12396,12397,12398,12399,12400, #12352
|
||||
12401,12402,12403,12404,12405,12406,12407,12408,4988,12409,12410,12411,12412,12413,12414,12415, #12368
|
||||
12416,12417,12418,12419,12420,12421,12422,12423,12424,12425,12426,12427,12428,12429,12430,12431, #12384
|
||||
12432,12433,12434,12435,12436,12437,12438,3554,12439,12440,12441,12442,12443,12444,12445,12446, #12400
|
||||
12447,12448,12449,12450,12451,12452,12453,12454,12455,12456,12457,12458,12459,12460,12461,12462, #12416
|
||||
12463,12464,4989,12465,12466,12467,12468,12469,12470,12471,12472,12473,12474,12475,12476,12477, #12432
|
||||
12478,12479,12480,4990,12481,12482,12483,12484,12485,12486,12487,12488,12489,4498,12490,12491, #12448
|
||||
12492,12493,12494,12495,12496,12497,12498,12499,12500,12501,12502,12503,12504,12505,12506,12507, #12464
|
||||
12508,12509,12510,12511,12512,12513,12514,12515,12516,12517,12518,12519,12520,12521,12522,12523, #12480
|
||||
12524,12525,12526,12527,12528,12529,12530,12531,12532,12533,12534,12535,12536,12537,12538,12539, #12496
|
||||
12540,12541,12542,12543,12544,12545,12546,12547,12548,12549,12550,12551,4991,12552,12553,12554, #12512
|
||||
12555,12556,12557,12558,12559,12560,12561,12562,12563,12564,12565,12566,12567,12568,12569,12570, #12528
|
||||
12571,12572,12573,12574,12575,12576,12577,12578,3036,12579,12580,12581,12582,12583,3966,12584, #12544
|
||||
12585,12586,12587,12588,12589,12590,12591,12592,12593,12594,12595,12596,12597,12598,12599,12600, #12560
|
||||
12601,12602,12603,12604,12605,12606,12607,12608,12609,12610,12611,12612,12613,12614,12615,12616, #12576
|
||||
12617,12618,12619,12620,12621,12622,12623,12624,12625,12626,12627,12628,12629,12630,12631,12632, #12592
|
||||
12633,12634,12635,12636,12637,12638,12639,12640,12641,12642,12643,12644,12645,12646,4499,12647, #12608
|
||||
12648,12649,12650,12651,12652,12653,12654,12655,12656,12657,12658,12659,12660,12661,12662,12663, #12624
|
||||
12664,12665,12666,12667,12668,12669,12670,12671,12672,12673,12674,12675,12676,12677,12678,12679, #12640
|
||||
12680,12681,12682,12683,12684,12685,12686,12687,12688,12689,12690,12691,12692,12693,12694,12695, #12656
|
||||
12696,12697,12698,4992,12699,12700,12701,12702,12703,12704,12705,12706,12707,12708,12709,12710, #12672
|
||||
12711,12712,12713,12714,12715,12716,12717,12718,12719,12720,12721,12722,12723,12724,12725,12726, #12688
|
||||
12727,12728,12729,12730,12731,12732,12733,12734,12735,12736,12737,12738,12739,12740,12741,12742, #12704
|
||||
12743,12744,12745,12746,12747,12748,12749,12750,12751,12752,12753,12754,12755,12756,12757,12758, #12720
|
||||
12759,12760,12761,12762,12763,12764,12765,12766,12767,12768,12769,12770,12771,12772,12773,12774, #12736
|
||||
12775,12776,12777,12778,4993,2175,12779,12780,12781,12782,12783,12784,12785,12786,4500,12787, #12752
|
||||
12788,12789,12790,12791,12792,12793,12794,12795,12796,12797,12798,12799,12800,12801,12802,12803, #12768
|
||||
12804,12805,12806,12807,12808,12809,12810,12811,12812,12813,12814,12815,12816,12817,12818,12819, #12784
|
||||
12820,12821,12822,12823,12824,12825,12826,4198,3967,12827,12828,12829,12830,12831,12832,12833, #12800
|
||||
12834,12835,12836,12837,12838,12839,12840,12841,12842,12843,12844,12845,12846,12847,12848,12849, #12816
|
||||
12850,12851,12852,12853,12854,12855,12856,12857,12858,12859,12860,12861,4199,12862,12863,12864, #12832
|
||||
12865,12866,12867,12868,12869,12870,12871,12872,12873,12874,12875,12876,12877,12878,12879,12880, #12848
|
||||
12881,12882,12883,12884,12885,12886,12887,4501,12888,12889,12890,12891,12892,12893,12894,12895, #12864
|
||||
12896,12897,12898,12899,12900,12901,12902,12903,12904,12905,12906,12907,12908,12909,12910,12911, #12880
|
||||
12912,4994,12913,12914,12915,12916,12917,12918,12919,12920,12921,12922,12923,12924,12925,12926, #12896
|
||||
12927,12928,12929,12930,12931,12932,12933,12934,12935,12936,12937,12938,12939,12940,12941,12942, #12912
|
||||
12943,12944,12945,12946,12947,12948,12949,12950,12951,12952,12953,12954,12955,12956,1772,12957, #12928
|
||||
12958,12959,12960,12961,12962,12963,12964,12965,12966,12967,12968,12969,12970,12971,12972,12973, #12944
|
||||
12974,12975,12976,12977,12978,12979,12980,12981,12982,12983,12984,12985,12986,12987,12988,12989, #12960
|
||||
12990,12991,12992,12993,12994,12995,12996,12997,4502,12998,4503,12999,13000,13001,13002,13003, #12976
|
||||
4504,13004,13005,13006,13007,13008,13009,13010,13011,13012,13013,13014,13015,13016,13017,13018, #12992
|
||||
13019,13020,13021,13022,13023,13024,13025,13026,13027,13028,13029,3449,13030,13031,13032,13033, #13008
|
||||
13034,13035,13036,13037,13038,13039,13040,13041,13042,13043,13044,13045,13046,13047,13048,13049, #13024
|
||||
13050,13051,13052,13053,13054,13055,13056,13057,13058,13059,13060,13061,13062,13063,13064,13065, #13040
|
||||
13066,13067,13068,13069,13070,13071,13072,13073,13074,13075,13076,13077,13078,13079,13080,13081, #13056
|
||||
13082,13083,13084,13085,13086,13087,13088,13089,13090,13091,13092,13093,13094,13095,13096,13097, #13072
|
||||
13098,13099,13100,13101,13102,13103,13104,13105,13106,13107,13108,13109,13110,13111,13112,13113, #13088
|
||||
13114,13115,13116,13117,13118,3968,13119,4995,13120,13121,13122,13123,13124,13125,13126,13127, #13104
|
||||
4505,13128,13129,13130,13131,13132,13133,13134,4996,4506,13135,13136,13137,13138,13139,4997, #13120
|
||||
13140,13141,13142,13143,13144,13145,13146,13147,13148,13149,13150,13151,13152,13153,13154,13155, #13136
|
||||
13156,13157,13158,13159,4998,13160,13161,13162,13163,13164,13165,13166,13167,13168,13169,13170, #13152
|
||||
13171,13172,13173,13174,13175,13176,4999,13177,13178,13179,13180,13181,13182,13183,13184,13185, #13168
|
||||
13186,13187,13188,13189,13190,13191,13192,13193,13194,13195,13196,13197,13198,13199,13200,13201, #13184
|
||||
13202,13203,13204,13205,13206,5000,13207,13208,13209,13210,13211,13212,13213,13214,13215,13216, #13200
|
||||
13217,13218,13219,13220,13221,13222,13223,13224,13225,13226,13227,4200,5001,13228,13229,13230, #13216
|
||||
13231,13232,13233,13234,13235,13236,13237,13238,13239,13240,3969,13241,13242,13243,13244,3970, #13232
|
||||
13245,13246,13247,13248,13249,13250,13251,13252,13253,13254,13255,13256,13257,13258,13259,13260, #13248
|
||||
13261,13262,13263,13264,13265,13266,13267,13268,3450,13269,13270,13271,13272,13273,13274,13275, #13264
|
||||
13276,5002,13277,13278,13279,13280,13281,13282,13283,13284,13285,13286,13287,13288,13289,13290, #13280
|
||||
13291,13292,13293,13294,13295,13296,13297,13298,13299,13300,13301,13302,3813,13303,13304,13305, #13296
|
||||
13306,13307,13308,13309,13310,13311,13312,13313,13314,13315,13316,13317,13318,13319,13320,13321, #13312
|
||||
13322,13323,13324,13325,13326,13327,13328,4507,13329,13330,13331,13332,13333,13334,13335,13336, #13328
|
||||
13337,13338,13339,13340,13341,5003,13342,13343,13344,13345,13346,13347,13348,13349,13350,13351, #13344
|
||||
13352,13353,13354,13355,13356,13357,13358,13359,13360,13361,13362,13363,13364,13365,13366,13367, #13360
|
||||
5004,13368,13369,13370,13371,13372,13373,13374,13375,13376,13377,13378,13379,13380,13381,13382, #13376
|
||||
13383,13384,13385,13386,13387,13388,13389,13390,13391,13392,13393,13394,13395,13396,13397,13398, #13392
|
||||
13399,13400,13401,13402,13403,13404,13405,13406,13407,13408,13409,13410,13411,13412,13413,13414, #13408
|
||||
13415,13416,13417,13418,13419,13420,13421,13422,13423,13424,13425,13426,13427,13428,13429,13430, #13424
|
||||
13431,13432,4508,13433,13434,13435,4201,13436,13437,13438,13439,13440,13441,13442,13443,13444, #13440
|
||||
13445,13446,13447,13448,13449,13450,13451,13452,13453,13454,13455,13456,13457,5005,13458,13459, #13456
|
||||
13460,13461,13462,13463,13464,13465,13466,13467,13468,13469,13470,4509,13471,13472,13473,13474, #13472
|
||||
13475,13476,13477,13478,13479,13480,13481,13482,13483,13484,13485,13486,13487,13488,13489,13490, #13488
|
||||
13491,13492,13493,13494,13495,13496,13497,13498,13499,13500,13501,13502,13503,13504,13505,13506, #13504
|
||||
13507,13508,13509,13510,13511,13512,13513,13514,13515,13516,13517,13518,13519,13520,13521,13522, #13520
|
||||
13523,13524,13525,13526,13527,13528,13529,13530,13531,13532,13533,13534,13535,13536,13537,13538, #13536
|
||||
13539,13540,13541,13542,13543,13544,13545,13546,13547,13548,13549,13550,13551,13552,13553,13554, #13552
|
||||
13555,13556,13557,13558,13559,13560,13561,13562,13563,13564,13565,13566,13567,13568,13569,13570, #13568
|
||||
13571,13572,13573,13574,13575,13576,13577,13578,13579,13580,13581,13582,13583,13584,13585,13586, #13584
|
||||
13587,13588,13589,13590,13591,13592,13593,13594,13595,13596,13597,13598,13599,13600,13601,13602, #13600
|
||||
13603,13604,13605,13606,13607,13608,13609,13610,13611,13612,13613,13614,13615,13616,13617,13618, #13616
|
||||
13619,13620,13621,13622,13623,13624,13625,13626,13627,13628,13629,13630,13631,13632,13633,13634, #13632
|
||||
13635,13636,13637,13638,13639,13640,13641,13642,5006,13643,13644,13645,13646,13647,13648,13649, #13648
|
||||
13650,13651,5007,13652,13653,13654,13655,13656,13657,13658,13659,13660,13661,13662,13663,13664, #13664
|
||||
13665,13666,13667,13668,13669,13670,13671,13672,13673,13674,13675,13676,13677,13678,13679,13680, #13680
|
||||
13681,13682,13683,13684,13685,13686,13687,13688,13689,13690,13691,13692,13693,13694,13695,13696, #13696
|
||||
13697,13698,13699,13700,13701,13702,13703,13704,13705,13706,13707,13708,13709,13710,13711,13712, #13712
|
||||
13713,13714,13715,13716,13717,13718,13719,13720,13721,13722,13723,13724,13725,13726,13727,13728, #13728
|
||||
13729,13730,13731,13732,13733,13734,13735,13736,13737,13738,13739,13740,13741,13742,13743,13744, #13744
|
||||
13745,13746,13747,13748,13749,13750,13751,13752,13753,13754,13755,13756,13757,13758,13759,13760, #13760
|
||||
13761,13762,13763,13764,13765,13766,13767,13768,13769,13770,13771,13772,13773,13774,3273,13775, #13776
|
||||
13776,13777,13778,13779,13780,13781,13782,13783,13784,13785,13786,13787,13788,13789,13790,13791, #13792
|
||||
13792,13793,13794,13795,13796,13797,13798,13799,13800,13801,13802,13803,13804,13805,13806,13807, #13808
|
||||
13808,13809,13810,13811,13812,13813,13814,13815,13816,13817,13818,13819,13820,13821,13822,13823, #13824
|
||||
13824,13825,13826,13827,13828,13829,13830,13831,13832,13833,13834,13835,13836,13837,13838,13839, #13840
|
||||
13840,13841,13842,13843,13844,13845,13846,13847,13848,13849,13850,13851,13852,13853,13854,13855, #13856
|
||||
13856,13857,13858,13859,13860,13861,13862,13863,13864,13865,13866,13867,13868,13869,13870,13871, #13872
|
||||
13872,13873,13874,13875,13876,13877,13878,13879,13880,13881,13882,13883,13884,13885,13886,13887, #13888
|
||||
13888,13889,13890,13891,13892,13893,13894,13895,13896,13897,13898,13899,13900,13901,13902,13903, #13904
|
||||
13904,13905,13906,13907,13908,13909,13910,13911,13912,13913,13914,13915,13916,13917,13918,13919, #13920
|
||||
13920,13921,13922,13923,13924,13925,13926,13927,13928,13929,13930,13931,13932,13933,13934,13935, #13936
|
||||
13936,13937,13938,13939,13940,13941,13942,13943,13944,13945,13946,13947,13948,13949,13950,13951, #13952
|
||||
13952,13953,13954,13955,13956,13957,13958,13959,13960,13961,13962,13963,13964,13965,13966,13967, #13968
|
||||
13968,13969,13970,13971,13972) #13973
|
||||
|
|
@ -1,11 +1,11 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Communicator client code.
|
||||
#
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
|
|
@ -13,35 +13,29 @@
|
|||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
from .mbcharsetprober import MultiByteCharSetProber
|
||||
from .codingstatemachine import CodingStateMachine
|
||||
from .chardistribution import Big5DistributionAnalysis
|
||||
from .mbcssm import BIG5_SM_MODEL
|
||||
|
||||
from mbcharsetprober import MultiByteCharSetProber
|
||||
from codingstatemachine import CodingStateMachine
|
||||
from chardistribution import Big5DistributionAnalysis
|
||||
from mbcssm import Big5SMModel
|
||||
|
||||
class Big5Prober(MultiByteCharSetProber):
|
||||
def __init__(self):
|
||||
super(Big5Prober, self).__init__()
|
||||
self.coding_sm = CodingStateMachine(BIG5_SM_MODEL)
|
||||
self.distribution_analyzer = Big5DistributionAnalysis()
|
||||
MultiByteCharSetProber.__init__(self)
|
||||
self._mCodingSM = CodingStateMachine(Big5SMModel)
|
||||
self._mDistributionAnalyzer = Big5DistributionAnalysis()
|
||||
self.reset()
|
||||
|
||||
@property
|
||||
def charset_name(self):
|
||||
def get_charset_name(self):
|
||||
return "Big5"
|
||||
|
||||
@property
|
||||
def language(self):
|
||||
return "Chinese"
|
||||
200
fanficdownloader/chardet/chardistribution.py
Normal file
200
fanficdownloader/chardet/chardistribution.py
Normal file
|
|
@ -0,0 +1,200 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Communicator client code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
import constants
|
||||
from euctwfreq import EUCTWCharToFreqOrder, EUCTW_TABLE_SIZE, EUCTW_TYPICAL_DISTRIBUTION_RATIO
|
||||
from euckrfreq import EUCKRCharToFreqOrder, EUCKR_TABLE_SIZE, EUCKR_TYPICAL_DISTRIBUTION_RATIO
|
||||
from gb2312freq import GB2312CharToFreqOrder, GB2312_TABLE_SIZE, GB2312_TYPICAL_DISTRIBUTION_RATIO
|
||||
from big5freq import Big5CharToFreqOrder, BIG5_TABLE_SIZE, BIG5_TYPICAL_DISTRIBUTION_RATIO
|
||||
from jisfreq import JISCharToFreqOrder, JIS_TABLE_SIZE, JIS_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
ENOUGH_DATA_THRESHOLD = 1024
|
||||
SURE_YES = 0.99
|
||||
SURE_NO = 0.01
|
||||
|
||||
class CharDistributionAnalysis:
|
||||
def __init__(self):
|
||||
self._mCharToFreqOrder = None # Mapping table to get frequency order from char order (get from GetOrder())
|
||||
self._mTableSize = None # Size of above table
|
||||
self._mTypicalDistributionRatio = None # This is a constant value which varies from language to language, used in calculating confidence. See http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html for further detail.
|
||||
self.reset()
|
||||
|
||||
def reset(self):
|
||||
"""reset analyser, clear any state"""
|
||||
self._mDone = constants.False # If this flag is set to constants.True, detection is done and conclusion has been made
|
||||
self._mTotalChars = 0 # Total characters encountered
|
||||
self._mFreqChars = 0 # The number of characters whose frequency order is less than 512
|
||||
|
||||
def feed(self, aStr, aCharLen):
|
||||
"""feed a character with known length"""
|
||||
if aCharLen == 2:
|
||||
# we only care about 2-bytes character in our distribution analysis
|
||||
order = self.get_order(aStr)
|
||||
else:
|
||||
order = -1
|
||||
if order >= 0:
|
||||
self._mTotalChars += 1
|
||||
# order is valid
|
||||
if order < self._mTableSize:
|
||||
if 512 > self._mCharToFreqOrder[order]:
|
||||
self._mFreqChars += 1
|
||||
|
||||
def get_confidence(self):
|
||||
"""return confidence based on existing data"""
|
||||
# if we didn't receive any character in our consideration range, return negative answer
|
||||
if self._mTotalChars <= 0:
|
||||
return SURE_NO
|
||||
|
||||
if self._mTotalChars != self._mFreqChars:
|
||||
r = self._mFreqChars / ((self._mTotalChars - self._mFreqChars) * self._mTypicalDistributionRatio)
|
||||
if r < SURE_YES:
|
||||
return r
|
||||
|
||||
# normalize confidence (we don't want to be 100% sure)
|
||||
return SURE_YES
|
||||
|
||||
def got_enough_data(self):
|
||||
# It is not necessary to receive all data to draw conclusion. For charset detection,
|
||||
# certain amount of data is enough
|
||||
return self._mTotalChars > ENOUGH_DATA_THRESHOLD
|
||||
|
||||
def get_order(self, aStr):
|
||||
# We do not handle characters based on the original encoding string, but
|
||||
# convert this encoding string to a number, here called order.
|
||||
# This allows multiple encodings of a language to share one frequency table.
|
||||
return -1
|
||||
|
||||
class EUCTWDistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = EUCTWCharToFreqOrder
|
||||
self._mTableSize = EUCTW_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = EUCTW_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for euc-TW encoding, we are interested
|
||||
# first byte range: 0xc4 -- 0xfe
|
||||
# second byte range: 0xa1 -- 0xfe
|
||||
# no validation needed here. State machine has done that
|
||||
if aStr[0] >= '\xC4':
|
||||
return 94 * (ord(aStr[0]) - 0xC4) + ord(aStr[1]) - 0xA1
|
||||
else:
|
||||
return -1
|
||||
|
||||
class EUCKRDistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = EUCKRCharToFreqOrder
|
||||
self._mTableSize = EUCKR_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = EUCKR_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for euc-KR encoding, we are interested
|
||||
# first byte range: 0xb0 -- 0xfe
|
||||
# second byte range: 0xa1 -- 0xfe
|
||||
# no validation needed here. State machine has done that
|
||||
if aStr[0] >= '\xB0':
|
||||
return 94 * (ord(aStr[0]) - 0xB0) + ord(aStr[1]) - 0xA1
|
||||
else:
|
||||
return -1;
|
||||
|
||||
class GB2312DistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = GB2312CharToFreqOrder
|
||||
self._mTableSize = GB2312_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = GB2312_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for GB2312 encoding, we are interested
|
||||
# first byte range: 0xb0 -- 0xfe
|
||||
# second byte range: 0xa1 -- 0xfe
|
||||
# no validation needed here. State machine has done that
|
||||
if (aStr[0] >= '\xB0') and (aStr[1] >= '\xA1'):
|
||||
return 94 * (ord(aStr[0]) - 0xB0) + ord(aStr[1]) - 0xA1
|
||||
else:
|
||||
return -1;
|
||||
|
||||
class Big5DistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = Big5CharToFreqOrder
|
||||
self._mTableSize = BIG5_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = BIG5_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for big5 encoding, we are interested
|
||||
# first byte range: 0xa4 -- 0xfe
|
||||
# second byte range: 0x40 -- 0x7e , 0xa1 -- 0xfe
|
||||
# no validation needed here. State machine has done that
|
||||
if aStr[0] >= '\xA4':
|
||||
if aStr[1] >= '\xA1':
|
||||
return 157 * (ord(aStr[0]) - 0xA4) + ord(aStr[1]) - 0xA1 + 63
|
||||
else:
|
||||
return 157 * (ord(aStr[0]) - 0xA4) + ord(aStr[1]) - 0x40
|
||||
else:
|
||||
return -1
|
||||
|
||||
class SJISDistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = JISCharToFreqOrder
|
||||
self._mTableSize = JIS_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = JIS_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for sjis encoding, we are interested
|
||||
# first byte range: 0x81 -- 0x9f , 0xe0 -- 0xfe
|
||||
# second byte range: 0x40 -- 0x7e, 0x81 -- oxfe
|
||||
# no validation needed here. State machine has done that
|
||||
if (aStr[0] >= '\x81') and (aStr[0] <= '\x9F'):
|
||||
order = 188 * (ord(aStr[0]) - 0x81)
|
||||
elif (aStr[0] >= '\xE0') and (aStr[0] <= '\xEF'):
|
||||
order = 188 * (ord(aStr[0]) - 0xE0 + 31)
|
||||
else:
|
||||
return -1;
|
||||
order = order + ord(aStr[1]) - 0x40
|
||||
if aStr[1] > '\x7F':
|
||||
order =- 1
|
||||
return order
|
||||
|
||||
class EUCJPDistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
self._mCharToFreqOrder = JISCharToFreqOrder
|
||||
self._mTableSize = JIS_TABLE_SIZE
|
||||
self._mTypicalDistributionRatio = JIS_TYPICAL_DISTRIBUTION_RATIO
|
||||
|
||||
def get_order(self, aStr):
|
||||
# for euc-JP encoding, we are interested
|
||||
# first byte range: 0xa0 -- 0xfe
|
||||
# second byte range: 0xa1 -- 0xfe
|
||||
# no validation needed here. State machine has done that
|
||||
if aStr[0] >= '\xA0':
|
||||
return 94 * (ord(aStr[0]) - 0xA1) + ord(aStr[1]) - 0xa1
|
||||
else:
|
||||
return -1
|
||||
96
fanficdownloader/chardet/charsetgroupprober.py
Normal file
96
fanficdownloader/chardet/charsetgroupprober.py
Normal file
|
|
@ -0,0 +1,96 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Communicator client code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
import constants, sys
|
||||
from charsetprober import CharSetProber
|
||||
|
||||
class CharSetGroupProber(CharSetProber):
|
||||
def __init__(self):
|
||||
CharSetProber.__init__(self)
|
||||
self._mActiveNum = 0
|
||||
self._mProbers = []
|
||||
self._mBestGuessProber = None
|
||||
|
||||
def reset(self):
|
||||
CharSetProber.reset(self)
|
||||
self._mActiveNum = 0
|
||||
for prober in self._mProbers:
|
||||
if prober:
|
||||
prober.reset()
|
||||
prober.active = constants.True
|
||||
self._mActiveNum += 1
|
||||
self._mBestGuessProber = None
|
||||
|
||||
def get_charset_name(self):
|
||||
if not self._mBestGuessProber:
|
||||
self.get_confidence()
|
||||
if not self._mBestGuessProber: return None
|
||||
# self._mBestGuessProber = self._mProbers[0]
|
||||
return self._mBestGuessProber.get_charset_name()
|
||||
|
||||
def feed(self, aBuf):
|
||||
for prober in self._mProbers:
|
||||
if not prober: continue
|
||||
if not prober.active: continue
|
||||
st = prober.feed(aBuf)
|
||||
if not st: continue
|
||||
if st == constants.eFoundIt:
|
||||
self._mBestGuessProber = prober
|
||||
return self.get_state()
|
||||
elif st == constants.eNotMe:
|
||||
prober.active = constants.False
|
||||
self._mActiveNum -= 1
|
||||
if self._mActiveNum <= 0:
|
||||
self._mState = constants.eNotMe
|
||||
return self.get_state()
|
||||
return self.get_state()
|
||||
|
||||
def get_confidence(self):
|
||||
st = self.get_state()
|
||||
if st == constants.eFoundIt:
|
||||
return 0.99
|
||||
elif st == constants.eNotMe:
|
||||
return 0.01
|
||||
bestConf = 0.0
|
||||
self._mBestGuessProber = None
|
||||
for prober in self._mProbers:
|
||||
if not prober: continue
|
||||
if not prober.active:
|
||||
if constants._debug:
|
||||
sys.stderr.write(prober.get_charset_name() + ' not active\n')
|
||||
continue
|
||||
cf = prober.get_confidence()
|
||||
if constants._debug:
|
||||
sys.stderr.write('%s confidence = %s\n' % (prober.get_charset_name(), cf))
|
||||
if bestConf < cf:
|
||||
bestConf = cf
|
||||
self._mBestGuessProber = prober
|
||||
if not self._mBestGuessProber: return 0.0
|
||||
return bestConf
|
||||
# else:
|
||||
# self._mBestGuessProber = self._mProbers[0]
|
||||
# return self._mBestGuessProber.get_confidence()
|
||||
|
|
@ -1,49 +1,60 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is mozilla.org code.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
# Shy Shalom - original C code
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
from .chardistribution import EUCKRDistributionAnalysis
|
||||
from .codingstatemachine import CodingStateMachine
|
||||
from .mbcharsetprober import MultiByteCharSetProber
|
||||
from .mbcssm import CP949_SM_MODEL
|
||||
import constants, re
|
||||
|
||||
|
||||
class CP949Prober(MultiByteCharSetProber):
|
||||
class CharSetProber:
|
||||
def __init__(self):
|
||||
super(CP949Prober, self).__init__()
|
||||
self.coding_sm = CodingStateMachine(CP949_SM_MODEL)
|
||||
# NOTE: CP949 is a superset of EUC-KR, so the distribution should be
|
||||
# not different.
|
||||
self.distribution_analyzer = EUCKRDistributionAnalysis()
|
||||
self.reset()
|
||||
pass
|
||||
|
||||
def reset(self):
|
||||
self._mState = constants.eDetecting
|
||||
|
||||
def get_charset_name(self):
|
||||
return None
|
||||
|
||||
@property
|
||||
def charset_name(self):
|
||||
return "CP949"
|
||||
def feed(self, aBuf):
|
||||
pass
|
||||
|
||||
@property
|
||||
def language(self):
|
||||
return "Korean"
|
||||
def get_state(self):
|
||||
return self._mState
|
||||
|
||||
def get_confidence(self):
|
||||
return 0.0
|
||||
|
||||
def filter_high_bit_only(self, aBuf):
|
||||
aBuf = re.sub(r'([\x00-\x7F])+', ' ', aBuf)
|
||||
return aBuf
|
||||
|
||||
def filter_without_english_letters(self, aBuf):
|
||||
aBuf = re.sub(r'([A-Za-z])+', ' ', aBuf)
|
||||
return aBuf
|
||||
|
||||
def filter_with_english_letters(self, aBuf):
|
||||
# TODO
|
||||
return aBuf
|
||||
56
fanficdownloader/chardet/codingstatemachine.py
Normal file
56
fanficdownloader/chardet/codingstatemachine.py
Normal file
|
|
@ -0,0 +1,56 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is mozilla.org code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
from constants import eStart, eError, eItsMe
|
||||
|
||||
class CodingStateMachine:
|
||||
def __init__(self, sm):
|
||||
self._mModel = sm
|
||||
self._mCurrentBytePos = 0
|
||||
self._mCurrentCharLen = 0
|
||||
self.reset()
|
||||
|
||||
def reset(self):
|
||||
self._mCurrentState = eStart
|
||||
|
||||
def next_state(self, c):
|
||||
# for each byte we get its class
|
||||
# if it is first byte, we also get byte length
|
||||
byteCls = self._mModel['classTable'][ord(c)]
|
||||
if self._mCurrentState == eStart:
|
||||
self._mCurrentBytePos = 0
|
||||
self._mCurrentCharLen = self._mModel['charLenTable'][byteCls]
|
||||
# from byte's class and stateTable, we get its next state
|
||||
self._mCurrentState = self._mModel['stateTable'][self._mCurrentState * self._mModel['classFactor'] + byteCls]
|
||||
self._mCurrentBytePos += 1
|
||||
return self._mCurrentState
|
||||
|
||||
def get_current_charlen(self):
|
||||
return self._mCurrentCharLen
|
||||
|
||||
def get_coding_state_machine(self):
|
||||
return self._mModel['name']
|
||||
47
fanficdownloader/chardet/constants.py
Normal file
47
fanficdownloader/chardet/constants.py
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
# Shy Shalom - original C code
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
_debug = 0
|
||||
|
||||
eDetecting = 0
|
||||
eFoundIt = 1
|
||||
eNotMe = 2
|
||||
|
||||
eStart = 0
|
||||
eError = 1
|
||||
eItsMe = 2
|
||||
|
||||
SHORTCUT_THRESHOLD = 0.95
|
||||
|
||||
import __builtin__
|
||||
if not hasattr(__builtin__, 'False'):
|
||||
False = 0
|
||||
True = 1
|
||||
else:
|
||||
False = __builtin__.False
|
||||
True = __builtin__.True
|
||||
79
fanficdownloader/chardet/escprober.py
Normal file
79
fanficdownloader/chardet/escprober.py
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is mozilla.org code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
import constants, sys
|
||||
from escsm import HZSMModel, ISO2022CNSMModel, ISO2022JPSMModel, ISO2022KRSMModel
|
||||
from charsetprober import CharSetProber
|
||||
from codingstatemachine import CodingStateMachine
|
||||
|
||||
class EscCharSetProber(CharSetProber):
|
||||
def __init__(self):
|
||||
CharSetProber.__init__(self)
|
||||
self._mCodingSM = [ \
|
||||
CodingStateMachine(HZSMModel),
|
||||
CodingStateMachine(ISO2022CNSMModel),
|
||||
CodingStateMachine(ISO2022JPSMModel),
|
||||
CodingStateMachine(ISO2022KRSMModel)
|
||||
]
|
||||
self.reset()
|
||||
|
||||
def reset(self):
|
||||
CharSetProber.reset(self)
|
||||
for codingSM in self._mCodingSM:
|
||||
if not codingSM: continue
|
||||
codingSM.active = constants.True
|
||||
codingSM.reset()
|
||||
self._mActiveSM = len(self._mCodingSM)
|
||||
self._mDetectedCharset = None
|
||||
|
||||
def get_charset_name(self):
|
||||
return self._mDetectedCharset
|
||||
|
||||
def get_confidence(self):
|
||||
if self._mDetectedCharset:
|
||||
return 0.99
|
||||
else:
|
||||
return 0.00
|
||||
|
||||
def feed(self, aBuf):
|
||||
for c in aBuf:
|
||||
for codingSM in self._mCodingSM:
|
||||
if not codingSM: continue
|
||||
if not codingSM.active: continue
|
||||
codingState = codingSM.next_state(c)
|
||||
if codingState == constants.eError:
|
||||
codingSM.active = constants.False
|
||||
self._mActiveSM -= 1
|
||||
if self._mActiveSM <= 0:
|
||||
self._mState = constants.eNotMe
|
||||
return self.get_state()
|
||||
elif codingState == constants.eItsMe:
|
||||
self._mState = constants.eFoundIt
|
||||
self._mDetectedCharset = codingSM.get_coding_state_machine()
|
||||
return self.get_state()
|
||||
|
||||
return self.get_state()
|
||||
240
fanficdownloader/chardet/escsm.py
Normal file
240
fanficdownloader/chardet/escsm.py
Normal file
|
|
@ -0,0 +1,240 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is mozilla.org code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
from constants import eStart, eError, eItsMe
|
||||
|
||||
HZ_cls = ( \
|
||||
1,0,0,0,0,0,0,0, # 00 - 07
|
||||
0,0,0,0,0,0,0,0, # 08 - 0f
|
||||
0,0,0,0,0,0,0,0, # 10 - 17
|
||||
0,0,0,1,0,0,0,0, # 18 - 1f
|
||||
0,0,0,0,0,0,0,0, # 20 - 27
|
||||
0,0,0,0,0,0,0,0, # 28 - 2f
|
||||
0,0,0,0,0,0,0,0, # 30 - 37
|
||||
0,0,0,0,0,0,0,0, # 38 - 3f
|
||||
0,0,0,0,0,0,0,0, # 40 - 47
|
||||
0,0,0,0,0,0,0,0, # 48 - 4f
|
||||
0,0,0,0,0,0,0,0, # 50 - 57
|
||||
0,0,0,0,0,0,0,0, # 58 - 5f
|
||||
0,0,0,0,0,0,0,0, # 60 - 67
|
||||
0,0,0,0,0,0,0,0, # 68 - 6f
|
||||
0,0,0,0,0,0,0,0, # 70 - 77
|
||||
0,0,0,4,0,5,2,0, # 78 - 7f
|
||||
1,1,1,1,1,1,1,1, # 80 - 87
|
||||
1,1,1,1,1,1,1,1, # 88 - 8f
|
||||
1,1,1,1,1,1,1,1, # 90 - 97
|
||||
1,1,1,1,1,1,1,1, # 98 - 9f
|
||||
1,1,1,1,1,1,1,1, # a0 - a7
|
||||
1,1,1,1,1,1,1,1, # a8 - af
|
||||
1,1,1,1,1,1,1,1, # b0 - b7
|
||||
1,1,1,1,1,1,1,1, # b8 - bf
|
||||
1,1,1,1,1,1,1,1, # c0 - c7
|
||||
1,1,1,1,1,1,1,1, # c8 - cf
|
||||
1,1,1,1,1,1,1,1, # d0 - d7
|
||||
1,1,1,1,1,1,1,1, # d8 - df
|
||||
1,1,1,1,1,1,1,1, # e0 - e7
|
||||
1,1,1,1,1,1,1,1, # e8 - ef
|
||||
1,1,1,1,1,1,1,1, # f0 - f7
|
||||
1,1,1,1,1,1,1,1, # f8 - ff
|
||||
)
|
||||
|
||||
HZ_st = ( \
|
||||
eStart,eError, 3,eStart,eStart,eStart,eError,eError,# 00-07
|
||||
eError,eError,eError,eError,eItsMe,eItsMe,eItsMe,eItsMe,# 08-0f
|
||||
eItsMe,eItsMe,eError,eError,eStart,eStart, 4,eError,# 10-17
|
||||
5,eError, 6,eError, 5, 5, 4,eError,# 18-1f
|
||||
4,eError, 4, 4, 4,eError, 4,eError,# 20-27
|
||||
4,eItsMe,eStart,eStart,eStart,eStart,eStart,eStart,# 28-2f
|
||||
)
|
||||
|
||||
HZCharLenTable = (0, 0, 0, 0, 0, 0)
|
||||
|
||||
HZSMModel = {'classTable': HZ_cls,
|
||||
'classFactor': 6,
|
||||
'stateTable': HZ_st,
|
||||
'charLenTable': HZCharLenTable,
|
||||
'name': "HZ-GB-2312"}
|
||||
|
||||
ISO2022CN_cls = ( \
|
||||
2,0,0,0,0,0,0,0, # 00 - 07
|
||||
0,0,0,0,0,0,0,0, # 08 - 0f
|
||||
0,0,0,0,0,0,0,0, # 10 - 17
|
||||
0,0,0,1,0,0,0,0, # 18 - 1f
|
||||
0,0,0,0,0,0,0,0, # 20 - 27
|
||||
0,3,0,0,0,0,0,0, # 28 - 2f
|
||||
0,0,0,0,0,0,0,0, # 30 - 37
|
||||
0,0,0,0,0,0,0,0, # 38 - 3f
|
||||
0,0,0,4,0,0,0,0, # 40 - 47
|
||||
0,0,0,0,0,0,0,0, # 48 - 4f
|
||||
0,0,0,0,0,0,0,0, # 50 - 57
|
||||
0,0,0,0,0,0,0,0, # 58 - 5f
|
||||
0,0,0,0,0,0,0,0, # 60 - 67
|
||||
0,0,0,0,0,0,0,0, # 68 - 6f
|
||||
0,0,0,0,0,0,0,0, # 70 - 77
|
||||
0,0,0,0,0,0,0,0, # 78 - 7f
|
||||
2,2,2,2,2,2,2,2, # 80 - 87
|
||||
2,2,2,2,2,2,2,2, # 88 - 8f
|
||||
2,2,2,2,2,2,2,2, # 90 - 97
|
||||
2,2,2,2,2,2,2,2, # 98 - 9f
|
||||
2,2,2,2,2,2,2,2, # a0 - a7
|
||||
2,2,2,2,2,2,2,2, # a8 - af
|
||||
2,2,2,2,2,2,2,2, # b0 - b7
|
||||
2,2,2,2,2,2,2,2, # b8 - bf
|
||||
2,2,2,2,2,2,2,2, # c0 - c7
|
||||
2,2,2,2,2,2,2,2, # c8 - cf
|
||||
2,2,2,2,2,2,2,2, # d0 - d7
|
||||
2,2,2,2,2,2,2,2, # d8 - df
|
||||
2,2,2,2,2,2,2,2, # e0 - e7
|
||||
2,2,2,2,2,2,2,2, # e8 - ef
|
||||
2,2,2,2,2,2,2,2, # f0 - f7
|
||||
2,2,2,2,2,2,2,2, # f8 - ff
|
||||
)
|
||||
|
||||
ISO2022CN_st = ( \
|
||||
eStart, 3,eError,eStart,eStart,eStart,eStart,eStart,# 00-07
|
||||
eStart,eError,eError,eError,eError,eError,eError,eError,# 08-0f
|
||||
eError,eError,eItsMe,eItsMe,eItsMe,eItsMe,eItsMe,eItsMe,# 10-17
|
||||
eItsMe,eItsMe,eItsMe,eError,eError,eError, 4,eError,# 18-1f
|
||||
eError,eError,eError,eItsMe,eError,eError,eError,eError,# 20-27
|
||||
5, 6,eError,eError,eError,eError,eError,eError,# 28-2f
|
||||
eError,eError,eError,eItsMe,eError,eError,eError,eError,# 30-37
|
||||
eError,eError,eError,eError,eError,eItsMe,eError,eStart,# 38-3f
|
||||
)
|
||||
|
||||
ISO2022CNCharLenTable = (0, 0, 0, 0, 0, 0, 0, 0, 0)
|
||||
|
||||
ISO2022CNSMModel = {'classTable': ISO2022CN_cls,
|
||||
'classFactor': 9,
|
||||
'stateTable': ISO2022CN_st,
|
||||
'charLenTable': ISO2022CNCharLenTable,
|
||||
'name': "ISO-2022-CN"}
|
||||
|
||||
ISO2022JP_cls = ( \
|
||||
2,0,0,0,0,0,0,0, # 00 - 07
|
||||
0,0,0,0,0,0,2,2, # 08 - 0f
|
||||
0,0,0,0,0,0,0,0, # 10 - 17
|
||||
0,0,0,1,0,0,0,0, # 18 - 1f
|
||||
0,0,0,0,7,0,0,0, # 20 - 27
|
||||
3,0,0,0,0,0,0,0, # 28 - 2f
|
||||
0,0,0,0,0,0,0,0, # 30 - 37
|
||||
0,0,0,0,0,0,0,0, # 38 - 3f
|
||||
6,0,4,0,8,0,0,0, # 40 - 47
|
||||
0,9,5,0,0,0,0,0, # 48 - 4f
|
||||
0,0,0,0,0,0,0,0, # 50 - 57
|
||||
0,0,0,0,0,0,0,0, # 58 - 5f
|
||||
0,0,0,0,0,0,0,0, # 60 - 67
|
||||
0,0,0,0,0,0,0,0, # 68 - 6f
|
||||
0,0,0,0,0,0,0,0, # 70 - 77
|
||||
0,0,0,0,0,0,0,0, # 78 - 7f
|
||||
2,2,2,2,2,2,2,2, # 80 - 87
|
||||
2,2,2,2,2,2,2,2, # 88 - 8f
|
||||
2,2,2,2,2,2,2,2, # 90 - 97
|
||||
2,2,2,2,2,2,2,2, # 98 - 9f
|
||||
2,2,2,2,2,2,2,2, # a0 - a7
|
||||
2,2,2,2,2,2,2,2, # a8 - af
|
||||
2,2,2,2,2,2,2,2, # b0 - b7
|
||||
2,2,2,2,2,2,2,2, # b8 - bf
|
||||
2,2,2,2,2,2,2,2, # c0 - c7
|
||||
2,2,2,2,2,2,2,2, # c8 - cf
|
||||
2,2,2,2,2,2,2,2, # d0 - d7
|
||||
2,2,2,2,2,2,2,2, # d8 - df
|
||||
2,2,2,2,2,2,2,2, # e0 - e7
|
||||
2,2,2,2,2,2,2,2, # e8 - ef
|
||||
2,2,2,2,2,2,2,2, # f0 - f7
|
||||
2,2,2,2,2,2,2,2, # f8 - ff
|
||||
)
|
||||
|
||||
ISO2022JP_st = ( \
|
||||
eStart, 3,eError,eStart,eStart,eStart,eStart,eStart,# 00-07
|
||||
eStart,eStart,eError,eError,eError,eError,eError,eError,# 08-0f
|
||||
eError,eError,eError,eError,eItsMe,eItsMe,eItsMe,eItsMe,# 10-17
|
||||
eItsMe,eItsMe,eItsMe,eItsMe,eItsMe,eItsMe,eError,eError,# 18-1f
|
||||
eError, 5,eError,eError,eError, 4,eError,eError,# 20-27
|
||||
eError,eError,eError, 6,eItsMe,eError,eItsMe,eError,# 28-2f
|
||||
eError,eError,eError,eError,eError,eError,eItsMe,eItsMe,# 30-37
|
||||
eError,eError,eError,eItsMe,eError,eError,eError,eError,# 38-3f
|
||||
eError,eError,eError,eError,eItsMe,eError,eStart,eStart,# 40-47
|
||||
)
|
||||
|
||||
ISO2022JPCharLenTable = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
|
||||
|
||||
ISO2022JPSMModel = {'classTable': ISO2022JP_cls,
|
||||
'classFactor': 10,
|
||||
'stateTable': ISO2022JP_st,
|
||||
'charLenTable': ISO2022JPCharLenTable,
|
||||
'name': "ISO-2022-JP"}
|
||||
|
||||
ISO2022KR_cls = ( \
|
||||
2,0,0,0,0,0,0,0, # 00 - 07
|
||||
0,0,0,0,0,0,0,0, # 08 - 0f
|
||||
0,0,0,0,0,0,0,0, # 10 - 17
|
||||
0,0,0,1,0,0,0,0, # 18 - 1f
|
||||
0,0,0,0,3,0,0,0, # 20 - 27
|
||||
0,4,0,0,0,0,0,0, # 28 - 2f
|
||||
0,0,0,0,0,0,0,0, # 30 - 37
|
||||
0,0,0,0,0,0,0,0, # 38 - 3f
|
||||
0,0,0,5,0,0,0,0, # 40 - 47
|
||||
0,0,0,0,0,0,0,0, # 48 - 4f
|
||||
0,0,0,0,0,0,0,0, # 50 - 57
|
||||
0,0,0,0,0,0,0,0, # 58 - 5f
|
||||
0,0,0,0,0,0,0,0, # 60 - 67
|
||||
0,0,0,0,0,0,0,0, # 68 - 6f
|
||||
0,0,0,0,0,0,0,0, # 70 - 77
|
||||
0,0,0,0,0,0,0,0, # 78 - 7f
|
||||
2,2,2,2,2,2,2,2, # 80 - 87
|
||||
2,2,2,2,2,2,2,2, # 88 - 8f
|
||||
2,2,2,2,2,2,2,2, # 90 - 97
|
||||
2,2,2,2,2,2,2,2, # 98 - 9f
|
||||
2,2,2,2,2,2,2,2, # a0 - a7
|
||||
2,2,2,2,2,2,2,2, # a8 - af
|
||||
2,2,2,2,2,2,2,2, # b0 - b7
|
||||
2,2,2,2,2,2,2,2, # b8 - bf
|
||||
2,2,2,2,2,2,2,2, # c0 - c7
|
||||
2,2,2,2,2,2,2,2, # c8 - cf
|
||||
2,2,2,2,2,2,2,2, # d0 - d7
|
||||
2,2,2,2,2,2,2,2, # d8 - df
|
||||
2,2,2,2,2,2,2,2, # e0 - e7
|
||||
2,2,2,2,2,2,2,2, # e8 - ef
|
||||
2,2,2,2,2,2,2,2, # f0 - f7
|
||||
2,2,2,2,2,2,2,2, # f8 - ff
|
||||
)
|
||||
|
||||
ISO2022KR_st = ( \
|
||||
eStart, 3,eError,eStart,eStart,eStart,eError,eError,# 00-07
|
||||
eError,eError,eError,eError,eItsMe,eItsMe,eItsMe,eItsMe,# 08-0f
|
||||
eItsMe,eItsMe,eError,eError,eError, 4,eError,eError,# 10-17
|
||||
eError,eError,eError,eError, 5,eError,eError,eError,# 18-1f
|
||||
eError,eError,eError,eItsMe,eStart,eStart,eStart,eStart,# 20-27
|
||||
)
|
||||
|
||||
ISO2022KRCharLenTable = (0, 0, 0, 0, 0, 0)
|
||||
|
||||
ISO2022KRSMModel = {'classTable': ISO2022KR_cls,
|
||||
'classFactor': 6,
|
||||
'stateTable': ISO2022KR_st,
|
||||
'charLenTable': ISO2022KRCharLenTable,
|
||||
'name': "ISO-2022-KR"}
|
||||
85
fanficdownloader/chardet/eucjpprober.py
Normal file
85
fanficdownloader/chardet/eucjpprober.py
Normal file
|
|
@ -0,0 +1,85 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is mozilla.org code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
import constants, sys
|
||||
from constants import eStart, eError, eItsMe
|
||||
from mbcharsetprober import MultiByteCharSetProber
|
||||
from codingstatemachine import CodingStateMachine
|
||||
from chardistribution import EUCJPDistributionAnalysis
|
||||
from jpcntx import EUCJPContextAnalysis
|
||||
from mbcssm import EUCJPSMModel
|
||||
|
||||
class EUCJPProber(MultiByteCharSetProber):
|
||||
def __init__(self):
|
||||
MultiByteCharSetProber.__init__(self)
|
||||
self._mCodingSM = CodingStateMachine(EUCJPSMModel)
|
||||
self._mDistributionAnalyzer = EUCJPDistributionAnalysis()
|
||||
self._mContextAnalyzer = EUCJPContextAnalysis()
|
||||
self.reset()
|
||||
|
||||
def reset(self):
|
||||
MultiByteCharSetProber.reset(self)
|
||||
self._mContextAnalyzer.reset()
|
||||
|
||||
def get_charset_name(self):
|
||||
return "EUC-JP"
|
||||
|
||||
def feed(self, aBuf):
|
||||
aLen = len(aBuf)
|
||||
for i in range(0, aLen):
|
||||
codingState = self._mCodingSM.next_state(aBuf[i])
|
||||
if codingState == eError:
|
||||
if constants._debug:
|
||||
sys.stderr.write(self.get_charset_name() + ' prober hit error at byte ' + str(i) + '\n')
|
||||
self._mState = constants.eNotMe
|
||||
break
|
||||
elif codingState == eItsMe:
|
||||
self._mState = constants.eFoundIt
|
||||
break
|
||||
elif codingState == eStart:
|
||||
charLen = self._mCodingSM.get_current_charlen()
|
||||
if i == 0:
|
||||
self._mLastChar[1] = aBuf[0]
|
||||
self._mContextAnalyzer.feed(self._mLastChar, charLen)
|
||||
self._mDistributionAnalyzer.feed(self._mLastChar, charLen)
|
||||
else:
|
||||
self._mContextAnalyzer.feed(aBuf[i-1:i+1], charLen)
|
||||
self._mDistributionAnalyzer.feed(aBuf[i-1:i+1], charLen)
|
||||
|
||||
self._mLastChar[0] = aBuf[aLen - 1]
|
||||
|
||||
if self.get_state() == constants.eDetecting:
|
||||
if self._mContextAnalyzer.got_enough_data() and \
|
||||
(self.get_confidence() > constants.SHORTCUT_THRESHOLD):
|
||||
self._mState = constants.eFoundIt
|
||||
|
||||
return self.get_state()
|
||||
|
||||
def get_confidence(self):
|
||||
contxtCf = self._mContextAnalyzer.get_confidence()
|
||||
distribCf = self._mDistributionAnalyzer.get_confidence()
|
||||
return max(contxtCf, distribCf)
|
||||
594
fanficdownloader/chardet/euckrfreq.py
Normal file
594
fanficdownloader/chardet/euckrfreq.py
Normal file
|
|
@ -0,0 +1,594 @@
|
|||
######################## BEGIN LICENSE BLOCK ########################
|
||||
# The Original Code is Mozilla Communicator client code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 1998
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Mark Pilgrim - port to Python
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
# Sampling from about 20M text materials include literature and computer technology
|
||||
|
||||
# 128 --> 0.79
|
||||
# 256 --> 0.92
|
||||
# 512 --> 0.986
|
||||
# 1024 --> 0.99944
|
||||
# 2048 --> 0.99999
|
||||
#
|
||||
# Idea Distribution Ratio = 0.98653 / (1-0.98653) = 73.24
|
||||
# Random Distribution Ration = 512 / (2350-512) = 0.279.
|
||||
#
|
||||
# Typical Distribution Ratio
|
||||
|
||||
EUCKR_TYPICAL_DISTRIBUTION_RATIO = 6.0
|
||||
|
||||
EUCKR_TABLE_SIZE = 2352
|
||||
|
||||
# Char to FreqOrder table ,
|
||||
EUCKRCharToFreqOrder = ( \
|
||||
13, 130, 120,1396, 481,1719,1720, 328, 609, 212,1721, 707, 400, 299,1722, 87,
|
||||
1397,1723, 104, 536,1117,1203,1724,1267, 685,1268, 508,1725,1726,1727,1728,1398,
|
||||
1399,1729,1730,1731, 141, 621, 326,1057, 368,1732, 267, 488, 20,1733,1269,1734,
|
||||
945,1400,1735, 47, 904,1270,1736,1737, 773, 248,1738, 409, 313, 786, 429,1739,
|
||||
116, 987, 813,1401, 683, 75,1204, 145,1740,1741,1742,1743, 16, 847, 667, 622,
|
||||
708,1744,1745,1746, 966, 787, 304, 129,1747, 60, 820, 123, 676,1748,1749,1750,
|
||||
1751, 617,1752, 626,1753,1754,1755,1756, 653,1757,1758,1759,1760,1761,1762, 856,
|
||||
344,1763,1764,1765,1766, 89, 401, 418, 806, 905, 848,1767,1768,1769, 946,1205,
|
||||
709,1770,1118,1771, 241,1772,1773,1774,1271,1775, 569,1776, 999,1777,1778,1779,
|
||||
1780, 337, 751,1058, 28, 628, 254,1781, 177, 906, 270, 349, 891,1079,1782, 19,
|
||||
1783, 379,1784, 315,1785, 629, 754,1402, 559,1786, 636, 203,1206,1787, 710, 567,
|
||||
1788, 935, 814,1789,1790,1207, 766, 528,1791,1792,1208,1793,1794,1795,1796,1797,
|
||||
1403,1798,1799, 533,1059,1404,1405,1156,1406, 936, 884,1080,1800, 351,1801,1802,
|
||||
1803,1804,1805, 801,1806,1807,1808,1119,1809,1157, 714, 474,1407,1810, 298, 899,
|
||||
885,1811,1120, 802,1158,1812, 892,1813,1814,1408, 659,1815,1816,1121,1817,1818,
|
||||
1819,1820,1821,1822, 319,1823, 594, 545,1824, 815, 937,1209,1825,1826, 573,1409,
|
||||
1022,1827,1210,1828,1829,1830,1831,1832,1833, 556, 722, 807,1122,1060,1834, 697,
|
||||
1835, 900, 557, 715,1836,1410, 540,1411, 752,1159, 294, 597,1211, 976, 803, 770,
|
||||
1412,1837,1838, 39, 794,1413, 358,1839, 371, 925,1840, 453, 661, 788, 531, 723,
|
||||
544,1023,1081, 869, 91,1841, 392, 430, 790, 602,1414, 677,1082, 457,1415,1416,
|
||||
1842,1843, 475, 327,1024,1417, 795, 121,1844, 733, 403,1418,1845,1846,1847, 300,
|
||||
119, 711,1212, 627,1848,1272, 207,1849,1850, 796,1213, 382,1851, 519,1852,1083,
|
||||
893,1853,1854,1855, 367, 809, 487, 671,1856, 663,1857,1858, 956, 471, 306, 857,
|
||||
1859,1860,1160,1084,1861,1862,1863,1864,1865,1061,1866,1867,1868,1869,1870,1871,
|
||||
282, 96, 574,1872, 502,1085,1873,1214,1874, 907,1875,1876, 827, 977,1419,1420,
|
||||
1421, 268,1877,1422,1878,1879,1880, 308,1881, 2, 537,1882,1883,1215,1884,1885,
|
||||
127, 791,1886,1273,1423,1887, 34, 336, 404, 643,1888, 571, 654, 894, 840,1889,
|
||||
0, 886,1274, 122, 575, 260, 908, 938,1890,1275, 410, 316,1891,1892, 100,1893,
|
||||
1894,1123, 48,1161,1124,1025,1895, 633, 901,1276,1896,1897, 115, 816,1898, 317,
|
||||
1899, 694,1900, 909, 734,1424, 572, 866,1425, 691, 85, 524,1010, 543, 394, 841,
|
||||
1901,1902,1903,1026,1904,1905,1906,1907,1908,1909, 30, 451, 651, 988, 310,1910,
|
||||
1911,1426, 810,1216, 93,1912,1913,1277,1217,1914, 858, 759, 45, 58, 181, 610,
|
||||
269,1915,1916, 131,1062, 551, 443,1000, 821,1427, 957, 895,1086,1917,1918, 375,
|
||||
1919, 359,1920, 687,1921, 822,1922, 293,1923,1924, 40, 662, 118, 692, 29, 939,
|
||||
887, 640, 482, 174,1925, 69,1162, 728,1428, 910,1926,1278,1218,1279, 386, 870,
|
||||
217, 854,1163, 823,1927,1928,1929,1930, 834,1931, 78,1932, 859,1933,1063,1934,
|
||||
1935,1936,1937, 438,1164, 208, 595,1938,1939,1940,1941,1219,1125,1942, 280, 888,
|
||||
1429,1430,1220,1431,1943,1944,1945,1946,1947,1280, 150, 510,1432,1948,1949,1950,
|
||||
1951,1952,1953,1954,1011,1087,1955,1433,1043,1956, 881,1957, 614, 958,1064,1065,
|
||||
1221,1958, 638,1001, 860, 967, 896,1434, 989, 492, 553,1281,1165,1959,1282,1002,
|
||||
1283,1222,1960,1961,1962,1963, 36, 383, 228, 753, 247, 454,1964, 876, 678,1965,
|
||||
1966,1284, 126, 464, 490, 835, 136, 672, 529, 940,1088,1435, 473,1967,1968, 467,
|
||||
50, 390, 227, 587, 279, 378, 598, 792, 968, 240, 151, 160, 849, 882,1126,1285,
|
||||
639,1044, 133, 140, 288, 360, 811, 563,1027, 561, 142, 523,1969,1970,1971, 7,
|
||||
103, 296, 439, 407, 506, 634, 990,1972,1973,1974,1975, 645,1976,1977,1978,1979,
|
||||
1980,1981, 236,1982,1436,1983,1984,1089, 192, 828, 618, 518,1166, 333,1127,1985,
|
||||
818,1223,1986,1987,1988,1989,1990,1991,1992,1993, 342,1128,1286, 746, 842,1994,
|
||||
1995, 560, 223,1287, 98, 8, 189, 650, 978,1288,1996,1437,1997, 17, 345, 250,
|
||||
423, 277, 234, 512, 226, 97, 289, 42, 167,1998, 201,1999,2000, 843, 836, 824,
|
||||
532, 338, 783,1090, 182, 576, 436,1438,1439, 527, 500,2001, 947, 889,2002,2003,
|
||||
2004,2005, 262, 600, 314, 447,2006, 547,2007, 693, 738,1129,2008, 71,1440, 745,
|
||||
619, 688,2009, 829,2010,2011, 147,2012, 33, 948,2013,2014, 74, 224,2015, 61,
|
||||
191, 918, 399, 637,2016,1028,1130, 257, 902,2017,2018,2019,2020,2021,2022,2023,
|
||||
2024,2025,2026, 837,2027,2028,2029,2030, 179, 874, 591, 52, 724, 246,2031,2032,
|
||||
2033,2034,1167, 969,2035,1289, 630, 605, 911,1091,1168,2036,2037,2038,1441, 912,
|
||||
2039, 623,2040,2041, 253,1169,1290,2042,1442, 146, 620, 611, 577, 433,2043,1224,
|
||||
719,1170, 959, 440, 437, 534, 84, 388, 480,1131, 159, 220, 198, 679,2044,1012,
|
||||
819,1066,1443, 113,1225, 194, 318,1003,1029,2045,2046,2047,2048,1067,2049,2050,
|
||||
2051,2052,2053, 59, 913, 112,2054, 632,2055, 455, 144, 739,1291,2056, 273, 681,
|
||||
499,2057, 448,2058,2059, 760,2060,2061, 970, 384, 169, 245,1132,2062,2063, 414,
|
||||
1444,2064,2065, 41, 235,2066, 157, 252, 877, 568, 919, 789, 580,2067, 725,2068,
|
||||
2069,1292,2070,2071,1445,2072,1446,2073,2074, 55, 588, 66,1447, 271,1092,2075,
|
||||
1226,2076, 960,1013, 372,2077,2078,2079,2080,2081,1293,2082,2083,2084,2085, 850,
|
||||
2086,2087,2088,2089,2090, 186,2091,1068, 180,2092,2093,2094, 109,1227, 522, 606,
|
||||
2095, 867,1448,1093, 991,1171, 926, 353,1133,2096, 581,2097,2098,2099,1294,1449,
|
||||
1450,2100, 596,1172,1014,1228,2101,1451,1295,1173,1229,2102,2103,1296,1134,1452,
|
||||
949,1135,2104,2105,1094,1453,1454,1455,2106,1095,2107,2108,2109,2110,2111,2112,
|
||||
2113,2114,2115,2116,2117, 804,2118,2119,1230,1231, 805,1456, 405,1136,2120,2121,
|
||||
2122,2123,2124, 720, 701,1297, 992,1457, 927,1004,2125,2126,2127,2128,2129,2130,
|
||||
22, 417,2131, 303,2132, 385,2133, 971, 520, 513,2134,1174, 73,1096, 231, 274,
|
||||
962,1458, 673,2135,1459,2136, 152,1137,2137,2138,2139,2140,1005,1138,1460,1139,
|
||||
2141,2142,2143,2144, 11, 374, 844,2145, 154,1232, 46,1461,2146, 838, 830, 721,
|
||||
1233, 106,2147, 90, 428, 462, 578, 566,1175, 352,2148,2149, 538,1234, 124,1298,
|
||||
2150,1462, 761, 565,2151, 686,2152, 649,2153, 72, 173,2154, 460, 415,2155,1463,
|
||||
2156,1235, 305,2157,2158,2159,2160,2161,2162, 579,2163,2164,2165,2166,2167, 747,
|
||||
2168,2169,2170,2171,1464, 669,2172,2173,2174,2175,2176,1465,2177, 23, 530, 285,
|
||||
2178, 335, 729,2179, 397,2180,2181,2182,1030,2183,2184, 698,2185,2186, 325,2187,
|
||||
2188, 369,2189, 799,1097,1015, 348,2190,1069, 680,2191, 851,1466,2192,2193, 10,
|
||||
2194, 613, 424,2195, 979, 108, 449, 589, 27, 172, 81,1031, 80, 774, 281, 350,
|
||||
1032, 525, 301, 582,1176,2196, 674,1045,2197,2198,1467, 730, 762,2199,2200,2201,
|
||||
2202,1468,2203, 993,2204,2205, 266,1070, 963,1140,2206,2207,2208, 664,1098, 972,
|
||||
2209,2210,2211,1177,1469,1470, 871,2212,2213,2214,2215,2216,1471,2217,2218,2219,
|
||||
2220,2221,2222,2223,2224,2225,2226,2227,1472,1236,2228,2229,2230,2231,2232,2233,
|
||||
2234,2235,1299,2236,2237, 200,2238, 477, 373,2239,2240, 731, 825, 777,2241,2242,
|
||||
2243, 521, 486, 548,2244,2245,2246,1473,1300, 53, 549, 137, 875, 76, 158,2247,
|
||||
1301,1474, 469, 396,1016, 278, 712,2248, 321, 442, 503, 767, 744, 941,1237,1178,
|
||||
1475,2249, 82, 178,1141,1179, 973,2250,1302,2251, 297,2252,2253, 570,2254,2255,
|
||||
2256, 18, 450, 206,2257, 290, 292,1142,2258, 511, 162, 99, 346, 164, 735,2259,
|
||||
1476,1477, 4, 554, 343, 798,1099,2260,1100,2261, 43, 171,1303, 139, 215,2262,
|
||||
2263, 717, 775,2264,1033, 322, 216,2265, 831,2266, 149,2267,1304,2268,2269, 702,
|
||||
1238, 135, 845, 347, 309,2270, 484,2271, 878, 655, 238,1006,1478,2272, 67,2273,
|
||||
295,2274,2275, 461,2276, 478, 942, 412,2277,1034,2278,2279,2280, 265,2281, 541,
|
||||
2282,2283,2284,2285,2286, 70, 852,1071,2287,2288,2289,2290, 21, 56, 509, 117,
|
||||
432,2291,2292, 331, 980, 552,1101, 148, 284, 105, 393,1180,1239, 755,2293, 187,
|
||||
2294,1046,1479,2295, 340,2296, 63,1047, 230,2297,2298,1305, 763,1306, 101, 800,
|
||||
808, 494,2299,2300,2301, 903,2302, 37,1072, 14, 5,2303, 79, 675,2304, 312,
|
||||
2305,2306,2307,2308,2309,1480, 6,1307,2310,2311,2312, 1, 470, 35, 24, 229,
|
||||
2313, 695, 210, 86, 778, 15, 784, 592, 779, 32, 77, 855, 964,2314, 259,2315,
|
||||
501, 380,2316,2317, 83, 981, 153, 689,1308,1481,1482,1483,2318,2319, 716,1484,
|
||||
2320,2321,2322,2323,2324,2325,1485,2326,2327, 128, 57, 68, 261,1048, 211, 170,
|
||||
1240, 31,2328, 51, 435, 742,2329,2330,2331, 635,2332, 264, 456,2333,2334,2335,
|
||||
425,2336,1486, 143, 507, 263, 943,2337, 363, 920,1487, 256,1488,1102, 243, 601,
|
||||
1489,2338,2339,2340,2341,2342,2343,2344, 861,2345,2346,2347,2348,2349,2350, 395,
|
||||
2351,1490,1491, 62, 535, 166, 225,2352,2353, 668, 419,1241, 138, 604, 928,2354,
|
||||
1181,2355,1492,1493,2356,2357,2358,1143,2359, 696,2360, 387, 307,1309, 682, 476,
|
||||
2361,2362, 332, 12, 222, 156,2363, 232,2364, 641, 276, 656, 517,1494,1495,1035,
|
||||
416, 736,1496,2365,1017, 586,2366,2367,2368,1497,2369, 242,2370,2371,2372,1498,
|
||||
2373, 965, 713,2374,2375,2376,2377, 740, 982,1499, 944,1500,1007,2378,2379,1310,
|
||||
1501,2380,2381,2382, 785, 329,2383,2384,1502,2385,2386,2387, 932,2388,1503,2389,
|
||||
2390,2391,2392,1242,2393,2394,2395,2396,2397, 994, 950,2398,2399,2400,2401,1504,
|
||||
1311,2402,2403,2404,2405,1049, 749,2406,2407, 853, 718,1144,1312,2408,1182,1505,
|
||||
2409,2410, 255, 516, 479, 564, 550, 214,1506,1507,1313, 413, 239, 444, 339,1145,
|
||||
1036,1508,1509,1314,1037,1510,1315,2411,1511,2412,2413,2414, 176, 703, 497, 624,
|
||||
593, 921, 302,2415, 341, 165,1103,1512,2416,1513,2417,2418,2419, 376,2420, 700,
|
||||
2421,2422,2423, 258, 768,1316,2424,1183,2425, 995, 608,2426,2427,2428,2429, 221,
|
||||
2430,2431,2432,2433,2434,2435,2436,2437, 195, 323, 726, 188, 897, 983,1317, 377,
|
||||
644,1050, 879,2438, 452,2439,2440,2441,2442,2443,2444, 914,2445,2446,2447,2448,
|
||||
915, 489,2449,1514,1184,2450,2451, 515, 64, 427, 495,2452, 583,2453, 483, 485,
|
||||
1038, 562, 213,1515, 748, 666,2454,2455,2456,2457, 334,2458, 780, 996,1008, 705,
|
||||
1243,2459,2460,2461,2462,2463, 114,2464, 493,1146, 366, 163,1516, 961,1104,2465,
|
||||
291,2466,1318,1105,2467,1517, 365,2468, 355, 951,1244,2469,1319,2470, 631,2471,
|
||||
2472, 218,1320, 364, 320, 756,1518,1519,1321,1520,1322,2473,2474,2475,2476, 997,
|
||||
2477,2478,2479,2480, 665,1185,2481, 916,1521,2482,2483,2484, 584, 684,2485,2486,
|
||||
797,2487,1051,1186,2488,2489,2490,1522,2491,2492, 370,2493,1039,1187, 65,2494,
|
||||
434, 205, 463,1188,2495, 125, 812, 391, 402, 826, 699, 286, 398, 155, 781, 771,
|
||||
585,2496, 590, 505,1073,2497, 599, 244, 219, 917,1018, 952, 646,1523,2498,1323,
|
||||
2499,2500, 49, 984, 354, 741,2501, 625,2502,1324,2503,1019, 190, 357, 757, 491,
|
||||
95, 782, 868,2504,2505,2506,2507,2508,2509, 134,1524,1074, 422,1525, 898,2510,
|
||||
161,2511,2512,2513,2514, 769,2515,1526,2516,2517, 411,1325,2518, 472,1527,2519,
|
||||
2520,2521,2522,2523,2524, 985,2525,2526,2527,2528,2529,2530, 764,2531,1245,2532,
|
||||
2533, 25, 204, 311,2534, 496,2535,1052,2536,2537,2538,2539,2540,2541,2542, 199,
|
||||
704, 504, 468, 758, 657,1528, 196, 44, 839,1246, 272, 750,2543, 765, 862,2544,
|
||||
2545,1326,2546, 132, 615, 933,2547, 732,2548,2549,2550,1189,1529,2551, 283,1247,
|
||||
1053, 607, 929,2552,2553,2554, 930, 183, 872, 616,1040,1147,2555,1148,1020, 441,
|
||||
249,1075,2556,2557,2558, 466, 743,2559,2560,2561, 92, 514, 426, 420, 526,2562,
|
||||
2563,2564,2565,2566,2567,2568, 185,2569,2570,2571,2572, 776,1530, 658,2573, 362,
|
||||
2574, 361, 922,1076, 793,2575,2576,2577,2578,2579,2580,1531, 251,2581,2582,2583,
|
||||
2584,1532, 54, 612, 237,1327,2585,2586, 275, 408, 647, 111,2587,1533,1106, 465,
|
||||
3, 458, 9, 38,2588, 107, 110, 890, 209, 26, 737, 498,2589,1534,2590, 431,
|
||||
202, 88,1535, 356, 287,1107, 660,1149,2591, 381,1536, 986,1150, 445,1248,1151,
|
||||
974,2592,2593, 846,2594, 446, 953, 184,1249,1250, 727,2595, 923, 193, 883,2596,
|
||||
2597,2598, 102, 324, 539, 817,2599, 421,1041,2600, 832,2601, 94, 175, 197, 406,
|
||||
2602, 459,2603,2604,2605,2606,2607, 330, 555,2608,2609,2610, 706,1108, 389,2611,
|
||||
2612,2613,2614, 233,2615, 833, 558, 931, 954,1251,2616,2617,1537, 546,2618,2619,
|
||||
1009,2620,2621,2622,1538, 690,1328,2623, 955,2624,1539,2625,2626, 772,2627,2628,
|
||||
2629,2630,2631, 924, 648, 863, 603,2632,2633, 934,1540, 864, 865,2634, 642,1042,
|
||||
670,1190,2635,2636,2637,2638, 168,2639, 652, 873, 542,1054,1541,2640,2641,2642, # 512, 256
|
||||
#Everything below is of no interest for detection purpose
|
||||
2643,2644,2645,2646,2647,2648,2649,2650,2651,2652,2653,2654,2655,2656,2657,2658,
|
||||
2659,2660,2661,2662,2663,2664,2665,2666,2667,2668,2669,2670,2671,2672,2673,2674,
|
||||
2675,2676,2677,2678,2679,2680,2681,2682,2683,2684,2685,2686,2687,2688,2689,2690,
|
||||
2691,2692,2693,2694,2695,2696,2697,2698,2699,1542, 880,2700,2701,2702,2703,2704,
|
||||
2705,2706,2707,2708,2709,2710,2711,2712,2713,2714,2715,2716,2717,2718,2719,2720,
|
||||
2721,2722,2723,2724,2725,1543,2726,2727,2728,2729,2730,2731,2732,1544,2733,2734,
|
||||
2735,2736,2737,2738,2739,2740,2741,2742,2743,2744,2745,2746,2747,2748,2749,2750,
|
||||
2751,2752,2753,2754,1545,2755,2756,2757,2758,2759,2760,2761,2762,2763,2764,2765,
|
||||
2766,1546,2767,1547,2768,2769,2770,2771,2772,2773,2774,2775,2776,2777,2778,2779,
|
||||
2780,2781,2782,2783,2784,2785,2786,1548,2787,2788,2789,1109,2790,2791,2792,2793,
|
||||
2794,2795,2796,2797,2798,2799,2800,2801,2802,2803,2804,2805,2806,2807,2808,2809,
|
||||
2810,2811,2812,1329,2813,2814,2815,2816,2817,2818,2819,2820,2821,2822,2823,2824,
|
||||
2825,2826,2827,2828,2829,2830,2831,2832,2833,2834,2835,2836,2837,2838,2839,2840,
|
||||
2841,2842,2843,2844,2845,2846,2847,2848,2849,2850,2851,2852,2853,2854,2855,2856,
|
||||
1549,2857,2858,2859,2860,1550,2861,2862,1551,2863,2864,2865,2866,2867,2868,2869,
|
||||
2870,2871,2872,2873,2874,1110,1330,2875,2876,2877,2878,2879,2880,2881,2882,2883,
|
||||
2884,2885,2886,2887,2888,2889,2890,2891,2892,2893,2894,2895,2896,2897,2898,2899,
|
||||
2900,2901,2902,2903,2904,2905,2906,2907,2908,2909,2910,2911,2912,2913,2914,2915,
|
||||
2916,2917,2918,2919,2920,2921,2922,2923,2924,2925,2926,2927,2928,2929,2930,1331,
|
||||
2931,2932,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942,2943,1552,2944,2945,
|
||||
2946,2947,2948,2949,2950,2951,2952,2953,2954,2955,2956,2957,2958,2959,2960,2961,
|
||||
2962,2963,2964,1252,2965,2966,2967,2968,2969,2970,2971,2972,2973,2974,2975,2976,
|
||||
2977,2978,2979,2980,2981,2982,2983,2984,2985,2986,2987,2988,2989,2990,2991,2992,
|
||||
2993,2994,2995,2996,2997,2998,2999,3000,3001,3002,3003,3004,3005,3006,3007,3008,
|
||||
3009,3010,3011,3012,1553,3013,3014,3015,3016,3017,1554,3018,1332,3019,3020,3021,
|
||||
3022,3023,3024,3025,3026,3027,3028,3029,3030,3031,3032,3033,3034,3035,3036,3037,
|
||||
3038,3039,3040,3041,3042,3043,3044,3045,3046,3047,3048,3049,3050,1555,3051,3052,
|
||||
3053,1556,1557,3054,3055,3056,3057,3058,3059,3060,3061,3062,3063,3064,3065,3066,
|
||||
3067,1558,3068,3069,3070,3071,3072,3073,3074,3075,3076,1559,3077,3078,3079,3080,
|
||||
3081,3082,3083,1253,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,
|
||||
3096,3097,3098,3099,3100,3101,3102,3103,3104,3105,3106,3107,3108,1152,3109,3110,
|
||||
3111,3112,3113,1560,3114,3115,3116,3117,1111,3118,3119,3120,3121,3122,3123,3124,
|
||||
3125,3126,3127,3128,3129,3130,3131,3132,3133,3134,3135,3136,3137,3138,3139,3140,
|
||||
3141,3142,3143,3144,3145,3146,3147,3148,3149,3150,3151,3152,3153,3154,3155,3156,
|
||||
3157,3158,3159,3160,3161,3162,3163,3164,3165,3166,3167,3168,3169,3170,3171,3172,
|
||||
3173,3174,3175,3176,1333,3177,3178,3179,3180,3181,3182,3183,3184,3185,3186,3187,
|
||||
3188,3189,1561,3190,3191,1334,3192,3193,3194,3195,3196,3197,3198,3199,3200,3201,
|
||||
3202,3203,3204,3205,3206,3207,3208,3209,3210,3211,3212,3213,3214,3215,3216,3217,
|
||||
3218,3219,3220,3221,3222,3223,3224,3225,3226,3227,3228,3229,3230,3231,3232,3233,
|
||||
3234,1562,3235,3236,3237,3238,3239,3240,3241,3242,3243,3244,3245,3246,3247,3248,
|
||||
3249,3250,3251,3252,3253,3254,3255,3256,3257,3258,3259,3260,3261,3262,3263,3264,
|
||||
3265,3266,3267,3268,3269,3270,3271,3272,3273,3274,3275,3276,3277,1563,3278,3279,
|
||||
3280,3281,3282,3283,3284,3285,3286,3287,3288,3289,3290,3291,3292,3293,3294,3295,
|
||||
3296,3297,3298,3299,3300,3301,3302,3303,3304,3305,3306,3307,3308,3309,3310,3311,
|
||||
3312,3313,3314,3315,3316,3317,3318,3319,3320,3321,3322,3323,3324,3325,3326,3327,
|
||||
3328,3329,3330,3331,3332,3333,3334,3335,3336,3337,3338,3339,3340,3341,3342,3343,
|
||||
3344,3345,3346,3347,3348,3349,3350,3351,3352,3353,3354,3355,3356,3357,3358,3359,
|
||||
3360,3361,3362,3363,3364,1335,3365,3366,3367,3368,3369,3370,3371,3372,3373,3374,
|
||||
3375,3376,3377,3378,3379,3380,3381,3382,3383,3384,3385,3386,3387,1336,3388,3389,
|
||||
3390,3391,3392,3393,3394,3395,3396,3397,3398,3399,3400,3401,3402,3403,3404,3405,
|
||||
3406,3407,3408,3409,3410,3411,3412,3413,3414,1337,3415,3416,3417,3418,3419,1338,
|
||||
3420,3421,3422,1564,1565,3423,3424,3425,3426,3427,3428,3429,3430,3431,1254,3432,
|
||||
3433,3434,1339,3435,3436,3437,3438,3439,1566,3440,3441,3442,3443,3444,3445,3446,
|
||||
3447,3448,3449,3450,3451,3452,3453,3454,1255,3455,3456,3457,3458,3459,1567,1191,
|
||||
3460,1568,1569,3461,3462,3463,1570,3464,3465,3466,3467,3468,1571,3469,3470,3471,
|
||||
3472,3473,1572,3474,3475,3476,3477,3478,3479,3480,3481,3482,3483,3484,3485,3486,
|
||||
1340,3487,3488,3489,3490,3491,3492,1021,3493,3494,3495,3496,3497,3498,1573,3499,
|
||||
1341,3500,3501,3502,3503,3504,3505,3506,3507,3508,3509,3510,3511,1342,3512,3513,
|
||||
3514,3515,3516,1574,1343,3517,3518,3519,1575,3520,1576,3521,3522,3523,3524,3525,
|
||||
3526,3527,3528,3529,3530,3531,3532,3533,3534,3535,3536,3537,3538,3539,3540,3541,
|
||||
3542,3543,3544,3545,3546,3547,3548,3549,3550,3551,3552,3553,3554,3555,3556,3557,
|
||||
3558,3559,3560,3561,3562,3563,3564,3565,3566,3567,3568,3569,3570,3571,3572,3573,
|
||||
3574,3575,3576,3577,3578,3579,3580,1577,3581,3582,1578,3583,3584,3585,3586,3587,
|
||||
3588,3589,3590,3591,3592,3593,3594,3595,3596,3597,3598,3599,3600,3601,3602,3603,
|
||||
3604,1579,3605,3606,3607,3608,3609,3610,3611,3612,3613,3614,3615,3616,3617,3618,
|
||||
3619,3620,3621,3622,3623,3624,3625,3626,3627,3628,3629,1580,3630,3631,1581,3632,
|
||||
3633,3634,3635,3636,3637,3638,3639,3640,3641,3642,3643,3644,3645,3646,3647,3648,
|
||||
3649,3650,3651,3652,3653,3654,3655,3656,1582,3657,3658,3659,3660,3661,3662,3663,
|
||||
3664,3665,3666,3667,3668,3669,3670,3671,3672,3673,3674,3675,3676,3677,3678,3679,
|
||||
3680,3681,3682,3683,3684,3685,3686,3687,3688,3689,3690,3691,3692,3693,3694,3695,
|
||||
3696,3697,3698,3699,3700,1192,3701,3702,3703,3704,1256,3705,3706,3707,3708,1583,
|
||||
1257,3709,3710,3711,3712,3713,3714,3715,3716,1584,3717,3718,3719,3720,3721,3722,
|
||||
3723,3724,3725,3726,3727,3728,3729,3730,3731,3732,3733,3734,3735,3736,3737,3738,
|
||||
3739,3740,3741,3742,3743,3744,3745,1344,3746,3747,3748,3749,3750,3751,3752,3753,
|
||||
3754,3755,3756,1585,3757,3758,3759,3760,3761,3762,3763,3764,3765,3766,1586,3767,
|
||||
3768,3769,3770,3771,3772,3773,3774,3775,3776,3777,3778,1345,3779,3780,3781,3782,
|
||||
3783,3784,3785,3786,3787,3788,3789,3790,3791,3792,3793,3794,3795,1346,1587,3796,
|
||||
3797,1588,3798,3799,3800,3801,3802,3803,3804,3805,3806,1347,3807,3808,3809,3810,
|
||||
3811,1589,3812,3813,3814,3815,3816,3817,3818,3819,3820,3821,1590,3822,3823,1591,
|
||||
1348,3824,3825,3826,3827,3828,3829,3830,1592,3831,3832,1593,3833,3834,3835,3836,
|
||||
3837,3838,3839,3840,3841,3842,3843,3844,1349,3845,3846,3847,3848,3849,3850,3851,
|
||||
3852,3853,3854,3855,3856,3857,3858,1594,3859,3860,3861,3862,3863,3864,3865,3866,
|
||||
3867,3868,3869,1595,3870,3871,3872,3873,1596,3874,3875,3876,3877,3878,3879,3880,
|
||||
3881,3882,3883,3884,3885,3886,1597,3887,3888,3889,3890,3891,3892,3893,3894,3895,
|
||||
1598,3896,3897,3898,1599,1600,3899,1350,3900,1351,3901,3902,1352,3903,3904,3905,
|
||||
3906,3907,3908,3909,3910,3911,3912,3913,3914,3915,3916,3917,3918,3919,3920,3921,
|
||||
3922,3923,3924,1258,3925,3926,3927,3928,3929,3930,3931,1193,3932,1601,3933,3934,
|
||||
3935,3936,3937,3938,3939,3940,3941,3942,3943,1602,3944,3945,3946,3947,3948,1603,
|
||||
3949,3950,3951,3952,3953,3954,3955,3956,3957,3958,3959,3960,3961,3962,3963,3964,
|
||||
3965,1604,3966,3967,3968,3969,3970,3971,3972,3973,3974,3975,3976,3977,1353,3978,
|
||||
3979,3980,3981,3982,3983,3984,3985,3986,3987,3988,3989,3990,3991,1354,3992,3993,
|
||||
3994,3995,3996,3997,3998,3999,4000,4001,4002,4003,4004,4005,4006,4007,4008,4009,
|
||||
4010,4011,4012,4013,4014,4015,4016,4017,4018,4019,4020,4021,4022,4023,1355,4024,
|
||||
4025,4026,4027,4028,4029,4030,4031,4032,4033,4034,4035,4036,4037,4038,4039,4040,
|
||||
1605,4041,4042,4043,4044,4045,4046,4047,4048,4049,4050,4051,4052,4053,4054,4055,
|
||||
4056,4057,4058,4059,4060,1606,4061,4062,4063,4064,1607,4065,4066,4067,4068,4069,
|
||||
4070,4071,4072,4073,4074,4075,4076,1194,4077,4078,1608,4079,4080,4081,4082,4083,
|
||||
4084,4085,4086,4087,1609,4088,4089,4090,4091,4092,4093,4094,4095,4096,4097,4098,
|
||||
4099,4100,4101,4102,4103,4104,4105,4106,4107,4108,1259,4109,4110,4111,4112,4113,
|
||||
4114,4115,4116,4117,4118,4119,4120,4121,4122,4123,4124,1195,4125,4126,4127,1610,
|
||||
4128,4129,4130,4131,4132,4133,4134,4135,4136,4137,1356,4138,4139,4140,4141,4142,
|
||||
4143,4144,1611,4145,4146,4147,4148,4149,4150,4151,4152,4153,4154,4155,4156,4157,
|
||||
4158,4159,4160,4161,4162,4163,4164,4165,4166,4167,4168,4169,4170,4171,4172,4173,
|
||||
4174,4175,4176,4177,4178,4179,4180,4181,4182,4183,4184,4185,4186,4187,4188,4189,
|
||||
4190,4191,4192,4193,4194,4195,4196,4197,4198,4199,4200,4201,4202,4203,4204,4205,
|
||||
4206,4207,4208,4209,4210,4211,4212,4213,4214,4215,4216,4217,4218,4219,1612,4220,
|
||||
4221,4222,4223,4224,4225,4226,4227,1357,4228,1613,4229,4230,4231,4232,4233,4234,
|
||||
4235,4236,4237,4238,4239,4240,4241,4242,4243,1614,4244,4245,4246,4247,4248,4249,
|
||||
4250,4251,4252,4253,4254,4255,4256,4257,4258,4259,4260,4261,4262,4263,4264,4265,
|
||||
4266,4267,4268,4269,4270,1196,1358,4271,4272,4273,4274,4275,4276,4277,4278,4279,
|
||||
4280,4281,4282,4283,4284,4285,4286,4287,1615,4288,4289,4290,4291,4292,4293,4294,
|
||||
4295,4296,4297,4298,4299,4300,4301,4302,4303,4304,4305,4306,4307,4308,4309,4310,
|
||||
4311,4312,4313,4314,4315,4316,4317,4318,4319,4320,4321,4322,4323,4324,4325,4326,
|
||||
4327,4328,4329,4330,4331,4332,4333,4334,1616,4335,4336,4337,4338,4339,4340,4341,
|
||||
4342,4343,4344,4345,4346,4347,4348,4349,4350,4351,4352,4353,4354,4355,4356,4357,
|
||||
4358,4359,4360,1617,4361,4362,4363,4364,4365,1618,4366,4367,4368,4369,4370,4371,
|
||||
4372,4373,4374,4375,4376,4377,4378,4379,4380,4381,4382,4383,4384,4385,4386,4387,
|
||||
4388,4389,4390,4391,4392,4393,4394,4395,4396,4397,4398,4399,4400,4401,4402,4403,
|
||||
4404,4405,4406,4407,4408,4409,4410,4411,4412,4413,4414,4415,4416,1619,4417,4418,
|
||||
4419,4420,4421,4422,4423,4424,4425,1112,4426,4427,4428,4429,4430,1620,4431,4432,
|
||||
4433,4434,4435,4436,4437,4438,4439,4440,4441,4442,1260,1261,4443,4444,4445,4446,
|
||||
4447,4448,4449,4450,4451,4452,4453,4454,4455,1359,4456,4457,4458,4459,4460,4461,
|
||||
4462,4463,4464,4465,1621,4466,4467,4468,4469,4470,4471,4472,4473,4474,4475,4476,
|
||||
4477,4478,4479,4480,4481,4482,4483,4484,4485,4486,4487,4488,4489,1055,4490,4491,
|
||||
4492,4493,4494,4495,4496,4497,4498,4499,4500,4501,4502,4503,4504,4505,4506,4507,
|
||||
4508,4509,4510,4511,4512,4513,4514,4515,4516,4517,4518,1622,4519,4520,4521,1623,
|
||||
4522,4523,4524,4525,4526,4527,4528,4529,4530,4531,4532,4533,4534,4535,1360,4536,
|
||||
4537,4538,4539,4540,4541,4542,4543, 975,4544,4545,4546,4547,4548,4549,4550,4551,
|
||||
4552,4553,4554,4555,4556,4557,4558,4559,4560,4561,4562,4563,4564,4565,4566,4567,
|
||||
4568,4569,4570,4571,1624,4572,4573,4574,4575,4576,1625,4577,4578,4579,4580,4581,
|
||||
4582,4583,4584,1626,4585,4586,4587,4588,4589,4590,4591,4592,4593,4594,4595,1627,
|
||||
4596,4597,4598,4599,4600,4601,4602,4603,4604,4605,4606,4607,4608,4609,4610,4611,
|
||||
4612,4613,4614,4615,1628,4616,4617,4618,4619,4620,4621,4622,4623,4624,4625,4626,
|
||||
4627,4628,4629,4630,4631,4632,4633,4634,4635,4636,4637,4638,4639,4640,4641,4642,
|
||||
4643,4644,4645,4646,4647,4648,4649,1361,4650,4651,4652,4653,4654,4655,4656,4657,
|
||||
4658,4659,4660,4661,1362,4662,4663,4664,4665,4666,4667,4668,4669,4670,4671,4672,
|
||||
4673,4674,4675,4676,4677,4678,4679,4680,4681,4682,1629,4683,4684,4685,4686,4687,
|
||||
1630,4688,4689,4690,4691,1153,4692,4693,4694,1113,4695,4696,4697,4698,4699,4700,
|
||||
4701,4702,4703,4704,4705,4706,4707,4708,4709,4710,4711,1197,4712,4713,4714,4715,
|
||||
4716,4717,4718,4719,4720,4721,4722,4723,4724,4725,4726,4727,4728,4729,4730,4731,
|
||||
4732,4733,4734,4735,1631,4736,1632,4737,4738,4739,4740,4741,4742,4743,4744,1633,
|
||||
4745,4746,4747,4748,4749,1262,4750,4751,4752,4753,4754,1363,4755,4756,4757,4758,
|
||||
4759,4760,4761,4762,4763,4764,4765,4766,4767,4768,1634,4769,4770,4771,4772,4773,
|
||||
4774,4775,4776,4777,4778,1635,4779,4780,4781,4782,4783,4784,4785,4786,4787,4788,
|
||||
4789,1636,4790,4791,4792,4793,4794,4795,4796,4797,4798,4799,4800,4801,4802,4803,
|
||||
4804,4805,4806,1637,4807,4808,4809,1638,4810,4811,4812,4813,4814,4815,4816,4817,
|
||||
4818,1639,4819,4820,4821,4822,4823,4824,4825,4826,4827,4828,4829,4830,4831,4832,
|
||||
4833,1077,4834,4835,4836,4837,4838,4839,4840,4841,4842,4843,4844,4845,4846,4847,
|
||||
4848,4849,4850,4851,4852,4853,4854,4855,4856,4857,4858,4859,4860,4861,4862,4863,
|
||||
4864,4865,4866,4867,4868,4869,4870,4871,4872,4873,4874,4875,4876,4877,4878,4879,
|
||||
4880,4881,4882,4883,1640,4884,4885,1641,4886,4887,4888,4889,4890,4891,4892,4893,
|
||||
4894,4895,4896,4897,4898,4899,4900,4901,4902,4903,4904,4905,4906,4907,4908,4909,
|
||||
4910,4911,1642,4912,4913,4914,1364,4915,4916,4917,4918,4919,4920,4921,4922,4923,
|
||||
4924,4925,4926,4927,4928,4929,4930,4931,1643,4932,4933,4934,4935,4936,4937,4938,
|
||||
4939,4940,4941,4942,4943,4944,4945,4946,4947,4948,4949,4950,4951,4952,4953,4954,
|
||||
4955,4956,4957,4958,4959,4960,4961,4962,4963,4964,4965,4966,4967,4968,4969,4970,
|
||||
4971,4972,4973,4974,4975,4976,4977,4978,4979,4980,1644,4981,4982,4983,4984,1645,
|
||||
4985,4986,1646,4987,4988,4989,4990,4991,4992,4993,4994,4995,4996,4997,4998,4999,
|
||||
5000,5001,5002,5003,5004,5005,1647,5006,1648,5007,5008,5009,5010,5011,5012,1078,
|
||||
5013,5014,5015,5016,5017,5018,5019,5020,5021,5022,5023,5024,5025,5026,5027,5028,
|
||||
1365,5029,5030,5031,5032,5033,5034,5035,5036,5037,5038,5039,1649,5040,5041,5042,
|
||||
5043,5044,5045,1366,5046,5047,5048,5049,5050,5051,5052,5053,5054,5055,1650,5056,
|
||||
5057,5058,5059,5060,5061,5062,5063,5064,5065,5066,5067,5068,5069,5070,5071,5072,
|
||||
5073,5074,5075,5076,5077,1651,5078,5079,5080,5081,5082,5083,5084,5085,5086,5087,
|
||||
5088,5089,5090,5091,5092,5093,5094,5095,5096,5097,5098,5099,5100,5101,5102,5103,
|
||||
5104,5105,5106,5107,5108,5109,5110,1652,5111,5112,5113,5114,5115,5116,5117,5118,
|
||||
1367,5119,5120,5121,5122,5123,5124,5125,5126,5127,5128,5129,1653,5130,5131,5132,
|
||||
5133,5134,5135,5136,5137,5138,5139,5140,5141,5142,5143,5144,5145,5146,5147,5148,
|
||||
5149,1368,5150,1654,5151,1369,5152,5153,5154,5155,5156,5157,5158,5159,5160,5161,
|
||||
5162,5163,5164,5165,5166,5167,5168,5169,5170,5171,5172,5173,5174,5175,5176,5177,
|
||||
5178,1370,5179,5180,5181,5182,5183,5184,5185,5186,5187,5188,5189,5190,5191,5192,
|
||||
5193,5194,5195,5196,5197,5198,1655,5199,5200,5201,5202,1656,5203,5204,5205,5206,
|
||||
1371,5207,1372,5208,5209,5210,5211,1373,5212,5213,1374,5214,5215,5216,5217,5218,
|
||||
5219,5220,5221,5222,5223,5224,5225,5226,5227,5228,5229,5230,5231,5232,5233,5234,
|
||||
5235,5236,5237,5238,5239,5240,5241,5242,5243,5244,5245,5246,5247,1657,5248,5249,
|
||||
5250,5251,1658,1263,5252,5253,5254,5255,5256,1375,5257,5258,5259,5260,5261,5262,
|
||||
5263,5264,5265,5266,5267,5268,5269,5270,5271,5272,5273,5274,5275,5276,5277,5278,
|
||||
5279,5280,5281,5282,5283,1659,5284,5285,5286,5287,5288,5289,5290,5291,5292,5293,
|
||||
5294,5295,5296,5297,5298,5299,5300,1660,5301,5302,5303,5304,5305,5306,5307,5308,
|
||||
5309,5310,5311,5312,5313,5314,5315,5316,5317,5318,5319,5320,5321,1376,5322,5323,
|
||||
5324,5325,5326,5327,5328,5329,5330,5331,5332,5333,1198,5334,5335,5336,5337,5338,
|
||||
5339,5340,5341,5342,5343,1661,5344,5345,5346,5347,5348,5349,5350,5351,5352,5353,
|
||||
5354,5355,5356,5357,5358,5359,5360,5361,5362,5363,5364,5365,5366,5367,5368,5369,
|
||||
5370,5371,5372,5373,5374,5375,5376,5377,5378,5379,5380,5381,5382,5383,5384,5385,
|
||||
5386,5387,5388,5389,5390,5391,5392,5393,5394,5395,5396,5397,5398,1264,5399,5400,
|
||||
5401,5402,5403,5404,5405,5406,5407,5408,5409,5410,5411,5412,1662,5413,5414,5415,
|
||||
5416,1663,5417,5418,5419,5420,5421,5422,5423,5424,5425,5426,5427,5428,5429,5430,
|
||||
5431,5432,5433,5434,5435,5436,5437,5438,1664,5439,5440,5441,5442,5443,5444,5445,
|
||||
5446,5447,5448,5449,5450,5451,5452,5453,5454,5455,5456,5457,5458,5459,5460,5461,
|
||||
5462,5463,5464,5465,5466,5467,5468,5469,5470,5471,5472,5473,5474,5475,5476,5477,
|
||||
5478,1154,5479,5480,5481,5482,5483,5484,5485,1665,5486,5487,5488,5489,5490,5491,
|
||||
5492,5493,5494,5495,5496,5497,5498,5499,5500,5501,5502,5503,5504,5505,5506,5507,
|
||||
5508,5509,5510,5511,5512,5513,5514,5515,5516,5517,5518,5519,5520,5521,5522,5523,
|
||||
5524,5525,5526,5527,5528,5529,5530,5531,5532,5533,5534,5535,5536,5537,5538,5539,
|
||||
5540,5541,5542,5543,5544,5545,5546,5547,5548,1377,5549,5550,5551,5552,5553,5554,
|
||||
5555,5556,5557,5558,5559,5560,5561,5562,5563,5564,5565,5566,5567,5568,5569,5570,
|
||||
1114,5571,5572,5573,5574,5575,5576,5577,5578,5579,5580,5581,5582,5583,5584,5585,
|
||||
5586,5587,5588,5589,5590,5591,5592,1378,5593,5594,5595,5596,5597,5598,5599,5600,
|
||||
5601,5602,5603,5604,5605,5606,5607,5608,5609,5610,5611,5612,5613,5614,1379,5615,
|
||||
5616,5617,5618,5619,5620,5621,5622,5623,5624,5625,5626,5627,5628,5629,5630,5631,
|
||||
5632,5633,5634,1380,5635,5636,5637,5638,5639,5640,5641,5642,5643,5644,5645,5646,
|
||||
5647,5648,5649,1381,1056,5650,5651,5652,5653,5654,5655,5656,5657,5658,5659,5660,
|
||||
1666,5661,5662,5663,5664,5665,5666,5667,5668,1667,5669,1668,5670,5671,5672,5673,
|
||||
5674,5675,5676,5677,5678,1155,5679,5680,5681,5682,5683,5684,5685,5686,5687,5688,
|
||||
5689,5690,5691,5692,5693,5694,5695,5696,5697,5698,1669,5699,5700,5701,5702,5703,
|
||||
5704,5705,1670,5706,5707,5708,5709,5710,1671,5711,5712,5713,5714,1382,5715,5716,
|
||||
5717,5718,5719,5720,5721,5722,5723,5724,5725,1672,5726,5727,1673,1674,5728,5729,
|
||||
5730,5731,5732,5733,5734,5735,5736,1675,5737,5738,5739,5740,5741,5742,5743,5744,
|
||||
1676,5745,5746,5747,5748,5749,5750,5751,1383,5752,5753,5754,5755,5756,5757,5758,
|
||||
5759,5760,5761,5762,5763,5764,5765,5766,5767,5768,1677,5769,5770,5771,5772,5773,
|
||||
1678,5774,5775,5776, 998,5777,5778,5779,5780,5781,5782,5783,5784,5785,1384,5786,
|
||||
5787,5788,5789,5790,5791,5792,5793,5794,5795,5796,5797,5798,5799,5800,1679,5801,
|
||||
5802,5803,1115,1116,5804,5805,5806,5807,5808,5809,5810,5811,5812,5813,5814,5815,
|
||||
5816,5817,5818,5819,5820,5821,5822,5823,5824,5825,5826,5827,5828,5829,5830,5831,
|
||||
5832,5833,5834,5835,5836,5837,5838,5839,5840,5841,5842,5843,5844,5845,5846,5847,
|
||||
5848,5849,5850,5851,5852,5853,5854,5855,1680,5856,5857,5858,5859,5860,5861,5862,
|
||||
5863,5864,1681,5865,5866,5867,1682,5868,5869,5870,5871,5872,5873,5874,5875,5876,
|
||||
5877,5878,5879,1683,5880,1684,5881,5882,5883,5884,1685,5885,5886,5887,5888,5889,
|
||||
5890,5891,5892,5893,5894,5895,5896,5897,5898,5899,5900,5901,5902,5903,5904,5905,
|
||||
5906,5907,1686,5908,5909,5910,5911,5912,5913,5914,5915,5916,5917,5918,5919,5920,
|
||||
5921,5922,5923,5924,5925,5926,5927,5928,5929,5930,5931,5932,5933,5934,5935,1687,
|
||||
5936,5937,5938,5939,5940,5941,5942,5943,5944,5945,5946,5947,5948,5949,5950,5951,
|
||||
5952,1688,1689,5953,1199,5954,5955,5956,5957,5958,5959,5960,5961,1690,5962,5963,
|
||||
5964,5965,5966,5967,5968,5969,5970,5971,5972,5973,5974,5975,5976,5977,5978,5979,
|
||||
5980,5981,1385,5982,1386,5983,5984,5985,5986,5987,5988,5989,5990,5991,5992,5993,
|
||||
5994,5995,5996,5997,5998,5999,6000,6001,6002,6003,6004,6005,6006,6007,6008,6009,
|
||||
6010,6011,6012,6013,6014,6015,6016,6017,6018,6019,6020,6021,6022,6023,6024,6025,
|
||||
6026,6027,1265,6028,6029,1691,6030,6031,6032,6033,6034,6035,6036,6037,6038,6039,
|
||||
6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,
|
||||
6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,
|
||||
6072,6073,6074,6075,6076,6077,6078,6079,6080,6081,6082,6083,6084,1692,6085,6086,
|
||||
6087,6088,6089,6090,6091,6092,6093,6094,6095,6096,6097,6098,6099,6100,6101,6102,
|
||||
6103,6104,6105,6106,6107,6108,6109,6110,6111,6112,6113,6114,6115,6116,6117,6118,
|
||||
6119,6120,6121,6122,6123,6124,6125,6126,6127,6128,6129,6130,6131,1693,6132,6133,
|
||||
6134,6135,6136,1694,6137,6138,6139,6140,6141,1695,6142,6143,6144,6145,6146,6147,
|
||||
6148,6149,6150,6151,6152,6153,6154,6155,6156,6157,6158,6159,6160,6161,6162,6163,
|
||||
6164,6165,6166,6167,6168,6169,6170,6171,6172,6173,6174,6175,6176,6177,6178,6179,
|
||||
6180,6181,6182,6183,6184,6185,1696,6186,6187,6188,6189,6190,6191,6192,6193,6194,
|
||||
6195,6196,6197,6198,6199,6200,6201,6202,6203,6204,6205,6206,6207,6208,6209,6210,
|
||||
6211,6212,6213,6214,6215,6216,6217,6218,6219,1697,6220,6221,6222,6223,6224,6225,
|
||||
6226,6227,6228,6229,6230,6231,6232,6233,6234,6235,6236,6237,6238,6239,6240,6241,
|
||||
6242,6243,6244,6245,6246,6247,6248,6249,6250,6251,6252,6253,1698,6254,6255,6256,
|
||||
6257,6258,6259,6260,6261,6262,6263,1200,6264,6265,6266,6267,6268,6269,6270,6271, #1024
|
||||
6272,6273,6274,6275,6276,6277,6278,6279,6280,6281,6282,6283,6284,6285,6286,6287,
|
||||
6288,6289,6290,6291,6292,6293,6294,6295,6296,6297,6298,6299,6300,6301,6302,1699,
|
||||
6303,6304,1700,6305,6306,6307,6308,6309,6310,6311,6312,6313,6314,6315,6316,6317,
|
||||
6318,6319,6320,6321,6322,6323,6324,6325,6326,6327,6328,6329,6330,6331,6332,6333,
|
||||
6334,6335,6336,6337,6338,6339,1701,6340,6341,6342,6343,6344,1387,6345,6346,6347,
|
||||
6348,6349,6350,6351,6352,6353,6354,6355,6356,6357,6358,6359,6360,6361,6362,6363,
|
||||
6364,6365,6366,6367,6368,6369,6370,6371,6372,6373,6374,6375,6376,6377,6378,6379,
|
||||
6380,6381,6382,6383,6384,6385,6386,6387,6388,6389,6390,6391,6392,6393,6394,6395,
|
||||
6396,6397,6398,6399,6400,6401,6402,6403,6404,6405,6406,6407,6408,6409,6410,6411,
|
||||
6412,6413,1702,6414,6415,6416,6417,6418,6419,6420,6421,6422,1703,6423,6424,6425,
|
||||
6426,6427,6428,6429,6430,6431,6432,6433,6434,6435,6436,6437,6438,1704,6439,6440,
|
||||
6441,6442,6443,6444,6445,6446,6447,6448,6449,6450,6451,6452,6453,6454,6455,6456,
|
||||
6457,6458,6459,6460,6461,6462,6463,6464,6465,6466,6467,6468,6469,6470,6471,6472,
|
||||
6473,6474,6475,6476,6477,6478,6479,6480,6481,6482,6483,6484,6485,6486,6487,6488,
|
||||
6489,6490,6491,6492,6493,6494,6495,6496,6497,6498,6499,6500,6501,6502,6503,1266,
|
||||
6504,6505,6506,6507,6508,6509,6510,6511,6512,6513,6514,6515,6516,6517,6518,6519,
|
||||
6520,6521,6522,6523,6524,6525,6526,6527,6528,6529,6530,6531,6532,6533,6534,6535,
|
||||
6536,6537,6538,6539,6540,6541,6542,6543,6544,6545,6546,6547,6548,6549,6550,6551,
|
||||
1705,1706,6552,6553,6554,6555,6556,6557,6558,6559,6560,6561,6562,6563,6564,6565,
|
||||
6566,6567,6568,6569,6570,6571,6572,6573,6574,6575,6576,6577,6578,6579,6580,6581,
|
||||
6582,6583,6584,6585,6586,6587,6588,6589,6590,6591,6592,6593,6594,6595,6596,6597,
|
||||
6598,6599,6600,6601,6602,6603,6604,6605,6606,6607,6608,6609,6610,6611,6612,6613,
|
||||
6614,6615,6616,6617,6618,6619,6620,6621,6622,6623,6624,6625,6626,6627,6628,6629,
|
||||
6630,6631,6632,6633,6634,6635,6636,6637,1388,6638,6639,6640,6641,6642,6643,6644,
|
||||
1707,6645,6646,6647,6648,6649,6650,6651,6652,6653,6654,6655,6656,6657,6658,6659,
|
||||
6660,6661,6662,6663,1708,6664,6665,6666,6667,6668,6669,6670,6671,6672,6673,6674,
|
||||
1201,6675,6676,6677,6678,6679,6680,6681,6682,6683,6684,6685,6686,6687,6688,6689,
|
||||
6690,6691,6692,6693,6694,6695,6696,6697,6698,6699,6700,6701,6702,6703,6704,6705,
|
||||
6706,6707,6708,6709,6710,6711,6712,6713,6714,6715,6716,6717,6718,6719,6720,6721,
|
||||
6722,6723,6724,6725,1389,6726,6727,6728,6729,6730,6731,6732,6733,6734,6735,6736,
|
||||
1390,1709,6737,6738,6739,6740,6741,6742,1710,6743,6744,6745,6746,1391,6747,6748,
|
||||
6749,6750,6751,6752,6753,6754,6755,6756,6757,1392,6758,6759,6760,6761,6762,6763,
|
||||
6764,6765,6766,6767,6768,6769,6770,6771,6772,6773,6774,6775,6776,6777,6778,6779,
|
||||
6780,1202,6781,6782,6783,6784,6785,6786,6787,6788,6789,6790,6791,6792,6793,6794,
|
||||
6795,6796,6797,6798,6799,6800,6801,6802,6803,6804,6805,6806,6807,6808,6809,1711,
|
||||
6810,6811,6812,6813,6814,6815,6816,6817,6818,6819,6820,6821,6822,6823,6824,6825,
|
||||
6826,6827,6828,6829,6830,6831,6832,6833,6834,6835,6836,1393,6837,6838,6839,6840,
|
||||
6841,6842,6843,6844,6845,6846,6847,6848,6849,6850,6851,6852,6853,6854,6855,6856,
|
||||
6857,6858,6859,6860,6861,6862,6863,6864,6865,6866,6867,6868,6869,6870,6871,6872,
|
||||
6873,6874,6875,6876,6877,6878,6879,6880,6881,6882,6883,6884,6885,6886,6887,6888,
|
||||
6889,6890,6891,6892,6893,6894,6895,6896,6897,6898,6899,6900,6901,6902,1712,6903,
|
||||
6904,6905,6906,6907,6908,6909,6910,1713,6911,6912,6913,6914,6915,6916,6917,6918,
|
||||
6919,6920,6921,6922,6923,6924,6925,6926,6927,6928,6929,6930,6931,6932,6933,6934,
|
||||
6935,6936,6937,6938,6939,6940,6941,6942,6943,6944,6945,6946,6947,6948,6949,6950,
|
||||
6951,6952,6953,6954,6955,6956,6957,6958,6959,6960,6961,6962,6963,6964,6965,6966,
|
||||
6967,6968,6969,6970,6971,6972,6973,6974,1714,6975,6976,6977,6978,6979,6980,6981,
|
||||
6982,6983,6984,6985,6986,6987,6988,1394,6989,6990,6991,6992,6993,6994,6995,6996,
|
||||
6997,6998,6999,7000,1715,7001,7002,7003,7004,7005,7006,7007,7008,7009,7010,7011,
|
||||
7012,7013,7014,7015,7016,7017,7018,7019,7020,7021,7022,7023,7024,7025,7026,7027,
|
||||
7028,1716,7029,7030,7031,7032,7033,7034,7035,7036,7037,7038,7039,7040,7041,7042,
|
||||
7043,7044,7045,7046,7047,7048,7049,7050,7051,7052,7053,7054,7055,7056,7057,7058,
|
||||
7059,7060,7061,7062,7063,7064,7065,7066,7067,7068,7069,7070,7071,7072,7073,7074,
|
||||
7075,7076,7077,7078,7079,7080,7081,7082,7083,7084,7085,7086,7087,7088,7089,7090,
|
||||
7091,7092,7093,7094,7095,7096,7097,7098,7099,7100,7101,7102,7103,7104,7105,7106,
|
||||
7107,7108,7109,7110,7111,7112,7113,7114,7115,7116,7117,7118,7119,7120,7121,7122,
|
||||
7123,7124,7125,7126,7127,7128,7129,7130,7131,7132,7133,7134,7135,7136,7137,7138,
|
||||
7139,7140,7141,7142,7143,7144,7145,7146,7147,7148,7149,7150,7151,7152,7153,7154,
|
||||
7155,7156,7157,7158,7159,7160,7161,7162,7163,7164,7165,7166,7167,7168,7169,7170,
|
||||
7171,7172,7173,7174,7175,7176,7177,7178,7179,7180,7181,7182,7183,7184,7185,7186,
|
||||
7187,7188,7189,7190,7191,7192,7193,7194,7195,7196,7197,7198,7199,7200,7201,7202,
|
||||
7203,7204,7205,7206,7207,1395,7208,7209,7210,7211,7212,7213,1717,7214,7215,7216,
|
||||
7217,7218,7219,7220,7221,7222,7223,7224,7225,7226,7227,7228,7229,7230,7231,7232,
|
||||
7233,7234,7235,7236,7237,7238,7239,7240,7241,7242,7243,7244,7245,7246,7247,7248,
|
||||
7249,7250,7251,7252,7253,7254,7255,7256,7257,7258,7259,7260,7261,7262,7263,7264,
|
||||
7265,7266,7267,7268,7269,7270,7271,7272,7273,7274,7275,7276,7277,7278,7279,7280,
|
||||
7281,7282,7283,7284,7285,7286,7287,7288,7289,7290,7291,7292,7293,7294,7295,7296,
|
||||
7297,7298,7299,7300,7301,7302,7303,7304,7305,7306,7307,7308,7309,7310,7311,7312,
|
||||
7313,1718,7314,7315,7316,7317,7318,7319,7320,7321,7322,7323,7324,7325,7326,7327,
|
||||
7328,7329,7330,7331,7332,7333,7334,7335,7336,7337,7338,7339,7340,7341,7342,7343,
|
||||
7344,7345,7346,7347,7348,7349,7350,7351,7352,7353,7354,7355,7356,7357,7358,7359,
|
||||
7360,7361,7362,7363,7364,7365,7366,7367,7368,7369,7370,7371,7372,7373,7374,7375,
|
||||
7376,7377,7378,7379,7380,7381,7382,7383,7384,7385,7386,7387,7388,7389,7390,7391,
|
||||
7392,7393,7394,7395,7396,7397,7398,7399,7400,7401,7402,7403,7404,7405,7406,7407,
|
||||
7408,7409,7410,7411,7412,7413,7414,7415,7416,7417,7418,7419,7420,7421,7422,7423,
|
||||
7424,7425,7426,7427,7428,7429,7430,7431,7432,7433,7434,7435,7436,7437,7438,7439,
|
||||
7440,7441,7442,7443,7444,7445,7446,7447,7448,7449,7450,7451,7452,7453,7454,7455,
|
||||
7456,7457,7458,7459,7460,7461,7462,7463,7464,7465,7466,7467,7468,7469,7470,7471,
|
||||
7472,7473,7474,7475,7476,7477,7478,7479,7480,7481,7482,7483,7484,7485,7486,7487,
|
||||
7488,7489,7490,7491,7492,7493,7494,7495,7496,7497,7498,7499,7500,7501,7502,7503,
|
||||
7504,7505,7506,7507,7508,7509,7510,7511,7512,7513,7514,7515,7516,7517,7518,7519,
|
||||
7520,7521,7522,7523,7524,7525,7526,7527,7528,7529,7530,7531,7532,7533,7534,7535,
|
||||
7536,7537,7538,7539,7540,7541,7542,7543,7544,7545,7546,7547,7548,7549,7550,7551,
|
||||
7552,7553,7554,7555,7556,7557,7558,7559,7560,7561,7562,7563,7564,7565,7566,7567,
|
||||
7568,7569,7570,7571,7572,7573,7574,7575,7576,7577,7578,7579,7580,7581,7582,7583,
|
||||
7584,7585,7586,7587,7588,7589,7590,7591,7592,7593,7594,7595,7596,7597,7598,7599,
|
||||
7600,7601,7602,7603,7604,7605,7606,7607,7608,7609,7610,7611,7612,7613,7614,7615,
|
||||
7616,7617,7618,7619,7620,7621,7622,7623,7624,7625,7626,7627,7628,7629,7630,7631,
|
||||
7632,7633,7634,7635,7636,7637,7638,7639,7640,7641,7642,7643,7644,7645,7646,7647,
|
||||
7648,7649,7650,7651,7652,7653,7654,7655,7656,7657,7658,7659,7660,7661,7662,7663,
|
||||
7664,7665,7666,7667,7668,7669,7670,7671,7672,7673,7674,7675,7676,7677,7678,7679,
|
||||
7680,7681,7682,7683,7684,7685,7686,7687,7688,7689,7690,7691,7692,7693,7694,7695,
|
||||
7696,7697,7698,7699,7700,7701,7702,7703,7704,7705,7706,7707,7708,7709,7710,7711,
|
||||
7712,7713,7714,7715,7716,7717,7718,7719,7720,7721,7722,7723,7724,7725,7726,7727,
|
||||
7728,7729,7730,7731,7732,7733,7734,7735,7736,7737,7738,7739,7740,7741,7742,7743,
|
||||
7744,7745,7746,7747,7748,7749,7750,7751,7752,7753,7754,7755,7756,7757,7758,7759,
|
||||
7760,7761,7762,7763,7764,7765,7766,7767,7768,7769,7770,7771,7772,7773,7774,7775,
|
||||
7776,7777,7778,7779,7780,7781,7782,7783,7784,7785,7786,7787,7788,7789,7790,7791,
|
||||
7792,7793,7794,7795,7796,7797,7798,7799,7800,7801,7802,7803,7804,7805,7806,7807,
|
||||
7808,7809,7810,7811,7812,7813,7814,7815,7816,7817,7818,7819,7820,7821,7822,7823,
|
||||
7824,7825,7826,7827,7828,7829,7830,7831,7832,7833,7834,7835,7836,7837,7838,7839,
|
||||
7840,7841,7842,7843,7844,7845,7846,7847,7848,7849,7850,7851,7852,7853,7854,7855,
|
||||
7856,7857,7858,7859,7860,7861,7862,7863,7864,7865,7866,7867,7868,7869,7870,7871,
|
||||
7872,7873,7874,7875,7876,7877,7878,7879,7880,7881,7882,7883,7884,7885,7886,7887,
|
||||
7888,7889,7890,7891,7892,7893,7894,7895,7896,7897,7898,7899,7900,7901,7902,7903,
|
||||
7904,7905,7906,7907,7908,7909,7910,7911,7912,7913,7914,7915,7916,7917,7918,7919,
|
||||
7920,7921,7922,7923,7924,7925,7926,7927,7928,7929,7930,7931,7932,7933,7934,7935,
|
||||
7936,7937,7938,7939,7940,7941,7942,7943,7944,7945,7946,7947,7948,7949,7950,7951,
|
||||
7952,7953,7954,7955,7956,7957,7958,7959,7960,7961,7962,7963,7964,7965,7966,7967,
|
||||
7968,7969,7970,7971,7972,7973,7974,7975,7976,7977,7978,7979,7980,7981,7982,7983,
|
||||
7984,7985,7986,7987,7988,7989,7990,7991,7992,7993,7994,7995,7996,7997,7998,7999,
|
||||
8000,8001,8002,8003,8004,8005,8006,8007,8008,8009,8010,8011,8012,8013,8014,8015,
|
||||
8016,8017,8018,8019,8020,8021,8022,8023,8024,8025,8026,8027,8028,8029,8030,8031,
|
||||
8032,8033,8034,8035,8036,8037,8038,8039,8040,8041,8042,8043,8044,8045,8046,8047,
|
||||
8048,8049,8050,8051,8052,8053,8054,8055,8056,8057,8058,8059,8060,8061,8062,8063,
|
||||
8064,8065,8066,8067,8068,8069,8070,8071,8072,8073,8074,8075,8076,8077,8078,8079,
|
||||
8080,8081,8082,8083,8084,8085,8086,8087,8088,8089,8090,8091,8092,8093,8094,8095,
|
||||
8096,8097,8098,8099,8100,8101,8102,8103,8104,8105,8106,8107,8108,8109,8110,8111,
|
||||
8112,8113,8114,8115,8116,8117,8118,8119,8120,8121,8122,8123,8124,8125,8126,8127,
|
||||
8128,8129,8130,8131,8132,8133,8134,8135,8136,8137,8138,8139,8140,8141,8142,8143,
|
||||
8144,8145,8146,8147,8148,8149,8150,8151,8152,8153,8154,8155,8156,8157,8158,8159,
|
||||
8160,8161,8162,8163,8164,8165,8166,8167,8168,8169,8170,8171,8172,8173,8174,8175,
|
||||
8176,8177,8178,8179,8180,8181,8182,8183,8184,8185,8186,8187,8188,8189,8190,8191,
|
||||
8192,8193,8194,8195,8196,8197,8198,8199,8200,8201,8202,8203,8204,8205,8206,8207,
|
||||
8208,8209,8210,8211,8212,8213,8214,8215,8216,8217,8218,8219,8220,8221,8222,8223,
|
||||
8224,8225,8226,8227,8228,8229,8230,8231,8232,8233,8234,8235,8236,8237,8238,8239,
|
||||
8240,8241,8242,8243,8244,8245,8246,8247,8248,8249,8250,8251,8252,8253,8254,8255,
|
||||
8256,8257,8258,8259,8260,8261,8262,8263,8264,8265,8266,8267,8268,8269,8270,8271,
|
||||
8272,8273,8274,8275,8276,8277,8278,8279,8280,8281,8282,8283,8284,8285,8286,8287,
|
||||
8288,8289,8290,8291,8292,8293,8294,8295,8296,8297,8298,8299,8300,8301,8302,8303,
|
||||
8304,8305,8306,8307,8308,8309,8310,8311,8312,8313,8314,8315,8316,8317,8318,8319,
|
||||
8320,8321,8322,8323,8324,8325,8326,8327,8328,8329,8330,8331,8332,8333,8334,8335,
|
||||
8336,8337,8338,8339,8340,8341,8342,8343,8344,8345,8346,8347,8348,8349,8350,8351,
|
||||
8352,8353,8354,8355,8356,8357,8358,8359,8360,8361,8362,8363,8364,8365,8366,8367,
|
||||
8368,8369,8370,8371,8372,8373,8374,8375,8376,8377,8378,8379,8380,8381,8382,8383,
|
||||
8384,8385,8386,8387,8388,8389,8390,8391,8392,8393,8394,8395,8396,8397,8398,8399,
|
||||
8400,8401,8402,8403,8404,8405,8406,8407,8408,8409,8410,8411,8412,8413,8414,8415,
|
||||
8416,8417,8418,8419,8420,8421,8422,8423,8424,8425,8426,8427,8428,8429,8430,8431,
|
||||
8432,8433,8434,8435,8436,8437,8438,8439,8440,8441,8442,8443,8444,8445,8446,8447,
|
||||
8448,8449,8450,8451,8452,8453,8454,8455,8456,8457,8458,8459,8460,8461,8462,8463,
|
||||
8464,8465,8466,8467,8468,8469,8470,8471,8472,8473,8474,8475,8476,8477,8478,8479,
|
||||
8480,8481,8482,8483,8484,8485,8486,8487,8488,8489,8490,8491,8492,8493,8494,8495,
|
||||
8496,8497,8498,8499,8500,8501,8502,8503,8504,8505,8506,8507,8508,8509,8510,8511,
|
||||
8512,8513,8514,8515,8516,8517,8518,8519,8520,8521,8522,8523,8524,8525,8526,8527,
|
||||
8528,8529,8530,8531,8532,8533,8534,8535,8536,8537,8538,8539,8540,8541,8542,8543,
|
||||
8544,8545,8546,8547,8548,8549,8550,8551,8552,8553,8554,8555,8556,8557,8558,8559,
|
||||
8560,8561,8562,8563,8564,8565,8566,8567,8568,8569,8570,8571,8572,8573,8574,8575,
|
||||
8576,8577,8578,8579,8580,8581,8582,8583,8584,8585,8586,8587,8588,8589,8590,8591,
|
||||
8592,8593,8594,8595,8596,8597,8598,8599,8600,8601,8602,8603,8604,8605,8606,8607,
|
||||
8608,8609,8610,8611,8612,8613,8614,8615,8616,8617,8618,8619,8620,8621,8622,8623,
|
||||
8624,8625,8626,8627,8628,8629,8630,8631,8632,8633,8634,8635,8636,8637,8638,8639,
|
||||
8640,8641,8642,8643,8644,8645,8646,8647,8648,8649,8650,8651,8652,8653,8654,8655,
|
||||
8656,8657,8658,8659,8660,8661,8662,8663,8664,8665,8666,8667,8668,8669,8670,8671,
|
||||
8672,8673,8674,8675,8676,8677,8678,8679,8680,8681,8682,8683,8684,8685,8686,8687,
|
||||
8688,8689,8690,8691,8692,8693,8694,8695,8696,8697,8698,8699,8700,8701,8702,8703,
|
||||
8704,8705,8706,8707,8708,8709,8710,8711,8712,8713,8714,8715,8716,8717,8718,8719,
|
||||
8720,8721,8722,8723,8724,8725,8726,8727,8728,8729,8730,8731,8732,8733,8734,8735,
|
||||
8736,8737,8738,8739,8740,8741)
|
||||
|
|
@ -13,35 +13,29 @@
|
|||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
# 02110-1301 USA
|
||||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
from .mbcharsetprober import MultiByteCharSetProber
|
||||
from .codingstatemachine import CodingStateMachine
|
||||
from .chardistribution import EUCKRDistributionAnalysis
|
||||
from .mbcssm import EUCKR_SM_MODEL
|
||||
|
||||
from mbcharsetprober import MultiByteCharSetProber
|
||||
from codingstatemachine import CodingStateMachine
|
||||
from chardistribution import EUCKRDistributionAnalysis
|
||||
from mbcssm import EUCKRSMModel
|
||||
|
||||
class EUCKRProber(MultiByteCharSetProber):
|
||||
def __init__(self):
|
||||
super(EUCKRProber, self).__init__()
|
||||
self.coding_sm = CodingStateMachine(EUCKR_SM_MODEL)
|
||||
self.distribution_analyzer = EUCKRDistributionAnalysis()
|
||||
MultiByteCharSetProber.__init__(self)
|
||||
self._mCodingSM = CodingStateMachine(EUCKRSMModel)
|
||||
self._mDistributionAnalyzer = EUCKRDistributionAnalysis()
|
||||
self.reset()
|
||||
|
||||
@property
|
||||
def charset_name(self):
|
||||
def get_charset_name(self):
|
||||
return "EUC-KR"
|
||||
|
||||
@property
|
||||
def language(self):
|
||||
return "Korean"
|
||||
|
|
@ -13,12 +13,12 @@
|
|||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License as published by the Free Software Foundation; either
|
||||
# version 2.1 of the License, or (at your option) any later version.
|
||||
#
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
|
||||
|
|
@ -26,8 +26,8 @@
|
|||
######################### END LICENSE BLOCK #########################
|
||||
|
||||
# EUCTW frequency table
|
||||
# Converted from big5 work
|
||||
# by Taiwan's Mandarin Promotion Council
|
||||
# Converted from big5 work
|
||||
# by Taiwan's Mandarin Promotion Council
|
||||
# <http:#www.edu.tw:81/mandr/>
|
||||
|
||||
# 128 --> 0.42261
|
||||
|
|
@ -38,350 +38,389 @@
|
|||
#
|
||||
# Idea Distribution Ratio = 0.74851/(1-0.74851) =2.98
|
||||
# Random Distribution Ration = 512/(5401-512)=0.105
|
||||
#
|
||||
#
|
||||
# Typical Distribution Ratio about 25% of Ideal one, still much higher than RDR
|
||||
|
||||
EUCTW_TYPICAL_DISTRIBUTION_RATIO = 0.75
|
||||
|
||||
# Char to FreqOrder table ,
|
||||
EUCTW_TABLE_SIZE = 5376
|
||||
|
||||
EUCTW_CHAR_TO_FREQ_ORDER = (
|
||||
1,1800,1506, 255,1431, 198, 9, 82, 6,7310, 177, 202,3615,1256,2808, 110, # 2742
|
||||
3735, 33,3241, 261, 76, 44,2113, 16,2931,2184,1176, 659,3868, 26,3404,2643, # 2758
|
||||
1198,3869,3313,4060, 410,2211, 302, 590, 361,1963, 8, 204, 58,4296,7311,1931, # 2774
|
||||
63,7312,7313, 317,1614, 75, 222, 159,4061,2412,1480,7314,3500,3068, 224,2809, # 2790
|
||||
3616, 3, 10,3870,1471, 29,2774,1135,2852,1939, 873, 130,3242,1123, 312,7315, # 2806
|
||||
4297,2051, 507, 252, 682,7316, 142,1914, 124, 206,2932, 34,3501,3173, 64, 604, # 2822
|
||||
7317,2494,1976,1977, 155,1990, 645, 641,1606,7318,3405, 337, 72, 406,7319, 80, # 2838
|
||||
630, 238,3174,1509, 263, 939,1092,2644, 756,1440,1094,3406, 449, 69,2969, 591, # 2854
|
||||
179,2095, 471, 115,2034,1843, 60, 50,2970, 134, 806,1868, 734,2035,3407, 180, # 2870
|
||||
995,1607, 156, 537,2893, 688,7320, 319,1305, 779,2144, 514,2374, 298,4298, 359, # 2886
|
||||
2495, 90,2707,1338, 663, 11, 906,1099,2545, 20,2436, 182, 532,1716,7321, 732, # 2902
|
||||
1376,4062,1311,1420,3175, 25,2312,1056, 113, 399, 382,1949, 242,3408,2467, 529, # 2918
|
||||
3243, 475,1447,3617,7322, 117, 21, 656, 810,1297,2295,2329,3502,7323, 126,4063, # 2934
|
||||
706, 456, 150, 613,4299, 71,1118,2036,4064, 145,3069, 85, 835, 486,2114,1246, # 2950
|
||||
1426, 428, 727,1285,1015, 800, 106, 623, 303,1281,7324,2127,2354, 347,3736, 221, # 2966
|
||||
3503,3110,7325,1955,1153,4065, 83, 296,1199,3070, 192, 624, 93,7326, 822,1897, # 2982
|
||||
2810,3111, 795,2064, 991,1554,1542,1592, 27, 43,2853, 859, 139,1456, 860,4300, # 2998
|
||||
437, 712,3871, 164,2392,3112, 695, 211,3017,2096, 195,3872,1608,3504,3505,3618, # 3014
|
||||
3873, 234, 811,2971,2097,3874,2229,1441,3506,1615,2375, 668,2076,1638, 305, 228, # 3030
|
||||
1664,4301, 467, 415,7327, 262,2098,1593, 239, 108, 300, 200,1033, 512,1247,2077, # 3046
|
||||
7328,7329,2173,3176,3619,2673, 593, 845,1062,3244, 88,1723,2037,3875,1950, 212, # 3062
|
||||
266, 152, 149, 468,1898,4066,4302, 77, 187,7330,3018, 37, 5,2972,7331,3876, # 3078
|
||||
7332,7333, 39,2517,4303,2894,3177,2078, 55, 148, 74,4304, 545, 483,1474,1029, # 3094
|
||||
1665, 217,1869,1531,3113,1104,2645,4067, 24, 172,3507, 900,3877,3508,3509,4305, # 3110
|
||||
32,1408,2811,1312, 329, 487,2355,2247,2708, 784,2674, 4,3019,3314,1427,1788, # 3126
|
||||
188, 109, 499,7334,3620,1717,1789, 888,1217,3020,4306,7335,3510,7336,3315,1520, # 3142
|
||||
3621,3878, 196,1034, 775,7337,7338, 929,1815, 249, 439, 38,7339,1063,7340, 794, # 3158
|
||||
3879,1435,2296, 46, 178,3245,2065,7341,2376,7342, 214,1709,4307, 804, 35, 707, # 3174
|
||||
324,3622,1601,2546, 140, 459,4068,7343,7344,1365, 839, 272, 978,2257,2572,3409, # 3190
|
||||
2128,1363,3623,1423, 697, 100,3071, 48, 70,1231, 495,3114,2193,7345,1294,7346, # 3206
|
||||
2079, 462, 586,1042,3246, 853, 256, 988, 185,2377,3410,1698, 434,1084,7347,3411, # 3222
|
||||
314,2615,2775,4308,2330,2331, 569,2280, 637,1816,2518, 757,1162,1878,1616,3412, # 3238
|
||||
287,1577,2115, 768,4309,1671,2854,3511,2519,1321,3737, 909,2413,7348,4069, 933, # 3254
|
||||
3738,7349,2052,2356,1222,4310, 765,2414,1322, 786,4311,7350,1919,1462,1677,2895, # 3270
|
||||
1699,7351,4312,1424,2437,3115,3624,2590,3316,1774,1940,3413,3880,4070, 309,1369, # 3286
|
||||
1130,2812, 364,2230,1653,1299,3881,3512,3882,3883,2646, 525,1085,3021, 902,2000, # 3302
|
||||
1475, 964,4313, 421,1844,1415,1057,2281, 940,1364,3116, 376,4314,4315,1381, 7, # 3318
|
||||
2520, 983,2378, 336,1710,2675,1845, 321,3414, 559,1131,3022,2742,1808,1132,1313, # 3334
|
||||
265,1481,1857,7352, 352,1203,2813,3247, 167,1089, 420,2814, 776, 792,1724,3513, # 3350
|
||||
4071,2438,3248,7353,4072,7354, 446, 229, 333,2743, 901,3739,1200,1557,4316,2647, # 3366
|
||||
1920, 395,2744,2676,3740,4073,1835, 125, 916,3178,2616,4317,7355,7356,3741,7357, # 3382
|
||||
7358,7359,4318,3117,3625,1133,2547,1757,3415,1510,2313,1409,3514,7360,2145, 438, # 3398
|
||||
2591,2896,2379,3317,1068, 958,3023, 461, 311,2855,2677,4074,1915,3179,4075,1978, # 3414
|
||||
383, 750,2745,2617,4076, 274, 539, 385,1278,1442,7361,1154,1964, 384, 561, 210, # 3430
|
||||
98,1295,2548,3515,7362,1711,2415,1482,3416,3884,2897,1257, 129,7363,3742, 642, # 3446
|
||||
523,2776,2777,2648,7364, 141,2231,1333, 68, 176, 441, 876, 907,4077, 603,2592, # 3462
|
||||
710, 171,3417, 404, 549, 18,3118,2393,1410,3626,1666,7365,3516,4319,2898,4320, # 3478
|
||||
7366,2973, 368,7367, 146, 366, 99, 871,3627,1543, 748, 807,1586,1185, 22,2258, # 3494
|
||||
379,3743,3180,7368,3181, 505,1941,2618,1991,1382,2314,7369, 380,2357, 218, 702, # 3510
|
||||
1817,1248,3418,3024,3517,3318,3249,7370,2974,3628, 930,3250,3744,7371, 59,7372, # 3526
|
||||
585, 601,4078, 497,3419,1112,1314,4321,1801,7373,1223,1472,2174,7374, 749,1836, # 3542
|
||||
690,1899,3745,1772,3885,1476, 429,1043,1790,2232,2116, 917,4079, 447,1086,1629, # 3558
|
||||
7375, 556,7376,7377,2020,1654, 844,1090, 105, 550, 966,1758,2815,1008,1782, 686, # 3574
|
||||
1095,7378,2282, 793,1602,7379,3518,2593,4322,4080,2933,2297,4323,3746, 980,2496, # 3590
|
||||
544, 353, 527,4324, 908,2678,2899,7380, 381,2619,1942,1348,7381,1341,1252, 560, # 3606
|
||||
3072,7382,3420,2856,7383,2053, 973, 886,2080, 143,4325,7384,7385, 157,3886, 496, # 3622
|
||||
4081, 57, 840, 540,2038,4326,4327,3421,2117,1445, 970,2259,1748,1965,2081,4082, # 3638
|
||||
3119,1234,1775,3251,2816,3629, 773,1206,2129,1066,2039,1326,3887,1738,1725,4083, # 3654
|
||||
279,3120, 51,1544,2594, 423,1578,2130,2066, 173,4328,1879,7386,7387,1583, 264, # 3670
|
||||
610,3630,4329,2439, 280, 154,7388,7389,7390,1739, 338,1282,3073, 693,2857,1411, # 3686
|
||||
1074,3747,2440,7391,4330,7392,7393,1240, 952,2394,7394,2900,1538,2679, 685,1483, # 3702
|
||||
4084,2468,1436, 953,4085,2054,4331, 671,2395, 79,4086,2441,3252, 608, 567,2680, # 3718
|
||||
3422,4087,4088,1691, 393,1261,1791,2396,7395,4332,7396,7397,7398,7399,1383,1672, # 3734
|
||||
3748,3182,1464, 522,1119, 661,1150, 216, 675,4333,3888,1432,3519, 609,4334,2681, # 3750
|
||||
2397,7400,7401,7402,4089,3025, 0,7403,2469, 315, 231,2442, 301,3319,4335,2380, # 3766
|
||||
7404, 233,4090,3631,1818,4336,4337,7405, 96,1776,1315,2082,7406, 257,7407,1809, # 3782
|
||||
3632,2709,1139,1819,4091,2021,1124,2163,2778,1777,2649,7408,3074, 363,1655,3183, # 3798
|
||||
7409,2975,7410,7411,7412,3889,1567,3890, 718, 103,3184, 849,1443, 341,3320,2934, # 3814
|
||||
1484,7413,1712, 127, 67, 339,4092,2398, 679,1412, 821,7414,7415, 834, 738, 351, # 3830
|
||||
2976,2146, 846, 235,1497,1880, 418,1992,3749,2710, 186,1100,2147,2746,3520,1545, # 3846
|
||||
1355,2935,2858,1377, 583,3891,4093,2573,2977,7416,1298,3633,1078,2549,3634,2358, # 3862
|
||||
78,3750,3751, 267,1289,2099,2001,1594,4094, 348, 369,1274,2194,2175,1837,4338, # 3878
|
||||
1820,2817,3635,2747,2283,2002,4339,2936,2748, 144,3321, 882,4340,3892,2749,3423, # 3894
|
||||
4341,2901,7417,4095,1726, 320,7418,3893,3026, 788,2978,7419,2818,1773,1327,2859, # 3910
|
||||
3894,2819,7420,1306,4342,2003,1700,3752,3521,2359,2650, 787,2022, 506, 824,3636, # 3926
|
||||
534, 323,4343,1044,3322,2023,1900, 946,3424,7421,1778,1500,1678,7422,1881,4344, # 3942
|
||||
165, 243,4345,3637,2521, 123, 683,4096, 764,4346, 36,3895,1792, 589,2902, 816, # 3958
|
||||
626,1667,3027,2233,1639,1555,1622,3753,3896,7423,3897,2860,1370,1228,1932, 891, # 3974
|
||||
2083,2903, 304,4097,7424, 292,2979,2711,3522, 691,2100,4098,1115,4347, 118, 662, # 3990
|
||||
7425, 611,1156, 854,2381,1316,2861, 2, 386, 515,2904,7426,7427,3253, 868,2234, # 4006
|
||||
1486, 855,2651, 785,2212,3028,7428,1040,3185,3523,7429,3121, 448,7430,1525,7431, # 4022
|
||||
2164,4348,7432,3754,7433,4099,2820,3524,3122, 503, 818,3898,3123,1568, 814, 676, # 4038
|
||||
1444, 306,1749,7434,3755,1416,1030, 197,1428, 805,2821,1501,4349,7435,7436,7437, # 4054
|
||||
1993,7438,4350,7439,7440,2195, 13,2779,3638,2980,3124,1229,1916,7441,3756,2131, # 4070
|
||||
7442,4100,4351,2399,3525,7443,2213,1511,1727,1120,7444,7445, 646,3757,2443, 307, # 4086
|
||||
7446,7447,1595,3186,7448,7449,7450,3639,1113,1356,3899,1465,2522,2523,7451, 519, # 4102
|
||||
7452, 128,2132, 92,2284,1979,7453,3900,1512, 342,3125,2196,7454,2780,2214,1980, # 4118
|
||||
3323,7455, 290,1656,1317, 789, 827,2360,7456,3758,4352, 562, 581,3901,7457, 401, # 4134
|
||||
4353,2248, 94,4354,1399,2781,7458,1463,2024,4355,3187,1943,7459, 828,1105,4101, # 4150
|
||||
1262,1394,7460,4102, 605,4356,7461,1783,2862,7462,2822, 819,2101, 578,2197,2937, # 4166
|
||||
7463,1502, 436,3254,4103,3255,2823,3902,2905,3425,3426,7464,2712,2315,7465,7466, # 4182
|
||||
2332,2067, 23,4357, 193, 826,3759,2102, 699,1630,4104,3075, 390,1793,1064,3526, # 4198
|
||||
7467,1579,3076,3077,1400,7468,4105,1838,1640,2863,7469,4358,4359, 137,4106, 598, # 4214
|
||||
3078,1966, 780, 104, 974,2938,7470, 278, 899, 253, 402, 572, 504, 493,1339,7471, # 4230
|
||||
3903,1275,4360,2574,2550,7472,3640,3029,3079,2249, 565,1334,2713, 863, 41,7473, # 4246
|
||||
7474,4361,7475,1657,2333, 19, 463,2750,4107, 606,7476,2981,3256,1087,2084,1323, # 4262
|
||||
2652,2982,7477,1631,1623,1750,4108,2682,7478,2864, 791,2714,2653,2334, 232,2416, # 4278
|
||||
7479,2983,1498,7480,2654,2620, 755,1366,3641,3257,3126,2025,1609, 119,1917,3427, # 4294
|
||||
862,1026,4109,7481,3904,3760,4362,3905,4363,2260,1951,2470,7482,1125, 817,4110, # 4310
|
||||
4111,3906,1513,1766,2040,1487,4112,3030,3258,2824,3761,3127,7483,7484,1507,7485, # 4326
|
||||
2683, 733, 40,1632,1106,2865, 345,4113, 841,2524, 230,4364,2984,1846,3259,3428, # 4342
|
||||
7486,1263, 986,3429,7487, 735, 879, 254,1137, 857, 622,1300,1180,1388,1562,3907, # 4358
|
||||
3908,2939, 967,2751,2655,1349, 592,2133,1692,3324,2985,1994,4114,1679,3909,1901, # 4374
|
||||
2185,7488, 739,3642,2715,1296,1290,7489,4115,2198,2199,1921,1563,2595,2551,1870, # 4390
|
||||
2752,2986,7490, 435,7491, 343,1108, 596, 17,1751,4365,2235,3430,3643,7492,4366, # 4406
|
||||
294,3527,2940,1693, 477, 979, 281,2041,3528, 643,2042,3644,2621,2782,2261,1031, # 4422
|
||||
2335,2134,2298,3529,4367, 367,1249,2552,7493,3530,7494,4368,1283,3325,2004, 240, # 4438
|
||||
1762,3326,4369,4370, 836,1069,3128, 474,7495,2148,2525, 268,3531,7496,3188,1521, # 4454
|
||||
1284,7497,1658,1546,4116,7498,3532,3533,7499,4117,3327,2684,1685,4118, 961,1673, # 4470
|
||||
2622, 190,2005,2200,3762,4371,4372,7500, 570,2497,3645,1490,7501,4373,2623,3260, # 4486
|
||||
1956,4374, 584,1514, 396,1045,1944,7502,4375,1967,2444,7503,7504,4376,3910, 619, # 4502
|
||||
7505,3129,3261, 215,2006,2783,2553,3189,4377,3190,4378, 763,4119,3763,4379,7506, # 4518
|
||||
7507,1957,1767,2941,3328,3646,1174, 452,1477,4380,3329,3130,7508,2825,1253,2382, # 4534
|
||||
2186,1091,2285,4120, 492,7509, 638,1169,1824,2135,1752,3911, 648, 926,1021,1324, # 4550
|
||||
4381, 520,4382, 997, 847,1007, 892,4383,3764,2262,1871,3647,7510,2400,1784,4384, # 4566
|
||||
1952,2942,3080,3191,1728,4121,2043,3648,4385,2007,1701,3131,1551, 30,2263,4122, # 4582
|
||||
7511,2026,4386,3534,7512, 501,7513,4123, 594,3431,2165,1821,3535,3432,3536,3192, # 4598
|
||||
829,2826,4124,7514,1680,3132,1225,4125,7515,3262,4387,4126,3133,2336,7516,4388, # 4614
|
||||
4127,7517,3912,3913,7518,1847,2383,2596,3330,7519,4389, 374,3914, 652,4128,4129, # 4630
|
||||
375,1140, 798,7520,7521,7522,2361,4390,2264, 546,1659, 138,3031,2445,4391,7523, # 4646
|
||||
2250, 612,1848, 910, 796,3765,1740,1371, 825,3766,3767,7524,2906,2554,7525, 692, # 4662
|
||||
444,3032,2624, 801,4392,4130,7526,1491, 244,1053,3033,4131,4132, 340,7527,3915, # 4678
|
||||
1041,2987, 293,1168, 87,1357,7528,1539, 959,7529,2236, 721, 694,4133,3768, 219, # 4694
|
||||
1478, 644,1417,3331,2656,1413,1401,1335,1389,3916,7530,7531,2988,2362,3134,1825, # 4710
|
||||
730,1515, 184,2827, 66,4393,7532,1660,2943, 246,3332, 378,1457, 226,3433, 975, # 4726
|
||||
3917,2944,1264,3537, 674, 696,7533, 163,7534,1141,2417,2166, 713,3538,3333,4394, # 4742
|
||||
3918,7535,7536,1186, 15,7537,1079,1070,7538,1522,3193,3539, 276,1050,2716, 758, # 4758
|
||||
1126, 653,2945,3263,7539,2337, 889,3540,3919,3081,2989, 903,1250,4395,3920,3434, # 4774
|
||||
3541,1342,1681,1718, 766,3264, 286, 89,2946,3649,7540,1713,7541,2597,3334,2990, # 4790
|
||||
7542,2947,2215,3194,2866,7543,4396,2498,2526, 181, 387,1075,3921, 731,2187,3335, # 4806
|
||||
7544,3265, 310, 313,3435,2299, 770,4134, 54,3034, 189,4397,3082,3769,3922,7545, # 4822
|
||||
1230,1617,1849, 355,3542,4135,4398,3336, 111,4136,3650,1350,3135,3436,3035,4137, # 4838
|
||||
2149,3266,3543,7546,2784,3923,3924,2991, 722,2008,7547,1071, 247,1207,2338,2471, # 4854
|
||||
1378,4399,2009, 864,1437,1214,4400, 373,3770,1142,2216, 667,4401, 442,2753,2555, # 4870
|
||||
3771,3925,1968,4138,3267,1839, 837, 170,1107, 934,1336,1882,7548,7549,2118,4139, # 4886
|
||||
2828, 743,1569,7550,4402,4140, 582,2384,1418,3437,7551,1802,7552, 357,1395,1729, # 4902
|
||||
3651,3268,2418,1564,2237,7553,3083,3772,1633,4403,1114,2085,4141,1532,7554, 482, # 4918
|
||||
2446,4404,7555,7556,1492, 833,1466,7557,2717,3544,1641,2829,7558,1526,1272,3652, # 4934
|
||||
4142,1686,1794, 416,2556,1902,1953,1803,7559,3773,2785,3774,1159,2316,7560,2867, # 4950
|
||||
4405,1610,1584,3036,2419,2754, 443,3269,1163,3136,7561,7562,3926,7563,4143,2499, # 4966
|
||||
3037,4406,3927,3137,2103,1647,3545,2010,1872,4144,7564,4145, 431,3438,7565, 250, # 4982
|
||||
97, 81,4146,7566,1648,1850,1558, 160, 848,7567, 866, 740,1694,7568,2201,2830, # 4998
|
||||
3195,4147,4407,3653,1687, 950,2472, 426, 469,3196,3654,3655,3928,7569,7570,1188, # 5014
|
||||
424,1995, 861,3546,4148,3775,2202,2685, 168,1235,3547,4149,7571,2086,1674,4408, # 5030
|
||||
3337,3270, 220,2557,1009,7572,3776, 670,2992, 332,1208, 717,7573,7574,3548,2447, # 5046
|
||||
3929,3338,7575, 513,7576,1209,2868,3339,3138,4409,1080,7577,7578,7579,7580,2527, # 5062
|
||||
3656,3549, 815,1587,3930,3931,7581,3550,3439,3777,1254,4410,1328,3038,1390,3932, # 5078
|
||||
1741,3933,3778,3934,7582, 236,3779,2448,3271,7583,7584,3657,3780,1273,3781,4411, # 5094
|
||||
7585, 308,7586,4412, 245,4413,1851,2473,1307,2575, 430, 715,2136,2449,7587, 270, # 5110
|
||||
199,2869,3935,7588,3551,2718,1753, 761,1754, 725,1661,1840,4414,3440,3658,7589, # 5126
|
||||
7590, 587, 14,3272, 227,2598, 326, 480,2265, 943,2755,3552, 291, 650,1883,7591, # 5142
|
||||
1702,1226, 102,1547, 62,3441, 904,4415,3442,1164,4150,7592,7593,1224,1548,2756, # 5158
|
||||
391, 498,1493,7594,1386,1419,7595,2055,1177,4416, 813, 880,1081,2363, 566,1145, # 5174
|
||||
4417,2286,1001,1035,2558,2599,2238, 394,1286,7596,7597,2068,7598, 86,1494,1730, # 5190
|
||||
3936, 491,1588, 745, 897,2948, 843,3340,3937,2757,2870,3273,1768, 998,2217,2069, # 5206
|
||||
397,1826,1195,1969,3659,2993,3341, 284,7599,3782,2500,2137,2119,1903,7600,3938, # 5222
|
||||
2150,3939,4151,1036,3443,1904, 114,2559,4152, 209,1527,7601,7602,2949,2831,2625, # 5238
|
||||
2385,2719,3139, 812,2560,7603,3274,7604,1559, 737,1884,3660,1210, 885, 28,2686, # 5254
|
||||
3553,3783,7605,4153,1004,1779,4418,7606, 346,1981,2218,2687,4419,3784,1742, 797, # 5270
|
||||
1642,3940,1933,1072,1384,2151, 896,3941,3275,3661,3197,2871,3554,7607,2561,1958, # 5286
|
||||
4420,2450,1785,7608,7609,7610,3942,4154,1005,1308,3662,4155,2720,4421,4422,1528, # 5302
|
||||
2600, 161,1178,4156,1982, 987,4423,1101,4157, 631,3943,1157,3198,2420,1343,1241, # 5318
|
||||
1016,2239,2562, 372, 877,2339,2501,1160, 555,1934, 911,3944,7611, 466,1170, 169, # 5334
|
||||
1051,2907,2688,3663,2474,2994,1182,2011,2563,1251,2626,7612, 992,2340,3444,1540, # 5350
|
||||
2721,1201,2070,2401,1996,2475,7613,4424, 528,1922,2188,1503,1873,1570,2364,3342, # 5366
|
||||
3276,7614, 557,1073,7615,1827,3445,2087,2266,3140,3039,3084, 767,3085,2786,4425, # 5382
|
||||
1006,4158,4426,2341,1267,2176,3664,3199, 778,3945,3200,2722,1597,2657,7616,4427, # 5398
|
||||
7617,3446,7618,7619,7620,3277,2689,1433,3278, 131, 95,1504,3946, 723,4159,3141, # 5414
|
||||
1841,3555,2758,2189,3947,2027,2104,3665,7621,2995,3948,1218,7622,3343,3201,3949, # 5430
|
||||
4160,2576, 248,1634,3785, 912,7623,2832,3666,3040,3786, 654, 53,7624,2996,7625, # 5446
|
||||
1688,4428, 777,3447,1032,3950,1425,7626, 191, 820,2120,2833, 971,4429, 931,3202, # 5462
|
||||
135, 664, 783,3787,1997, 772,2908,1935,3951,3788,4430,2909,3203, 282,2723, 640, # 5478
|
||||
1372,3448,1127, 922, 325,3344,7627,7628, 711,2044,7629,7630,3952,2219,2787,1936, # 5494
|
||||
3953,3345,2220,2251,3789,2300,7631,4431,3790,1258,3279,3954,3204,2138,2950,3955, # 5510
|
||||
3956,7632,2221, 258,3205,4432, 101,1227,7633,3280,1755,7634,1391,3281,7635,2910, # 5526
|
||||
2056, 893,7636,7637,7638,1402,4161,2342,7639,7640,3206,3556,7641,7642, 878,1325, # 5542
|
||||
1780,2788,4433, 259,1385,2577, 744,1183,2267,4434,7643,3957,2502,7644, 684,1024, # 5558
|
||||
4162,7645, 472,3557,3449,1165,3282,3958,3959, 322,2152, 881, 455,1695,1152,1340, # 5574
|
||||
660, 554,2153,4435,1058,4436,4163, 830,1065,3346,3960,4437,1923,7646,1703,1918, # 5590
|
||||
7647, 932,2268, 122,7648,4438, 947, 677,7649,3791,2627, 297,1905,1924,2269,4439, # 5606
|
||||
2317,3283,7650,7651,4164,7652,4165, 84,4166, 112, 989,7653, 547,1059,3961, 701, # 5622
|
||||
3558,1019,7654,4167,7655,3450, 942, 639, 457,2301,2451, 993,2951, 407, 851, 494, # 5638
|
||||
4440,3347, 927,7656,1237,7657,2421,3348, 573,4168, 680, 921,2911,1279,1874, 285, # 5654
|
||||
790,1448,1983, 719,2167,7658,7659,4441,3962,3963,1649,7660,1541, 563,7661,1077, # 5670
|
||||
7662,3349,3041,3451, 511,2997,3964,3965,3667,3966,1268,2564,3350,3207,4442,4443, # 5686
|
||||
7663, 535,1048,1276,1189,2912,2028,3142,1438,1373,2834,2952,1134,2012,7664,4169, # 5702
|
||||
1238,2578,3086,1259,7665, 700,7666,2953,3143,3668,4170,7667,4171,1146,1875,1906, # 5718
|
||||
4444,2601,3967, 781,2422, 132,1589, 203, 147, 273,2789,2402, 898,1786,2154,3968, # 5734
|
||||
3969,7668,3792,2790,7669,7670,4445,4446,7671,3208,7672,1635,3793, 965,7673,1804, # 5750
|
||||
2690,1516,3559,1121,1082,1329,3284,3970,1449,3794, 65,1128,2835,2913,2759,1590, # 5766
|
||||
3795,7674,7675, 12,2658, 45, 976,2579,3144,4447, 517,2528,1013,1037,3209,7676, # 5782
|
||||
3796,2836,7677,3797,7678,3452,7679,2602, 614,1998,2318,3798,3087,2724,2628,7680, # 5798
|
||||
2580,4172, 599,1269,7681,1810,3669,7682,2691,3088, 759,1060, 489,1805,3351,3285, # 5814
|
||||
1358,7683,7684,2386,1387,1215,2629,2252, 490,7685,7686,4173,1759,2387,2343,7687, # 5830
|
||||
4448,3799,1907,3971,2630,1806,3210,4449,3453,3286,2760,2344, 874,7688,7689,3454, # 5846
|
||||
3670,1858, 91,2914,3671,3042,3800,4450,7690,3145,3972,2659,7691,3455,1202,1403, # 5862
|
||||
3801,2954,2529,1517,2503,4451,3456,2504,7692,4452,7693,2692,1885,1495,1731,3973, # 5878
|
||||
2365,4453,7694,2029,7695,7696,3974,2693,1216, 237,2581,4174,2319,3975,3802,4454, # 5894
|
||||
4455,2694,3560,3457, 445,4456,7697,7698,7699,7700,2761, 61,3976,3672,1822,3977, # 5910
|
||||
7701, 687,2045, 935, 925, 405,2660, 703,1096,1859,2725,4457,3978,1876,1367,2695, # 5926
|
||||
3352, 918,2105,1781,2476, 334,3287,1611,1093,4458, 564,3146,3458,3673,3353, 945, # 5942
|
||||
2631,2057,4459,7702,1925, 872,4175,7703,3459,2696,3089, 349,4176,3674,3979,4460, # 5958
|
||||
3803,4177,3675,2155,3980,4461,4462,4178,4463,2403,2046, 782,3981, 400, 251,4179, # 5974
|
||||
1624,7704,7705, 277,3676, 299,1265, 476,1191,3804,2121,4180,4181,1109, 205,7706, # 5990
|
||||
2582,1000,2156,3561,1860,7707,7708,7709,4464,7710,4465,2565, 107,2477,2157,3982, # 6006
|
||||
3460,3147,7711,1533, 541,1301, 158, 753,4182,2872,3562,7712,1696, 370,1088,4183, # 6022
|
||||
4466,3563, 579, 327, 440, 162,2240, 269,1937,1374,3461, 968,3043, 56,1396,3090, # 6038
|
||||
2106,3288,3354,7713,1926,2158,4467,2998,7714,3564,7715,7716,3677,4468,2478,7717, # 6054
|
||||
2791,7718,1650,4469,7719,2603,7720,7721,3983,2661,3355,1149,3356,3984,3805,3985, # 6070
|
||||
7722,1076, 49,7723, 951,3211,3289,3290, 450,2837, 920,7724,1811,2792,2366,4184, # 6086
|
||||
1908,1138,2367,3806,3462,7725,3212,4470,1909,1147,1518,2423,4471,3807,7726,4472, # 6102
|
||||
2388,2604, 260,1795,3213,7727,7728,3808,3291, 708,7729,3565,1704,7730,3566,1351, # 6118
|
||||
1618,3357,2999,1886, 944,4185,3358,4186,3044,3359,4187,7731,3678, 422, 413,1714, # 6134
|
||||
3292, 500,2058,2345,4188,2479,7732,1344,1910, 954,7733,1668,7734,7735,3986,2404, # 6150
|
||||
4189,3567,3809,4190,7736,2302,1318,2505,3091, 133,3092,2873,4473, 629, 31,2838, # 6166
|
||||
2697,3810,4474, 850, 949,4475,3987,2955,1732,2088,4191,1496,1852,7737,3988, 620, # 6182
|
||||
3214, 981,1242,3679,3360,1619,3680,1643,3293,2139,2452,1970,1719,3463,2168,7738, # 6198
|
||||
3215,7739,7740,3361,1828,7741,1277,4476,1565,2047,7742,1636,3568,3093,7743, 869, # 6214
|
||||
2839, 655,3811,3812,3094,3989,3000,3813,1310,3569,4477,7744,7745,7746,1733, 558, # 6230
|
||||
4478,3681, 335,1549,3045,1756,4192,3682,1945,3464,1829,1291,1192, 470,2726,2107, # 6246
|
||||
2793, 913,1054,3990,7747,1027,7748,3046,3991,4479, 982,2662,3362,3148,3465,3216, # 6262
|
||||
3217,1946,2794,7749, 571,4480,7750,1830,7751,3570,2583,1523,2424,7752,2089, 984, # 6278
|
||||
4481,3683,1959,7753,3684, 852, 923,2795,3466,3685, 969,1519, 999,2048,2320,1705, # 6294
|
||||
7754,3095, 615,1662, 151, 597,3992,2405,2321,1049, 275,4482,3686,4193, 568,3687, # 6310
|
||||
3571,2480,4194,3688,7755,2425,2270, 409,3218,7756,1566,2874,3467,1002, 769,2840, # 6326
|
||||
194,2090,3149,3689,2222,3294,4195, 628,1505,7757,7758,1763,2177,3001,3993, 521, # 6342
|
||||
1161,2584,1787,2203,2406,4483,3994,1625,4196,4197, 412, 42,3096, 464,7759,2632, # 6358
|
||||
4484,3363,1760,1571,2875,3468,2530,1219,2204,3814,2633,2140,2368,4485,4486,3295, # 6374
|
||||
1651,3364,3572,7760,7761,3573,2481,3469,7762,3690,7763,7764,2271,2091, 460,7765, # 6390
|
||||
4487,7766,3002, 962, 588,3574, 289,3219,2634,1116, 52,7767,3047,1796,7768,7769, # 6406
|
||||
7770,1467,7771,1598,1143,3691,4198,1984,1734,1067,4488,1280,3365, 465,4489,1572, # 6422
|
||||
510,7772,1927,2241,1812,1644,3575,7773,4490,3692,7774,7775,2663,1573,1534,7776, # 6438
|
||||
7777,4199, 536,1807,1761,3470,3815,3150,2635,7778,7779,7780,4491,3471,2915,1911, # 6454
|
||||
2796,7781,3296,1122, 377,3220,7782, 360,7783,7784,4200,1529, 551,7785,2059,3693, # 6470
|
||||
1769,2426,7786,2916,4201,3297,3097,2322,2108,2030,4492,1404, 136,1468,1479, 672, # 6486
|
||||
1171,3221,2303, 271,3151,7787,2762,7788,2049, 678,2727, 865,1947,4493,7789,2013, # 6502
|
||||
3995,2956,7790,2728,2223,1397,3048,3694,4494,4495,1735,2917,3366,3576,7791,3816, # 6518
|
||||
509,2841,2453,2876,3817,7792,7793,3152,3153,4496,4202,2531,4497,2304,1166,1010, # 6534
|
||||
552, 681,1887,7794,7795,2957,2958,3996,1287,1596,1861,3154, 358, 453, 736, 175, # 6550
|
||||
478,1117, 905,1167,1097,7796,1853,1530,7797,1706,7798,2178,3472,2287,3695,3473, # 6566
|
||||
3577,4203,2092,4204,7799,3367,1193,2482,4205,1458,2190,2205,1862,1888,1421,3298, # 6582
|
||||
2918,3049,2179,3474, 595,2122,7800,3997,7801,7802,4206,1707,2636, 223,3696,1359, # 6598
|
||||
751,3098, 183,3475,7803,2797,3003, 419,2369, 633, 704,3818,2389, 241,7804,7805, # 6614
|
||||
7806, 838,3004,3697,2272,2763,2454,3819,1938,2050,3998,1309,3099,2242,1181,7807, # 6630
|
||||
1136,2206,3820,2370,1446,4207,2305,4498,7808,7809,4208,1055,2605, 484,3698,7810, # 6646
|
||||
3999, 625,4209,2273,3368,1499,4210,4000,7811,4001,4211,3222,2274,2275,3476,7812, # 6662
|
||||
7813,2764, 808,2606,3699,3369,4002,4212,3100,2532, 526,3370,3821,4213, 955,7814, # 6678
|
||||
1620,4214,2637,2427,7815,1429,3700,1669,1831, 994, 928,7816,3578,1260,7817,7818, # 6694
|
||||
7819,1948,2288, 741,2919,1626,4215,2729,2455, 867,1184, 362,3371,1392,7820,7821, # 6710
|
||||
4003,4216,1770,1736,3223,2920,4499,4500,1928,2698,1459,1158,7822,3050,3372,2877, # 6726
|
||||
1292,1929,2506,2842,3701,1985,1187,2071,2014,2607,4217,7823,2566,2507,2169,3702, # 6742
|
||||
2483,3299,7824,3703,4501,7825,7826, 666,1003,3005,1022,3579,4218,7827,4502,1813, # 6758
|
||||
2253, 574,3822,1603, 295,1535, 705,3823,4219, 283, 858, 417,7828,7829,3224,4503, # 6774
|
||||
4504,3051,1220,1889,1046,2276,2456,4004,1393,1599, 689,2567, 388,4220,7830,2484, # 6790
|
||||
802,7831,2798,3824,2060,1405,2254,7832,4505,3825,2109,1052,1345,3225,1585,7833, # 6806
|
||||
809,7834,7835,7836, 575,2730,3477, 956,1552,1469,1144,2323,7837,2324,1560,2457, # 6822
|
||||
3580,3226,4005, 616,2207,3155,2180,2289,7838,1832,7839,3478,4506,7840,1319,3704, # 6838
|
||||
3705,1211,3581,1023,3227,1293,2799,7841,7842,7843,3826, 607,2306,3827, 762,2878, # 6854
|
||||
1439,4221,1360,7844,1485,3052,7845,4507,1038,4222,1450,2061,2638,4223,1379,4508, # 6870
|
||||
2585,7846,7847,4224,1352,1414,2325,2921,1172,7848,7849,3828,3829,7850,1797,1451, # 6886
|
||||
7851,7852,7853,7854,2922,4006,4007,2485,2346, 411,4008,4009,3582,3300,3101,4509, # 6902
|
||||
1561,2664,1452,4010,1375,7855,7856, 47,2959, 316,7857,1406,1591,2923,3156,7858, # 6918
|
||||
1025,2141,3102,3157, 354,2731, 884,2224,4225,2407, 508,3706, 726,3583, 996,2428, # 6934
|
||||
3584, 729,7859, 392,2191,1453,4011,4510,3707,7860,7861,2458,3585,2608,1675,2800, # 6950
|
||||
919,2347,2960,2348,1270,4511,4012, 73,7862,7863, 647,7864,3228,2843,2255,1550, # 6966
|
||||
1346,3006,7865,1332, 883,3479,7866,7867,7868,7869,3301,2765,7870,1212, 831,1347, # 6982
|
||||
4226,4512,2326,3830,1863,3053, 720,3831,4513,4514,3832,7871,4227,7872,7873,4515, # 6998
|
||||
7874,7875,1798,4516,3708,2609,4517,3586,1645,2371,7876,7877,2924, 669,2208,2665, # 7014
|
||||
2429,7878,2879,7879,7880,1028,3229,7881,4228,2408,7882,2256,1353,7883,7884,4518, # 7030
|
||||
3158, 518,7885,4013,7886,4229,1960,7887,2142,4230,7888,7889,3007,2349,2350,3833, # 7046
|
||||
516,1833,1454,4014,2699,4231,4519,2225,2610,1971,1129,3587,7890,2766,7891,2961, # 7062
|
||||
1422, 577,1470,3008,1524,3373,7892,7893, 432,4232,3054,3480,7894,2586,1455,2508, # 7078
|
||||
2226,1972,1175,7895,1020,2732,4015,3481,4520,7896,2733,7897,1743,1361,3055,3482, # 7094
|
||||
2639,4016,4233,4521,2290, 895, 924,4234,2170, 331,2243,3056, 166,1627,3057,1098, # 7110
|
||||
7898,1232,2880,2227,3374,4522, 657, 403,1196,2372, 542,3709,3375,1600,4235,3483, # 7126
|
||||
7899,4523,2767,3230, 576, 530,1362,7900,4524,2533,2666,3710,4017,7901, 842,3834, # 7142
|
||||
7902,2801,2031,1014,4018, 213,2700,3376, 665, 621,4236,7903,3711,2925,2430,7904, # 7158
|
||||
2431,3302,3588,3377,7905,4237,2534,4238,4525,3589,1682,4239,3484,1380,7906, 724, # 7174
|
||||
2277, 600,1670,7907,1337,1233,4526,3103,2244,7908,1621,4527,7909, 651,4240,7910, # 7190
|
||||
1612,4241,2611,7911,2844,7912,2734,2307,3058,7913, 716,2459,3059, 174,1255,2701, # 7206
|
||||
4019,3590, 548,1320,1398, 728,4020,1574,7914,1890,1197,3060,4021,7915,3061,3062, # 7222
|
||||
3712,3591,3713, 747,7916, 635,4242,4528,7917,7918,7919,4243,7920,7921,4529,7922, # 7238
|
||||
3378,4530,2432, 451,7923,3714,2535,2072,4244,2735,4245,4022,7924,1764,4531,7925, # 7254
|
||||
4246, 350,7926,2278,2390,2486,7927,4247,4023,2245,1434,4024, 488,4532, 458,4248, # 7270
|
||||
4025,3715, 771,1330,2391,3835,2568,3159,2159,2409,1553,2667,3160,4249,7928,2487, # 7286
|
||||
2881,2612,1720,2702,4250,3379,4533,7929,2536,4251,7930,3231,4252,2768,7931,2015, # 7302
|
||||
2736,7932,1155,1017,3716,3836,7933,3303,2308, 201,1864,4253,1430,7934,4026,7935, # 7318
|
||||
7936,7937,7938,7939,4254,1604,7940, 414,1865, 371,2587,4534,4535,3485,2016,3104, # 7334
|
||||
4536,1708, 960,4255, 887, 389,2171,1536,1663,1721,7941,2228,4027,2351,2926,1580, # 7350
|
||||
7942,7943,7944,1744,7945,2537,4537,4538,7946,4539,7947,2073,7948,7949,3592,3380, # 7366
|
||||
2882,4256,7950,4257,2640,3381,2802, 673,2703,2460, 709,3486,4028,3593,4258,7951, # 7382
|
||||
1148, 502, 634,7952,7953,1204,4540,3594,1575,4541,2613,3717,7954,3718,3105, 948, # 7398
|
||||
3232, 121,1745,3837,1110,7955,4259,3063,2509,3009,4029,3719,1151,1771,3838,1488, # 7414
|
||||
4030,1986,7956,2433,3487,7957,7958,2093,7959,4260,3839,1213,1407,2803, 531,2737, # 7430
|
||||
2538,3233,1011,1537,7960,2769,4261,3106,1061,7961,3720,3721,1866,2883,7962,2017, # 7446
|
||||
120,4262,4263,2062,3595,3234,2309,3840,2668,3382,1954,4542,7963,7964,3488,1047, # 7462
|
||||
2704,1266,7965,1368,4543,2845, 649,3383,3841,2539,2738,1102,2846,2669,7966,7967, # 7478
|
||||
1999,7968,1111,3596,2962,7969,2488,3842,3597,2804,1854,3384,3722,7970,7971,3385, # 7494
|
||||
2410,2884,3304,3235,3598,7972,2569,7973,3599,2805,4031,1460, 856,7974,3600,7975, # 7510
|
||||
2885,2963,7976,2886,3843,7977,4264, 632,2510, 875,3844,1697,3845,2291,7978,7979, # 7526
|
||||
4544,3010,1239, 580,4545,4265,7980, 914, 936,2074,1190,4032,1039,2123,7981,7982, # 7542
|
||||
7983,3386,1473,7984,1354,4266,3846,7985,2172,3064,4033, 915,3305,4267,4268,3306, # 7558
|
||||
1605,1834,7986,2739, 398,3601,4269,3847,4034, 328,1912,2847,4035,3848,1331,4270, # 7574
|
||||
3011, 937,4271,7987,3602,4036,4037,3387,2160,4546,3388, 524, 742, 538,3065,1012, # 7590
|
||||
7988,7989,3849,2461,7990, 658,1103, 225,3850,7991,7992,4547,7993,4548,7994,3236, # 7606
|
||||
1243,7995,4038, 963,2246,4549,7996,2705,3603,3161,7997,7998,2588,2327,7999,4550, # 7622
|
||||
8000,8001,8002,3489,3307, 957,3389,2540,2032,1930,2927,2462, 870,2018,3604,1746, # 7638
|
||||
2770,2771,2434,2463,8003,3851,8004,3723,3107,3724,3490,3390,3725,8005,1179,3066, # 7654
|
||||
8006,3162,2373,4272,3726,2541,3163,3108,2740,4039,8007,3391,1556,2542,2292, 977, # 7670
|
||||
2887,2033,4040,1205,3392,8008,1765,3393,3164,2124,1271,1689, 714,4551,3491,8009, # 7686
|
||||
2328,3852, 533,4273,3605,2181, 617,8010,2464,3308,3492,2310,8011,8012,3165,8013, # 7702
|
||||
8014,3853,1987, 618, 427,2641,3493,3394,8015,8016,1244,1690,8017,2806,4274,4552, # 7718
|
||||
8018,3494,8019,8020,2279,1576, 473,3606,4275,3395, 972,8021,3607,8022,3067,8023, # 7734
|
||||
8024,4553,4554,8025,3727,4041,4042,8026, 153,4555, 356,8027,1891,2888,4276,2143, # 7750
|
||||
408, 803,2352,8028,3854,8029,4277,1646,2570,2511,4556,4557,3855,8030,3856,4278, # 7766
|
||||
8031,2411,3396, 752,8032,8033,1961,2964,8034, 746,3012,2465,8035,4279,3728, 698, # 7782
|
||||
4558,1892,4280,3608,2543,4559,3609,3857,8036,3166,3397,8037,1823,1302,4043,2706, # 7798
|
||||
3858,1973,4281,8038,4282,3167, 823,1303,1288,1236,2848,3495,4044,3398, 774,3859, # 7814
|
||||
8039,1581,4560,1304,2849,3860,4561,8040,2435,2161,1083,3237,4283,4045,4284, 344, # 7830
|
||||
1173, 288,2311, 454,1683,8041,8042,1461,4562,4046,2589,8043,8044,4563, 985, 894, # 7846
|
||||
8045,3399,3168,8046,1913,2928,3729,1988,8047,2110,1974,8048,4047,8049,2571,1194, # 7862
|
||||
425,8050,4564,3169,1245,3730,4285,8051,8052,2850,8053, 636,4565,1855,3861, 760, # 7878
|
||||
1799,8054,4286,2209,1508,4566,4048,1893,1684,2293,8055,8056,8057,4287,4288,2210, # 7894
|
||||
479,8058,8059, 832,8060,4049,2489,8061,2965,2490,3731, 990,3109, 627,1814,2642, # 7910
|
||||
4289,1582,4290,2125,2111,3496,4567,8062, 799,4291,3170,8063,4568,2112,1737,3013, # 7926
|
||||
1018, 543, 754,4292,3309,1676,4569,4570,4050,8064,1489,8065,3497,8066,2614,2889, # 7942
|
||||
4051,8067,8068,2966,8069,8070,8071,8072,3171,4571,4572,2182,1722,8073,3238,3239, # 7958
|
||||
1842,3610,1715, 481, 365,1975,1856,8074,8075,1962,2491,4573,8076,2126,3611,3240, # 7974
|
||||
433,1894,2063,2075,8077, 602,2741,8078,8079,8080,8081,8082,3014,1628,3400,8083, # 7990
|
||||
3172,4574,4052,2890,4575,2512,8084,2544,2772,8085,8086,8087,3310,4576,2891,8088, # 8006
|
||||
4577,8089,2851,4578,4579,1221,2967,4053,2513,8090,8091,8092,1867,1989,8093,8094, # 8022
|
||||
8095,1895,8096,8097,4580,1896,4054, 318,8098,2094,4055,4293,8099,8100, 485,8101, # 8038
|
||||
938,3862, 553,2670, 116,8102,3863,3612,8103,3498,2671,2773,3401,3311,2807,8104, # 8054
|
||||
3613,2929,4056,1747,2930,2968,8105,8106, 207,8107,8108,2672,4581,2514,8109,3015, # 8070
|
||||
890,3614,3864,8110,1877,3732,3402,8111,2183,2353,3403,1652,8112,8113,8114, 941, # 8086
|
||||
2294, 208,3499,4057,2019, 330,4294,3865,2892,2492,3733,4295,8115,8116,8117,8118, # 8102
|
||||
)
|
||||
# Char to FreqOrder table ,
|
||||
EUCTW_TABLE_SIZE = 8102
|
||||
|
||||
EUCTWCharToFreqOrder = ( \
|
||||
1,1800,1506, 255,1431, 198, 9, 82, 6,7310, 177, 202,3615,1256,2808, 110, # 2742
|
||||
3735, 33,3241, 261, 76, 44,2113, 16,2931,2184,1176, 659,3868, 26,3404,2643, # 2758
|
||||
1198,3869,3313,4060, 410,2211, 302, 590, 361,1963, 8, 204, 58,4296,7311,1931, # 2774
|
||||
63,7312,7313, 317,1614, 75, 222, 159,4061,2412,1480,7314,3500,3068, 224,2809, # 2790
|
||||
3616, 3, 10,3870,1471, 29,2774,1135,2852,1939, 873, 130,3242,1123, 312,7315, # 2806
|
||||
4297,2051, 507, 252, 682,7316, 142,1914, 124, 206,2932, 34,3501,3173, 64, 604, # 2822
|
||||
7317,2494,1976,1977, 155,1990, 645, 641,1606,7318,3405, 337, 72, 406,7319, 80, # 2838
|
||||
630, 238,3174,1509, 263, 939,1092,2644, 756,1440,1094,3406, 449, 69,2969, 591, # 2854
|
||||
179,2095, 471, 115,2034,1843, 60, 50,2970, 134, 806,1868, 734,2035,3407, 180, # 2870
|
||||
995,1607, 156, 537,2893, 688,7320, 319,1305, 779,2144, 514,2374, 298,4298, 359, # 2886
|
||||
2495, 90,2707,1338, 663, 11, 906,1099,2545, 20,2436, 182, 532,1716,7321, 732, # 2902
|
||||
1376,4062,1311,1420,3175, 25,2312,1056, 113, 399, 382,1949, 242,3408,2467, 529, # 2918
|
||||
3243, 475,1447,3617,7322, 117, 21, 656, 810,1297,2295,2329,3502,7323, 126,4063, # 2934
|
||||
706, 456, 150, 613,4299, 71,1118,2036,4064, 145,3069, 85, 835, 486,2114,1246, # 2950
|
||||
1426, 428, 727,1285,1015, 800, 106, 623, 303,1281,7324,2127,2354, 347,3736, 221, # 2966
|
||||
3503,3110,7325,1955,1153,4065, 83, 296,1199,3070, 192, 624, 93,7326, 822,1897, # 2982
|
||||
2810,3111, 795,2064, 991,1554,1542,1592, 27, 43,2853, 859, 139,1456, 860,4300, # 2998
|
||||
437, 712,3871, 164,2392,3112, 695, 211,3017,2096, 195,3872,1608,3504,3505,3618, # 3014
|
||||
3873, 234, 811,2971,2097,3874,2229,1441,3506,1615,2375, 668,2076,1638, 305, 228, # 3030
|
||||
1664,4301, 467, 415,7327, 262,2098,1593, 239, 108, 300, 200,1033, 512,1247,2077, # 3046
|
||||
7328,7329,2173,3176,3619,2673, 593, 845,1062,3244, 88,1723,2037,3875,1950, 212, # 3062
|
||||
266, 152, 149, 468,1898,4066,4302, 77, 187,7330,3018, 37, 5,2972,7331,3876, # 3078
|
||||
7332,7333, 39,2517,4303,2894,3177,2078, 55, 148, 74,4304, 545, 483,1474,1029, # 3094
|
||||
1665, 217,1869,1531,3113,1104,2645,4067, 24, 172,3507, 900,3877,3508,3509,4305, # 3110
|
||||
32,1408,2811,1312, 329, 487,2355,2247,2708, 784,2674, 4,3019,3314,1427,1788, # 3126
|
||||
188, 109, 499,7334,3620,1717,1789, 888,1217,3020,4306,7335,3510,7336,3315,1520, # 3142
|
||||
3621,3878, 196,1034, 775,7337,7338, 929,1815, 249, 439, 38,7339,1063,7340, 794, # 3158
|
||||
3879,1435,2296, 46, 178,3245,2065,7341,2376,7342, 214,1709,4307, 804, 35, 707, # 3174
|
||||
324,3622,1601,2546, 140, 459,4068,7343,7344,1365, 839, 272, 978,2257,2572,3409, # 3190
|
||||
2128,1363,3623,1423, 697, 100,3071, 48, 70,1231, 495,3114,2193,7345,1294,7346, # 3206
|
||||
2079, 462, 586,1042,3246, 853, 256, 988, 185,2377,3410,1698, 434,1084,7347,3411, # 3222
|
||||
314,2615,2775,4308,2330,2331, 569,2280, 637,1816,2518, 757,1162,1878,1616,3412, # 3238
|
||||
287,1577,2115, 768,4309,1671,2854,3511,2519,1321,3737, 909,2413,7348,4069, 933, # 3254
|
||||
3738,7349,2052,2356,1222,4310, 765,2414,1322, 786,4311,7350,1919,1462,1677,2895, # 3270
|
||||
1699,7351,4312,1424,2437,3115,3624,2590,3316,1774,1940,3413,3880,4070, 309,1369, # 3286
|
||||
1130,2812, 364,2230,1653,1299,3881,3512,3882,3883,2646, 525,1085,3021, 902,2000, # 3302
|
||||
1475, 964,4313, 421,1844,1415,1057,2281, 940,1364,3116, 376,4314,4315,1381, 7, # 3318
|
||||
2520, 983,2378, 336,1710,2675,1845, 321,3414, 559,1131,3022,2742,1808,1132,1313, # 3334
|
||||
265,1481,1857,7352, 352,1203,2813,3247, 167,1089, 420,2814, 776, 792,1724,3513, # 3350
|
||||
4071,2438,3248,7353,4072,7354, 446, 229, 333,2743, 901,3739,1200,1557,4316,2647, # 3366
|
||||
1920, 395,2744,2676,3740,4073,1835, 125, 916,3178,2616,4317,7355,7356,3741,7357, # 3382
|
||||
7358,7359,4318,3117,3625,1133,2547,1757,3415,1510,2313,1409,3514,7360,2145, 438, # 3398
|
||||
2591,2896,2379,3317,1068, 958,3023, 461, 311,2855,2677,4074,1915,3179,4075,1978, # 3414
|
||||
383, 750,2745,2617,4076, 274, 539, 385,1278,1442,7361,1154,1964, 384, 561, 210, # 3430
|
||||
98,1295,2548,3515,7362,1711,2415,1482,3416,3884,2897,1257, 129,7363,3742, 642, # 3446
|
||||
523,2776,2777,2648,7364, 141,2231,1333, 68, 176, 441, 876, 907,4077, 603,2592, # 3462
|
||||
710, 171,3417, 404, 549, 18,3118,2393,1410,3626,1666,7365,3516,4319,2898,4320, # 3478
|
||||
7366,2973, 368,7367, 146, 366, 99, 871,3627,1543, 748, 807,1586,1185, 22,2258, # 3494
|
||||
379,3743,3180,7368,3181, 505,1941,2618,1991,1382,2314,7369, 380,2357, 218, 702, # 3510
|
||||
1817,1248,3418,3024,3517,3318,3249,7370,2974,3628, 930,3250,3744,7371, 59,7372, # 3526
|
||||
585, 601,4078, 497,3419,1112,1314,4321,1801,7373,1223,1472,2174,7374, 749,1836, # 3542
|
||||
690,1899,3745,1772,3885,1476, 429,1043,1790,2232,2116, 917,4079, 447,1086,1629, # 3558
|
||||
7375, 556,7376,7377,2020,1654, 844,1090, 105, 550, 966,1758,2815,1008,1782, 686, # 3574
|
||||
1095,7378,2282, 793,1602,7379,3518,2593,4322,4080,2933,2297,4323,3746, 980,2496, # 3590
|
||||
544, 353, 527,4324, 908,2678,2899,7380, 381,2619,1942,1348,7381,1341,1252, 560, # 3606
|
||||
3072,7382,3420,2856,7383,2053, 973, 886,2080, 143,4325,7384,7385, 157,3886, 496, # 3622
|
||||
4081, 57, 840, 540,2038,4326,4327,3421,2117,1445, 970,2259,1748,1965,2081,4082, # 3638
|
||||
3119,1234,1775,3251,2816,3629, 773,1206,2129,1066,2039,1326,3887,1738,1725,4083, # 3654
|
||||
279,3120, 51,1544,2594, 423,1578,2130,2066, 173,4328,1879,7386,7387,1583, 264, # 3670
|
||||
610,3630,4329,2439, 280, 154,7388,7389,7390,1739, 338,1282,3073, 693,2857,1411, # 3686
|
||||
1074,3747,2440,7391,4330,7392,7393,1240, 952,2394,7394,2900,1538,2679, 685,1483, # 3702
|
||||
4084,2468,1436, 953,4085,2054,4331, 671,2395, 79,4086,2441,3252, 608, 567,2680, # 3718
|
||||
3422,4087,4088,1691, 393,1261,1791,2396,7395,4332,7396,7397,7398,7399,1383,1672, # 3734
|
||||
3748,3182,1464, 522,1119, 661,1150, 216, 675,4333,3888,1432,3519, 609,4334,2681, # 3750
|
||||
2397,7400,7401,7402,4089,3025, 0,7403,2469, 315, 231,2442, 301,3319,4335,2380, # 3766
|
||||
7404, 233,4090,3631,1818,4336,4337,7405, 96,1776,1315,2082,7406, 257,7407,1809, # 3782
|
||||
3632,2709,1139,1819,4091,2021,1124,2163,2778,1777,2649,7408,3074, 363,1655,3183, # 3798
|
||||
7409,2975,7410,7411,7412,3889,1567,3890, 718, 103,3184, 849,1443, 341,3320,2934, # 3814
|
||||
1484,7413,1712, 127, 67, 339,4092,2398, 679,1412, 821,7414,7415, 834, 738, 351, # 3830
|
||||
2976,2146, 846, 235,1497,1880, 418,1992,3749,2710, 186,1100,2147,2746,3520,1545, # 3846
|
||||
1355,2935,2858,1377, 583,3891,4093,2573,2977,7416,1298,3633,1078,2549,3634,2358, # 3862
|
||||
78,3750,3751, 267,1289,2099,2001,1594,4094, 348, 369,1274,2194,2175,1837,4338, # 3878
|
||||
1820,2817,3635,2747,2283,2002,4339,2936,2748, 144,3321, 882,4340,3892,2749,3423, # 3894
|
||||
4341,2901,7417,4095,1726, 320,7418,3893,3026, 788,2978,7419,2818,1773,1327,2859, # 3910
|
||||
3894,2819,7420,1306,4342,2003,1700,3752,3521,2359,2650, 787,2022, 506, 824,3636, # 3926
|
||||
534, 323,4343,1044,3322,2023,1900, 946,3424,7421,1778,1500,1678,7422,1881,4344, # 3942
|
||||
165, 243,4345,3637,2521, 123, 683,4096, 764,4346, 36,3895,1792, 589,2902, 816, # 3958
|
||||
626,1667,3027,2233,1639,1555,1622,3753,3896,7423,3897,2860,1370,1228,1932, 891, # 3974
|
||||
2083,2903, 304,4097,7424, 292,2979,2711,3522, 691,2100,4098,1115,4347, 118, 662, # 3990
|
||||
7425, 611,1156, 854,2381,1316,2861, 2, 386, 515,2904,7426,7427,3253, 868,2234, # 4006
|
||||
1486, 855,2651, 785,2212,3028,7428,1040,3185,3523,7429,3121, 448,7430,1525,7431, # 4022
|
||||
2164,4348,7432,3754,7433,4099,2820,3524,3122, 503, 818,3898,3123,1568, 814, 676, # 4038
|
||||
1444, 306,1749,7434,3755,1416,1030, 197,1428, 805,2821,1501,4349,7435,7436,7437, # 4054
|
||||
1993,7438,4350,7439,7440,2195, 13,2779,3638,2980,3124,1229,1916,7441,3756,2131, # 4070
|
||||
7442,4100,4351,2399,3525,7443,2213,1511,1727,1120,7444,7445, 646,3757,2443, 307, # 4086
|
||||
7446,7447,1595,3186,7448,7449,7450,3639,1113,1356,3899,1465,2522,2523,7451, 519, # 4102
|
||||
7452, 128,2132, 92,2284,1979,7453,3900,1512, 342,3125,2196,7454,2780,2214,1980, # 4118
|
||||
3323,7455, 290,1656,1317, 789, 827,2360,7456,3758,4352, 562, 581,3901,7457, 401, # 4134
|
||||
4353,2248, 94,4354,1399,2781,7458,1463,2024,4355,3187,1943,7459, 828,1105,4101, # 4150
|
||||
1262,1394,7460,4102, 605,4356,7461,1783,2862,7462,2822, 819,2101, 578,2197,2937, # 4166
|
||||
7463,1502, 436,3254,4103,3255,2823,3902,2905,3425,3426,7464,2712,2315,7465,7466, # 4182
|
||||
2332,2067, 23,4357, 193, 826,3759,2102, 699,1630,4104,3075, 390,1793,1064,3526, # 4198
|
||||
7467,1579,3076,3077,1400,7468,4105,1838,1640,2863,7469,4358,4359, 137,4106, 598, # 4214
|
||||
3078,1966, 780, 104, 974,2938,7470, 278, 899, 253, 402, 572, 504, 493,1339,7471, # 4230
|
||||
3903,1275,4360,2574,2550,7472,3640,3029,3079,2249, 565,1334,2713, 863, 41,7473, # 4246
|
||||
7474,4361,7475,1657,2333, 19, 463,2750,4107, 606,7476,2981,3256,1087,2084,1323, # 4262
|
||||
2652,2982,7477,1631,1623,1750,4108,2682,7478,2864, 791,2714,2653,2334, 232,2416, # 4278
|
||||
7479,2983,1498,7480,2654,2620, 755,1366,3641,3257,3126,2025,1609, 119,1917,3427, # 4294
|
||||
862,1026,4109,7481,3904,3760,4362,3905,4363,2260,1951,2470,7482,1125, 817,4110, # 4310
|
||||
4111,3906,1513,1766,2040,1487,4112,3030,3258,2824,3761,3127,7483,7484,1507,7485, # 4326
|
||||
2683, 733, 40,1632,1106,2865, 345,4113, 841,2524, 230,4364,2984,1846,3259,3428, # 4342
|
||||
7486,1263, 986,3429,7487, 735, 879, 254,1137, 857, 622,1300,1180,1388,1562,3907, # 4358
|
||||
3908,2939, 967,2751,2655,1349, 592,2133,1692,3324,2985,1994,4114,1679,3909,1901, # 4374
|
||||
2185,7488, 739,3642,2715,1296,1290,7489,4115,2198,2199,1921,1563,2595,2551,1870, # 4390
|
||||
2752,2986,7490, 435,7491, 343,1108, 596, 17,1751,4365,2235,3430,3643,7492,4366, # 4406
|
||||
294,3527,2940,1693, 477, 979, 281,2041,3528, 643,2042,3644,2621,2782,2261,1031, # 4422
|
||||
2335,2134,2298,3529,4367, 367,1249,2552,7493,3530,7494,4368,1283,3325,2004, 240, # 4438
|
||||
1762,3326,4369,4370, 836,1069,3128, 474,7495,2148,2525, 268,3531,7496,3188,1521, # 4454
|
||||
1284,7497,1658,1546,4116,7498,3532,3533,7499,4117,3327,2684,1685,4118, 961,1673, # 4470
|
||||
2622, 190,2005,2200,3762,4371,4372,7500, 570,2497,3645,1490,7501,4373,2623,3260, # 4486
|
||||
1956,4374, 584,1514, 396,1045,1944,7502,4375,1967,2444,7503,7504,4376,3910, 619, # 4502
|
||||
7505,3129,3261, 215,2006,2783,2553,3189,4377,3190,4378, 763,4119,3763,4379,7506, # 4518
|
||||
7507,1957,1767,2941,3328,3646,1174, 452,1477,4380,3329,3130,7508,2825,1253,2382, # 4534
|
||||
2186,1091,2285,4120, 492,7509, 638,1169,1824,2135,1752,3911, 648, 926,1021,1324, # 4550
|
||||
4381, 520,4382, 997, 847,1007, 892,4383,3764,2262,1871,3647,7510,2400,1784,4384, # 4566
|
||||
1952,2942,3080,3191,1728,4121,2043,3648,4385,2007,1701,3131,1551, 30,2263,4122, # 4582
|
||||
7511,2026,4386,3534,7512, 501,7513,4123, 594,3431,2165,1821,3535,3432,3536,3192, # 4598
|
||||
829,2826,4124,7514,1680,3132,1225,4125,7515,3262,4387,4126,3133,2336,7516,4388, # 4614
|
||||
4127,7517,3912,3913,7518,1847,2383,2596,3330,7519,4389, 374,3914, 652,4128,4129, # 4630
|
||||
375,1140, 798,7520,7521,7522,2361,4390,2264, 546,1659, 138,3031,2445,4391,7523, # 4646
|
||||
2250, 612,1848, 910, 796,3765,1740,1371, 825,3766,3767,7524,2906,2554,7525, 692, # 4662
|
||||
444,3032,2624, 801,4392,4130,7526,1491, 244,1053,3033,4131,4132, 340,7527,3915, # 4678
|
||||
1041,2987, 293,1168, 87,1357,7528,1539, 959,7529,2236, 721, 694,4133,3768, 219, # 4694
|
||||
1478, 644,1417,3331,2656,1413,1401,1335,1389,3916,7530,7531,2988,2362,3134,1825, # 4710
|
||||
730,1515, 184,2827, 66,4393,7532,1660,2943, 246,3332, 378,1457, 226,3433, 975, # 4726
|
||||
3917,2944,1264,3537, 674, 696,7533, 163,7534,1141,2417,2166, 713,3538,3333,4394, # 4742
|
||||
3918,7535,7536,1186, 15,7537,1079,1070,7538,1522,3193,3539, 276,1050,2716, 758, # 4758
|
||||
1126, 653,2945,3263,7539,2337, 889,3540,3919,3081,2989, 903,1250,4395,3920,3434, # 4774
|
||||
3541,1342,1681,1718, 766,3264, 286, 89,2946,3649,7540,1713,7541,2597,3334,2990, # 4790
|
||||
7542,2947,2215,3194,2866,7543,4396,2498,2526, 181, 387,1075,3921, 731,2187,3335, # 4806
|
||||
7544,3265, 310, 313,3435,2299, 770,4134, 54,3034, 189,4397,3082,3769,3922,7545, # 4822
|
||||
1230,1617,1849, 355,3542,4135,4398,3336, 111,4136,3650,1350,3135,3436,3035,4137, # 4838
|
||||
2149,3266,3543,7546,2784,3923,3924,2991, 722,2008,7547,1071, 247,1207,2338,2471, # 4854
|
||||
1378,4399,2009, 864,1437,1214,4400, 373,3770,1142,2216, 667,4401, 442,2753,2555, # 4870
|
||||
3771,3925,1968,4138,3267,1839, 837, 170,1107, 934,1336,1882,7548,7549,2118,4139, # 4886
|
||||
2828, 743,1569,7550,4402,4140, 582,2384,1418,3437,7551,1802,7552, 357,1395,1729, # 4902
|
||||
3651,3268,2418,1564,2237,7553,3083,3772,1633,4403,1114,2085,4141,1532,7554, 482, # 4918
|
||||
2446,4404,7555,7556,1492, 833,1466,7557,2717,3544,1641,2829,7558,1526,1272,3652, # 4934
|
||||
4142,1686,1794, 416,2556,1902,1953,1803,7559,3773,2785,3774,1159,2316,7560,2867, # 4950
|
||||
4405,1610,1584,3036,2419,2754, 443,3269,1163,3136,7561,7562,3926,7563,4143,2499, # 4966
|
||||
3037,4406,3927,3137,2103,1647,3545,2010,1872,4144,7564,4145, 431,3438,7565, 250, # 4982
|
||||
97, 81,4146,7566,1648,1850,1558, 160, 848,7567, 866, 740,1694,7568,2201,2830, # 4998
|
||||
3195,4147,4407,3653,1687, 950,2472, 426, 469,3196,3654,3655,3928,7569,7570,1188, # 5014
|
||||
424,1995, 861,3546,4148,3775,2202,2685, 168,1235,3547,4149,7571,2086,1674,4408, # 5030
|
||||
3337,3270, 220,2557,1009,7572,3776, 670,2992, 332,1208, 717,7573,7574,3548,2447, # 5046
|
||||
3929,3338,7575, 513,7576,1209,2868,3339,3138,4409,1080,7577,7578,7579,7580,2527, # 5062
|
||||
3656,3549, 815,1587,3930,3931,7581,3550,3439,3777,1254,4410,1328,3038,1390,3932, # 5078
|
||||
1741,3933,3778,3934,7582, 236,3779,2448,3271,7583,7584,3657,3780,1273,3781,4411, # 5094
|
||||
7585, 308,7586,4412, 245,4413,1851,2473,1307,2575, 430, 715,2136,2449,7587, 270, # 5110
|
||||
199,2869,3935,7588,3551,2718,1753, 761,1754, 725,1661,1840,4414,3440,3658,7589, # 5126
|
||||
7590, 587, 14,3272, 227,2598, 326, 480,2265, 943,2755,3552, 291, 650,1883,7591, # 5142
|
||||
1702,1226, 102,1547, 62,3441, 904,4415,3442,1164,4150,7592,7593,1224,1548,2756, # 5158
|
||||
391, 498,1493,7594,1386,1419,7595,2055,1177,4416, 813, 880,1081,2363, 566,1145, # 5174
|
||||
4417,2286,1001,1035,2558,2599,2238, 394,1286,7596,7597,2068,7598, 86,1494,1730, # 5190
|
||||
3936, 491,1588, 745, 897,2948, 843,3340,3937,2757,2870,3273,1768, 998,2217,2069, # 5206
|
||||
397,1826,1195,1969,3659,2993,3341, 284,7599,3782,2500,2137,2119,1903,7600,3938, # 5222
|
||||
2150,3939,4151,1036,3443,1904, 114,2559,4152, 209,1527,7601,7602,2949,2831,2625, # 5238
|
||||
2385,2719,3139, 812,2560,7603,3274,7604,1559, 737,1884,3660,1210, 885, 28,2686, # 5254
|
||||
3553,3783,7605,4153,1004,1779,4418,7606, 346,1981,2218,2687,4419,3784,1742, 797, # 5270
|
||||
1642,3940,1933,1072,1384,2151, 896,3941,3275,3661,3197,2871,3554,7607,2561,1958, # 5286
|
||||
4420,2450,1785,7608,7609,7610,3942,4154,1005,1308,3662,4155,2720,4421,4422,1528, # 5302
|
||||
2600, 161,1178,4156,1982, 987,4423,1101,4157, 631,3943,1157,3198,2420,1343,1241, # 5318
|
||||
1016,2239,2562, 372, 877,2339,2501,1160, 555,1934, 911,3944,7611, 466,1170, 169, # 5334
|
||||
1051,2907,2688,3663,2474,2994,1182,2011,2563,1251,2626,7612, 992,2340,3444,1540, # 5350
|
||||
2721,1201,2070,2401,1996,2475,7613,4424, 528,1922,2188,1503,1873,1570,2364,3342, # 5366
|
||||
3276,7614, 557,1073,7615,1827,3445,2087,2266,3140,3039,3084, 767,3085,2786,4425, # 5382
|
||||
1006,4158,4426,2341,1267,2176,3664,3199, 778,3945,3200,2722,1597,2657,7616,4427, # 5398
|
||||
7617,3446,7618,7619,7620,3277,2689,1433,3278, 131, 95,1504,3946, 723,4159,3141, # 5414
|
||||
1841,3555,2758,2189,3947,2027,2104,3665,7621,2995,3948,1218,7622,3343,3201,3949, # 5430
|
||||
4160,2576, 248,1634,3785, 912,7623,2832,3666,3040,3786, 654, 53,7624,2996,7625, # 5446
|
||||
1688,4428, 777,3447,1032,3950,1425,7626, 191, 820,2120,2833, 971,4429, 931,3202, # 5462
|
||||
135, 664, 783,3787,1997, 772,2908,1935,3951,3788,4430,2909,3203, 282,2723, 640, # 5478
|
||||
1372,3448,1127, 922, 325,3344,7627,7628, 711,2044,7629,7630,3952,2219,2787,1936, # 5494
|
||||
3953,3345,2220,2251,3789,2300,7631,4431,3790,1258,3279,3954,3204,2138,2950,3955, # 5510
|
||||
3956,7632,2221, 258,3205,4432, 101,1227,7633,3280,1755,7634,1391,3281,7635,2910, # 5526
|
||||
2056, 893,7636,7637,7638,1402,4161,2342,7639,7640,3206,3556,7641,7642, 878,1325, # 5542
|
||||
1780,2788,4433, 259,1385,2577, 744,1183,2267,4434,7643,3957,2502,7644, 684,1024, # 5558
|
||||
4162,7645, 472,3557,3449,1165,3282,3958,3959, 322,2152, 881, 455,1695,1152,1340, # 5574
|
||||
660, 554,2153,4435,1058,4436,4163, 830,1065,3346,3960,4437,1923,7646,1703,1918, # 5590
|
||||
7647, 932,2268, 122,7648,4438, 947, 677,7649,3791,2627, 297,1905,1924,2269,4439, # 5606
|
||||
2317,3283,7650,7651,4164,7652,4165, 84,4166, 112, 989,7653, 547,1059,3961, 701, # 5622
|
||||
3558,1019,7654,4167,7655,3450, 942, 639, 457,2301,2451, 993,2951, 407, 851, 494, # 5638
|
||||
4440,3347, 927,7656,1237,7657,2421,3348, 573,4168, 680, 921,2911,1279,1874, 285, # 5654
|
||||
790,1448,1983, 719,2167,7658,7659,4441,3962,3963,1649,7660,1541, 563,7661,1077, # 5670
|
||||
7662,3349,3041,3451, 511,2997,3964,3965,3667,3966,1268,2564,3350,3207,4442,4443, # 5686
|
||||
7663, 535,1048,1276,1189,2912,2028,3142,1438,1373,2834,2952,1134,2012,7664,4169, # 5702
|
||||
1238,2578,3086,1259,7665, 700,7666,2953,3143,3668,4170,7667,4171,1146,1875,1906, # 5718
|
||||
4444,2601,3967, 781,2422, 132,1589, 203, 147, 273,2789,2402, 898,1786,2154,3968, # 5734
|
||||
3969,7668,3792,2790,7669,7670,4445,4446,7671,3208,7672,1635,3793, 965,7673,1804, # 5750
|
||||
2690,1516,3559,1121,1082,1329,3284,3970,1449,3794, 65,1128,2835,2913,2759,1590, # 5766
|
||||
3795,7674,7675, 12,2658, 45, 976,2579,3144,4447, 517,2528,1013,1037,3209,7676, # 5782
|
||||
3796,2836,7677,3797,7678,3452,7679,2602, 614,1998,2318,3798,3087,2724,2628,7680, # 5798
|
||||
2580,4172, 599,1269,7681,1810,3669,7682,2691,3088, 759,1060, 489,1805,3351,3285, # 5814
|
||||
1358,7683,7684,2386,1387,1215,2629,2252, 490,7685,7686,4173,1759,2387,2343,7687, # 5830
|
||||
4448,3799,1907,3971,2630,1806,3210,4449,3453,3286,2760,2344, 874,7688,7689,3454, # 5846
|
||||
3670,1858, 91,2914,3671,3042,3800,4450,7690,3145,3972,2659,7691,3455,1202,1403, # 5862
|
||||
3801,2954,2529,1517,2503,4451,3456,2504,7692,4452,7693,2692,1885,1495,1731,3973, # 5878
|
||||
2365,4453,7694,2029,7695,7696,3974,2693,1216, 237,2581,4174,2319,3975,3802,4454, # 5894
|
||||
4455,2694,3560,3457, 445,4456,7697,7698,7699,7700,2761, 61,3976,3672,1822,3977, # 5910
|
||||
7701, 687,2045, 935, 925, 405,2660, 703,1096,1859,2725,4457,3978,1876,1367,2695, # 5926
|
||||
3352, 918,2105,1781,2476, 334,3287,1611,1093,4458, 564,3146,3458,3673,3353, 945, # 5942
|
||||
2631,2057,4459,7702,1925, 872,4175,7703,3459,2696,3089, 349,4176,3674,3979,4460, # 5958
|
||||
3803,4177,3675,2155,3980,4461,4462,4178,4463,2403,2046, 782,3981, 400, 251,4179, # 5974
|
||||
1624,7704,7705, 277,3676, 299,1265, 476,1191,3804,2121,4180,4181,1109, 205,7706, # 5990
|
||||
2582,1000,2156,3561,1860,7707,7708,7709,4464,7710,4465,2565, 107,2477,2157,3982, # 6006
|
||||
3460,3147,7711,1533, 541,1301, 158, 753,4182,2872,3562,7712,1696, 370,1088,4183, # 6022
|
||||
4466,3563, 579, 327, 440, 162,2240, 269,1937,1374,3461, 968,3043, 56,1396,3090, # 6038
|
||||
2106,3288,3354,7713,1926,2158,4467,2998,7714,3564,7715,7716,3677,4468,2478,7717, # 6054
|
||||
2791,7718,1650,4469,7719,2603,7720,7721,3983,2661,3355,1149,3356,3984,3805,3985, # 6070
|
||||
7722,1076, 49,7723, 951,3211,3289,3290, 450,2837, 920,7724,1811,2792,2366,4184, # 6086
|
||||
1908,1138,2367,3806,3462,7725,3212,4470,1909,1147,1518,2423,4471,3807,7726,4472, # 6102
|
||||
2388,2604, 260,1795,3213,7727,7728,3808,3291, 708,7729,3565,1704,7730,3566,1351, # 6118
|
||||
1618,3357,2999,1886, 944,4185,3358,4186,3044,3359,4187,7731,3678, 422, 413,1714, # 6134
|
||||
3292, 500,2058,2345,4188,2479,7732,1344,1910, 954,7733,1668,7734,7735,3986,2404, # 6150
|
||||
4189,3567,3809,4190,7736,2302,1318,2505,3091, 133,3092,2873,4473, 629, 31,2838, # 6166
|
||||
2697,3810,4474, 850, 949,4475,3987,2955,1732,2088,4191,1496,1852,7737,3988, 620, # 6182
|
||||
3214, 981,1242,3679,3360,1619,3680,1643,3293,2139,2452,1970,1719,3463,2168,7738, # 6198
|
||||
3215,7739,7740,3361,1828,7741,1277,4476,1565,2047,7742,1636,3568,3093,7743, 869, # 6214
|
||||
2839, 655,3811,3812,3094,3989,3000,3813,1310,3569,4477,7744,7745,7746,1733, 558, # 6230
|
||||
4478,3681, 335,1549,3045,1756,4192,3682,1945,3464,1829,1291,1192, 470,2726,2107, # 6246
|
||||
2793, 913,1054,3990,7747,1027,7748,3046,3991,4479, 982,2662,3362,3148,3465,3216, # 6262
|
||||
3217,1946,2794,7749, 571,4480,7750,1830,7751,3570,2583,1523,2424,7752,2089, 984, # 6278
|
||||
4481,3683,1959,7753,3684, 852, 923,2795,3466,3685, 969,1519, 999,2048,2320,1705, # 6294
|
||||
7754,3095, 615,1662, 151, 597,3992,2405,2321,1049, 275,4482,3686,4193, 568,3687, # 6310
|
||||
3571,2480,4194,3688,7755,2425,2270, 409,3218,7756,1566,2874,3467,1002, 769,2840, # 6326
|
||||
194,2090,3149,3689,2222,3294,4195, 628,1505,7757,7758,1763,2177,3001,3993, 521, # 6342
|
||||
1161,2584,1787,2203,2406,4483,3994,1625,4196,4197, 412, 42,3096, 464,7759,2632, # 6358
|
||||
4484,3363,1760,1571,2875,3468,2530,1219,2204,3814,2633,2140,2368,4485,4486,3295, # 6374
|
||||
1651,3364,3572,7760,7761,3573,2481,3469,7762,3690,7763,7764,2271,2091, 460,7765, # 6390
|
||||
4487,7766,3002, 962, 588,3574, 289,3219,2634,1116, 52,7767,3047,1796,7768,7769, # 6406
|
||||
7770,1467,7771,1598,1143,3691,4198,1984,1734,1067,4488,1280,3365, 465,4489,1572, # 6422
|
||||
510,7772,1927,2241,1812,1644,3575,7773,4490,3692,7774,7775,2663,1573,1534,7776, # 6438
|
||||
7777,4199, 536,1807,1761,3470,3815,3150,2635,7778,7779,7780,4491,3471,2915,1911, # 6454
|
||||
2796,7781,3296,1122, 377,3220,7782, 360,7783,7784,4200,1529, 551,7785,2059,3693, # 6470
|
||||
1769,2426,7786,2916,4201,3297,3097,2322,2108,2030,4492,1404, 136,1468,1479, 672, # 6486
|
||||
1171,3221,2303, 271,3151,7787,2762,7788,2049, 678,2727, 865,1947,4493,7789,2013, # 6502
|
||||
3995,2956,7790,2728,2223,1397,3048,3694,4494,4495,1735,2917,3366,3576,7791,3816, # 6518
|
||||
509,2841,2453,2876,3817,7792,7793,3152,3153,4496,4202,2531,4497,2304,1166,1010, # 6534
|
||||
552, 681,1887,7794,7795,2957,2958,3996,1287,1596,1861,3154, 358, 453, 736, 175, # 6550
|
||||
478,1117, 905,1167,1097,7796,1853,1530,7797,1706,7798,2178,3472,2287,3695,3473, # 6566
|
||||
3577,4203,2092,4204,7799,3367,1193,2482,4205,1458,2190,2205,1862,1888,1421,3298, # 6582
|
||||
2918,3049,2179,3474, 595,2122,7800,3997,7801,7802,4206,1707,2636, 223,3696,1359, # 6598
|
||||
751,3098, 183,3475,7803,2797,3003, 419,2369, 633, 704,3818,2389, 241,7804,7805, # 6614
|
||||
7806, 838,3004,3697,2272,2763,2454,3819,1938,2050,3998,1309,3099,2242,1181,7807, # 6630
|
||||
1136,2206,3820,2370,1446,4207,2305,4498,7808,7809,4208,1055,2605, 484,3698,7810, # 6646
|
||||
3999, 625,4209,2273,3368,1499,4210,4000,7811,4001,4211,3222,2274,2275,3476,7812, # 6662
|
||||
7813,2764, 808,2606,3699,3369,4002,4212,3100,2532, 526,3370,3821,4213, 955,7814, # 6678
|
||||
1620,4214,2637,2427,7815,1429,3700,1669,1831, 994, 928,7816,3578,1260,7817,7818, # 6694
|
||||
7819,1948,2288, 741,2919,1626,4215,2729,2455, 867,1184, 362,3371,1392,7820,7821, # 6710
|
||||
4003,4216,1770,1736,3223,2920,4499,4500,1928,2698,1459,1158,7822,3050,3372,2877, # 6726
|
||||
1292,1929,2506,2842,3701,1985,1187,2071,2014,2607,4217,7823,2566,2507,2169,3702, # 6742
|
||||
2483,3299,7824,3703,4501,7825,7826, 666,1003,3005,1022,3579,4218,7827,4502,1813, # 6758
|
||||
2253, 574,3822,1603, 295,1535, 705,3823,4219, 283, 858, 417,7828,7829,3224,4503, # 6774
|
||||
4504,3051,1220,1889,1046,2276,2456,4004,1393,1599, 689,2567, 388,4220,7830,2484, # 6790
|
||||
802,7831,2798,3824,2060,1405,2254,7832,4505,3825,2109,1052,1345,3225,1585,7833, # 6806
|
||||
809,7834,7835,7836, 575,2730,3477, 956,1552,1469,1144,2323,7837,2324,1560,2457, # 6822
|
||||
3580,3226,4005, 616,2207,3155,2180,2289,7838,1832,7839,3478,4506,7840,1319,3704, # 6838
|
||||
3705,1211,3581,1023,3227,1293,2799,7841,7842,7843,3826, 607,2306,3827, 762,2878, # 6854
|
||||
1439,4221,1360,7844,1485,3052,7845,4507,1038,4222,1450,2061,2638,4223,1379,4508, # 6870
|
||||
2585,7846,7847,4224,1352,1414,2325,2921,1172,7848,7849,3828,3829,7850,1797,1451, # 6886
|
||||
7851,7852,7853,7854,2922,4006,4007,2485,2346, 411,4008,4009,3582,3300,3101,4509, # 6902
|
||||
1561,2664,1452,4010,1375,7855,7856, 47,2959, 316,7857,1406,1591,2923,3156,7858, # 6918
|
||||
1025,2141,3102,3157, 354,2731, 884,2224,4225,2407, 508,3706, 726,3583, 996,2428, # 6934
|
||||
3584, 729,7859, 392,2191,1453,4011,4510,3707,7860,7861,2458,3585,2608,1675,2800, # 6950
|
||||
919,2347,2960,2348,1270,4511,4012, 73,7862,7863, 647,7864,3228,2843,2255,1550, # 6966
|
||||
1346,3006,7865,1332, 883,3479,7866,7867,7868,7869,3301,2765,7870,1212, 831,1347, # 6982
|
||||
4226,4512,2326,3830,1863,3053, 720,3831,4513,4514,3832,7871,4227,7872,7873,4515, # 6998
|
||||
7874,7875,1798,4516,3708,2609,4517,3586,1645,2371,7876,7877,2924, 669,2208,2665, # 7014
|
||||
2429,7878,2879,7879,7880,1028,3229,7881,4228,2408,7882,2256,1353,7883,7884,4518, # 7030
|
||||
3158, 518,7885,4013,7886,4229,1960,7887,2142,4230,7888,7889,3007,2349,2350,3833, # 7046
|
||||
516,1833,1454,4014,2699,4231,4519,2225,2610,1971,1129,3587,7890,2766,7891,2961, # 7062
|
||||
1422, 577,1470,3008,1524,3373,7892,7893, 432,4232,3054,3480,7894,2586,1455,2508, # 7078
|
||||
2226,1972,1175,7895,1020,2732,4015,3481,4520,7896,2733,7897,1743,1361,3055,3482, # 7094
|
||||
2639,4016,4233,4521,2290, 895, 924,4234,2170, 331,2243,3056, 166,1627,3057,1098, # 7110
|
||||
7898,1232,2880,2227,3374,4522, 657, 403,1196,2372, 542,3709,3375,1600,4235,3483, # 7126
|
||||
7899,4523,2767,3230, 576, 530,1362,7900,4524,2533,2666,3710,4017,7901, 842,3834, # 7142
|
||||
7902,2801,2031,1014,4018, 213,2700,3376, 665, 621,4236,7903,3711,2925,2430,7904, # 7158
|
||||
2431,3302,3588,3377,7905,4237,2534,4238,4525,3589,1682,4239,3484,1380,7906, 724, # 7174
|
||||
2277, 600,1670,7907,1337,1233,4526,3103,2244,7908,1621,4527,7909, 651,4240,7910, # 7190
|
||||
1612,4241,2611,7911,2844,7912,2734,2307,3058,7913, 716,2459,3059, 174,1255,2701, # 7206
|
||||
4019,3590, 548,1320,1398, 728,4020,1574,7914,1890,1197,3060,4021,7915,3061,3062, # 7222
|
||||
3712,3591,3713, 747,7916, 635,4242,4528,7917,7918,7919,4243,7920,7921,4529,7922, # 7238
|
||||
3378,4530,2432, 451,7923,3714,2535,2072,4244,2735,4245,4022,7924,1764,4531,7925, # 7254
|
||||
4246, 350,7926,2278,2390,2486,7927,4247,4023,2245,1434,4024, 488,4532, 458,4248, # 7270
|
||||
4025,3715, 771,1330,2391,3835,2568,3159,2159,2409,1553,2667,3160,4249,7928,2487, # 7286
|
||||
2881,2612,1720,2702,4250,3379,4533,7929,2536,4251,7930,3231,4252,2768,7931,2015, # 7302
|
||||
2736,7932,1155,1017,3716,3836,7933,3303,2308, 201,1864,4253,1430,7934,4026,7935, # 7318
|
||||
7936,7937,7938,7939,4254,1604,7940, 414,1865, 371,2587,4534,4535,3485,2016,3104, # 7334
|
||||
4536,1708, 960,4255, 887, 389,2171,1536,1663,1721,7941,2228,4027,2351,2926,1580, # 7350
|
||||
7942,7943,7944,1744,7945,2537,4537,4538,7946,4539,7947,2073,7948,7949,3592,3380, # 7366
|
||||
2882,4256,7950,4257,2640,3381,2802, 673,2703,2460, 709,3486,4028,3593,4258,7951, # 7382
|
||||
1148, 502, 634,7952,7953,1204,4540,3594,1575,4541,2613,3717,7954,3718,3105, 948, # 7398
|
||||
3232, 121,1745,3837,1110,7955,4259,3063,2509,3009,4029,3719,1151,1771,3838,1488, # 7414
|
||||
4030,1986,7956,2433,3487,7957,7958,2093,7959,4260,3839,1213,1407,2803, 531,2737, # 7430
|
||||
2538,3233,1011,1537,7960,2769,4261,3106,1061,7961,3720,3721,1866,2883,7962,2017, # 7446
|
||||
120,4262,4263,2062,3595,3234,2309,3840,2668,3382,1954,4542,7963,7964,3488,1047, # 7462
|
||||
2704,1266,7965,1368,4543,2845, 649,3383,3841,2539,2738,1102,2846,2669,7966,7967, # 7478
|
||||
1999,7968,1111,3596,2962,7969,2488,3842,3597,2804,1854,3384,3722,7970,7971,3385, # 7494
|
||||
2410,2884,3304,3235,3598,7972,2569,7973,3599,2805,4031,1460, 856,7974,3600,7975, # 7510
|
||||
2885,2963,7976,2886,3843,7977,4264, 632,2510, 875,3844,1697,3845,2291,7978,7979, # 7526
|
||||
4544,3010,1239, 580,4545,4265,7980, 914, 936,2074,1190,4032,1039,2123,7981,7982, # 7542
|
||||
7983,3386,1473,7984,1354,4266,3846,7985,2172,3064,4033, 915,3305,4267,4268,3306, # 7558
|
||||
1605,1834,7986,2739, 398,3601,4269,3847,4034, 328,1912,2847,4035,3848,1331,4270, # 7574
|
||||
3011, 937,4271,7987,3602,4036,4037,3387,2160,4546,3388, 524, 742, 538,3065,1012, # 7590
|
||||
7988,7989,3849,2461,7990, 658,1103, 225,3850,7991,7992,4547,7993,4548,7994,3236, # 7606
|
||||
1243,7995,4038, 963,2246,4549,7996,2705,3603,3161,7997,7998,2588,2327,7999,4550, # 7622
|
||||
8000,8001,8002,3489,3307, 957,3389,2540,2032,1930,2927,2462, 870,2018,3604,1746, # 7638
|
||||
2770,2771,2434,2463,8003,3851,8004,3723,3107,3724,3490,3390,3725,8005,1179,3066, # 7654
|
||||
8006,3162,2373,4272,3726,2541,3163,3108,2740,4039,8007,3391,1556,2542,2292, 977, # 7670
|
||||
2887,2033,4040,1205,3392,8008,1765,3393,3164,2124,1271,1689, 714,4551,3491,8009, # 7686
|
||||
2328,3852, 533,4273,3605,2181, 617,8010,2464,3308,3492,2310,8011,8012,3165,8013, # 7702
|
||||
8014,3853,1987, 618, 427,2641,3493,3394,8015,8016,1244,1690,8017,2806,4274,4552, # 7718
|
||||
8018,3494,8019,8020,2279,1576, 473,3606,4275,3395, 972,8021,3607,8022,3067,8023, # 7734
|
||||
8024,4553,4554,8025,3727,4041,4042,8026, 153,4555, 356,8027,1891,2888,4276,2143, # 7750
|
||||
408, 803,2352,8028,3854,8029,4277,1646,2570,2511,4556,4557,3855,8030,3856,4278, # 7766
|
||||
8031,2411,3396, 752,8032,8033,1961,2964,8034, 746,3012,2465,8035,4279,3728, 698, # 7782
|
||||
4558,1892,4280,3608,2543,4559,3609,3857,8036,3166,3397,8037,1823,1302,4043,2706, # 7798
|
||||
3858,1973,4281,8038,4282,3167, 823,1303,1288,1236,2848,3495,4044,3398, 774,3859, # 7814
|
||||
8039,1581,4560,1304,2849,3860,4561,8040,2435,2161,1083,3237,4283,4045,4284, 344, # 7830
|
||||
1173, 288,2311, 454,1683,8041,8042,1461,4562,4046,2589,8043,8044,4563, 985, 894, # 7846
|
||||
8045,3399,3168,8046,1913,2928,3729,1988,8047,2110,1974,8048,4047,8049,2571,1194, # 7862
|
||||
425,8050,4564,3169,1245,3730,4285,8051,8052,2850,8053, 636,4565,1855,3861, 760, # 7878
|
||||
1799,8054,4286,2209,1508,4566,4048,1893,1684,2293,8055,8056,8057,4287,4288,2210, # 7894
|
||||
479,8058,8059, 832,8060,4049,2489,8061,2965,2490,3731, 990,3109, 627,1814,2642, # 7910
|
||||
4289,1582,4290,2125,2111,3496,4567,8062, 799,4291,3170,8063,4568,2112,1737,3013, # 7926
|
||||
1018, 543, 754,4292,3309,1676,4569,4570,4050,8064,1489,8065,3497,8066,2614,2889, # 7942
|
||||
4051,8067,8068,2966,8069,8070,8071,8072,3171,4571,4572,2182,1722,8073,3238,3239, # 7958
|
||||
1842,3610,1715, 481, 365,1975,1856,8074,8075,1962,2491,4573,8076,2126,3611,3240, # 7974
|
||||
433,1894,2063,2075,8077, 602,2741,8078,8079,8080,8081,8082,3014,1628,3400,8083, # 7990
|
||||
3172,4574,4052,2890,4575,2512,8084,2544,2772,8085,8086,8087,3310,4576,2891,8088, # 8006
|
||||
4577,8089,2851,4578,4579,1221,2967,4053,2513,8090,8091,8092,1867,1989,8093,8094, # 8022
|
||||
8095,1895,8096,8097,4580,1896,4054, 318,8098,2094,4055,4293,8099,8100, 485,8101, # 8038
|
||||
938,3862, 553,2670, 116,8102,3863,3612,8103,3498,2671,2773,3401,3311,2807,8104, # 8054
|
||||
3613,2929,4056,1747,2930,2968,8105,8106, 207,8107,8108,2672,4581,2514,8109,3015, # 8070
|
||||
890,3614,3864,8110,1877,3732,3402,8111,2183,2353,3403,1652,8112,8113,8114, 941, # 8086
|
||||
2294, 208,3499,4057,2019, 330,4294,3865,2892,2492,3733,4295,8115,8116,8117,8118, # 8102
|
||||
#Everything below is of no interest for detection purpose
|
||||
2515,1613,4582,8119,3312,3866,2516,8120,4058,8121,1637,4059,2466,4583,3867,8122, # 8118
|
||||
2493,3016,3734,8123,8124,2192,8125,8126,2162,8127,8128,8129,8130,8131,8132,8133, # 8134
|
||||
8134,8135,8136,8137,8138,8139,8140,8141,8142,8143,8144,8145,8146,8147,8148,8149, # 8150
|
||||
8150,8151,8152,8153,8154,8155,8156,8157,8158,8159,8160,8161,8162,8163,8164,8165, # 8166
|
||||
8166,8167,8168,8169,8170,8171,8172,8173,8174,8175,8176,8177,8178,8179,8180,8181, # 8182
|
||||
8182,8183,8184,8185,8186,8187,8188,8189,8190,8191,8192,8193,8194,8195,8196,8197, # 8198
|
||||
8198,8199,8200,8201,8202,8203,8204,8205,8206,8207,8208,8209,8210,8211,8212,8213, # 8214
|
||||
8214,8215,8216,8217,8218,8219,8220,8221,8222,8223,8224,8225,8226,8227,8228,8229, # 8230
|
||||
8230,8231,8232,8233,8234,8235,8236,8237,8238,8239,8240,8241,8242,8243,8244,8245, # 8246
|
||||
8246,8247,8248,8249,8250,8251,8252,8253,8254,8255,8256,8257,8258,8259,8260,8261, # 8262
|
||||
8262,8263,8264,8265,8266,8267,8268,8269,8270,8271,8272,8273,8274,8275,8276,8277, # 8278
|
||||
8278,8279,8280,8281,8282,8283,8284,8285,8286,8287,8288,8289,8290,8291,8292,8293, # 8294
|
||||
8294,8295,8296,8297,8298,8299,8300,8301,8302,8303,8304,8305,8306,8307,8308,8309, # 8310
|
||||
8310,8311,8312,8313,8314,8315,8316,8317,8318,8319,8320,8321,8322,8323,8324,8325, # 8326
|
||||
8326,8327,8328,8329,8330,8331,8332,8333,8334,8335,8336,8337,8338,8339,8340,8341, # 8342
|
||||
8342,8343,8344,8345,8346,8347,8348,8349,8350,8351,8352,8353,8354,8355,8356,8357, # 8358
|
||||
8358,8359,8360,8361,8362,8363,8364,8365,8366,8367,8368,8369,8370,8371,8372,8373, # 8374
|
||||
8374,8375,8376,8377,8378,8379,8380,8381,8382,8383,8384,8385,8386,8387,8388,8389, # 8390
|
||||
8390,8391,8392,8393,8394,8395,8396,8397,8398,8399,8400,8401,8402,8403,8404,8405, # 8406
|
||||
8406,8407,8408,8409,8410,8411,8412,8413,8414,8415,8416,8417,8418,8419,8420,8421, # 8422
|
||||
8422,8423,8424,8425,8426,8427,8428,8429,8430,8431,8432,8433,8434,8435,8436,8437, # 8438
|
||||
8438,8439,8440,8441,8442,8443,8444,8445,8446,8447,8448,8449,8450,8451,8452,8453, # 8454
|
||||
8454,8455,8456,8457,8458,8459,8460,8461,8462,8463,8464,8465,8466,8467,8468,8469, # 8470
|
||||
8470,8471,8472,8473,8474,8475,8476,8477,8478,8479,8480,8481,8482,8483,8484,8485, # 8486
|
||||
8486,8487,8488,8489,8490,8491,8492,8493,8494,8495,8496,8497,8498,8499,8500,8501, # 8502
|
||||
8502,8503,8504,8505,8506,8507,8508,8509,8510,8511,8512,8513,8514,8515,8516,8517, # 8518
|
||||
8518,8519,8520,8521,8522,8523,8524,8525,8526,8527,8528,8529,8530,8531,8532,8533, # 8534
|
||||
8534,8535,8536,8537,8538,8539,8540,8541,8542,8543,8544,8545,8546,8547,8548,8549, # 8550
|
||||
8550,8551,8552,8553,8554,8555,8556,8557,8558,8559,8560,8561,8562,8563,8564,8565, # 8566
|
||||
8566,8567,8568,8569,8570,8571,8572,8573,8574,8575,8576,8577,8578,8579,8580,8581, # 8582
|
||||
8582,8583,8584,8585,8586,8587,8588,8589,8590,8591,8592,8593,8594,8595,8596,8597, # 8598
|
||||
8598,8599,8600,8601,8602,8603,8604,8605,8606,8607,8608,8609,8610,8611,8612,8613, # 8614
|
||||
8614,8615,8616,8617,8618,8619,8620,8621,8622,8623,8624,8625,8626,8627,8628,8629, # 8630
|
||||
8630,8631,8632,8633,8634,8635,8636,8637,8638,8639,8640,8641,8642,8643,8644,8645, # 8646
|
||||
8646,8647,8648,8649,8650,8651,8652,8653,8654,8655,8656,8657,8658,8659,8660,8661, # 8662
|
||||
8662,8663,8664,8665,8666,8667,8668,8669,8670,8671,8672,8673,8674,8675,8676,8677, # 8678
|
||||
8678,8679,8680,8681,8682,8683,8684,8685,8686,8687,8688,8689,8690,8691,8692,8693, # 8694
|
||||
8694,8695,8696,8697,8698,8699,8700,8701,8702,8703,8704,8705,8706,8707,8708,8709, # 8710
|
||||
8710,8711,8712,8713,8714,8715,8716,8717,8718,8719,8720,8721,8722,8723,8724,8725, # 8726
|
||||
8726,8727,8728,8729,8730,8731,8732,8733,8734,8735,8736,8737,8738,8739,8740,8741) # 8742
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue