Changeset Tag Branch User Date Summary

53:c93e07566456 wsuetholz 2010-11-10 10:11 Ficwad wants you to login in order to view some stories.. They old ficwad.py got around that by starting with the first chapter instead of the story index page. Since I needed the story index page I had changed it to switch to that page, and then scrape the chapter information from there, which doesn't work if the chapters are blocked. While it still won't work if you pass in the URL for the story index page, I now switch back to the page that you passed in originally when looking for the chapters to download.. The one problem I have with this, is I change the self.url to the story index page, which should probably remain so that we have a consistent self.url even if the user starts with chapter 9 this time instead of chapter 1.
2025-12-29 12:16:27 +01:00 · 2010-11-10 10:18:46 -06:00 · 2010-11-10 10:18:46 -06:00 · 33d1fabd11
commit 33d1fabd11
parent c326aa47b0
1 changed files with 6 additions and 0 deletions
--- a/ficwad.py
+++ b/ficwad.py
@ -71,6 +71,7 @@ class FicWad(FanfictionSiteAdapter):
 		self.storyUserRating = '0'
 		self.storyCharacters = []
 		self.storySeries = ''
+		oldurl = ''
 		
 		data = u2.urlopen(self.url).read()
 		soup = bs.BeautifulStoneSoup(data)
@ -84,6 +85,7 @@ class FicWad(FanfictionSiteAdapter):
 		if u1 == "story":
 			# This page does not have the correct information on it..  Need to get the Story Title Page
 			logging.debug('URL %s is a chapter URL.  Getting Title Page http://%s/%s/%s.' % (self.url, self.host, u1, storyid))
+			oldurl = self.url
 			self.url = 'http://' + self.host + '/' + u1 + '/' + storyid
 			data = u2.urlopen(self.url).read()
 			soup = bs.BeautifulStoneSoup(data)
@ -177,6 +179,10 @@ class FicWad(FanfictionSiteAdapter):
 		result = []
 		ii = 1

+		if oldurl is not None and len(oldurl) > 0:
+			data = u2.urlopen(oldurl).read()
+			soup = bs.BeautifulStoneSoup(data)
+			
 		storylist = soup.find('ul', {'id' : 'storylist'})
 		if storylist is not None:
 			allH4s = storylist.findAll('h4')