Merge pull request #3995 from wisp3rwind/pr_lyrics_tekstowo_no_crashes

Crash-resilient Tekstowo lyrics source
This commit is contained in:
Adrian Sampson 2021-07-05 09:52:05 -04:00 committed by GitHub
commit 0f9ffeec3e
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 23 additions and 15 deletions

View file

@ -442,14 +442,15 @@ class Tekstowo(Backend):
search_results = self.fetch_url(url)
if not search_results:
return None
song_page_url = self.parse_search_results(search_results)
song_page_url = self.parse_search_results(search_results)
if not song_page_url:
return None
song_page_html = self.fetch_url(song_page_url)
if not song_page_html:
return None
return self.extract_lyrics(song_page_html)
def parse_search_results(self, html):
@ -460,20 +461,27 @@ class Tekstowo(Backend):
if not soup:
return None
song_rows = soup.find("div", class_="content"). \
find("div", class_="card"). \
find_all("div", class_="box-przeboje")
content_div = soup.find("div", class_="content")
if not content_div:
return None
card_div = content_div.find("div", class_="card")
if not card_div:
return None
song_rows = card_div.find_all("div", class_="box-przeboje")
if not song_rows:
return None
song_row = song_rows[0]
if not song_row:
return None
href = song_row.find('a').get('href')
return self.BASE_URL + href
link = song_row.find('a')
if not link:
return None
return self.BASE_URL + link.get('href')
def extract_lyrics(self, html):
html = _scrape_strip_cruft(html)
@ -483,10 +491,11 @@ class Tekstowo(Backend):
if not soup:
return None
c = soup.find("div", class_="song-text")
if c:
return c.get_text()
return None
lyrics_div = soup.find("div", class_="song-text")
if not lyrics_div:
return None
return lyrics_div.get_text()
def remove_credits(text):

View file

@ -212,6 +212,9 @@ Other new things:
* Get ISRC identifiers from musicbrainz
Thanks to :user:`aereaux`.
* :doc:`/plugins/metasync`: The ``metasync`` plugin now also fetches the ``Date Added`` field from iTunes databases and stores it in the``itunes_dateadded`` field.Thanks to :user:`sandersantema`.
* :doc:`/plugins/lyrics`: Added Tekstowo.pl lyrics provider. Thanks to various
people for the implementation and for reporting issues with the initial version.
:bug:`3344` :bug:`3904` :bug:`3905` :bug:`3994`
.. _py7zr: https://pypi.org/project/py7zr/
@ -294,8 +297,6 @@ Fixes:
* Removed ``@classmethod`` decorator from dbcore.query.NoneQuery.match method
failing with AttributeError when called. It is now an instance method.
:bug:`3516` :bug:`3517`
* :doc:`/plugins/lyrics`: Added Tekstowo.pl lyrics provider
:bug:`3344`
* :doc:`/plugins/lyrics`: Tolerate missing lyrics div in Genius scraper.
Thanks to :user:`thejli21`.
:bug:`3535` :bug:`3554`
@ -355,8 +356,6 @@ Fixes:
:bug:`3870`
* Allow equals within ``--set`` value when importing.
:bug:`2984`
* :doc:`/plugins/lyrics`: Fix crashes for Tekstowo false positives
:bug:`3904`
* :doc`/reference/cli`: Remove reference to rarfile version in link
* Fix :bug:`2873`. Duplicates can now generate checksums. Thanks user:`wisp3rwind`
for the pointer to how to solve. Thanks to :user:`arogl`.