Fixes for tables, add keep_empty_tags:p,td,th and add to keep_html_attrs colspan,rowspan.

This commit is contained in:
Jim Miller 2018-03-30 14:36:11 -05:00
parent d8224d129e
commit 7b34b7e5c2
4 changed files with 16 additions and 3 deletions

View file

@ -312,7 +312,7 @@ keep_summary_html:true
## Example: To add 'style', 'title' and 'align' to the list to keep,
## in your personal.ini [defaults] put:
## add_to_keep_html_attrs:,style,title,align
keep_html_attrs:href,name,class,id
keep_html_attrs:href,name,class,id,colspan,rowspan
## Tags listed here will be replaced with <span class="tagname">.
## For example: <u>underlined text</u> becomes
@ -323,6 +323,12 @@ keep_html_attrs:href,name,class,id
## HTML and EPUB standards.
replace_tags_with_spans:u,big,small
## By default, empty tags are removed as part of cleaning up the
## source HTML. However, a few tags should be kept even if empty.
## (Whitespace only, including &nbsp; is considered empty.) This
## setting can adjust which tags are kept.
keep_empty_tags:p,td,th
## If a chapter range was given, use this pattern for the book title.
## replace_metadata and include/exclude will be applied *after* this.
## Set to empty value to disable.

View file

@ -456,7 +456,7 @@ class BaseSiteAdapter(Configurable):
t['class']=t.name
t.name='div'
# removes paired, but empty non paragraph tags.
if t.name not in ('p') and t.string != None and len(t.string.strip()) == 0 :
if t.name not in self.getConfigList('keep_empty_tags',['p','td','th']) and t.string != None and len(t.string.strip()) == 0 :
t.extract()
# remove script tags cross the board.

View file

@ -372,6 +372,7 @@ def get_valid_keywords():
'keep_title_attr',
'keep_html_attrs',
'replace_tags_with_spans',
'keep_empty_tags',
'keep_summary_html',
'logpage_end',
'logpage_entries',

View file

@ -315,7 +315,7 @@ keep_summary_html:true
## Example: To add 'style', 'title' and 'align' to the list to keep,
## in your personal.ini [defaults] put:
## add_to_keep_html_attrs:,style,title,align
keep_html_attrs:href,name,class,id
keep_html_attrs:href,name,class,id,colspan,rowspan
## Tags listed here will be replaced with <span class="tagname">.
## For example: <u>underlined text</u> becomes
@ -326,6 +326,12 @@ keep_html_attrs:href,name,class,id
## HTML and EPUB standards.
replace_tags_with_spans:u,big,small
## By default, empty tags are removed as part of cleaning up the
## source HTML. However, a few tags should be kept even if empty.
## (Whitespace only, including &nbsp; is considered empty.) This
## setting can adjust which tags are kept.
keep_empty_tags:p,td,th
## If a chapter range was given, use this pattern for the book title.
## replace_metadata and include/exclude will be applied *after* this.
## Set to empty value to disable.