Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 06-05-2013, 04:00 AM   #1
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
DOCX Input [Calibre native]

As a first DOCX Input developer, I am starting this thread for a new Kovid's baby: native DOCX Input.

So with big interest I hurried to get dev trunk and, unfortunately, on the first try got the exception:

Spoiler:
InputFormatPlugin: DOCX Input running
on /tmp/calibre_0.9.33_tmp_HdwPid/qemoxw.docx
Traceback (most recent call last):
File "site.py", line 58, in main
File "/Advanced/workspace/Calibre/src/calibre/utils/ipc/worker.py", line 189, in main
result = func(*args, **kwargs)
File "/Advanced/workspace/Calibre/src/calibre/gui2/convert/gui_conversion.py", line 31, in gui_convert_override
override_input_metadata=True)
File "/Advanced/workspace/Calibre/src/calibre/gui2/convert/gui_conversion.py", line 25, in gui_convert
plumber.run()
File "/Advanced/workspace/Calibre/src/calibre/ebooks/conversion/plumber.py", line 1010, in run
accelerators, tdir)
File "/Advanced/workspace/Calibre/src/calibre/customize/conversion.py", line 239, in __call__
log, accelerators)
File "/Advanced/workspace/Calibre/src/calibre/ebooks/conversion/plugins/docx_input.py", line 21, in convert
return Convert(stream, log=log)()
File "/Advanced/workspace/Calibre/src/calibre/ebooks/docx/to_html.py", line 85, in __call__
self.read_page_properties(doc)
File "/Advanced/workspace/Calibre/src/calibre/ebooks/docx/to_html.py", line 164, in read_page_properties
for p in descendants(doc, 'w', 'w:tbl'):
File "/Advanced/workspace/Calibre/src/calibre/ebooks/docx/names.py", line 105, in descendants
return elem.iterdescendants(*map(expand, args))
File "lxml.etree.pyx", line 1290, in lxml.etree._Element.iterdescendants (src/lxml/lxml.etree.c:40339)
TypeError: iterdescendants() takes at most 1 positional argument (2 given)

SauliusP. is offline   Reply With Quote
Old 06-05-2013, 04:10 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's caused by the version of lxml being too old in the calibre OS X build, it will be updated when the next calibre release is made. In the meantime if you want to workaround it, you can replace the line

Code:
descendants(doc, 'w:p', 'w:tbl')

with
XPath('descendant::w:p|descendant::w:tbl')(doc)
there may be other places where you will need to make the change as well.
kovidgoyal is offline   Reply With Quote
Advert
Old 06-05-2013, 04:33 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I just pushed a commit that should make it work even with older lxml, I haven't tested it, however.
kovidgoyal is offline   Reply With Quote
Old 06-06-2013, 02:28 AM   #4
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Updated, working. However, what about font embedding? I saw some code regarding fonts, but how is it intended to work? I was doing font scanning and caching, what approach is taken with native plugin?

At least for me fonts are not embedded in the resulting epub/azw3.

Update: OS is 64bit Ubuntu Precise.

Last edited by SauliusP.; 06-06-2013 at 02:31 AM.
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 02:29 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Font embedding works fine for me. Look for the @font-face rules that define the embedded fonts.
kovidgoyal is offline   Reply With Quote
Advert
Old 06-06-2013, 02:39 AM   #6
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Cannot find any in stylesheets.
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 02:46 AM   #7
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Might this in the console be related:

'embed_font_family': None,


If so, where to set this to "True" or something like that?
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 02:55 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That has nothing to do with it, it comes from a later stage in the conversion pipeline, the docs input plugin embeds fonts right at the beginning.

Post a docx file for which you fail to get embedded fonts.
kovidgoyal is offline   Reply With Quote
Old 06-06-2013, 03:36 AM   #9
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Quote:
Originally Posted by kovidgoyal View Post
Post a docx file for which you fail to get embedded fonts.
I didn't get fonts embedded on 20 docx files, so, I believe, it is not file related, but here you go, one I've used for demo features.
Attached Files
File Type: zip DOCX Input features demo - Saulius P_.docx.zip (30.4 KB, 286 views)
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 03:56 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That docx file has no embedded fonts in it. If you want the fonts to be embedded you need to tell Word to do that explicitly in the Word options.

In other words, only if a font is embedded in the input document will it be embedded in the output document.
kovidgoyal is offline   Reply With Quote
Old 06-06-2013, 04:03 AM   #11
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Any plans to embed external fonts, that are simply used in the document?
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 04:15 AM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont mind adding an option to do that, but it belongs in the main conversion pipeline not in the docx input plugin. I'll get to it when I have time, or you are welcome to submit a patch for it.
kovidgoyal is offline   Reply With Quote
Old 06-06-2013, 04:26 AM   #13
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Allright. I simply think, that at least because of docx nature, it is unconvenient for users to embed fonts into docx—it is a "draft" of the book, not the book itself. Also, embedding option is global to Word, not to the file, which is also inconvenient.

I'll consider the patch, however. Wonder if font scanning and caching would be a good option for that? Also might be an idea to enable font scanning on Calibre's startup. Otherwise it would be a performance overhead to search for system fonts on every conversion, if several takes place.
SauliusP. is offline   Reply With Quote
Old 06-06-2013, 04:33 AM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
calibre already does cached font scanning. Thats how the embed font family option (among many other font processing tasks in calibre) works. If you want to work on the patch, look at how the embed font family option is implemented first.

Also, I dont see why emebedding fonts in a draft in progress makes any difference, it's not like it significantly changes save time for the docx.
kovidgoyal is offline   Reply With Quote
Old 06-06-2013, 04:41 AM   #15
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
I'm from an old school, where each kilobyte was valuable :-D So embedding, say, Verdana font in every DOCX I have (>1000), when it is installed in the system, seems like overhead and space waste for me :-)

Also, if I have some docx book, now I need to open it in Word with enabled "embed fonts" feature (which is usually turned off for most of the people), save it and only convert then.

No no, not convenient!
SauliusP. is offline   Reply With Quote
Reply

Tags
calibre, docx input


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Input Plugin] DOCX Input SauliusP. Plugins 42 06-05-2013 04:01 AM
DOCX Input and DOCX Metadata Reader SauliusP. Development 5 06-15-2012 02:17 AM
Calibre native app on iphone for reading news? bigreat Calibre 2 07-21-2010 11:50 PM
XML input into calibre cremofix Calibre 3 05-18-2009 06:38 AM


All times are GMT -4. The time now is 12:58 AM.


MobileRead.com is a privately owned, operated and funded community.