06-05-2013, 04:00 AM | #1 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
DOCX Input [Calibre native]
As a first DOCX Input developer, I am starting this thread for a new Kovid's baby: native DOCX Input.
So with big interest I hurried to get dev trunk and, unfortunately, on the first try got the exception: Spoiler:
|
06-05-2013, 04:10 AM | #2 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's caused by the version of lxml being too old in the calibre OS X build, it will be updated when the next calibre release is made. In the meantime if you want to workaround it, you can replace the line
Code:
descendants(doc, 'w:p', 'w:tbl') with XPath('descendant::w:p|descendant::w:tbl')(doc) |
Advert | |
|
06-05-2013, 04:33 AM | #3 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I just pushed a commit that should make it work even with older lxml, I haven't tested it, however.
|
06-06-2013, 02:28 AM | #4 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
Updated, working. However, what about font embedding? I saw some code regarding fonts, but how is it intended to work? I was doing font scanning and caching, what approach is taken with native plugin?
At least for me fonts are not embedded in the resulting epub/azw3. Update: OS is 64bit Ubuntu Precise. Last edited by SauliusP.; 06-06-2013 at 02:31 AM. |
06-06-2013, 02:29 AM | #5 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Font embedding works fine for me. Look for the @font-face rules that define the embedded fonts.
|
Advert | |
|
06-06-2013, 02:39 AM | #6 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
Cannot find any in stylesheets.
|
06-06-2013, 02:46 AM | #7 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
Might this in the console be related:
'embed_font_family': None, If so, where to set this to "True" or something like that? |
06-06-2013, 02:55 AM | #8 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That has nothing to do with it, it comes from a later stage in the conversion pipeline, the docs input plugin embeds fonts right at the beginning.
Post a docx file for which you fail to get embedded fonts. |
06-06-2013, 03:36 AM | #9 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
I didn't get fonts embedded on 20 docx files, so, I believe, it is not file related, but here you go, one I've used for demo features.
|
06-06-2013, 03:56 AM | #10 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That docx file has no embedded fonts in it. If you want the fonts to be embedded you need to tell Word to do that explicitly in the Word options.
In other words, only if a font is embedded in the input document will it be embedded in the output document. |
06-06-2013, 04:03 AM | #11 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
Any plans to embed external fonts, that are simply used in the document?
|
06-06-2013, 04:15 AM | #12 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I dont mind adding an option to do that, but it belongs in the main conversion pipeline not in the docx input plugin. I'll get to it when I have time, or you are welcome to submit a patch for it.
|
06-06-2013, 04:26 AM | #13 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
Allright. I simply think, that at least because of docx nature, it is unconvenient for users to embed fonts into docx—it is a "draft" of the book, not the book itself. Also, embedding option is global to Word, not to the file, which is also inconvenient.
I'll consider the patch, however. Wonder if font scanning and caching would be a good option for that? Also might be an idea to enable font scanning on Calibre's startup. Otherwise it would be a performance overhead to search for system fonts on every conversion, if several takes place. |
06-06-2013, 04:33 AM | #14 |
creator of calibre
Posts: 43,962
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
calibre already does cached font scanning. Thats how the embed font family option (among many other font processing tasks in calibre) works. If you want to work on the patch, look at how the embed font family option is implemented first.
Also, I dont see why emebedding fonts in a draft in progress makes any difference, it's not like it significantly changes save time for the docx. |
06-06-2013, 04:41 AM | #15 |
Plugin developer
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
|
I'm from an old school, where each kilobyte was valuable :-D So embedding, say, Verdana font in every DOCX I have (>1000), when it is installed in the system, seems like overhead and space waste for me :-)
Also, if I have some docx book, now I need to open it in Word with enabled "embed fonts" feature (which is usually turned off for most of the people), save it and only convert then. No no, not convenient! |
Tags |
calibre, docx input |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Input Plugin] DOCX Input | SauliusP. | Plugins | 42 | 06-05-2013 04:01 AM |
DOCX Input and DOCX Metadata Reader | SauliusP. | Development | 5 | 06-15-2012 02:17 AM |
Calibre native app on iphone for reading news? | bigreat | Calibre | 2 | 07-21-2010 11:50 PM |
XML input into calibre | cremofix | Calibre | 3 | 05-18-2009 06:38 AM |