View Single Post
Old 03-18-2024, 01:46 PM   #8
slm
Fool
slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.slm ought to be getting tired of karma fortunes by now.
 
Posts: 383
Karma: 3557934
Join Date: Feb 2003
Device: Kindle Voyage, Kindle PW1, Kobo Glo HD, Nook Glowlight Plus ...
Quote:
Originally Posted by Quoth View Post
My experience is that the Internet Archive "ebooks" are worthless. They are generated automatically from un-proofed OCR text only marginally good for searching.

So I deleted them all and only download PDFs (after checking they are really PD) and read them on a tablet.

You are better doing your own OCR of the PDF and proofing it. Do put page breaks at chapters, sections or other natural breaks in your wordprocessor. Later those will start new files in the epub. A new file is the only reliable page break and works for epub converted to mobi, azw3/KF8, dual mobi and KFX.
Just for the record--and to offset the quoted view--my experience with reading many Internet Archive "epubs" OCR'd from pdfs over many years is that about two-thirds of them are perfectly OK and about 10% are completely unusable. Note that this is just for casual reading.
slm is online now   Reply With Quote