Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 10-15-2014, 04:56 PM   #1
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Adding books - filename RegEx author FN (initials) LN

Apologies if an answer is already posted, but I couldn't find one as I struggle to understand RegEx (even with the help of RegEx Buddy). FWIW this is a special case situation where I do not have metadata and rely on the existing filename to determine the author and title.

(?P<author>.+) - (?P<title>[^_]+) is the default RegEx that I currently have under "preferences" for adding books to Calibre, for obtaining the author and title from the filename rather than from metadata, but that RegEx flips the order of FN LN for the author:

filename 1: Tom Jones - My Book.epub
resulting author: Jones Tom

filename 2: Tom G. Jones - My Book.epub
resulting author: G. Jones Tom

I've tried other RegEx expressions that I have seen posted that are even more sophisticated, to optionally accommodate series name and number, but those RegEx all seem to have the same strange effect when determining the author.
e.g.
^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?(\[?(?P<series>[^0-9\-]+) (- )?(?P<series_index>[0-9.]+)\]?\s*-\s*)?(?P<title>.+)

I think I must be missing something because it seems non-intuitive for the default for these various RegEx approaches to switch the order of FN and LN as it appears in the filename. The results don't even generate the comma version of FN, LN so that the problem can be fixed in Calibre.

Thanks for any help on this.
Rob557 is offline   Reply With Quote
Old 10-15-2014, 05:16 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,665
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Rob557 - maybe its not the regex that flipping the names, maybe its settings in Tweaks

Quote:
ID: author_sort_copy_method

The algorithm used to copy author to author_sort. Possible values are:

invert: use "fn ln" -> "ln, fn"
copy : copy author to author_sort without modification
comma : use 'copy' if there is a ',' in the name, otherwise use 'invert'
nocomma : "fn ln" -> "ln fn" (without the comma)

When this tweak is changed, the author_sort values stored with each author must be recomputed by right-clicking on an author in the left-hand tags pane, selecting 'manage authors', and pressing 'Recalculate all author sort values'.

The author name suffixes are words that are ignored when they occur at the end of an author name. The case of the suffix is ignored and trailing periods are automatically handled. The same is true for prefixes.

The author name copy words are a set of words which if they occur in an author name cause the automatically generated author sort string to be identical to the author name. This means that the sort for a string like Acme Inc. will be Acme Inc. instead of Inc., Acme

# Author sort name algorithm

Code:
author_sort_copy_method = 'comma'
author_name_suffixes = ('Jr', 'Sr', 'Inc', 'Ph.D', 'Phd', 'MD', 'M.D', 'I', 'II', 'III', 'IV', 'Junior', 'Senior')
author_name_prefixes = ('Mr', 'Mrs', 'Ms', 'Dr', 'Prof')
author_name_copywords = ('Corporation', 'Company', 'Co.', 'Agency', 'Council', 'Committee', 'Inc.', 'Institute', 'Society', 'Club', 'Team')
I struggled with this issue, until I found this plug-in ==>> [GUI Plugin] Quick Preferences.

BR
BetterRed is online now   Reply With Quote
Advert
Old 10-15-2014, 08:13 PM   #3
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Quote:
Originally Posted by BetterRed View Post
@Rob557 - maybe its not the regex that flipping the names, maybe its settings in Tweaks

I struggled with this issue, until I found this plug-in ==>> [GUI Plugin] Quick Preferences.
BR
Thanks for your response BetterRed. I think the circumstances here are a bit different and neither that tweak setting nor the author-sort parameters seem to be at issue here. The quick reference plugin has some useful features but I think what is needed for this problem is the identification of a RegEx script that would do a straight-forward extraction of the author name (and title) from the illustrated filename. Please let me know if I have misunderstood.

Within the Preferences / Adding Books screen there is basic test process for seeing the results of the RegEx script for a sample filename, and that test confirms the problem as well.
Rob557 is offline   Reply With Quote
Old 10-15-2014, 08:52 PM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,665
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Rob557 - Every time I think I understand the author name handling something comes along that I don't understand.

I can't get the behaviour you are getting even if I try - see first attachment. No matter what value I put in Author sort name algorithm, what you see is what I get with that first template

Surely the regex expression in the Add Books controls is limited to extraction?

To change the name (after extracting) you would need a replace regex... wouldn't you? Something like "\2 \1", where \2 was the last token and \1 was everything before the last token.

Added : I added the file Tom Jones - My Book.epub with that template, the resultant Metadata is shown in second attachment

Can you confirm that's what you want, because I may be misunderstanding you ?

BR
Attached Thumbnails
Click image for larger version

Name:	Capture1.JPG
Views:	290
Size:	71.4 KB
ID:	129717   Click image for larger version

Name:	Capture2.JPG
Views:	287
Size:	40.5 KB
ID:	129718  

Last edited by BetterRed; 10-15-2014 at 09:08 PM.
BetterRed is online now   Reply With Quote
Old 10-15-2014, 10:51 PM   #5
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Hi BetterRed,

The RegEx that you used is a bit different from what I had as my current default, but when I use your RegEx you will see from the image that the result that it produces is different from yours. Very strange!??

One difference is that I am using Calibre version 2.3 and yours is 2.5 but I wouldn't think that would be the reason? I can look at updating the version tomorrow but would be surprised. Can you see anything else I might be missing?
Attached Thumbnails
Click image for larger version

Name:	RexEx for adding book.jpg
Views:	281
Size:	32.9 KB
ID:	129721  
Rob557 is offline   Reply With Quote
Advert
Old 10-16-2014, 12:46 AM   #6
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,665
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Rob557 - I think I found it, right under our under our collective blithering noses, see attachment - uncheck the highlighted box

sorry - I guess I wont forget that one again Ψ²

BR
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	310
Size:	68.6 KB
ID:	129722  
BetterRed is online now   Reply With Quote
Old 10-16-2014, 10:08 AM   #7
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Hi BetterRed,

You are correct.

At first I thought you were mistaken because that box had no effect on the "test" results that are shown under Preferences / Adding Books, but now I realize that the "swap author first name and last name" means that the actual results when adding the book will differ from the "test" results by appropriately putting the first name first.

To hopefully avoid confusion for others I've replaced my prior comments and attachments posted a couple minutes ago with this acknowledgment that you are correct about the importance of that box that you highlighted. I've left in the attachment that shows the comment for that option box is in fact applicable when relying on the filename to define the author and title. Hopefully you will see my revised comments as you form any further response.

Thank you !!!
Attached Thumbnails
Click image for larger version

Name:	RexEx for adding book - 2.jpg
Views:	301
Size:	211.5 KB
ID:	129765  

Last edited by Rob557; 10-16-2014 at 10:26 AM. Reason: oops
Rob557 is offline   Reply With Quote
Old 10-16-2014, 11:20 AM   #8
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Just adding a cautionary note with regards to how the RegEx "test" works under Preferences / Adding Books.

While applying that RegEx test, it turns out that the test results are NOT immediately/ directly affected by checking or unchecking the option box "Swap author, firstname and lastname". Instead, the test results will be dependent only on whether that box was checked or unchecked THE LAST TIME the option selection was saved using the "apply" button.

That explains why, in my prior image attachment, the test results still show the firstname and lastname flipped around even though the option box is not checked. So long as the option box is unchecked, the actual results will be okay if that option selection is saved, and even the test results will be okay the next time the "Preferences / Adding books" RegEx test is run with that option box unchecked.
Rob557 is offline   Reply With Quote
Old 10-16-2014, 04:53 PM   #9
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,665
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Maybe Test could be disabled if there are pending changes to the actual settings. Also the option should be moved so that it lines up with other controls such as Mark Books (see attachment) - IMO of course.

I only noticed the presence of the Swap author firstname lastname checkbox (its default value is off) when I installed a fresh portable and changed the first setting, Read metadata from file contents rather than file name, to do a test with a real file.

BR
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	271
Size:	107.1 KB
ID:	129778  
BetterRed is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
regex for filename with muliple dots? (adding books) kite Library Management 5 09-28-2014 09:09 AM
Adding books - regex help tonyx3 Library Management 1 03-13-2013 08:21 AM
Adding books with different filename structures Sinnott Library Management 2 11-09-2012 08:12 AM
Adding books with series in the filename genright Library Management 5 06-13-2011 03:20 PM
A little help adding books and using regex. Dragonator Calibre 7 12-17-2010 06:57 PM


All times are GMT -4. The time now is 04:16 AM.


MobileRead.com is a privately owned, operated and funded community.