08-21-2023, 07:46 PM | #16 |
Wizard
Posts: 1,146
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
You could also use the Characters list in the Reports module. Double clicking on the character list will jump you to the next instance.
|
08-21-2023, 11:04 PM | #17 |
creator of calibre
Posts: 43,954
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
This is a python based regex engine, you need \U followed by 8 digits so insert three leading zeros. For example for pouting cat face (😾 U+1f63e)
\U0001f63e |
Advert | |
|
08-21-2023, 11:17 PM | #18 | |
Wizard
Posts: 1,146
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
Why does the Search find the character, then not recognize it when trying to replace it? |
|
08-21-2023, 11:23 PM | #19 |
creator of calibre
Posts: 43,954
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's a limitation of Qt with non-BMP unicode characters find will select only the first utf-16 codepoint making up the character. I could possibly workaround it, but its a lot of effort for a niche use case. Just use replace all for this case.
|
08-21-2023, 11:37 PM | #20 |
Wizard
Posts: 1,146
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
|
Advert | |
|
08-21-2023, 11:45 PM | #21 |
creator of calibre
Posts: 43,954
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Actually the workaround turns out to be quite easy
https://github.com/kovidgoyal/calibr...f210a1129f00ab |
08-22-2023, 12:46 AM | #22 | |
want to learn what I want
Posts: 1,039
Karma: 6422750
Join Date: Sep 2020
Device: Calibre E-book viewer
|
Quote:
Using ^(.), it would mark the first character occurrences in the search panel, but would not actually replace them ('\1'). Then I found two alternative ways to do it: one is by using ^.?(.*) instead, and the other is surprisingly simple - Alt-selecting vertically the text "column" and deleting it. |
|
08-22-2023, 02:38 AM | #23 | ||
Wizard
Posts: 1,146
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
Yep, looks like a simple fix... but probably took a bit to figure out that's what was needed. Quote:
Thanks for the tip!! |
||
08-22-2023, 08:22 PM | #24 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jun 2023
Device: Kobo Clara HD
|
Thank you for the feedback. With the syntax "\U0001D4B7" it finds some characters.
If i have the text "𝐧𝒐𝗏𝑬𝔩𝓤𝗌𝒷.𝓬𝑶𝐦" and search for this, it shows me 4 occurences (also the wrong ones). But when replacing, then it properly only replaces just the one character it should match. I assume this also corresponds to your bugfix. In some cases it doesn't seem to work consistently, but I'm looking forward for this bugfix to get a better visual information about the matches. Then i'll continue testing. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
pdf to epub regex unicode character match not working | marcio_oliveira | Conversion | 2 | 09-11-2021 03:16 PM |
Aura Supported Unicode ranges | kuvera | Kobo Reader | 3 | 06-12-2015 04:44 PM |
Can't match Unicode character | atordo | Recipes | 2 | 06-15-2012 03:20 PM |
Problem with Unicode Character 'Word Joiner' (U+2060) | psztk | Conversion | 0 | 10-14-2011 01:18 PM |
Glyph Substitution of Unicode character | vdevan | OpenInkpot | 2 | 07-18-2009 05:54 PM |