Allow transcripts of umlauts #16

sebix · 2015-01-22T21:52:23Z

For all users of openthesaurus not using a german keyboard layout and not knowing how to use compose keys, it would be very nice to have an automatic conversion for umlauts:

ue -> ü
ae -> ä
oe -> ö
ss or sz -> ß

dict.cc is doing the same

janschreiber · 2016-08-10T18:16:02Z

This feature is now implemented and seems to work. However, I found one strange result today:

Search for 'Evaskostum' finds 'Evaskostüm' via fuzzy search
Search for 'Evaskostuem' finds nothing
Search for 'Kostuem' finds 'Kostüm'

sebix · 2016-08-10T18:53:58Z

Great!

danielnaber · 2016-08-11T08:01:25Z

@janschreiber Thanks for the report. Unfortunately it's not easy to fix, as we apply some tricks to make it fast: the substring search needed here (as the item is im Evaskostüm) works on a memory table, but this table doesn't contain the normalized terms needed for this feature.

janschreiber · 2016-08-11T22:26:35Z

@danielnaber Thanks for your explanation. I'm not sure if the following suggestion makes any sense whatsoever, but wouldn't it be possible to apply the normalization to the search terms rather than to the searched data? I mean, isn't it possible to transform a search for words that contain "umlaut-ish" character combinations such as 'ae' to a search for (Cäsar|Caesar) before it is even sent to the search algorithm?

danielnaber · 2016-08-12T07:33:50Z

The normalization needs to be applied to both, but our in-memory database currently isn't a mapping, but just a list of words. We'd need to extend that to contain a mapping from normalized to original term. (Plus, we actually have two different ways of normalization.)

danielnaber added the enhancement label Nov 26, 2015

This was referenced Aug 12, 2016

Make users aware that they can use transripted umlauts #19

Open

Feature request: Add a character table with umlauts and other special characters #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow transcripts of umlauts #16

Allow transcripts of umlauts #16

sebix commented Jan 22, 2015

janschreiber commented Aug 10, 2016

sebix commented Aug 10, 2016

danielnaber commented Aug 11, 2016

janschreiber commented Aug 11, 2016 •

edited

Loading

danielnaber commented Aug 12, 2016

Allow transcripts of umlauts #16

Allow transcripts of umlauts #16

Comments

sebix commented Jan 22, 2015

janschreiber commented Aug 10, 2016

sebix commented Aug 10, 2016

danielnaber commented Aug 11, 2016

janschreiber commented Aug 11, 2016 • edited Loading

danielnaber commented Aug 12, 2016

janschreiber commented Aug 11, 2016 •

edited

Loading