-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datahub: Add generic "keyword" filter with translated values #782
Conversation
ef86949
to
c710132
Compare
Affected libs:
|
d4215ab
to
65319d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not looked at the code yet but just tested a bit, some issues on geocat dev for instance:
- the limit of 1000 for the
tag
facet is not enough as there are more than 1000 keywords - the limit of 1000 for thesaurus keys is not enough for
http://www.eionet.europa.eu/gemet/concept
thesaurus - the runtime_mapping slows a lot the request
- the runtime_mapping is present in all state search request payloads, which makes the search very slow, even when the runtime is not needed
- could be some issues, for instance
opendata.swiss
which have 2 entries, one for the keywords from the thesaurus, one for the same keyword which is not in the thesaurus opendata.swiss
entries sum count is not the same as the same facet in gn legacy ui900 + 1149 != 2566
- might be better to sort alphabetically
I think our method is accurate, but leads to some inconsistency and is slower than just targeting an index entry (eg tag.langfre
)
Might need to talk with Benoit about what is the best tradeoff.
You're right, the performance on the keyword fields with runtime mappings is abysmal on large catalogs, I didn't even realize that. We could use the pipeline system to register a scripted field specifically for keywords, and use it if available to improve the results. If not available, we can just target a As for the GEMET concepts, the thesaurus contains more than 1000 entries so currently some won't be fetched; I thought about doing pagination on these queries, not sure yet. Maybe this is going too far in terms of trying to improve the search results. I'm going to leave this PR for now and will revisit it later; I'll create another PR to extract the changes to the docker composition though, because this will improve e2e tests. |
GitHub Pages links:
|
65319d0
to
0ec742a
Compare
… on the keyword URIs Also read the description from the keywords, just in case
…eric keyword field The INSPIRE keyword field now also uses the same TranslateSearchField implementation, since the thesaurus loading is done behind the scenes by the Platform service.
0ec742a
to
e082a9a
Compare
I have reworked the PR to not rely on pipelines or runtime fields, and simply look for either This is ready for review, although I still have to add documentation on the topic. |
Multilingual fields rely on different fields depending on the current defined metadata language
e082a9a
to
3caa5cf
Compare
2650967
to
d59d9f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good.
I've tested the keyword
and works well, but the inspireKeyword
list is empty, on geocat dev.
Thanks, there was indeed a regression on the inspire keywords. I've added E2E tests for all the new filters to make sure this does not happen again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jahow ! I manually tested it and it LGTM! 🚀
77d8b0a
to
27ae152
Compare
Description
This PR introduces a generic
keyword
filter. Unlike the INSPIRE keyword filter which fetches translation dynamically from the GeoNetwork API, thekeyword
filter simply looks at eithertag.default
ortag.langxyz
fields depending on whether a hardcoded metadata language is defined in the configuration.This PR also simplifies the implementation of both topics and INSPIRE keyword filters by introducing a
TranslatedSearchField
class which relies on the platform service to show a label in the correct language.Architectural changes
No significant changes.
Screenshots
Quality Assurance Checklist
breaking change
labelbackport <release branch>
labelThis work is sponsored by Swisstopo / Geocat.ch.