-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Lots of) Missing Words #76
Comments
Heya, On Sunday, I will have a look. Thanks! |
Heya, I have added the ones you placed in normal view:
"bazoo" Oxford claims to be a US word. Thank you for the wordlist, I have downloaded it and will slowly add them. It may take a very long time to add so many words as they have to be analysed one by one to see if they accept plurals, etc.. Currently I lack the spare time as I have an ongoing PhD. It may take years to add them, anyway. |
Hi Marco, just found your website and I am impressed the amount of work you put into this. I made a comment on this bug report as I feel it is a general all languages issue especially in a more global world: hunspell/hunspell#113 My question in your case: how can I help? Both with the list mentioned by Crissum and this new proposed list. I can make pull request for the latter. How did you add the names of cities for example? I found fairly small towns included like Frimely. |
Heya, @amunizp , Currently, I can't do much since I have a private PhD presentation in two weeks and I don't know later when I will have a real presentation. I can't focus much on things that take whole days/weeks/months to do, I am just doing basic stuff, such as creating a rule or two in LanguageTool, updating the website with small enhancements, etc. Here is my plan: On 1-JAN-2025 I will make another release, and if I already have the PhD, things will go back to normal in 2025. This is my plan. Thanks! |
I acquired an electronic version of the Oxford Dictionary of English and compared the list of headwords with
en_GB (Marco Pinto)/wordlist_marcoagpinto_20240501_276252w.txt
. I found 72,767 missing words in total but of course this list requires manual proofreading. Here's a sample:Among these 'bawdry' seems like a serious omission, 'bbl.' actually already exists as 'bbl' and 'be-' should not be included, but others are probably too rare to be of any interest…
I browsed through the list and did find some other words that at least I don't have to consult a dictionary to understand:
nonexisting_words.txt
By the way, I'd like to suggest a couple of proper nouns: Brontë, Gaskell, Turgenev, Zola. Actually I removed all headwords with white space. I could upload another version with proper nouns included.
The text was updated successfully, but these errors were encountered: