-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heterophones not in the list? #23
Comments
Thanks for raising this question, but I'm not sure I quite understand the issue. Heterophones are explicitly included in the data structure, as described in the Readme:
Furthermore, the entries for
These seem to pretty unambiguously include the multiple possible pronunciations for each word. If you are referring to the data in the |
Right yes! Sorry I was as I am British so naturally gravitated towards that list 😛 On the topic of Heterophones, as this would effect my pull request what would the opinion be about including different pronunciations from different accents? For instance grass is pronounced |
@sancarn The more accents/dialects/speech varieties the better! The current approach is to separate these into different dictionaries so that the list for each language variant is, internally speaking, as phonemically consistent as possible. So, just, as I'm hoping someone will generate In terms of expanding the Looking at the code that generated that list, it appears that heteronyms are intentionally stripped out based on a list of 972 heteronyms (which include both If you'd like to give this a try, please go ahead, and I would be happy to accept the resulting PR. I would also be glad to look into this myself but likely won't have a chance to do so until early August at the earliest. |
Heterophones like
Tear
andRecord
are not present in the ipa dictionary for English. Is there a specific reason for this as i understand these wouldn't fit neatly into the data structure? Or is it just an oversight?The text was updated successfully, but these errors were encountered: