Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Naive Bayes: Ignore existing classes in Laplacian smoothing #3575

Merged
merged 1 commit into from
Feb 15, 2019

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Feb 2, 2019

Issue

As suggested by @lanzagar, issue #2943 can be fixed by ignoring empty classes in Laplacian smoothing.

Plainly, say that y has values male, female, yes and no because y appeared in two different data sets loaded in the same session. Of course, only two values appear in each set. With this PR, only 2 is added to denominators in computation of probabilities, and the probabilities of empty classes are set to 0.

Variable reuse remains a large open issue.

Description of changes

The change also required a minor change in computation of probabilities: instead of computing exp(log(class_probs) + sum(conditional_probs)), the code now computes class_probs * exp(sum(conditional_probls)) because class_probs can now be 0. This should not affect numerical stability.

Includes
  • Code changes
  • Tests

@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from 325575e to 6fcd7cf Compare February 2, 2019 15:45
@codecov
Copy link

codecov bot commented Feb 2, 2019

Codecov Report

Merging #3575 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3575      +/-   ##
==========================================
+ Coverage   83.97%   83.97%   +<.01%     
==========================================
  Files         370      370              
  Lines       66944    66945       +1     
==========================================
+ Hits        56213    56214       +1     
  Misses      10731    10731

@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from 6fcd7cf to de273b1 Compare February 2, 2019 16:52
@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from de273b1 to 5e0fa06 Compare February 2, 2019 17:56
@janezd janezd changed the title Naive Bayes: Ignore existing classes in Laplacian smoothing [RFC] Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 2, 2019
@lanzagar
Copy link
Contributor

Looks good to me.
It still has the [RFC] tag, but unless others have additional comments, I am fine with changing this to [FIX] and merging.

@janezd janezd changed the title [RFC] Naive Bayes: Ignore existing classes in Laplacian smoothing Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 15, 2019
@lanzagar lanzagar changed the title Naive Bayes: Ignore existing classes in Laplacian smoothing [FIX] Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 15, 2019
@lanzagar lanzagar merged commit 5fd880c into biolab:master Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants