Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document and test utility functions (e.g., mean_na) #53

Open
5 tasks
grinnellm opened this issue Jun 28, 2022 · 1 comment
Open
5 tasks

Document and test utility functions (e.g., mean_na) #53

grinnellm opened this issue Jun 28, 2022 · 1 comment
Assignees

Comments

@grinnellm
Copy link
Owner

grinnellm commented Jun 28, 2022

Write documentation and proper tests for utility functions. For all, complete tests including proper class for input and output. Also, use ... to pass say, omit_na to the mean() function in mean_na(x, ...) as described here. Note that this will require changing the functions where they are used because the default for na.rm in mean() is FALSE but the default for omit_na in mean_na() is TRUE.

  • max_na
  • sum_na
  • mean_na
  • wt_mean_na
  • unique_na: Add a test for empty set.

These were copied from the "pbs-assess/gfiscamutils" repo because they are on the "herring" branch but not the main branch; was causing an issue with installing.

@grinnellm
Copy link
Owner Author

grinnellm commented Jun 29, 2022

Include some motivation for these functions in the documentation. They're the result of some unexpected (to me) behaviour using sum() in the "dplyr" summarise functions:

require(dplyr)

dat <- tibble( Location = rep(c("A", "B", "C"), each = 3), Length = c(1:5, rep(NA, times = 4)) )

res <- dat %>% group_by(Location) %>% summarise( Max = max(Length, na.rm = TRUE), MaxNA = max_na(Length), Sum = sum(Length, na.rm = TRUE), SumNA = sum_na(Length), Mean = mean(Length, na.rm = TRUE), MeanNA = mean_na(Length), ) %>% ungroup()

The mean is as I expected, but the sum caught me by surprise. I expected the Sum for Location "C" to be NA instead of 0. It turns out this is the correct behaviour for the sum:

sum(NA, na.rm = TRUE)

should give 0, not NA (or NaN) because the sum of an empty set is zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant