-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review validation and shapes in demography #342
Review validation and shapes in demography #342
Conversation
_validate_demography_array_arguments( | ||
trait_args={"h_max": h_max, "a_hd": a_hd}, size_args={"dbh": dbh} | ||
) | ||
|
||
return h_max * (1 - np.exp(-a_hd * dbh / h_max)) | ||
return _enforce_2D(h_max * (1 - np.exp(-a_hd * dbh / h_max))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@j-emberton and @MarionBWeinzierl
I just wanted to pause here and get a sanity check before rolling this out further. Basically, there are two axes here: traits of a cohort (columns) and some measure of size for which something is being calculated. All of the functions accept arrays with varying dimensions, and the calculations can give rise to output arrays with 0, 1 or 2 dimensions. The calculations all work, but then the response dimensionality is unpredictable. My plan here is to promote all outputs to (n_traits, n_sizes), so that there is consistent outputs.
Does that seem sane?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think the utility of having a consistent shape so that all downstream calcs can be simplified to assume this shape is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god. This actually opens a bit of a can of worms. I've added some discussion to the PR description.
…nts and _enforce_2D
…idation in StemAllometry
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #342 +/- ##
===========================================
+ Coverage 95.29% 96.09% +0.80%
===========================================
Files 28 35 +7
Lines 1720 2766 +1046
===========================================
+ Hits 1639 2658 +1019
- Misses 81 108 +27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks sensible to me. I agree that the _enforce_2D
is not ideal, but I cannot think of a more elegant way of doing this right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the error messages in two of the tests might not be complete, see above.
Description
This PR:
Creates some new validation tools in
demography.core
:_validate_demography_array_arguments
which is intended to replacedemography.t_model_functions._validate_t_model_args
anddemography.crown._validate_z_qz_args
. These two overlap a lot. The function will check that:_enforce_2D
, which will be used to ensure a common return dimensionality from functions: various legal inputs can give rise to 0, 1 or 2D results from calculations and it seems easier to fix the single return array.Updates validation and shapes in demography:
StemTraits
is now validated when created directly, butFlora.get_stem_traits()
bypasses validation because the inputs are guaranteed correct.t_model_functions
now use_validate_demography_array_arguments
and have new validation flag arguments. TheStemAllometry
andStemAllocation
classes now also optionally validate the inputs once and then the inputs should be congruent so the individual function validation is skipped.crown
,community
andcanopy
. The tests are updated to match and the docs issues are fixed.Dimensionality and validation
There's an issue with differing approaches to validation here.
_enforce_2D
function is about ensuring that the output dimensions are consistent with the sizes vs stems array size model.check_input_shapes
within the validation. This is apyrealm.core
function that was setup for use with thePModel
parts of the code and it insists that arrays are not simply broadcastable, but are identically shaped or scalar.pyrealm
API that I dislike and would like to get rid of: users have to broadcast inputs to get shapes to match to get things to run. I suspect we could relaxcheck_input_shapes
and simply applynp.broadcast_shapes
as a validation tool. The existing code should then handle broadcasting correctly and users would not need to pre-broadcast data. They would still have to know which axis is which and allowingxarray
inputs would make this easier.check_input_shapes
here - then the internals of the demography are going to start insisting on identical shapes where values are actually sanely broadcastable. And that ends up with insisting that all the initial demography inputs are identically shaped - which is a significant usability issue and makes for uglier and more convoluted code.So - I'm going to try and get the
demography
working with the less constraining 'broadcastable' checking. I think that's something we should try and adopt more widely, but there is a difference in approach for now.Fixes #317
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks