Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract forecast reference time #118

Merged
merged 19 commits into from
Oct 10, 2024
Merged

Conversation

truth-quark
Copy link
Collaborator

@truth-quark truth-quark commented Oct 1, 2024

First step to doing #100.

This PR is intended for facilitating testing functionality for fixing the forecast reference time.

As of late Sept 2024, work up to commit fd664c7 only provides a skeleton of unit tests. It doesn't meaningfully test any cube time changes. This PR includes some skipped tests as the testing required is unclear.

Work on this PR needs to answer & address the following:

  • Is the time.units.calendar == 'gregorian' code branch & testing required?
  • Is the time.units.calendar == 'proleptic_gregorian' code branch & testing required?
  • What inputs should be tested?
  • What outputs should be tested?
  • What cube state/attrs should be tested?
  • Is the code branch covered by test_fix_forecast_reference_time_standard() required?

These questions will help determine what work is required. Any comments/ideas welcome!

Fixing & testing this function is specified in #141.

@truth-quark
Copy link
Collaborator Author

Adding Martin as a reviewer to help clarify what is required.

umpost/um2netcdf.py Outdated Show resolved Hide resolved
@CodeGat CodeGat self-requested a review October 2, 2024 00:27
test/test_um2netcdf.py Outdated Show resolved Hide resolved
CodeGat
CodeGat previously approved these changes Oct 2, 2024
Copy link

@CodeGat CodeGat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. But note that I'm missing some of the domain knowledge needed to give a more 'conceptual' approval, so probably wait for others to give their review :D

@blimlim
Copy link
Collaborator

blimlim commented Oct 2, 2024

Pulling this section out into a separate function sounds like a good idea!

I'm still trying to understand the original version of the code, but have a couple of comments and questions so far:

It's a bit tricky to follow I think because of the dynamic overwrite of the PPfile.calendar method:

# Override the PP file calendar function to use Proleptic Gregorian rather than Gregorian.
# This matters for control runs with model years < 1600.
@property
def pg_calendar(self):
"""Return the calendar of the field."""
# TODO #577 What calendar to return when ibtim.ic in [0, 3]
calendar = cf_units.CALENDAR_PROLEPTIC_GREGORIAN
if self.lbtim.ic == 2:
calendar = cf_units.CALENDAR_360_DAY
elif self.lbtim.ic == 4:
calendar = cf_units.CALENDAR_365_DAY
return calendar
# TODO: Is dynamically overwriting PPField acceptable?
PPField.calendar = pg_calendar

When I load in an ESM1.5 output file independently, e.g. /g/data/tm70/sw6175/nc_conversion/output_dirs_for_testing/output000/atmosphere/aiihca.paa1jan, I get cube.coord("time").units.calendar is standard. However, when I run um2nc on it, the if time.units.calendar == 'proleptic_gregorian' and refdate.year < 1600: branch is still executed because of the above overwrite.

Is the time.units.calendar == 'gregorian' code branch & testing required?

Because of the PPfile.calendar overwrite, I suspect we will never have a cube with a gregorian or standard calendar, and so I'm wondering if we can remove this branch entirely, or at least skip tests for it?

Is the time.units.calendar == 'proleptic_gregorian' code branch & testing required?

I think it would be good to test this branch eventually. Paleoclimate simulations usually use years before 1600, and so this branch would be used quite regularly.


@MartinDix, do you know why the forecast_reference_time coordinate is used to determine if the date is before 1600, rather than using the time coordinate directly?

reftime = cube.coord('forecast_reference_time')
time = cube.coord('time')
refdate = reftime.units.num2date(reftime.points[0])

I'm wondering if it's to ensure that the calendar of dump files isn't changed, e.g.

except iris.exceptions.CoordinateNotFoundError:
# Dump files don't have forecast_reference_time
pass

but thought it would be good to double check!

@truth-quark
Copy link
Collaborator Author

I'll link in some previous work on convert_propleptic() in issue #37.
There's part of a unit test here which might be useful...

@aidanheerdegen aidanheerdegen self-requested a review October 9, 2024 05:20
@truth-quark
Copy link
Collaborator Author

truth-quark commented Oct 9, 2024

Update: following the 9/10/2024 handover meeting, work on this will be split into 2 parts:

  1. Extract the time/calendar functions & provide basic unit test scaffolding (this PR)
  2. Retrofit testing in Retrofit unit tests for fix_forecast_reference_time() #141 & cover in a second PR.

@truth-quark truth-quark changed the title Extract & test forecast reference time Extract forecast reference time Oct 9, 2024
@truth-quark truth-quark requested review from blimlim and removed request for MartinDix October 9, 2024 05:30
Copy link
Member

@aidanheerdegen aidanheerdegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, didn't have time to check the tests. Can take another look tomorrow.

umpost/um2netcdf.py Show resolved Hide resolved
umpost/um2netcdf.py Show resolved Hide resolved
umpost/um2netcdf.py Show resolved Hide resolved
@truth-quark
Copy link
Collaborator Author

First pass, didn't have time to check the tests. Can take another look tomorrow.

This PR has given us several open questions & highlighted that the requirements are not really known. This function partly blocks work to fix structural/architectural repairs too. To address all these, how is the following?

How does that sound?

umpost/um2netcdf.py Outdated Show resolved Hide resolved
umpost/um2netcdf.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@blimlim blimlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the modularisation step looks good, and looks like it will behave exactly the same as before.

Setting up the tests and fixtures for this seems very tricky and intricate though, especially due to the pg_calendar overwrite. I'm wondering whether it would be easiest/fastest for this PR to just do the extraction into a separate function, and defer all of the tests to #141?

test/test_um2netcdf.py Show resolved Hide resolved
test/test_um2netcdf.py Show resolved Hide resolved
test/test_um2netcdf.py Show resolved Hide resolved
test/test_um2netcdf.py Show resolved Hide resolved
test/test_um2netcdf.py Show resolved Hide resolved
@truth-quark
Copy link
Collaborator Author

@blimlim ^ @aidanheerdegen do you want to check over the recent sets of changes & confirm if they cover the discussion threads here? I'll define "cover" as implements smaller fixes OR makes a note for a future fix.

@blimlim
Copy link
Collaborator

blimlim commented Oct 10, 2024

confirm if they cover the discussion threads here?

Changes look good

Would you be happy to add comments to the forecast_ref_time_coord and time_coord fixtures to note that their unit.calendar attributes will likely have to be changed in the future, since um2nc only ever uses the cf_units.CALENDAR_PROLEPTIC_GREGORIAN, cf_units.CALENDAR_360_DAY, and cf_units.CALENDAR_365_DAY calendars?

@truth-quark
Copy link
Collaborator Author

Would you be happy to add comments to the forecast_ref_time_coord and time_coord fixtures to note that their unit.calendar attributes will likely have to be changed in the future

That's gone in with 3820528.

Copy link
Member

@aidanheerdegen aidanheerdegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LFTM

Copy link
Collaborator

@blimlim blimlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@truth-quark
Copy link
Collaborator Author

truth-quark commented Oct 10, 2024

Setting up the tests and fixtures for this seems very tricky and intricate though, especially due to the pg_calendar overwrite. I'm wondering whether it would be easiest/fastest for this PR to just do the extraction into a separate function, and defer all of the tests to #141?

Yep, initial attempts at testing were fiddly & nasty to set up the data structures. Approaching in small, logical steps, extracting helper funcs etc should make things better with gradual logic reduction.

I've also referenced #141 in the main body of this PR.

@truth-quark truth-quark merged commit 43836d2 into main Oct 10, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants