-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add numeric_only=True to relevant pandas operations #259
Conversation
Objects: DataFrameGroupBy -> mean(), min(), max(), median(), quantile(), Roller -> same If this is correct, it should replicate old pandas behaviour (i. e., v0.XX)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blychs I think you could go ahead with the change to using 'h'
for hourly freq instead of 'H'
. I tested using freq='h'
in date range with pandas v1.0.5 (June 2020) and it worked fine.
Some places I identified:
melodies_monet/_cli.py
date ranges- date range in
docs/examples/idealized.ipynb
ts_avg_window
indocs/examples/
andexamples/yaml
YAML files- resamples in driver and
melodies_monet/util/tools.py
- comments "pandas resample rule (e.g., 'H', 'D')" and "Pandas resampling rule (e.g., 'H', 'D')", which appear multiple places
The reformat*
notebooks we are probably going to remove in the future so I wouldn't bother modifying them.
"H" is deprecated "h" also works with older versions This conforms also with numpy TODO: Modify examples, documentation and batch scripts
"H" is deptracted "h" also works with older versions This also conforms with numpy TODO: Modify ipynb files to conform with this. Just running them should do the trick.
@zmoon, I just prepared a new version, but now the newer scorecard plots (which I did not see there before) are failing in the updated version. I will update this PR once I've corrected the issue. It should not be merged until then. |
Older versions did this automatically, but with newer versions, it returns a Datarray with 1 value. This affects plots/surfplot.py scorecard plot.
The corrections are done. airnow_wrfchem.ipynb has the same result as the current code. The mistake was probably due to newer package versions, and is explained in the commit message. Let me know if any other issue appears. Having said that, all the notebooks in docs/examples run without issues, except for camchem.ipynb. When I am running that, there is a conflict in the renaming of lat and lon. If I am not mistaken, though, the problem seems to be in the monet_accessor.py, since the dataset seems to have latitude and longitude created before renaming. I get the same error if running the original melodies_monet develop branch, without my modifications. |
So, @rschwant or @zmoon , the issue seems to have nothing to do with this PR. I added it as Issue (#261) , together with a complete description and a few ideas of how to solve it. These can be at different levels, but I guess that that discussion should be continued there. I have not found any other problems with this PR, and I think it is ready to undergo the tests that you deem necessary before (hopefully, if nothing else goes wrong) merging. Edit: corrected typos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had 2 quick questions and then I can approve this later today. We can discuss quickly in the meeting today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zach confirmed my questions. Looks great thanks for updating this so we can move to Pandas 2!
@zmoon, Do you know why these tests are failing though? Is it using ts_avg_window: '3h' in the idealized example? What error have you all been seeing on this? |
@rschwant the issue is an incompat between pandas v1 and matplotlib 3.9, which was released on conda-forge a week ago. pandas v2 deals with the matplotlib change but I guess it wasn't backported to v1. To fix this for now, we could add |
Yes, that sounds good we can recommend an earlier version of matplotlib now too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thanks for making that update Zach!
Zach suggested it should be okay. Pablo added it in along with the other
updates to convert our code to be acceptable to Pandas 2.
It looks like our ReadTheDocs on the develop branch is still running okay
here:
https://melodies-monet.readthedocs.io/en/develop/examples/airnow_wrfchem.html
So it seems like you can go either way with this. Beiming let me know if it
does cause problems in your code though?
…On Mon, Jul 15, 2024 at 11:58 PM Beiming Tang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In melodies_monet/plots/surfplots.py
<#259 (comment)>
:
> @@ -1250,7 +1250,7 @@ def scorecard_step4_GetRegionLUCDate(ds_name=None,region_list=None,datelist=None
region_date_rural_here = []
region_date_urban_here = []
for i in range(len(region_here['Time'])):
- date_here1 = region_here['Time'][i]
+ date_here1 = region_here['Time'].values[i]
@rschwant <https://github.com/rschwant> @zmoon <https://github.com/zmoon>
I saw the change, I cannot say it works. we need to run and test, if test
if past then we can change?
btw, who changed it?
—
Reply to this email directly, view it on GitHub
<#259 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJD4V5WIFLKDH4HEPPORYPTZMSZANAVCNFSM6AAAAABJW3NSMCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCNZZGI4DANRRHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Becky Schwantes
Research Chemist
NOAA Chemical Sciences Laboratory
|
This pull request adds the keyword
numeric_only=True
to relevant pandas objects,i.e. pandas.core.groupby.DataFrameGroupBy and pandas.core.window.rolling.Rolling .
The relevant operations are
mean()
,min()
,max()
,median()
andquantile()
.This makes it compatible with pandas 2.2.2, and should be a nice start for future proofing MM, without breaking backwards compatibility (i., e., pandas 1, for monet).
If we want to keep it future proof, we should also update the references to freq='H' to freq='h', but I am not sure about compatibility with older versions of pandas for this, so I have kept it the way it was for now.
This has been tested for the surface plots with wrf-chem and AirNow.
Let me know if additional tests are required.