How to select candidate equations using performance of test data more efficiently? #622
-
Hi, Thanks for developing such a useful tool! I try to discover equations from observational data. I run PySR with different parameters settings (e.g., complexity, operators), and I want to select equations according to the performance of test data (e.g., RMSE < a & R > b). But I have to select the equations manually which is time-costing. Is there any method to select candidate equations more efficiently? Best regards, |
Beta Was this translation helpful? Give feedback.
Answered by
MilesCranmer
May 7, 2024
Replies: 1 comment
-
Thanks! Would the following help? import copy
equations = copy.deepcopy(model.equations_)
# this is a pandas dataframe, so we can add new columns:
equations["my_metric"] = [
my_metric(
model.predict(Xtest, index=i),
ytest
)
for i in range(len(equations))
]
choice = equations["my_metric"].idxmin()
# ^ or idxmax() if maximizing
model.predict(X, index=index)
# ^ Predict with best (or can pass to .sympy/.latex/.jax/.pytorch) |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
leelew
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks!
Would the following help?