You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Presumably feature selection should only be carried out on the CV dataset - how do I ensure this in the context of a sklearn pipeline?
This raises another spectre - I want to perform (tsfresh's) feature selection on tsfresh features, while simultaneously fitting non-tsfresh-derived features. Is this even possible and if so how can we make it work?
Thanks again!
The text was updated successfully, but these errors were encountered:
Hi @jtlz2,
Thanks for pointing out that we are missing an example on how to use FeaturesSelector() in an sklearn pipeline.
Let's assume that you already have extracted the time-series features using the extract_features() function. You can join the DataFrame with time-series features with another feature matrix, if both have the same index.
The sklearn pipeline can be built as follows:
from sklearn.ensemble import RandomForestClassifier
sklearn.model_selection import cross_val_score
from sklearn.pipeline import make_pipeline
from tsfresh.transformers import FeatureSelector
clf = make_pipeline(FeatureSelector(),
RandomForestClassifier())
cross_val_score(clf, X, y)
Then, you can fit your model clf.fit() or use clf with the tools provided in sklearn.model_selection.
Discussed in #959
Originally posted by jtlz2 August 3, 2022
Awesome package, thanks!
I'm trying to use the feature-selector transformer within a sklearn pipeline but keep getting errors like
Now, this raises a few questions for me:
Thanks again!
The text was updated successfully, but these errors were encountered: