Replies: 1 comment
-
Hi @haoweini - sorry for the late reply. Think about it like this: assuming there is actually a relation between your time series data and your target (there is some underlying truth), your time series will still be randomly distributed - so will be the extracted features. Depending on your "luck", this might make the features look "more relevant" or "more irrelevant" - but as you probably do not know the process and distributions behind your data (otherwise you wouldn't need to do all of this) you can never know if it is just by chance or really irrelevant. All you can do is increase the number of samples. There is however one "test" that you can do, if you have at least done everything correct technically: Use one feature column as the target. That should always work (as your features will definitely be relevant to predict one of the features). |
Beta Was this translation helpful? Give feedback.
-
Dear tsfresh developers,
I have a time-series data with 30 samples and each sample have 2500~5000 data points. After I used extracted_features function and apply select_feature function on it, the output is an empty dataframe with only index.
extracted_features = extract_features(x, column_id='Trail', column_sort='Time (ms)', default_fc_parameters=EfficientFCParameters())
filtered_features = select_features(extracted_features, y, fdr_level = 0.005)
I read another posts about the same issue and the idea might be more samples needed. I am just wondering, what is the minimal sample required to perform select_features?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions