Skip to content

Commit

Permalink
Implement volatility features (#187)
Browse files Browse the repository at this point in the history
* Add steps volatility features

* Update volatility testing

* Updated change log

Co-authored-by: Jim Zhu <[email protected]>
  • Loading branch information
ChinW97 and JimZhu96 authored Jun 29, 2022
1 parent 4b60e35 commit 4f8f66e
Show file tree
Hide file tree
Showing 17 changed files with 95 additions and 61 deletions.
2 changes: 1 addition & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -435,7 +435,7 @@ FITBIT_STEPS_SUMMARY:
PROVIDERS:
RAPIDS:
COMPUTE: False
FEATURES: ["maxsumsteps", "minsumsteps", "avgsumsteps", "mediansumsteps", "stdsumsteps"]
FEATURES: ["maxsumsteps", "minsumsteps", "avgsumsteps", "mediansumsteps", "stdsumsteps", "maxvolatilitysteps", "minvolatilitysteps", "avgvolatilitysteps", "medianvolatilitysteps", "stdvolatilitysteps", "annualizedvolatilitysteps"]
SRC_SCRIPT: src/features/fitbit_steps_summary/rapids/main.py

# See https://www.rapids.science/latest/features/fitbit-steps-intraday/
Expand Down
2 changes: 2 additions & 0 deletions docs/change-log.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
- Optimize memory usage in readable_datetime.R script
- Fix the bug of missing local_segment column in FITBIT_SLEEP_SUMMARY RAPIDS provider
- Add TYPING_SESSION_DURATION parameter for typing sessions detection to PHONE_KEYBOARD RAPIDS provider
- Add steps volatility features
- Add tests for steps volatility features
## v1.8.0
- Add data stream for AWARE Micro server
- Fix the NA bug in PHONE_LOCATIONS BARNETT provider
Expand Down
10 changes: 6 additions & 4 deletions docs/developers/test-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ The following is a list of the sensors that testing is currently available.
| Fitbit Steps Summary | RAPIDS | Y | Y | Y |
| Fitbit Steps Intraday | RAPIDS | Y | Y | Y |


## Accelerometer

Description
Expand Down Expand Up @@ -532,7 +531,7 @@ Checklist

Description

- The 4-day raw heartrate summary data is contained in `fitbit_steps_intraday_raw.csv`
- The 4-day raw step summary data is contained in `fitbit_steps_intraday_raw.csv`
- One episode for each daily segment (`night`, `morning`, `afternoon` and `evening`) on each day
- Two episodes within the same 30-min segment (`Fri 05:58:00` and `Fri 05:59:00`)
- A one-min episode at `2020-03-07 09:00:00` that will be converted to New York time `2020-03-07 12:00:00`
Expand All @@ -556,8 +555,11 @@ Checklist

Description

- The 4-day raw heartrate summary data is contained in `fitbit_steps_summary_raw.csv`.
- As heartrate summary is periodic, it only generates results in periodic feature, there will be no result in frequency and event.
- The 4-day calculated step summary data is contained in `fitbit_steps_summary_raw.csv`.
- The 4-day calculated step volatility summary data is contained in`fitbit_steps_summary_raw.csv`.
- Step summary including max, min, median, mean and standard deviation value.
- Volatility summary including max, min, median, mean, standard deviation and annulized volatility value.
- As step summary is periodic, it only generates results in periodic feature, there will be no result in frequency and event.


Checklist
Expand Down
36 changes: 33 additions & 3 deletions src/features/fitbit_steps_summary/rapids/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ def statsFeatures(steps_data, features_to_compute, features_type, steps_features
col_name = "steps"
elif features_type == "durationsedentarybout" or features_type == "durationactivebout":
col_name = "duration"
elif features_type == "volatilitysteps":
col_name = "volatilitysteps"
else:
raise ValueError("features_type can only be one of ['steps', 'sumsteps', 'durationsedentarybout', 'durationactivebout'].")
raise ValueError("features_type can only be one of ['steps', 'sumsteps', 'durationsedentarybout', 'durationactivebout', 'volatilitysteps'].")

if "count" + features_type.replace("duration", "episode") in features_to_compute:
steps_features["count" + features_type.replace("duration", "episode")] = steps_data.groupby(["local_segment"])[col_name].count()
Expand All @@ -23,14 +25,35 @@ def statsFeatures(steps_data, features_to_compute, features_type, steps_features
steps_features["median" + features_type] = steps_data.groupby(["local_segment"])[col_name].median()
if "std" + features_type in features_to_compute:
steps_features["std" + features_type] = steps_data.groupby(["local_segment"])[col_name].std()

if "annualized" + features_type in features_to_compute:
steps_features["annualized" + features_type] = steps_data.groupby(["local_segment"])[col_name].var(ddof = 1)
return steps_features

def extractStepsVolatility(steps_data):
#Set index to be datetime
steps_data.index = pd.DatetimeIndex(steps_data["local_date"])

#Add in the missing dates between date range
date_range = pd.date_range(start = steps_data.index.min(), end = steps_data.index.max(), freq="D")
steps_data = steps_data.reindex(date_range, fill_value = np.nan)

#Create the denominator for volatility function
steps_data.loc[steps_data["steps"] == 0, "steps"] = np.nan
steps_data["last_steps"] = steps_data["steps"].shift(periods = 1, fill_value = np.nan)

steps_data['volatilitysteps'] = np.log(steps_data["steps"] / steps_data["last_steps"])

steps_data.dropna(subset = ["local_segment"], inplace = True)
steps_data.reset_index(drop = True, inplace = True)

return steps_data

def extractStepsFeaturesFromSummaryData(steps_summary_data, summary_features_to_compute):
steps_summary_features = pd.DataFrame()

# statistics features of daily steps count
steps_summary_features = statsFeatures(steps_summary_data, summary_features_to_compute, "sumsteps", steps_summary_features)
steps_summary_features = statsFeatures(steps_summary_data, summary_features_to_compute, "volatilitysteps", steps_summary_features)

steps_summary_features.reset_index(inplace=True)

Expand All @@ -44,16 +67,20 @@ def rapids_features(sensor_data_files, time_segment, provider, filter_data_by_se
requested_summary_features = provider["FEATURES"]

# name of the features this function can compute
base_summary_features = ["maxsumsteps", "minsumsteps", "avgsumsteps", "mediansumsteps", "stdsumsteps"]
base_summary_features = ["maxsumsteps", "minsumsteps", "avgsumsteps", "mediansumsteps", "stdsumsteps",
"maxvolatilitysteps", "minvolatilitysteps", "avgvolatilitysteps", "medianvolatilitysteps", "stdvolatilitysteps", "annualizedvolatilitysteps"]

# the subset of requested features this function can compute
summary_features_to_compute = list(set(requested_summary_features) & set(base_summary_features))

# extract features from summary data
steps_summary_features = pd.DataFrame(columns=["local_segment"] + summary_features_to_compute)

if not steps_summary_data.empty:
steps_summary_data = filter_data_by_segment(steps_summary_data, time_segment)

if not steps_summary_data.empty:

# only keep the segments start at 00:00:00 and end at 23:59:59
datetime_start_regex = "[0-9]{4}[\\-|\\/][0-9]{2}[\\-|\\/][0-9]{2} 00:00:00"
datetime_end_regex = "[0-9]{4}[\\-|\\/][0-9]{2}[\\-|\\/][0-9]{2} 23:59:59"
Expand All @@ -62,6 +89,9 @@ def rapids_features(sensor_data_files, time_segment, provider, filter_data_by_se
steps_summary_data = steps_summary_data[steps_summary_data["local_segment"].str.match(segment_regex)]

if not steps_summary_data.empty:

steps_summary_data = extractStepsVolatility(steps_summary_data)
steps_summary_features = extractStepsFeaturesFromSummaryData(steps_summary_data, summary_features_to_compute)

return steps_summary_features

Original file line number Diff line number Diff line change
@@ -1 +1 @@
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgsumsteps"
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_annualizedvolatilitysteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdvolatilitysteps","fitbit_steps_summary_rapids_maxvolatilitysteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_minvolatilitysteps","fitbit_steps_summary_rapids_medianvolatilitysteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgvolatilitysteps"
Original file line number Diff line number Diff line change
@@ -1 +1 @@
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgsumsteps"
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_annualizedvolatilitysteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdvolatilitysteps","fitbit_steps_summary_rapids_maxvolatilitysteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_minvolatilitysteps","fitbit_steps_summary_rapids_medianvolatilitysteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgvolatilitysteps"
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_mediansumsteps"
"daily#2020-03-06 00:00:00,2020-03-06 23:59:59","daily","2020-03-06 00:00:00","2020-03-06 23:59:59",1379,1379,NA,1379,1379
"daily#2020-03-07 00:00:00,2020-03-07 23:59:59","daily","2020-03-07 00:00:00","2020-03-07 23:59:59",1021,1021,NA,1021,1021
"daily#2020-03-08 00:00:00,2020-03-08 23:59:59","daily","2020-03-08 00:00:00","2020-03-08 23:59:59",960,960,NA,960,960
"daily#2020-03-09 00:00:00,2020-03-09 23:59:59","daily","2020-03-09 00:00:00","2020-03-09 23:59:59",1006,1006,NA,1006,1006
"daily#2020-10-30 00:00:00,2020-10-30 23:59:59","daily","2020-10-30 00:00:00","2020-10-30 23:59:59",1379,1379,NA,1379,1379
"daily#2020-10-31 00:00:00,2020-10-31 23:59:59","daily","2020-10-31 00:00:00","2020-10-31 23:59:59",1021,1021,NA,1021,1021
"daily#2020-11-01 00:00:00,2020-11-01 23:59:59","daily","2020-11-01 00:00:00","2020-11-01 23:59:59",960,960,NA,960,960
"daily#2020-11-02 00:00:00,2020-11-02 23:59:59","daily","2020-11-02 00:00:00","2020-11-02 23:59:59",1006,1006,NA,1006,1006
"threeday#2020-03-06 00:00:00,2020-03-08 23:59:59","threeday","2020-03-06 00:00:00","2020-03-08 23:59:59",1379,1120,226.364749905987,960,1021
"threeday#2020-03-07 00:00:00,2020-03-09 23:59:59","threeday","2020-03-07 00:00:00","2020-03-09 23:59:59",1021,995.666666666667,31.785741037977,960,1006
"threeday#2020-03-08 00:00:00,2020-03-10 23:59:59","threeday","2020-03-08 00:00:00","2020-03-10 23:59:59",1006,983,32.5269119345812,960,983
"threeday#2020-03-09 00:00:00,2020-03-11 23:59:59","threeday","2020-03-09 00:00:00","2020-03-11 23:59:59",1006,1006,NA,1006,1006
"threeday#2020-10-28 00:00:00,2020-10-30 23:59:59","threeday","2020-10-28 00:00:00","2020-10-30 23:59:59",1379,1379,NA,1379,1379
"threeday#2020-10-29 00:00:00,2020-10-31 23:59:59","threeday","2020-10-29 00:00:00","2020-10-31 23:59:59",1379,1200,253.144227664784,1021,1200
"threeday#2020-10-30 00:00:00,2020-11-01 23:59:59","threeday","2020-10-30 00:00:00","2020-11-01 23:59:59",1379,1120,226.364749905987,960,1021
"threeday#2020-10-31 00:00:00,2020-11-02 23:59:59","threeday","2020-10-31 00:00:00","2020-11-02 23:59:59",1021,995.666666666667,31.785741037977,960,1006
"threeday#2020-11-01 00:00:00,2020-11-03 23:59:59","threeday","2020-11-01 00:00:00","2020-11-03 23:59:59",1006,983,32.5269119345812,960,983
"threeday#2020-11-02 00:00:00,2020-11-04 23:59:59","threeday","2020-11-02 00:00:00","2020-11-04 23:59:59",1006,1006,NA,1006,1006
"weekend#2020-03-06 00:00:00,2020-03-08 23:59:59","weekend","2020-03-06 00:00:00","2020-03-08 23:59:59",1379,1120,226.364749905987,960,1021
"weekend#2020-10-30 00:00:00,2020-11-01 23:59:59","weekend","2020-10-30 00:00:00","2020-11-01 23:59:59",1379,1120,226.364749905987,960,1021
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_annualizedvolatilitysteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdvolatilitysteps","fitbit_steps_summary_rapids_maxvolatilitysteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_minvolatilitysteps","fitbit_steps_summary_rapids_medianvolatilitysteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgvolatilitysteps"
"daily#2020-03-06 00:00:00,2020-03-06 23:59:59","daily","2020-03-06 00:00:00","2020-03-06 23:59:59",NA,1379,NA,NA,NA,1379,1379,NA,NA,1379,NA
"daily#2020-03-07 00:00:00,2020-03-07 23:59:59","daily","2020-03-07 00:00:00","2020-03-07 23:59:59",NA,1021,NA,-0.300576059628636,NA,1021,1021,-0.300576059628636,-0.300576059628636,1021,-0.300576059628636
"daily#2020-03-08 00:00:00,2020-03-08 23:59:59","daily","2020-03-08 00:00:00","2020-03-08 23:59:59",NA,960,NA,-0.0616045337027836,NA,960,960,-0.0616045337027836,-0.0616045337027836,960,-0.0616045337027836
"daily#2020-03-09 00:00:00,2020-03-09 23:59:59","daily","2020-03-09 00:00:00","2020-03-09 23:59:59",NA,1006,NA,0.0468040661978025,NA,1006,1006,0.0468040661978025,0.0468040661978025,1006,0.0468040661978025
"daily#2020-10-30 00:00:00,2020-10-30 23:59:59","daily","2020-10-30 00:00:00","2020-10-30 23:59:59",NA,1379,NA,NA,NA,1379,1379,NA,NA,1379,NA
"daily#2020-10-31 00:00:00,2020-10-31 23:59:59","daily","2020-10-31 00:00:00","2020-10-31 23:59:59",NA,1021,NA,-0.300576059628636,NA,1021,1021,-0.300576059628636,-0.300576059628636,1021,-0.300576059628636
"daily#2020-11-01 00:00:00,2020-11-01 23:59:59","daily","2020-11-01 00:00:00","2020-11-01 23:59:59",NA,960,NA,-0.0616045337027836,NA,960,960,-0.0616045337027836,-0.0616045337027836,960,-0.0616045337027836
"daily#2020-11-02 00:00:00,2020-11-02 23:59:59","daily","2020-11-02 00:00:00","2020-11-02 23:59:59",NA,1006,NA,0.0468040661978025,NA,1006,1006,0.0468040661978025,0.0468040661978025,1006,0.0468040661978025
"threeday#2020-03-06 00:00:00,2020-03-08 23:59:59","threeday","2020-03-06 00:00:00","2020-03-08 23:59:59",0.0285536951016652,1021,0.168978386492667,-0.0616045337027836,226.364749905987,1120,960,-0.300576059628636,-0.18109029666571,1379,-0.18109029666571
"threeday#2020-03-07 00:00:00,2020-03-09 23:59:59","threeday","2020-03-07 00:00:00","2020-03-09 23:59:59",0.00587621226620268,1006,0.0766564561286438,0.0468040661978025,31.785741037977,995.666666666667,960,-0.0616045337027836,-0.00740023375249054,1021,-0.00740023375249054
"threeday#2020-03-08 00:00:00,2020-03-10 23:59:59","threeday","2020-03-08 00:00:00","2020-03-10 23:59:59",NA,983,NA,0.0468040661978025,32.5269119345812,983,960,0.0468040661978025,0.0468040661978025,1006,0.0468040661978025
"threeday#2020-03-09 00:00:00,2020-03-11 23:59:59","threeday","2020-03-09 00:00:00","2020-03-11 23:59:59",NA,1006,NA,0.0468040661978025,NA,1006,1006,0.0468040661978025,0.0468040661978025,1006,0.0468040661978025
"threeday#2020-10-28 00:00:00,2020-10-30 23:59:59","threeday","2020-10-28 00:00:00","2020-10-30 23:59:59",NA,1379,NA,NA,NA,1379,1379,NA,NA,1379,NA
"threeday#2020-10-29 00:00:00,2020-10-31 23:59:59","threeday","2020-10-29 00:00:00","2020-10-31 23:59:59",NA,1200,NA,-0.300576059628636,253.144227664784,1200,1021,-0.300576059628636,-0.300576059628636,1379,-0.300576059628636
"threeday#2020-10-30 00:00:00,2020-11-01 23:59:59","threeday","2020-10-30 00:00:00","2020-11-01 23:59:59",0.0285536951016652,1021,0.168978386492667,-0.0616045337027836,226.364749905987,1120,960,-0.300576059628636,-0.18109029666571,1379,-0.18109029666571
"threeday#2020-10-31 00:00:00,2020-11-02 23:59:59","threeday","2020-10-31 00:00:00","2020-11-02 23:59:59",0.0315887944258214,1006,0.177732367411851,0.0468040661978025,31.785741037977,995.666666666667,960,-0.300576059628636,-0.0616045337027836,1021,-0.105125509044539
"threeday#2020-11-01 00:00:00,2020-11-03 23:59:59","threeday","2020-11-01 00:00:00","2020-11-03 23:59:59",0.00587621226620268,983,0.0766564561286438,0.0468040661978025,32.5269119345812,983,960,-0.0616045337027836,-0.00740023375249054,1006,-0.00740023375249054
"threeday#2020-11-02 00:00:00,2020-11-04 23:59:59","threeday","2020-11-02 00:00:00","2020-11-04 23:59:59",NA,1006,NA,0.0468040661978025,NA,1006,1006,0.0468040661978025,0.0468040661978025,1006,0.0468040661978025
"weekend#2020-03-06 00:00:00,2020-03-08 23:59:59","weekend","2020-03-06 00:00:00","2020-03-08 23:59:59",0.0285536951016652,1021,0.168978386492667,-0.0616045337027836,226.364749905987,1120,960,-0.300576059628636,-0.18109029666571,1379,-0.18109029666571
"weekend#2020-10-30 00:00:00,2020-11-01 23:59:59","weekend","2020-10-30 00:00:00","2020-11-01 23:59:59",0.0285536951016652,1021,0.168978386492667,-0.0616045337027836,226.364749905987,1120,960,-0.300576059628636,-0.18109029666571,1379,-0.18109029666571
Original file line number Diff line number Diff line change
@@ -1 +1 @@
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgsumsteps"
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_annualizedvolatilitysteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdvolatilitysteps","fitbit_steps_summary_rapids_maxvolatilitysteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_minvolatilitysteps","fitbit_steps_summary_rapids_medianvolatilitysteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgvolatilitysteps"
Original file line number Diff line number Diff line change
@@ -1 +1 @@
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdsumsteps"
"local_segment","local_segment_label","local_segment_start_datetime","local_segment_end_datetime","fitbit_steps_summary_rapids_annualizedvolatilitysteps","fitbit_steps_summary_rapids_mediansumsteps","fitbit_steps_summary_rapids_stdvolatilitysteps","fitbit_steps_summary_rapids_maxvolatilitysteps","fitbit_steps_summary_rapids_stdsumsteps","fitbit_steps_summary_rapids_avgsumsteps","fitbit_steps_summary_rapids_minsumsteps","fitbit_steps_summary_rapids_minvolatilitysteps","fitbit_steps_summary_rapids_medianvolatilitysteps","fitbit_steps_summary_rapids_maxsumsteps","fitbit_steps_summary_rapids_avgvolatilitysteps"
Loading

0 comments on commit 4f8f66e

Please sign in to comment.