Newton's implementation during RLOS Fest 2020 #10

newtonmwai · 2020-08-22T12:13:09Z

-Added implementation of DR, and DR in episodic settings to the estimator library
-Simulator interface that allows evaluation of target policy against logging policy
-Support Custom Vowpal Wabbit Policies
-Generate a random logging policy and target policy to use for evaluation
-Transforming supervised dataset into a CB dataset
-Transforming a supervised predictor into a stochastic policy using custom softening
-Visualization of comparison

To finalize:
-fix friendly and adversarial softening - not working quite as expected incorrect results
-fix episodic DR

jackgerrits

Thanks for opening this Newton! A couple of things:

Can you please remove the few pyc files that are in this PR
The license on this repo was added after this PR was opened - can you confirm you are okay with merging with that license (standard BSD 3 clause)
Are there any outstanding items here that still need to be done?

Newton's implementation

88cf195

jackgerrits requested changes Aug 31, 2020

View reviewed changes

Remove pyc files

813cc87

JuiP mentioned this pull request Feb 27, 2021

Initial file structure changes for Package #16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newton's implementation during RLOS Fest 2020 #10

Newton's implementation during RLOS Fest 2020 #10

newtonmwai commented Aug 22, 2020

jackgerrits left a comment

Newton's implementation during RLOS Fest 2020 #10

Are you sure you want to change the base?

Newton's implementation during RLOS Fest 2020 #10

Conversation

newtonmwai commented Aug 22, 2020

jackgerrits left a comment

Choose a reason for hiding this comment