Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] C++ Export #228

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

[DRAFT] C++ Export #228

wants to merge 20 commits into from

Conversation

Gregwar
Copy link
Contributor

@Gregwar Gregwar commented Apr 1, 2022

Description

This is a draft, I suggest we keep the conversation in the associated issue:
DLR-RM/stable-baselines3#836

Motivation and Context

  • I have raised an issue to propose this change (required for new features and bug fixes)

DLR-RM/stable-baselines3#836

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist:

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)

@Gregwar
Copy link
Contributor Author

Gregwar commented Apr 6, 2022

For some application, I would also be interrested in having the values predictions available, so I am adding them as well, some questions:

  • Are the Q/V networks decent estimations of the true functions or not ? (answer may depend on algorithm involved)
  • ContinuousCritic has several q_networks
    • Is that only for the twin trick ?
    • Is it ok in production to only look at the value of the first network, or should we compute both and get the min ?

@Gregwar
Copy link
Contributor Author

Gregwar commented Apr 6, 2022

Also, is there a consistent pattern across all algorithms to access values functions just like predict for the action ?

@Gregwar
Copy link
Contributor Author

Gregwar commented Apr 6, 2022

Another question/note:

If we use the normalize hyperparameter while training, I guess we will need to use the normalization during the inference, right? That would imply getting some data from the env wrapper and exporting it as well

@araffin
Copy link
Member

araffin commented Apr 6, 2022

I guess we will need to use the normalization during the inference, right?

yes, we already save the mean and std for observation in a separate file (vecnormalize.pkl) but should be easy to export.

@araffin
Copy link
Member

araffin commented Apr 6, 2022

onsistent pattern across all algorithms to access values functions just like predict for the action ?

not really... Only between algorithms of the same family...

For some application, I would also be interrested in having the values predictions available,

I agree but I would not focus on that for now (value functions are not needed for inference, unless it is DQN).
I would do it in a follow up PR, once we have the basic feature ready and working reliably.

@araffin
Copy link
Member

araffin commented Apr 6, 2022

re the Q/V networks decent estimations of the true functions or not ?

that's hard to give only one answer, we surely hope they are, but that doesn't mean it is always the case.

Is that only for the twin trick ?

yes

Is it ok in production to only look at the value of the first network

for rough estimation, only one is needed.

@araffin
Copy link
Member

araffin commented Apr 8, 2022

@Gregwar I could successfully test it =)
I had to do some tweaks to use it with conda env:

export CMAKE_PREFIX_PATH="${HOME}/.local/lib/libtorch:/home/user/miniconda3"

where the conda prefix can be retrieved with ${CONDA_PREFIX:-"$(dirname $(which conda))/../"}

I also had to comment out the hardcoded set (Python_EXECUTABLE "/usr/bin/python3.8") (I don't think that's needed)

@Gregwar
Copy link
Contributor Author

Gregwar commented Apr 11, 2022

@Gregwar I could successfully test it =)
That is nice!

I had to do some tweaks to use it with conda env:
Yes you are right this should not be there

I think I will have to setup some (automated) tests to check for consistency between Python and C++ predictions. It will be hard to be sure to cover all the cases else, I am digging in the code of all the possible algorithms and I might miss some information (there are many possible options as well like using images as input that should get normalized, using SDE, the Wrapping "normalize" that is not handled at all currently etc.)

@araffin
Copy link
Member

araffin commented Apr 11, 2022

It will be hard to be sure to cover all the cases else,

Let's do a first working version that covers only some algorithms (let's say PPO, DQN and SAC) and only covers basic case (MLP, no images, no additional feature like normalization or SDE).

Once that's working and merged, we can work on adding additional features, I would start with normalization and then image support ;)

@Gregwar
Copy link
Contributor Author

Gregwar commented Jun 26, 2022

Hello,

Sorry for the lag, we are currently working on our humanoid robots for RoboCup, we integrate DRL algorithms in the robots for the first year.

We spent some time investigating and finally using OpenVino runtime because of our robots architecture (we use ONNX as intermediary representation OpenVino's model exporter).

In first place I thought of implementing pre/post processing in C++ but it is actually a better idea to use it in the PyTorch module that is being traced or exported. We can't provide a lot of runtime-specific implementation, so we could focus on libtorch as first intended and provide ONNX possibility for people that want to use something else.

@araffin araffin added the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Jun 29, 2022
@araffin araffin removed the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Aug 13, 2022
@araffin araffin self-assigned this Aug 13, 2022
@araffin
Copy link
Member

araffin commented Sep 26, 2022

@Gregwar could you give me access to your repo so I can push changes? (mainly merging master with this branch)

@zoythum
Copy link

zoythum commented Jan 18, 2024

@araffin Is there any update on this PR? I would be interested in exporting SB3 models into C++ executables but I am not sure on how to approach this problem

@araffin
Copy link
Member

araffin commented Jan 18, 2024

I would be interested in exporting SB3 models into C++ executables but I am not sure on how to approach this problem

For inference, you can have a look at https://stable-baselines3.readthedocs.io/en/master/guide/export.html
and DLR-RM/stable-baselines3#1349 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants