Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #137

Open
wants to merge 422 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
422 commits
Select commit Hold shift + click to select a range
6b11f7d
- prepared experiment weekend pc2
Jun 11, 2021
eb18a30
- sleep 10 s after alloc
Jun 11, 2021
36c450f
- logfile path for ccsjobs changed
Jun 14, 2021
22fac16
- deleted Failed "RUNNING" trials and created a new study and run on …
Jun 14, 2021
47bc9ff
- prepared compareStudy 3 (alle 5 min + 256 startup samples)
Jun 15, 2021
643787f
- prepared study bestSetEval
Jun 16, 2021
9c8bd56
- prepared study 4 out of stuy_3 without RUNNING trials
Jun 16, 2021
2ba9165
- prepared study 5: added network size + no Liar in optuna
Jun 16, 2021
2ffcef7
- prepared study 5 with more a&c size
Jun 17, 2021
57856b9
- best trials study 5 -> try 100 times
Jun 17, 2021
cef8da3
- prepared study 4 out of stuy_3 without RUNNING trials
Jun 18, 2021
a3309aa
- prepared study 7: PI approch with 3 actor outputs
Jun 21, 2021
a3f3ab6
- prepared study 8 using integrator and *Ts
Jun 22, 2021
1cc0042
- prepared study 9, used integrator weight as HP
Jun 22, 2021
be049d0
- prepared study 9, minor fix
Jun 22, 2021
9be2f45
- prepared study 10 - PI+AntiWindUp
Jun 25, 2021
d3e08df
- prepared study 11, 6 actions(P&I seperated, get added in env_Wrapper)
Jun 28, 2021
95f1898
- prepared study 11, 6 actions(P&I seperated, get added in env_Wrapper)
Jun 28, 2021
1334581
- prepared study 11, 6 actions(P&I seperated, get added in env_Wrapper)
Jun 30, 2021
5490786
- prepared study 14, NO HPs best13 + LR von best8
Jul 1, 2021
1c59018
- prepared study 14, wrong config name
Jul 1, 2021
fef1ece
- prepared study 14, NO HPs best13 + LR von best8
Jul 7, 2021
5efb600
- prepared study 16, penalties on action P and I
Jul 9, 2021
e181224
- prepared study 16, penalties on action P and I
Jul 9, 2021
f82b6c8
- prepared study 17
Jul 12, 2021
5beb15d
- prepared study 17
Jul 12, 2021
f8caeff
- prepared study 18, penalties on action P and I with sceduler, added…
Jul 15, 2021
47fb388
- prepared study 18 minor fix
Jul 15, 2021
aee6156
- prepared study 16, penalties on action P and I
Jul 19, 2021
f238188
- forget to scale alpha of second critic
Jul 19, 2021
eaa94fb
- split Actor DDPG single run best 18 _6462
Jul 19, 2021
d973d13
- execution files
Jul 20, 2021
15ed823
- retrain study18_6462
Jul 20, 2021
18295c7
- File to evaluate std in replacing testcase by x-times trainingset
Jul 21, 2021
5bbc57b
- reset agent before starting new episode
Jul 21, 2021
148cadd
- show progress only in upper loops
Jul 21, 2021
d4539be
- ddpg testcase eval
Jul 22, 2021
f5646fd
- added titles
Jul 22, 2021
783cbab
- added ddpg model
Jul 22, 2021
0344bff
- prepared DDPG study 19
Jul 22, 2021
50f66e7
- prepaired study 20 ddpg new testcase
Jul 22, 2021
c7537e2
- minor fix - mysql using now
Jul 22, 2021
ac41fff
- minor fu
Jul 22, 2021
e7ffe1a
- minor fu2
Jul 22, 2021
af276c6
- testcase eval results
Jul 23, 2021
5b47e24
Merge remote-tracking branch 'origin/feature/51_rl-agents' into featu…
Jul 23, 2021
ef627a6
- prepaired holiday studies for pc2 on ddpg and td3
Jul 23, 2021
65fff65
- stopping pc2 after 12.000 completed trials in study
Jul 23, 2021
d9ed462
- used correced storage
Jul 23, 2021
1e7b473
- print compl. trials
Jul 23, 2021
a236477
- corrected port
Jul 23, 2021
e902144
- prepaired study
Jul 23, 2021
66032e7
- prepaired study
Jul 23, 2021
4603f01
- prepaired study
Jul 23, 2021
7d752ba
- prepaired study
Jul 23, 2021
9d5ee60
- corrected Trial count
Jul 23, 2021
67191fa
- generate PI safeopt file for RL comparison
Aug 17, 2021
c890675
- file to run PI and DDPG agents on tow (similar) envs at the same ti…
Aug 18, 2021
87891d3
- added logging infos abot the controllers
Aug 18, 2021
50d9da8
changed to 4D optimization
Aug 19, 2021
fb5a0de
- introducing MC-runner where every 1000 steps the env and controller…
Aug 20, 2021
5e9fd74
- plt file to compare PI and RL agents
Aug 20, 2021
215e1c7
- prepaired comparison file
Aug 20, 2021
a5ed505
- prepaired study
Aug 24, 2021
008d258
- added best DDPG model for comparison
Aug 24, 2021
6f41e8e
- save fix
Aug 24, 2021
363e2ff
- added plotting possibility to env wrapper
Aug 26, 2021
f1f5165
added config json
Aug 26, 2021
749ef55
- retrained model (only one loadstep per train episode)
Aug 27, 2021
22cda19
- added past (10) voltage-obs to features
Aug 27, 2021
62db3c9
- reduced to 100 trials in "HP"
Aug 27, 2021
17a73c1
- best past vals model
Aug 30, 2021
b178a24
- best(?) past vals model
Aug 30, 2021
e689bad
- give possibility to turn on/off past observations (v_dq0) as features
Aug 31, 2021
71cd445
- added testcase load data 10 s doing random loadsteps all the time i…
Aug 31, 2021
bd3965e
prepaired experiment without pastVals
Aug 31, 2021
85bb941
- Prepaired experiement without PastVals using MSE
Aug 31, 2021
3e7c912
- using pastVal feature in MSE calculation
Sep 1, 2021
9236fa3
- MSE experiment W/O pastVal Feature
Sep 1, 2021
26208e2
- prepaired experiment with load current as feature
Sep 2, 2021
489a9af
- added loadcurrent ti net
Sep 2, 2021
8355aba
- shifted Fastqueue delays to list, introduced pastVals-class to inhe…
Sep 2, 2021
200e6ed
- added pkl Load file for test comparison
Sep 2, 2021
f34e590
more samples for HPO
Sep 2, 2021
830c511
- give possibility to turn on/off past observations (v_dq0) as features
Sep 6, 2021
77b53c2
- run 20 noPastVal on new testcase for comparison
Sep 6, 2021
f1c2fa2
- give possibility to turn on/off past observations (v_dq0) as features
Sep 6, 2021
5a43fc7
- files for trained model evaluation and plotting
Sep 7, 2021
3b0c295
- Study 24 whole HPO including pastVals
Sep 7, 2021
6aaf72b
- scheduler study 24 to 5000 samples
Sep 7, 2021
c6fe445
- gen new testecases for RL Rload
Sep 8, 2021
988e0c0
- shifted files to local backup folder old
Sep 8, 2021
40fa8c6
- runner reset after 1000 steps
Webbah Sep 8, 2021
0fb1423
- 60 seconds hard testcase data without reset
Sep 8, 2021
d66f7ee
Merge remote-tracking branch 'origin/feature/51_rl-agents' into featu…
Sep 8, 2021
8c1338f
- added feature wrapper to test env
Sep 9, 2021
73e9f74
- back to HPO 24 + 400 Cores for PC2
Sep 9, 2021
7b94618
- minor fix
Sep 9, 2021
2a7e1f8
- study24 more samples
Sep 10, 2021
f1303d6
- HPO 24 continue with less cores
Sep 15, 2021
71d8bc1
- increased number of HPO samples to 12k
Sep 15, 2021
5343e6b
- trained future model
Webbah Sep 16, 2021
8202609
Merge remote-tracking branch 'origin/feature/51_rl-agents' into featu…
Webbah Sep 16, 2021
f631793
- introduced env wrapper combining I-Cotroller with DDPG as "p-Term"
Sep 16, 2021
40f011e
- minor fix, added int_sum to features
Sep 17, 2021
0943406
- bugfix in test_env_wrapper definition
Sep 17, 2021
fc2320c
- env_wrapper update for DDPG without I-term
Sep 17, 2021
0b73030
- prepared pastValHPO 2 ohne Phase feature
Sep 17, 2021
b2254ea
- prepaired study 25: DDPG without Integrator
Sep 20, 2021
f8f0082
- added loadcurve data
Sep 20, 2021
55f4924
- removed sin/cos feature in row DDPG
Sep 20, 2021
825fea0
- config change
Sep 20, 2021
7713f07
- normalized future feature values in wrapper
Sep 20, 2021
493e7e3
- trained future model with normalization
Webbah Sep 21, 2021
1ec597a
- deleted wrong future models
Sep 21, 2021
9c82f9b
- trained future model with normalization and correct future vals in …
Webbah Sep 21, 2021
63cac2e
- introduced string to define future data in all scripts besides vctr…
Sep 21, 2021
cb17ce9
- CAUTION - set risk to zero for unsafe but ongoing learning (also lo…
Sep 21, 2021
3d93e84
- trained future model with normalization and correct future vals in …
Webbah Sep 23, 2021
b8385fc
Merge remote-tracking branch 'origin/feature/51_rl-agents' into featu…
Webbah Sep 23, 2021
762cf47
- local study for DDPG-P-term + standard Ki-I-controller
Sep 27, 2021
b910a7d
- experiment with P10 setting
Sep 27, 2021
3bbf5c7
- corrected vdc and lim
Sep 28, 2021
9ac511b
- experiment with P10 setting and 2nd norm
Sep 28, 2021
8f24f8a
- clipping action
Sep 28, 2021
8bd2789
- turn off abort reward during testing
Sep 29, 2021
e93c9b0
- experiment with P10 setting abourt 2
Sep 29, 2021
887dc24
- new reward design, clipping is punished [-1,0] and mre [0,1]
Sep 29, 2021
de071f2
- debug nan, put log-lvl to train
Sep 30, 2021
c3bea13
- log testdata locally for NaN Debug
Oct 1, 2021
dc0ad7c
- clipped negative load values arising from std on lower bound
Oct 4, 2021
10e5e66
- GEM experiment no-I-term
Oct 6, 2021
3e0ef5f
- config path correction
Oct 6, 2021
9fe8e6b
- adjust reporter to grep GEM data
Oct 6, 2021
2b9fa39
- created own GEM reporter/recorder
Oct 6, 2021
e030366
- GEM experiment with no I term
Oct 7, 2021
37116bc
minor save fix
Oct 7, 2021
d358854
- indtroduced Dessca refenece case in test setting
Oct 7, 2021
e55ab95
- fixed testcase
Oct 7, 2021
7b66b92
- prepaired experiment for pastVals GEM
Oct 7, 2021
c7792ea
- MRE PIPI for P10 comparison
Oct 7, 2021
2e451cd
- more cores for second project
Oct 7, 2021
6cb8fd7
- add a few no I term samples
Oct 7, 2021
2b493e5
- add a few no I term samples
Oct 7, 2021
52ce6f7
- corrected error calculation and reset in test
Oct 7, 2021
c1563e3
- study for using integrator
Oct 7, 2021
6d96733
- delete render + minor fixes
Oct 13, 2021
a1a5e1b
- study with longer training + no-I-term
Oct 13, 2021
8e91179
- allow workers for 2nd study
Oct 13, 2021
ddbf181
- reporter config to log study 1
Oct 13, 2021
9eab90b
- reporter config to log study 2
Oct 13, 2021
40da172
- reporter adjusted to grep only file with specific numbers ending
Oct 14, 2021
3fe37ba
- load results from iterm study to moongodb
Oct 14, 2021
c1fd64a
- debug
Oct 14, 2021
3c7ac59
- debug
Oct 14, 2021
35bb3d4
- debug
Oct 14, 2021
82a58dd
- normed error feature correctly
Oct 18, 2021
97236af
- first i_load_feature model
Webbah Oct 21, 2021
6ea7f6b
- Reward back to MRE
Oct 21, 2021
b917100
- readjsuted loadsetting back to old tb
Oct 21, 2021
0f5dd97
- first i_load_feature model 2
Webbah Oct 21, 2021
4e5dcd7
- first i_load_feature_2 model 2
Webbah Oct 22, 2021
fe0a323
- best(?) past vals model
Oct 22, 2021
955dde5
Merge remote-tracking branch 'origin/feature/51_rl-agents' into featu…
Oct 22, 2021
36e4c2e
- adjusted PI reward to MRE only
Oct 22, 2021
f8cdae0
- best(?) past vals model
Nov 2, 2021
3011342
- first noIntegrator model
Nov 3, 2021
17a5755
- neglegt integrator during save if not used
Nov 3, 2021
8a8ee1b
- best model wo integrator so far
Nov 4, 2021
1498af6
- pc2 study for no I term
Nov 4, 2021
12523f9
- pc2 study for I term with no pastVals
Nov 4, 2021
2abd95f
- pc2 study for I term with no pastVals allow 50 nodes
Nov 4, 2021
f22ee75
- pc2 study for I term with no pastVals and i_load_feature
Nov 4, 2021
d8e8769
- pc2 study for I term with no pastVals and i_load_feature corrected …
Nov 4, 2021
6115db4
- pc2 study for I term with no pastVals and i_load_feature corrected …
Nov 4, 2021
8a979fd
- pc2 study for I term with no pastVals and i_load_feature corrected …
Nov 4, 2021
fae4f2f
- pc2 study for I term with no pastVals corrected due to pastVals
Nov 4, 2021
a400400
- pc2 study for I term with pastVals
Nov 4, 2021
10653af
- pc2 run results
Nov 5, 2021
5c57422
- pc2 run results
Nov 5, 2021
df6e758
- pc2 run results
Nov 5, 2021
abbca19
- pc2 run results
Nov 5, 2021
c005fdf
- pc2 run results
Nov 5, 2021
3d4786a
- pc2 run results
Nov 5, 2021
c8e0d36
- pc2 run results
Nov 5, 2021
a59df93
- pc2 run results non deterministic
Nov 5, 2021
bcff564
- pc2 run results non deterministic
Nov 5, 2021
c1a79a7
- pc2 logging to mongodb lea38
Nov 5, 2021
59f3933
- pc2 pastVal-study run next 250
Nov 8, 2021
7e50c43
- pc2 pastVal-study run next 500
Nov 8, 2021
ef335e4
- pc2 pastVal-study run next 500
Nov 8, 2021
7967a8c
- pc2 pastVal-study run next 50
Nov 9, 2021
f545042
- pc2 no pastVal-study run next 500
Nov 9, 2021
3abac77
- pc2 no pastVal-study run next 500 without i_load feature
Nov 9, 2021
f2d2f0d
- pc2 no pastVal-study run next 500
Nov 9, 2021
6c122cf
- pc2 next 250 runs for Actor without I-Term
Nov 10, 2021
e1c8886
- typo
Nov 10, 2021
7727d99
- OMG new models
Nov 11, 2021
ce1ee02
- rerun i_load_feature
Nov 11, 2021
16de2e8
- add v_dq_mess and reduce obs_space
Nov 11, 2021
2f3d522
- next 500 GEM-I-4
Nov 11, 2021
c940d64
- next 500 GEM-no_I-4
Nov 12, 2021
9db8e36
- next 250 GEM-I-4
Nov 15, 2021
7ca295c
- P10 HPO using I+pastVals
Nov 15, 2021
66470d4
- P10 reporter to send all files in directory
Nov 16, 2021
959b61f
- P10 run testcase for model and PI on pc2
Nov 16, 2021
eff10e1
- run PI stuff locally on laptop
Nov 16, 2021
e50be65
- plotting moinor
Nov 16, 2021
fa5e393
- plotting moinor
Nov 16, 2021
b1bf6a3
- plotting moinor
Nov 16, 2021
289bf03
- deterministic testcase
Nov 16, 2021
e0dde4b
- orderd to alphabet
Nov 16, 2021
b8ba8a6
- save plts for paper
Nov 17, 2021
92e7e01
- save plts for paper
Nov 19, 2021
29d42b7
- P10_pastVals Reward redesign
Nov 25, 2021
b1f453d
- log reward
Nov 26, 2021
d44c8c7
- test 2_213
Nov 26, 2021
e952f7b
- test 2_213
Nov 26, 2021
9a08969
- test 2_213
Nov 26, 2021
a071255
- test 2_204
Nov 26, 2021
6fb26de
- test 2_204
Nov 26, 2021
143fb2a
- test 2_204
Nov 26, 2021
6f39b02
- test 2_374
Nov 26, 2021
96ab03d
- test 2_1080
Nov 26, 2021
c76fbd1
- P10 HPO 3 with new reward design
Nov 26, 2021
34b28d9
- P10 HPO 3 with new reward design
Nov 26, 2021
95ec255
- changed label
Dec 9, 2021
e1e43b2
- minor plt stuff
Dec 17, 2021
68801d5
- generated dessca load data for paper
Jan 17, 2022
476dc69
- minor fix
Jan 17, 2022
d925efe
- minor
Jan 17, 2022
cd7a972
- added pastvals to ddpg
Jan 17, 2022
49a0c39
- mine data for paper ddpg 1 to 50
Jan 19, 2022
2882064
- added function to get named log files with reporter
Jan 19, 2022
8fc87f4
- updated ssh function new functionalty to grep specific files
Jan 19, 2022
585e12f
- minor plt stuff
Jan 19, 2022
fd1b0fe
- updated plotting for paper results
Jan 21, 2022
dc19d5c
- changed plt sizes
Jan 27, 2022
ce43831
- add T14 to VPN nodes
Feb 14, 2022
e555c51
- introduced logpath in config
Mar 10, 2022
cf61e62
- last changes for R_load experiment P10 on PC2
Mar 11, 2022
fc022bf
- last changes for R_load experiment P10 on PC2
Mar 11, 2022
f481c42
- last changes for R_load experiment P10 on PC2
Mar 11, 2022
d77416f
- last changes for R_load experiment P10 on PC2
Mar 11, 2022
2c15fb4
- last changes for R_load experiment P10 on PC2
Mar 11, 2022
c233444
- increased nodes
Mar 11, 2022
fad1732
- undo commit
Mar 11, 2022
df1ccd3
- Comparison for P10 trained agents
Apr 13, 2022
9905bd1
Merge remote-tracking branch 'origin/feature/51_rl-agents' into develop
May 18, 2022
d9c8ea2
- Added fmu for P10 experiment
Jun 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added DDPG_study18_best_test_varianz.pkl
Binary file not shown.
Binary file added OMG_Integrator_Actor/3/model.zip
Binary file not shown.
Binary file added OMG_Integrator_Actor/32/model.zip
Binary file not shown.
Binary file added OMG_Integrator_Actor_i_load_feature/0/model.zip
Binary file not shown.
Binary file added OMG_Integrator_Actor_i_load_feature/1/model.zip
Binary file not shown.
Binary file added OMG_Integrator_Actor_i_load_feature_2/1/model.zip
Binary file not shown.
Binary file added Pipi.pkl
Binary file not shown.
226 changes: 226 additions & 0 deletions experiments/DQN/env/Custom_Cartpole.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
"""
Classic cart-pole system implemented by Rich Sutton et al.
Copied from http://incompleteideas.net/sutton/book/code/pole.c
permalink: https://perma.cc/C9ZM-652R
"""

import math
import gym
from gym import spaces, logger
from gym.utils import seeding
import numpy as np


class CartPoleEnv(gym.Env):
"""
Description:
A pole is attached by an un-actuated joint to a cart, which moves along
a frictionless track. The pendulum starts upright, and the goal is to
prevent it from falling over by increasing and reducing the cart's
velocity.

Source:
This environment corresponds to the version of the cart-pole problem
described by Barto, Sutton, and Anderson

Observation:
Type: Box(4)
Num Observation Min Max
0 Cart Position -4.8 4.8
1 Cart Velocity -Inf Inf
2 Pole Angle -0.418 rad (-24 deg) 0.418 rad (24 deg)
3 Pole Angular Velocity -Inf Inf


Actions:
Type: Discrete(2)
Num Action
0 Push cart to the left
1 Push cart to the right

Note: The amount the velocity that is reduced or increased is not
fixed; it depends on the angle the pole is pointing. This is because
the center of gravity of the pole increases the amount of energy needed
to move the cart underneath it

Reward:
Reward is 1 for every step taken, including the termination step

Starting State:
All observations are assigned a uniform random value in [-0.05..0.05]

Episode Termination:
Pole Angle is more than 12 degrees.
Cart Position is more than 2.4 (center of the cart reaches the edge of
the display).
Episode length is greater than 200.
Solved Requirements:
Considered solved when the average return is greater than or equal to
195.0 over 100 consecutive trials.
"""

metadata = {
'render.modes': ['human', 'rgb_array'],
'video.frames_per_second': 50
}

def __init__(self):
self.gravity = 9.8
self.masscart = 1.0
self.masspole = 0.1
self.total_mass = (self.masspole + self.masscart)
self.length = 0.5 # actually half the pole's length
self.polemass_length = (self.masspole * self.length)
self.force_mag = 10.0
self.tau = 0.02 # seconds between state updates
self.kinematics_integrator = 'euler'

# Angle at which to fail the episode
self.theta_threshold_radians = 12 * 2 * math.pi / 360
self.x_threshold = 2.4

# Angle limit set to 2 * theta_threshold_radians so failing observation
# is still within bounds.
high = np.array([self.x_threshold * 2,
np.finfo(np.float32).max,
self.theta_threshold_radians * 2,
np.finfo(np.float32).max],
dtype=np.float32)

self.action_space = spaces.Discrete(2)
self.observation_space = spaces.Box(-high, high, dtype=np.float32)

self.seed()
self.viewer = None
self.state = None

self.steps_beyond_done = None

def seed(self, seed=None):
self.np_random, seed = seeding.np_random(seed)
return [seed]

def step(self, action):
err_msg = "%r (%s) invalid" % (action, type(action))
assert self.action_space.contains(action), err_msg

x, x_dot, theta, theta_dot = self.state
force = self.force_mag if action == 1 else -self.force_mag
costheta = math.cos(theta)
sintheta = math.sin(theta)

# For the interested reader:
# https://coneural.org/florian/papers/05_cart_pole.pdf
temp = (force + self.polemass_length * theta_dot ** 2 * sintheta) / self.total_mass
thetaacc = (self.gravity * sintheta - costheta * temp) / (
self.length * (4.0 / 3.0 - self.masspole * costheta ** 2 / self.total_mass))
xacc = temp - self.polemass_length * thetaacc * costheta / self.total_mass

if self.kinematics_integrator == 'euler':
x = x + self.tau * x_dot
x_dot = x_dot + self.tau * xacc
theta = theta + self.tau * theta_dot
theta_dot = theta_dot + self.tau * thetaacc
else: # semi-implicit euler
x_dot = x_dot + self.tau * xacc
x = x + self.tau * x_dot
theta_dot = theta_dot + self.tau * thetaacc
theta = theta + self.tau * theta_dot

if theta >= np.pi:
theta -= 2 * np.pi
elif theta <= -np.pi:
theta += 2 * np.pi

self.state = (x, x_dot, theta, theta_dot)

done = bool(
x < -self.x_threshold
or x > self.x_threshold
# or theta < -self.theta_threshold_radians
# or theta > self.theta_threshold_radians
)

if not done:
reward = 1 - (abs(theta) / np.pi)
# reward = 1.0
elif self.steps_beyond_done is None:
# Pole just fell!
self.steps_beyond_done = 0
reward = 0.0
else:
if self.steps_beyond_done == 0:
logger.warn(
"You are calling 'step()' even though this "
"environment has already returned done = True. You "
"should always call 'reset()' once you receive 'done = "
"True' -- any further steps are undefined behavior."
)
self.steps_beyond_done += 1
reward = 0.0

return np.array(self.state), reward, done, {}

def reset(self):
self.state = self.np_random.uniform(low=-0.05, high=0.05, size=(4,))
self.steps_beyond_done = None
return np.array(self.state)

def render(self, mode='human'):
screen_width = 600
screen_height = 400

world_width = self.x_threshold * 2
scale = screen_width / world_width
carty = 100 # TOP OF CART
polewidth = 10.0
polelen = scale * (2 * self.length)
cartwidth = 50.0
cartheight = 30.0

if self.viewer is None:
from gym.envs.classic_control import rendering
self.viewer = rendering.Viewer(screen_width, screen_height)
l, r, t, b = -cartwidth / 2, cartwidth / 2, cartheight / 2, -cartheight / 2
axleoffset = cartheight / 4.0
cart = rendering.FilledPolygon([(l, b), (l, t), (r, t), (r, b)])
self.carttrans = rendering.Transform()
cart.add_attr(self.carttrans)
self.viewer.add_geom(cart)
l, r, t, b = -polewidth / 2, polewidth / 2, polelen - polewidth / 2, -polewidth / 2
pole = rendering.FilledPolygon([(l, b), (l, t), (r, t), (r, b)])
pole.set_color(.8, .6, .4)
self.poletrans = rendering.Transform(translation=(0, axleoffset))
pole.add_attr(self.poletrans)
pole.add_attr(self.carttrans)
self.viewer.add_geom(pole)
self.axle = rendering.make_circle(polewidth / 2)
self.axle.add_attr(self.poletrans)
self.axle.add_attr(self.carttrans)
self.axle.set_color(.5, .5, .8)
self.viewer.add_geom(self.axle)
self.track = rendering.Line((0, carty), (screen_width, carty))
self.track.set_color(0, 0, 0)
self.viewer.add_geom(self.track)

self._pole_geom = pole

if self.state is None:
return None

# Edit the pole polygon vertex
pole = self._pole_geom
l, r, t, b = -polewidth / 2, polewidth / 2, polelen - polewidth / 2, -polewidth / 2
pole.v = [(l, b), (l, t), (r, t), (r, b)]

x = self.state
cartx = x[0] * scale + screen_width / 2.0 # MIDDLE OF CART
self.carttrans.set_translation(cartx, carty)
self.poletrans.set_rotation(-x[2])

return self.viewer.render(return_rgb_array=mode == 'rgb_array')

def close(self):
if self.viewer:
self.viewer.close()
self.viewer = None
Loading