Derk’s Gym 1.1.1¶
This is the documentation for gym-derk, a python package that exposes the game “Dr. Derk’s Mutant Battlegrounds” as an OpenAI gym environment.
Main website: https://gym.derkgame.com Please get a license on the website if you’re using this in a commercial or academic context.
Installing: pip install gym-derk
(see Installation & Running, os specific instructions for details)
Examples¶
Basic example¶
In this example the Derklings just take random actions
from gym_derk.envs import DerkEnv
env = DerkEnv()
for t in range(3):
observation_n = env.reset()
while True:
action_n = [env.action_space.sample() for i in range(env.n_agents)]
observation_n, reward_n, done_n, info = env.step(action_n)
if all(done_n):
print("Episode finished")
break
env.close()
Neural network example¶
This is an example of how to use the Genetic Algorithm to train a single layer neural network.
from gym_derk.envs import DerkEnv
from gym_derk import ObservationKeys
import numpy as np
import gym
import math
import os.path
env = DerkEnv()
class Network:
def __init__(self, weights=None, biases=None):
self.network_outputs = 13
if weights is None:
weights_shape = (self.network_outputs, len(ObservationKeys))
self.weights = np.random.normal(size=weights_shape)
else:
self.weights = weights
if biases is None:
self.biases = np.random.normal(size=(self.network_outputs))
else:
self.biases = biases
def clone(self):
return Network(np.copy(self.weights), np.copy(self.biases))
def forward(self, observations):
outputs = np.add(np.matmul(self.weights, observations), self.biases)
casts = outputs[3:6]
cast_i = np.argmax(casts)
focuses = outputs[6:13]
focus_i = np.argmax(focuses)
return (
math.tanh(outputs[0]), # MoveX
math.tanh(outputs[1]), # Rotate
max(min(outputs[2], 1), 0), # ChaseFocus
(cast_i + 1) if casts[cast_i] > 0 else 0, # CastSlot
(focus_i + 1) if focuses[focus_i] > 0 else 0, # Focus
)
def copy_and_mutate(self, network, mr=0.1):
self.weights = np.add(network.weights, np.random.normal(size=self.weights.shape) * mr)
self.biases = np.add(network.biases, np.random.normal(size=self.biases.shape) * mr)
weights = np.load('weights.npy') if os.path.isfile('weights.npy') else None
biases = np.load('biases.npy') if os.path.isfile('biases.npy') else None
networks = [Network(weights, biases) for i in range(env.n_agents)]
for e in range(10):
observation_n = env.reset()
while True:
action_n = [networks[i].forward(observation_n[i]) for i in range(env.n_agents)]
observation_n, reward_n, done_n, info = env.step(action_n)
if all(done_n):
print("Episode finished")
break
if env.mode == 'train':
reward_n = env.total_reward
print(reward_n)
top_network_i = np.argmax(reward_n)
top_network = networks[top_network_i].clone()
for network in networks:
network.copy_and_mutate(top_network)
print('top reward', reward_n[top_network_i])
np.save('weights.npy', top_network.weights)
np.save('biases.npy', top_network.biases)
env.close()
Environment¶
Environment details¶
This is a MOBA inspired RL environment, where two teams battle each other, while trying to defend their own “statue”. Each team is composed of three units, and each unit gets a random loadout (see Items for available items). The goal is to try to attack the opponents statue and units, while defending your own. With the default reward, you get one point for killing an enemy creature, and four points for killing an enemy statue.
Arenas and parallelism¶
The environment is designed to run multiple game instances in parallel on the GPU. Each game instance is called an arena. Functions such as step and reset provide and return values from multiple arenas each. Thanks to this functionallity, it’s possible to collect a large amount of experiences very quickly.
Team and episode stats¶
There are a number of statistics you can access about teams. Use gym_derk.envs.DerkEnv.team_stats
or
gym_derk.DerkSession.team_stats
to get the data. See gym_derk.TeamStatsKeys
for available keys.
For example, to read the Reward of the third team:
env.team_stats[2, TeamStatsKeys.Reward.value]
You can also get stats for all arenas in an episode with gym_derk.envs.DerkEnv.episode_stats
and
gym_derk.DerkSession.episode_stats
.
Running against other agents / Benchmarking¶
First, we need an agent to run; you can try for instance https://github.com/MountRouke/Randos. Clone the repo,
and start them with python main.py --server
. This will start a websocket server for that agent.
Next, set mode="connected"
in your own agent environment. The environment will connect with a
websocket to the server running locally (by default), and the away team will now
be controlled by the server. You can now train against these agents, or if you wish to benchmark against them you can
look at episode_stats
at the end of an episode to see how your agents were performing against the opponents.
Installation & Running, os specific instructions¶
The Derk environment is implemented as a WebGL2 web app, and runs in a chromium instance,
through pyppeteer.
This means that you can get the environment working anywhere where you can get chromium with WebGL2 working.
On Desktop systems (Windows, OSX, Desktop linux), using the Derk environment is fairly straightforward; just
run pip install gym-derk
. If you get any errors, make sure that WebGL2 works on your system; you can verify that
it does by visiting WebGL2 report. Unfortunately it’s not possible to run
the environment in a headless mode, since chromium doesn’t support GPU acceleration in headless mode yet (see
this issue).
On a server environment, or if you’re using Docker, there are two main ways to run the environment. The easiest is to use xvfb, which usually means the environment will run on the CPU. See https://github.com/MountRouke/DerkAppInstanceDockerized for a Debian based Docker image, and https://github.com/MountRouke/DerkAppInstanceDockerized/tree/ubuntu for an Ubuntu based image, both using xvfb. To utilize GPU acceleration on a server/in docker, you’ll need to use virtualgl. Virtualgl can be a bit tricky to set up, but there are Docker images with it that could serve as a base.
The environment can also be run on Google Colab. See the Derk Colab GPU example (virtualgl based) or Derk Colab CPU example (xvfb based).
Finally, it’s also possible to set up an agent as a server, without running the environment. This makes it possible to set up a trained agent as a service which you can connect to. See https://github.com/MountRouke/Randos for an example of how to do this. This is for instance useful for running a competition, where participants can submit Dockerized images with their agents, but where the actual environment is run outside of their images.
Competition (AICrowd)¶
We’re partnering with AICrowd to run a competition for Derk, where you can submit your agents to see how well they are performing compared to other participants’ agents. The API is free to use for the competition.
Competition page (with starter kit and submission guidelines): https://aicrowd.com/derk
Configuring your Derklings¶
You can configure a number of attributes on your Derklings, such as their appearance and their load-out. The configuration is read modulous, so you can specify 1, 3 or n_arenas * 3 configurations (or any other number) depending on how you want it repeated. Here’s a basic example:
env = DerkEnv(
home_team=[
{ 'primaryColor': '#ff00ff' },
{ 'primaryColor': '#00ff00', 'slots': ['Talons', None, None] },
{ 'primaryColor': '#ff0000', 'rewardFunction': { 'healTeammate1': 1 } }
]
)
The properties you can configure for a Derkling are:
- Cosmetics:
primaryColor
: A hex color: e.g. #ff00ffsecondaryColor
: Also a hex colorears
: Integer between 1-4eyes
: Integer between 1-5backSpikes
: Integer between 1-7
slots
: An array with exactly 3 items. Each item is a weapon/attachment slot. The first one is the arms attachment, the second tail attachment and the third is the misc attachment. See Items for available items.rewardFunction
: A specific reward function for this Derkling. See Reward function
Citation¶
Please use this BibTeX to cite this environment in your publications:
@misc{gym_derk,
author = {John Fredrik Wilhelm Norén},
title = {Derk Gym Environment},
year = {2020},
publisher = {Mount Rouke},
journal = {Mount Rouke},
howpublished = {\url{https://gym.derkgame.com}},
}
API reference¶
High-level API¶
The high-level API provides a simple, OpenAI Gym compatible DerkEnv class which is suitible for a Python notebook environment.
-
class
gym_derk.envs.
DerkEnv
(mode=False, n_arenas=None, reward_function=None, turbo_mode=False, home_team=None, away_team=None, session_args={}, app_args={}, agent_server_args={})¶ Reinforcement Learning environment for “Dr. Derk’s Mutant Battlegrounds”
There are two modes for the environment:
mode="normal"
: You control both the home and away teams.mode="connected"
: Connects this environment to another agent. See Running against other agents / Benchmarking.
- Parameters
mode (
str
) –"normal"
(default),"connected"
. See above for details. (Environment variable: DERK_MODE)n_arenas (
Optional
[int
]) – Number of parallel arenas to runreward_function (
Optional
[Dict
]) – Reward function. See Reward function for available optionsturbo_mode (
bool
) – Skip rendering to the screen to run as fast as possiblehome_team (
Optional
[List
[Dict
]]) – Home team creatures. See Configuring your Derklings.away_team (
Optional
[List
[Dict
]]) – Away team creatures. See Configuring your Derklings.session_args (
Dict
) – See arguments togym_derk.DerkAppInstance.create_session()
app_args (
Dict
) – See arguments togym_derk.DerkAppInstance
agent_server_args (
Dict
) – See arguments togym_derk.DerkAgentServer
This is a convenience wrapper of the more low level api of
gym_derk.DerkAppInstance
,gym_derk.DerkAgentServer
andgym_derk.DerkSession
.-
property
action_space
¶ Gym space for actions
-
async
async_step
(action_n=None)¶ Async version of
step()
- Return type
Tuple
[ndarray
,ndarray
,List
[bool
],List
[Dict
]]
-
close
()¶ Shut down environment
-
property
episode_stats
¶ Stats for the last episode
- Return type
Dict
-
property
n_agents
¶ Number of agents controlled by this environment
I.e.
env.n_teams * env.n_agents_per_team
- Return type
int
-
property
n_agents_per_team
¶ Number of agents in a team (3)
- Return type
int
-
property
n_teams
¶ Number of teams controlled by this environment
- Return type
int
-
property
observation_space
¶ Gym space for observations
-
reset
()¶ Resets the state of the environment and returns an initial observation.
- Return type
ndarray
- Returns
The initial observation for each agent, with shape (n_agents, len(
gym_derk.ObservationKeys
)).- Raises
ConnectionLostError – If there was a connection error in connected mode
-
step
(action_n=None)¶ Run one timestep.
Accepts a list of actions, one for each agent, and returns the current state.
Actions can have one of the following formats/shapes:
Numpy array of shape (
n_teams
,n_agents_per_team
, len(gym_derk.ActionKeys
))Numpy array of shape (
n_agents
, len(gym_derk.ActionKeys
))List of actions (i.e.
[[1, 0, 0, 2, 0], [0, 1, 0, 0, 3], ...]
), one inner array per agent. This is just cast to a numpy array of shape (n_agents
, len(gym_derk.ActionKeys
)).
The returned observations are laid out in the same way as the actions, and can therefore be reshape like the above. For instance:
observations.reshape((env.n_teams, env.n_agents_per_team, -1))
- Parameters
action_n (
Optional
[ndarray
]) – Numpy array or list of actions. Seegym_derk.ActionKeys
- Return type
Tuple
[ndarray
,ndarray
,List
[bool
],List
[Dict
]]- Returns
A tuple of (observation_n, reward_n, done_n, info). observation_n has shape (n_agents, len(
gym_derk.ObservationKeys
))- Raises
ConnectionLostError – If there was a connection error in connected mode
-
property
team_stats
¶ Stats for each team for the last episode
Numpy array of shape (env.n_teams, len(
gym_derk.TeamStatsKeys
))- Return type
ndarray
-
property
total_reward
¶ Accumulated rewards over an episode
Numpy array of shape (n_agents)
- Return type
ndarray
Low-level API¶
The low-level API is more versatile and makes it possible to do things like setting up an agent as a service or running many different agents together, even if they are running on completely different machines. Here’s an example of how it works:
from gym_derk import DerkAgentServer, DerkSession, DerkAppInstance
import asyncio
async def run_fixed(env: DerkSession, actions):
await env.reset()
while not env.done:
await env.step([actions for i in range(env.n_agents)])
async def main():
# Agent servers are just websocket servers which can be connected to by a DerkAppInstance
# That means these three could be running in different processes or even on different machines
agent_walk = DerkAgentServer(run_fixed, args={ 'actions': [0.1, 0, 0, 0, 0] }, port=8788)
agent_turn = DerkAgentServer(run_fixed, args={ 'actions': [0, 0.1, 0, 0, 0] }, port=8789)
agent_chase = DerkAgentServer(run_fixed, args={ 'actions': [0, 0, 1, 1, 5] }, port=8790)
await agent_walk.start()
await agent_turn.start()
await agent_chase.start()
# This creates an actual instance of the game to run simulations in
app = DerkAppInstance()
await app.start()
# We can specify any number of agent hosts here, and which sides and arenas they control
await app.run_session(
n_arenas=2,
agent_hosts=[
{ 'uri': agent_walk.uri, 'regions': [{ 'sides': 'home' }] },
{ 'uri': agent_turn.uri, 'regions': [{ 'sides': 'away', 'start_arena': 0, 'n_arenas': 1 }] },
{ 'uri': agent_chase.uri, 'regions': [{ 'sides': 'away', 'start_arena': 1, 'n_arenas': 1 }] },
]
)
await app.print_team_stats()
asyncio.get_event_loop().run_until_complete(main())
-
class
gym_derk.
DerkAgentServer
(handle_session, port=None, host=None, args={})¶ Agent server
This creates a websocket agent server, listening on
host:port
- Parameters
handle_session – A coroutine accepting the session and optionally a list org argument
port (
Optional
[int
]) – Port to listen to. Defaults to 8789host (
Optional
[str
]) – Host to listen to. Defaults to 127.0.0.1args (
Dict
) – Dictonary of args passed to handle_session
-
close
()¶ Shutdown
-
async
start
()¶ Start the server
-
class
gym_derk.
DerkSession
(websocket, init_msg)¶ A single training/evaluation session, consisting of multiple episodes
-
n_teams
¶ Number of teams controlled by this environment
-
n_agents_per_team
¶ Number of agents in a team (3)
-
action_space
¶ Gym space for actions
-
observation_space
¶ Gym space for observations
-
total_reward
¶ Accumulated rewards over an episode. Numpy array of shape (n_agents)
-
team_stats
¶ Stats for each team for the last episode. Numpy array of shape (n_teams, len(
gym_derk.TeamStatsKeys
)). See Team and episode stats
-
episode_stats
¶ Stats for the last episode. See Team and episode stats
-
async
close
()¶ Close session
-
property
n_agents
¶ Number of agents controlled by this environment
I.e.
env.n_teams * env.n_agents_per_team
-
async
reset
()¶ See
gym_derk.envs.DerkEnv.reset()
- Return type
ndarray
-
async
step
(action_n=None)¶ See
gym_derk.envs.DerkEnv.step()
- Return type
Tuple
[ndarray
,ndarray
,List
[bool
],List
[Dict
]]
-
-
class
gym_derk.
DerkAppInstance
(app_host=None, chrome_executable=None, chrome_args=[], chrome_devtools=False, window_size=[1000, 750], browser=None, browser_logs=False, internal_http_server=False)¶ Application instance of “Dr. Derk’s Mutant Battlegrounds”
- Parameters
app_host (
Optional
[str
]) – Configure an alternative app bundle host. (Environment variable: DERK_APP_HOST)chrome_executable (
Optional
[str
]) – Path to chrome or chromium. (Environment variable: DERK_CHROME_EXECUTABLE)chrome_args (
List
[str
]) – List of command line switches passed to chromechrome_devtools (
bool
) – Launch devtools when chrome startswindow_size (
Tuple
[int
,int
]) – Tuple with the size of the windowbrowser (
Optional
[Browser
]) – A pyppeteer browser instancebrowser_logs (
bool
) – Show log output from browserweb_socket_worker – Run websockets in a web worker
-
async
async_get_webgl_renderer
()¶ Async version of
get_webgl_renderer()
-
async
close
()¶ Shut down app instance
-
async
connect_to_agent_hosts
()¶ Connect to agent hosts specified when the session was created
- Returns
True if all hosts are connected, False otherwise
This method can be called in a loop to wait for all hosts to come online.
-
async
create_session
(n_arenas=1, reward_function=None, turbo_mode=False, home_team=None, away_team=None, substeps=8, interleaved=True, agent_hosts=None, debug_no_observations=False, web_socket_worker=None, ai_crowd_logo=False, read_game_state=False)¶ Create a session
All arguments are optional.
- Parameters
n_arenas (
int
) – Number of parallel arenas to runreward_function (
Optional
[Dict
]) – Reward function. See Reward function for available optionsturbo_mode (
bool
) – Skip rendering to the screen to run as fast as possiblehome_team (
Optional
[List
[Dict
]]) – Home team creatures. See Configuring your Derklings.away_team (
Optional
[List
[Dict
]]) – Away team creatures. See Configuring your Derklings.substeps (
int
) – Number of game steps to run for each call to stepinterleaved (
bool
) – Run each step in the background, returning the previous steps observationsagent_hosts (
Union
[List
[Dict
],str
,None
]) – List of DerkAgentServer’s to connect to, or"single_local"
, or"dual_local"
. See below for details.read_game_state (
bool
) – Read the entire internal game state each step, and provide it as a JSON in the info object returned from the step function.
With the interleaved mode on, there’s a delay between observation and action of size substeps. E.g. if substeps=8 there’s an 8*16ms = 128ms “reaction time” from observation to action. This means that the game and the python code can in effect run in parallel.
The
agent_hosts
argument takes list of dicts with the following format:{ uri: str, regions: [{ side: str, start_arena: int, n_arenas: int }] }
, whereuri
specifies a running DerkAgentServer to connect to, and regions define which arenas and sides that agent will control.side
can be'home'
,'away'
or'both'
.start_arena
andn_arenas
can be ommitted to run the agent on all arenas. You can also pass a string value of"single_local"
, in which case theagent_hosts
defaults to[{ 'uri': 'ws://127.0.0.1:8788', 'regions': [{ 'sides': 'both' }] }]
, or if you specify"dual_local"
it defaults to[ { 'uri': 'ws://127.0.0.1:8788', 'regions': [{ 'sides': 'home' }] }, { 'uri': 'ws://127.0.0.1:8789', 'regions': [{ 'sides': 'away' }] } ]
-
async
disconnect_all_remotes
()¶ Disconnect all remotes
-
async
episode_reset
()¶ Reset for an episode
-
async
episode_step
()¶ Step for an episode
-
async
get_episode_stats
()¶ Gets a summary of stats for the last episode, based on team_stats
-
async
get_team_stats
()¶ Read all team stats from the last episode
- Return type
ndarray
- Returns
Team stats for all teams; a numpy array of shape (2, n_arenas, len(
gym_derk.TeamStatsKeys
)). The first dimension is the side (0=home, 1=away).
-
get_webgl_renderer
()¶ Return which webgl renderer is being used by the game
- Return type
str
-
async
print_team_stats
(team_stats=None)¶ Reads and prints the team stats from the last episode
-
async
reload
()¶ Reload the game
-
async
run_episode
()¶ Run a single episode
Shorthand for:
try: await app.episode_reset() while not (await app.episode_step()): pass except Exception as e: app.disconnect_all_remotes()
-
async
run_episodes_loop
()¶ Runs episodes in a loop until agents disconnect
-
async
run_session
(**kwargs)¶ Creates a session, connect hosts and runs episodes loop.
See
create_session()
for args.This is just a shorthand for:
`python await self.create_session(args) await self.connect_to_agent_hosts() await self.run_episodes_loop() `
-
property
running
¶ Returns true if the app is still running
-
async
start
()¶ Start the application
-
async
update_away_team_config
(config)¶ Update the away teams configuration.
The session needs to be created first.
- Parameters
config – See Configuring your Derklings
-
async
update_home_team_config
(config)¶ Update the home teams configuration.
The session needs to be created first.
- Parameters
config – See Configuring your Derklings
-
async
update_reward_function
(reward_function)¶ Update the reward function.
The session needs to be created first.
- Parameters
reward_function – See Reward function
-
class
gym_derk.
ObservationKeys
(value)¶ An enumeration.
-
Hitpoints
= 0¶
-
Ability0Ready
= 1¶
-
FriendStatueDistance
= 2¶
-
FriendStatueAngle
= 3¶
-
Friend1Distance
= 4¶
-
Friend1Angle
= 5¶
-
Friend2Distance
= 6¶
-
Friend2Angle
= 7¶
-
EnemyStatueDistance
= 8¶
-
EnemyStatueAngle
= 9¶
-
Enemy1Distance
= 10¶
-
Enemy1Angle
= 11¶
-
Enemy2Distance
= 12¶
-
Enemy2Angle
= 13¶
-
Enemy3Distance
= 14¶
-
Enemy3Angle
= 15¶
-
HasFocus
= 16¶
-
FocusRelativeRotation
= 17¶
-
FocusFacingUs
= 18¶
-
FocusFocusingBack
= 19¶
-
FocusHitpoints
= 20¶
-
Ability1Ready
= 21¶
-
Ability2Ready
= 22¶
-
FocusDazed
= 23¶
-
FocusCrippled
= 24¶
-
HeightFront1
= 25¶
-
HeightFront5
= 26¶
-
HeightBack2
= 27¶
-
PositionLeftRight
= 28¶
-
PositionUpDown
= 29¶
-
Stuck
= 30¶
-
UnusedSense31
= 31¶
-
HasTalons
= 32¶
-
HasBloodClaws
= 33¶
-
HasCleavers
= 34¶
-
HasCripplers
= 35¶
-
HasHealingGland
= 36¶
-
HasVampireGland
= 37¶
-
HasFrogLegs
= 38¶
-
HasPistol
= 39¶
-
HasMagnum
= 40¶
-
HasBlaster
= 41¶
-
HasParalyzingDart
= 42¶
-
HasIronBubblegum
= 43¶
-
HasHeliumBubblegum
= 44¶
-
HasShell
= 45¶
-
HasTrombone
= 46¶
-
FocusHasTalons
= 47¶
-
FocusHasBloodClaws
= 48¶
-
FocusHasCleavers
= 49¶
-
FocusHasCripplers
= 50¶
-
FocusHasHealingGland
= 51¶
-
FocusHasVampireGland
= 52¶
-
FocusHasFrogLegs
= 53¶
-
FocusHasPistol
= 54¶
-
FocusHasMagnum
= 55¶
-
FocusHasBlaster
= 56¶
-
FocusHasParalyzingDart
= 57¶
-
FocusHasIronBubblegum
= 58¶
-
FocusHasHeliumBubblegum
= 59¶
-
FocusHasShell
= 60¶
-
FocusHasTrombone
= 61¶
-
UnusedExtraSense30
= 62¶
-
UnusedExtraSense31
= 63¶
-
-
class
gym_derk.
ActionKeys
(value)¶ These are the actions a Derkling can take, which you send to the step function.
-
MoveX = 0
A number between -1 and 1. This controlls forward/backwords movement of the Derkling.
-
Rotate = 1
A number between -1 and 1. This controlls the rotation of the Derklin. Rotate -1 mean turn left full speed.
-
ChaseFocus = 2
A number between 0 and 1. If this is 1, the MoveX and Rotate actions are ignored and instead the Derkling runs towards its current focus. Numbers between 0-1 interpolates between this behavior and the MoveX/Rotate actions, and 0 means only MoveX and Rotate are used.
-
CastingSlot = 3
0=don’t cast. 1-3=cast corresponding ability.
-
ChangeFocus = 4
0=keep current focus. 1=focus home statue. 2-3=focus teammates, 4=focus enemy statue, 5-7=focus enemy
-
-
class
gym_derk.
TeamStatsKeys
(value)¶ An enumeration.
-
Reward
= 0¶
-
OpponentReward
= 1¶
-
Hitpoints
= 2¶
-
AliveTime
= 3¶
-
CumulativeHitpoints
= 4¶
-
-
gym_derk.
run_derk_agent_server_in_background
(handle_session, **kwargs)¶ Launch a DerkAgentServer a background thread
Accepts the same arguments as
gym_derk.DerkAgentServer
Reward function¶
The reward function is based on the OpenAI Five reward function (https://gist.github.com/dfarhi/66ec9d760ae0c49a5c492c9fae93984a). These are the possible fields:
Field |
Default value |
Notes |
---|---|---|
damageEnemyStatue |
0 |
Per hitpoint |
damageEnemyUnit |
0 |
Per hitpoint |
killEnemyStatue |
4 |
|
killEnemyUnit |
1 |
|
healFriendlyStatue |
0 |
Per hitpoint |
healTeammate1 |
0 |
Per hitpoint |
healTeammate2 |
0 |
Per hitpoint |
timeSpentHomeBase |
0 |
Every 5 seconds |
timeSpentHomeTerritory |
0 |
Every 5 seconds |
timeSpentAwayTerritory |
0 |
Every 5 seconds |
timeSpentAwayBase |
0 |
Every 5 seconds |
damageTaken |
0 |
Per hitpoint |
friendlyFire |
0 |
Per hitpoint |
healEnemy |
0 |
Per hitpoint |
fallDamageTaken |
0 |
Per hitpoint |
statueDamageTaken |
0 |
Per hitpoint (the teams own statue) |
teamSpirit |
0 |
If this is 1, it means all rewards are averaged between teammates |
timeScaling |
1 |
This is a linear falloff with time for reward; 0 means no reward at all at the last step |
Items¶
By default, a Derkling gets a random loadout assigend. Each slot has a 70% chance to be filled, which means there’s a 34% chance of three items, 44% chance of two items, 19% chance of one item and 3% chance of no items.
Name |
Slot |
Description |
---|---|---|
Talons |
arms |
Melee item dealing good, steady damge to a target. |
BloodClaws |
arms |
Damage dealing melee item that also heals the equipper with each hit. |
Cleavers |
arms |
Heave and powerful, but slow hitting melee item. |
Cripplers |
arms |
Melee item that also cripple the opponent, making them move slower. |
Pistol |
arms |
Ranged weapon. Pew pew! |
Magnum |
arms |
Heavy ranged weapon that knocks the target back. |
Blaster |
arms |
Heavy ranged weapon that deals massive damge. |
FrogLegs |
misc |
Long strong legs, enabling the Derkling to quickly jump forward. |
IronBubblegum |
misc |
Blows an iron-enforced bubble around a target, protecting them from damage. |
HeliumBubblegum |
misc |
Blows a bubble filled with helium around a target, making them float up into the air. |
Shell |
misc |
Increases the armor of a Derkling. Armor is further increased when they duck. |
Trombone |
misc |
When the horn is blown, all enemies are forced to focus on the musician. |
HealingGland |
tail |
Siphons hitpoints to the target. |
VampireGland |
tail |
Drains a target of hitpoints and restores the casters hitpoints. |
ParalyzingDart |
tail |
Launches a projectile at a target, dazing them for a short moment. |
Changelog¶
1.1.1: Fix bug that made
gym_derk.DerkEnv.reset()
return None1.1.0:
Tweaked how reset/step were executed in DerkEnv (lock-step instead of free running)
1.0.1: Fix derkling configuration reading (should be mod the number of configs). Configuring your Derklings for details.
1.0.0:
The API has now been live for a while, so it’s time to call it 1.0. No changes from the previous version except the one below:
Introduced read_game_state to
gym_derk.DerkAppInstance.create_session()
0.16.4: Tweak random loadout item chance (50% -> 70%). See Items for details.
0.16.3: Remove .GymLoaded timeout (fixed)
0.16.2: Remove .GymLoaded timeout
0.16.1: Fix bug with victory/loss reward function calculations
0.16.0: Remove all xxx_keys enum properties, and add
gym_derk.ObservationKeys
,gym_derk.ActionKeys
andgym_derk.TeamStatsKeys
0.15.6: Fix exception thrown at env.close (https://github.com/MountRouke/DerkGymIssues/issues/6)
0.15.5: Tweak AI crowd logo display
0.15.4: Added
gym_derk.DerkAppInstance.get_episode_stats()
0.15.3: Remove headless switch (wasn’t working), and improve installation instructions.
0.15.2:
0.15.1: Make it possible to display the AICrowd logo in-game
0.15.0
Remove “Points”; we only have Reward now
Update default reward
Update team_stats_keys; remove “Gold”, “Points” and “OpponentPoints” and add “Reward” and “OpponentReward”
Winner is now based on the team with the highest reward
0.14.3: Prevent Derklings from moving too far off camera
0.14.2: Tweak camera to make more of the map visible
0.14.1: Configurable window size
0.14.0: Improve Derklig configuration documentation, and change Derkling bounties field name to rewardFunction.
0.13.2: Fix two memory leaks
0.13.1: Fix bug that prevented DerkEnv to start since 0.13.0
0.13.0
gym_derk.DerkAppInstance.create_session()
no longer automatically connects to hosts. When this method is used you need to callgym_derk.DerkAppInstance.connect_to_agent_hosts()
as well.
0.12.4:
Added:
gym_derk.envs.DerkEnv.team_stats_keys
gym_derk.DerkSession.team_stats_keys
gym_derk.DerkAppInstance.team_stats_keys
0.12.3:
Fix argument bug to run_derk_agent_server_in_background
0.12.2:
Add run_derk_agent_server_in_background convenience method
0.12.1:
Re-added a couple of convenience arguments to DerkEnv: n_arenas, reward_function, turbo_mode, home_team and away_team. These simply get added to the sesssion_args argument.
0.12.0:
Rename DerkAppInstance.async_init_browser to DerkAppInstance.start, which the user now needs to call to start the app
If the app closes for any reason, the DerkEnv is now able to restart it on reset
0.11.1:
Add
args
to DerkAgentServer which are passed to the session runner
0.11.0:
This version breaks the API into two parts; a high-level DerkEnv, suitable for working in for instance a notebook environment, and a low-level API with DerkAgentServer, DerkSession and DerkAppInstance.
The arguments to DerkEnv have changed and is just three dict args that gets passed down the the low-level API. To set for instance n_arenas and app_host, it would look like this now:
DerkEnv(session_args={ 'n_arenas': 10 }, app_args={ 'app_host': 'http://localhost:3000' })
0.10.0:
Added
env.action_keys
andenv.observation_keys
Removed
env.n_actions
; uselen(env.action_keys)
instead.Removed
env.n_senses
; uselen(env.observation_keys)
instead.
0.9.0:
Started keeping a changelog
The
connected_host
argument was replaced with aconnected_envs
argument, and documentation added on how to specify it