Derk’s Gym 1.1.1¶

This is the documentation for gym-derk, a python package that exposes the game “Dr. Derk’s Mutant Battlegrounds” as an OpenAI gym environment.

Main website: https://gym.derkgame.com Please get a license on the website if you’re using this in a commercial or academic context.

Installing: pip install gym-derk (see Installation & Running, os specific instructions for details)

Examples¶

Basic example¶

In this example the Derklings just take random actions

from gym_derk.envs import DerkEnv

env = DerkEnv()

for t in range(3):
  observation_n = env.reset()
  while True:
    action_n = [env.action_space.sample() for i in range(env.n_agents)]
    observation_n, reward_n, done_n, info = env.step(action_n)
    if all(done_n):
      print("Episode finished")
      break
env.close()

Neural network example¶

This is an example of how to use the Genetic Algorithm to train a single layer neural network.

from gym_derk.envs import DerkEnv
from gym_derk import ObservationKeys
import numpy as np
import gym
import math
import os.path

env = DerkEnv()

class Network:
  def __init__(self, weights=None, biases=None):
    self.network_outputs = 13
    if weights is None:
      weights_shape = (self.network_outputs, len(ObservationKeys))
      self.weights = np.random.normal(size=weights_shape)
    else:
      self.weights = weights
    if biases is None:
      self.biases = np.random.normal(size=(self.network_outputs))
    else:
      self.biases = biases

  def clone(self):
    return Network(np.copy(self.weights), np.copy(self.biases))

  def forward(self, observations):
    outputs = np.add(np.matmul(self.weights, observations), self.biases)
    casts = outputs[3:6]
    cast_i = np.argmax(casts)
    focuses = outputs[6:13]
    focus_i = np.argmax(focuses)
    return (
      math.tanh(outputs[0]), # MoveX
      math.tanh(outputs[1]), # Rotate
      max(min(outputs[2], 1), 0), # ChaseFocus
      (cast_i + 1) if casts[cast_i] > 0 else 0, # CastSlot
      (focus_i + 1) if focuses[focus_i] > 0 else 0, # Focus
    )

  def copy_and_mutate(self, network, mr=0.1):
    self.weights = np.add(network.weights, np.random.normal(size=self.weights.shape) * mr)
    self.biases = np.add(network.biases, np.random.normal(size=self.biases.shape) * mr)

weights = np.load('weights.npy') if os.path.isfile('weights.npy') else None
biases = np.load('biases.npy') if os.path.isfile('biases.npy') else None

networks = [Network(weights, biases) for i in range(env.n_agents)]

for e in range(10):
  observation_n = env.reset()
  while True:
    action_n = [networks[i].forward(observation_n[i]) for i in range(env.n_agents)]
    observation_n, reward_n, done_n, info = env.step(action_n)
    if all(done_n):
        print("Episode finished")
        break
  if env.mode == 'train':
    reward_n = env.total_reward
    print(reward_n)
    top_network_i = np.argmax(reward_n)
    top_network = networks[top_network_i].clone()
    for network in networks:
      network.copy_and_mutate(top_network)
    print('top reward', reward_n[top_network_i])
    np.save('weights.npy', top_network.weights)
    np.save('biases.npy', top_network.biases)
env.close()

Environment¶

Environment details¶

This is a MOBA inspired RL environment, where two teams battle each other, while trying to defend their own “statue”. Each team is composed of three units, and each unit gets a random loadout (see Items for available items). The goal is to try to attack the opponents statue and units, while defending your own. With the default reward, you get one point for killing an enemy creature, and four points for killing an enemy statue.

Arenas and parallelism¶

The environment is designed to run multiple game instances in parallel on the GPU. Each game instance is called an arena. Functions such as step and reset provide and return values from multiple arenas each. Thanks to this functionallity, it’s possible to collect a large amount of experiences very quickly.

Team and episode stats¶

There are a number of statistics you can access about teams. Use gym_derk.envs.DerkEnv.team_stats or gym_derk.DerkSession.team_stats to get the data. See gym_derk.TeamStatsKeys for available keys. For example, to read the Reward of the third team:

env.team_stats[2, TeamStatsKeys.Reward.value]

You can also get stats for all arenas in an episode with gym_derk.envs.DerkEnv.episode_stats and gym_derk.DerkSession.episode_stats.

Running against other agents / Benchmarking¶

First, we need an agent to run; you can try for instance https://github.com/MountRouke/Randos. Clone the repo, and start them with python main.py --server. This will start a websocket server for that agent.

Next, set mode="connected" in your own agent environment. The environment will connect with a websocket to the server running locally (by default), and the away team will now be controlled by the server. You can now train against these agents, or if you wish to benchmark against them you can look at episode_stats at the end of an episode to see how your agents were performing against the opponents.

Installation & Running, os specific instructions¶

The Derk environment is implemented as a WebGL2 web app, and runs in a chromium instance, through pyppeteer. This means that you can get the environment working anywhere where you can get chromium with WebGL2 working. On Desktop systems (Windows, OSX, Desktop linux), using the Derk environment is fairly straightforward; just run pip install gym-derk. If you get any errors, make sure that WebGL2 works on your system; you can verify that it does by visiting WebGL2 report. Unfortunately it’s not possible to run the environment in a headless mode, since chromium doesn’t support GPU acceleration in headless mode yet (see this issue).

On a server environment, or if you’re using Docker, there are two main ways to run the environment. The easiest is to use xvfb, which usually means the environment will run on the CPU. See https://github.com/MountRouke/DerkAppInstanceDockerized for a Debian based Docker image, and https://github.com/MountRouke/DerkAppInstanceDockerized/tree/ubuntu for an Ubuntu based image, both using xvfb. To utilize GPU acceleration on a server/in docker, you’ll need to use virtualgl. Virtualgl can be a bit tricky to set up, but there are Docker images with it that could serve as a base.

The environment can also be run on Google Colab. See the Derk Colab GPU example (virtualgl based) or Derk Colab CPU example (xvfb based).

Finally, it’s also possible to set up an agent as a server, without running the environment. This makes it possible to set up a trained agent as a service which you can connect to. See https://github.com/MountRouke/Randos for an example of how to do this. This is for instance useful for running a competition, where participants can submit Dockerized images with their agents, but where the actual environment is run outside of their images.

Competition (AICrowd)¶

We’re partnering with AICrowd to run a competition for Derk, where you can submit your agents to see how well they are performing compared to other participants’ agents. The API is free to use for the competition.

Competition page (with starter kit and submission guidelines): https://aicrowd.com/derk

Configuring your Derklings¶

You can configure a number of attributes on your Derklings, such as their appearance and their load-out. The configuration is read modulous, so you can specify 1, 3 or n_arenas * 3 configurations (or any other number) depending on how you want it repeated. Here’s a basic example:

env = DerkEnv(
   home_team=[
      { 'primaryColor': '#ff00ff' },
      { 'primaryColor': '#00ff00', 'slots': ['Talons', None, None] },
      { 'primaryColor': '#ff0000', 'rewardFunction': { 'healTeammate1': 1 } }
   ]
)

The properties you can configure for a Derkling are:

Cosmetics:
- primaryColor: A hex color: e.g. #ff00ff
- secondaryColor: Also a hex color
- ears: Integer between 1-4
- eyes: Integer between 1-5
- backSpikes: Integer between 1-7
slots: An array with exactly 3 items. Each item is a weapon/attachment slot. The first one is the arms attachment, the second tail attachment and the third is the misc attachment. See Items for available items.
rewardFunction: A specific reward function for this Derkling. See Reward function

Citation¶

Please use this BibTeX to cite this environment in your publications:

@misc{gym_derk,
   author = {John Fredrik Wilhelm Norén},
   title = {Derk Gym Environment},
   year = {2020},
   publisher = {Mount Rouke},
   journal = {Mount Rouke},
   howpublished = {\url{https://gym.derkgame.com}},
}

API reference¶

High-level API¶

The high-level API provides a simple, OpenAI Gym compatible DerkEnv class which is suitible for a Python notebook environment.

class gym_derk.envs.DerkEnv(mode=False, n_arenas=None, reward_function=None, turbo_mode=False, home_team=None, away_team=None, session_args={}, app_args={}, agent_server_args={})¶

Reinforcement Learning environment for “Dr. Derk’s Mutant Battlegrounds”

There are two modes for the environment:

mode="normal": You control both the home and away teams.
mode="connected": Connects this environment to another agent. See Running against other agents / Benchmarking.

Parameters

mode (str) – "normal" (default), "connected". See above for details. (Environment variable: DERK_MODE)
n_arenas (Optional[int]) – Number of parallel arenas to run
reward_function (Optional[Dict]) – Reward function. See Reward function for available options
turbo_mode (bool) – Skip rendering to the screen to run as fast as possible
home_team (Optional[List[Dict]]) – Home team creatures. See Configuring your Derklings.
away_team (Optional[List[Dict]]) – Away team creatures. See Configuring your Derklings.
session_args (Dict) – See arguments to gym_derk.DerkAppInstance.create_session()
app_args (Dict) – See arguments to gym_derk.DerkAppInstance
agent_server_args (Dict) – See arguments to gym_derk.DerkAgentServer

This is a convenience wrapper of the more low level api of gym_derk.DerkAppInstance, gym_derk.DerkAgentServer and gym_derk.DerkSession.

property action_space¶: Gym space for actions

async async_close()¶: Async version of close()

async async_reset()¶: Async version of reset()

async async_step(action_n=None)¶

Async version of step()

Return type: Tuple[ndarray, ndarray, List[bool], List[Dict]]

close()¶: Shut down environment

property episode_stats¶

Stats for the last episode

Return type: Dict

property n_agents¶

Number of agents controlled by this environment

I.e. env.n_teams * env.n_agents_per_team

Return type: int

property n_agents_per_team¶

Number of agents in a team (3)

Return type: int

property n_teams¶

Number of teams controlled by this environment

Return type: int

property observation_space¶: Gym space for observations

reset()¶

Resets the state of the environment and returns an initial observation.

Return type: ndarray
Returns: The initial observation for each agent, with shape (n_agents, len(gym_derk.ObservationKeys)).
Raises: ConnectionLostError – If there was a connection error in connected mode

step(action_n=None)¶

Run one timestep.

Accepts a list of actions, one for each agent, and returns the current state.

Actions can have one of the following formats/shapes:

Numpy array of shape (n_teams, n_agents_per_team, len(gym_derk.ActionKeys))
Numpy array of shape (n_agents, len(gym_derk.ActionKeys))
List of actions (i.e. [[1, 0, 0, 2, 0], [0, 1, 0, 0, 3], ...]), one inner array per agent. This is just cast to a numpy array of shape (n_agents, len(gym_derk.ActionKeys)).

The returned observations are laid out in the same way as the actions, and can therefore be reshape like the above. For instance: observations.reshape((env.n_teams, env.n_agents_per_team, -1))

Parameters: action_n (Optional[ndarray]) – Numpy array or list of actions. See gym_derk.ActionKeys
Return type: Tuple[ndarray, ndarray, List[bool], List[Dict]]
Returns: A tuple of (observation_n, reward_n, done_n, info). observation_n has shape (n_agents, len(gym_derk.ObservationKeys))
Raises: ConnectionLostError – If there was a connection error in connected mode

property team_stats¶

Stats for each team for the last episode

Numpy array of shape (env.n_teams, len(gym_derk.TeamStatsKeys))

See Team and episode stats

Return type: ndarray

property total_reward¶

Accumulated rewards over an episode

Numpy array of shape (n_agents)

Return type: ndarray

Low-level API¶

The low-level API is more versatile and makes it possible to do things like setting up an agent as a service or running many different agents together, even if they are running on completely different machines. Here’s an example of how it works:

from gym_derk import DerkAgentServer, DerkSession, DerkAppInstance
import asyncio

async def run_fixed(env: DerkSession, actions):
  await env.reset()
  while not env.done:
    await env.step([actions for i in range(env.n_agents)])

async def main():
  # Agent servers are just websocket servers which can be connected to by a DerkAppInstance
  # That means these three could be running in different processes or even on different machines
  agent_walk  = DerkAgentServer(run_fixed, args={ 'actions': [0.1,  0, 0, 0, 0] }, port=8788)
  agent_turn  = DerkAgentServer(run_fixed, args={ 'actions': [0,  0.1, 0, 0, 0] }, port=8789)
  agent_chase = DerkAgentServer(run_fixed, args={ 'actions': [0,    0, 1, 1, 5] }, port=8790)

  await agent_walk.start()
  await agent_turn.start()
  await agent_chase.start()

  # This creates an actual instance of the game to run simulations in
  app = DerkAppInstance()
  await app.start()
  # We can specify any number of agent hosts here, and which sides and arenas they control
  await app.run_session(
    n_arenas=2,
    agent_hosts=[
      { 'uri': agent_walk.uri,  'regions': [{ 'sides': 'home' }] },
      { 'uri': agent_turn.uri,  'regions': [{ 'sides': 'away', 'start_arena': 0, 'n_arenas': 1 }] },
      { 'uri': agent_chase.uri, 'regions': [{ 'sides': 'away', 'start_arena': 1, 'n_arenas': 1 }] },
    ]
  )
  await app.print_team_stats()

asyncio.get_event_loop().run_until_complete(main())

class gym_derk.DerkAgentServer(handle_session, port=None, host=None, args={})¶

Agent server

This creates a websocket agent server, listening on host:port

Parameters

handle_session – A coroutine accepting the session and optionally a list org argument
port (Optional[int]) – Port to listen to. Defaults to 8789
host (Optional[str]) – Host to listen to. Defaults to 127.0.0.1
args (Dict) – Dictonary of args passed to handle_session

close()¶: Shutdown

async start()¶: Start the server

class gym_derk.DerkSession(websocket, init_msg)¶

A single training/evaluation session, consisting of multiple episodes

n_teams¶: Number of teams controlled by this environment

n_agents_per_team¶: Number of agents in a team (3)

action_space¶: Gym space for actions

observation_space¶: Gym space for observations

total_reward¶: Accumulated rewards over an episode. Numpy array of shape (n_agents)

team_stats¶: Stats for each team for the last episode. Numpy array of shape (n_teams, len(gym_derk.TeamStatsKeys)). See Team and episode stats

episode_stats¶: Stats for the last episode. See Team and episode stats

async close()¶: Close session

property n_agents¶

Number of agents controlled by this environment

I.e. env.n_teams * env.n_agents_per_team

async reset()¶

See gym_derk.envs.DerkEnv.reset()

Return type: ndarray

async step(action_n=None)¶

See gym_derk.envs.DerkEnv.step()

Return type: Tuple[ndarray, ndarray, List[bool], List[Dict]]

class gym_derk.DerkAppInstance(app_host=None, chrome_executable=None, chrome_args=[], chrome_devtools=False, window_size=[1000, 750], browser=None, browser_logs=False, internal_http_server=False)¶

Application instance of “Dr. Derk’s Mutant Battlegrounds”

Parameters

app_host (Optional[str]) – Configure an alternative app bundle host. (Environment variable: DERK_APP_HOST)
chrome_executable (Optional[str]) – Path to chrome or chromium. (Environment variable: DERK_CHROME_EXECUTABLE)
chrome_args (List[str]) – List of command line switches passed to chrome
chrome_devtools (bool) – Launch devtools when chrome starts
window_size (Tuple[int, int]) – Tuple with the size of the window
browser (Optional[Browser]) – A pyppeteer browser instance
browser_logs (bool) – Show log output from browser
web_socket_worker – Run websockets in a web worker

async async_get_webgl_renderer()¶: Async version of get_webgl_renderer()

async close()¶: Shut down app instance

async connect_to_agent_hosts()¶

Connect to agent hosts specified when the session was created

Returns: True if all hosts are connected, False otherwise

This method can be called in a loop to wait for all hosts to come online.

async create_session(n_arenas=1, reward_function=None, turbo_mode=False, home_team=None, away_team=None, substeps=8, interleaved=True, agent_hosts=None, debug_no_observations=False, web_socket_worker=None, ai_crowd_logo=False, read_game_state=False)¶

Create a session

All arguments are optional.

Parameters

n_arenas (int) – Number of parallel arenas to run
reward_function (Optional[Dict]) – Reward function. See Reward function for available options
turbo_mode (bool) – Skip rendering to the screen to run as fast as possible
home_team (Optional[List[Dict]]) – Home team creatures. See Configuring your Derklings.
away_team (Optional[List[Dict]]) – Away team creatures. See Configuring your Derklings.
substeps (int) – Number of game steps to run for each call to step
interleaved (bool) – Run each step in the background, returning the previous steps observations
agent_hosts (Union[List[Dict], str, None]) – List of DerkAgentServer’s to connect to, or "single_local", or "dual_local". See below for details.
read_game_state (bool) – Read the entire internal game state each step, and provide it as a JSON in the info object returned from the step function.

With the interleaved mode on, there’s a delay between observation and action of size substeps. E.g. if substeps=8 there’s an 8*16ms = 128ms “reaction time” from observation to action. This means that the game and the python code can in effect run in parallel.

The agent_hosts argument takes list of dicts with the following format: { uri: str, regions: [{ side: str, start_arena: int, n_arenas: int }] }, where uri specifies a running DerkAgentServer to connect to, and regions define which arenas and sides that agent will control. side can be 'home', 'away' or 'both'. start_arena and n_arenas can be ommitted to run the agent on all arenas. You can also pass a string value of "single_local", in which case the agent_hosts defaults to [{ 'uri': 'ws://127.0.0.1:8788', 'regions': [{ 'sides': 'both' }] }], or if you specify "dual_local" it defaults to

[
  { 'uri': 'ws://127.0.0.1:8788', 'regions': [{ 'sides': 'home' }] },
  { 'uri': 'ws://127.0.0.1:8789', 'regions': [{ 'sides': 'away' }] }
]

async disconnect_all_remotes()¶: Disconnect all remotes

async episode_reset()¶: Reset for an episode

async episode_step()¶: Step for an episode

async get_episode_stats()¶: Gets a summary of stats for the last episode, based on team_stats

async get_team_stats()¶

Read all team stats from the last episode

Return type: ndarray
Returns: Team stats for all teams; a numpy array of shape (2, n_arenas, len(gym_derk.TeamStatsKeys)). The first dimension is the side (0=home, 1=away).

get_webgl_renderer()¶

Return which webgl renderer is being used by the game

Return type: str

async print_team_stats(team_stats=None)¶: Reads and prints the team stats from the last episode

async reload()¶: Reload the game

async run_episode()¶

Run a single episode

Shorthand for:

try:
  await app.episode_reset()
  while not (await app.episode_step()):
    pass
except Exception as e:
  app.disconnect_all_remotes()

async run_episodes_loop()¶: Runs episodes in a loop until agents disconnect

async run_session(**kwargs)¶

Creates a session, connect hosts and runs episodes loop.

See create_session() for args.

This is just a shorthand for:

`python await self.create_session(args) await self.connect_to_agent_hosts() await self.run_episodes_loop() `

property running¶: Returns true if the app is still running

async start()¶: Start the application

async update_away_team_config(config)¶

Update the away teams configuration.

The session needs to be created first.

Parameters: config – See Configuring your Derklings

async update_home_team_config(config)¶

Update the home teams configuration.

The session needs to be created first.

Parameters: config – See Configuring your Derklings

async update_reward_function(reward_function)¶

Update the reward function.

The session needs to be created first.

Parameters: reward_function – See Reward function

class gym_derk.ObservationKeys(value)¶

An enumeration.

Hitpoints = 0¶

Ability0Ready = 1¶

FriendStatueDistance = 2¶

FriendStatueAngle = 3¶

Friend1Distance = 4¶

Friend1Angle = 5¶

Friend2Distance = 6¶

Friend2Angle = 7¶

EnemyStatueDistance = 8¶

EnemyStatueAngle = 9¶

Enemy1Distance = 10¶

Enemy1Angle = 11¶

Enemy2Distance = 12¶

Enemy2Angle = 13¶

Enemy3Distance = 14¶

Enemy3Angle = 15¶

HasFocus = 16¶

FocusRelativeRotation = 17¶

FocusFacingUs = 18¶

FocusFocusingBack = 19¶

FocusHitpoints = 20¶

Ability1Ready = 21¶

Ability2Ready = 22¶

FocusDazed = 23¶

FocusCrippled = 24¶

HeightFront1 = 25¶

HeightFront5 = 26¶

HeightBack2 = 27¶

PositionLeftRight = 28¶

PositionUpDown = 29¶

Stuck = 30¶

UnusedSense31 = 31¶

HasTalons = 32¶

HasBloodClaws = 33¶

HasCleavers = 34¶

HasCripplers = 35¶

HasHealingGland = 36¶

HasVampireGland = 37¶

HasFrogLegs = 38¶

HasPistol = 39¶

HasMagnum = 40¶

HasBlaster = 41¶

HasParalyzingDart = 42¶

HasIronBubblegum = 43¶

HasHeliumBubblegum = 44¶

HasShell = 45¶

HasTrombone = 46¶

FocusHasTalons = 47¶

FocusHasBloodClaws = 48¶

FocusHasCleavers = 49¶

FocusHasCripplers = 50¶

FocusHasHealingGland = 51¶

FocusHasVampireGland = 52¶

FocusHasFrogLegs = 53¶

FocusHasPistol = 54¶

FocusHasMagnum = 55¶

FocusHasBlaster = 56¶

FocusHasParalyzingDart = 57¶

FocusHasIronBubblegum = 58¶

FocusHasHeliumBubblegum = 59¶

FocusHasShell = 60¶

FocusHasTrombone = 61¶

UnusedExtraSense30 = 62¶

UnusedExtraSense31 = 63¶

class gym_derk.ActionKeys(value)¶

These are the actions a Derkling can take, which you send to the step function.

MoveX = 0: A number between -1 and 1. This controlls forward/backwords movement of the Derkling.

Rotate = 1: A number between -1 and 1. This controlls the rotation of the Derklin. Rotate -1 mean turn left full speed.

ChaseFocus = 2: A number between 0 and 1. If this is 1, the MoveX and Rotate actions are ignored and instead the Derkling runs towards its current focus. Numbers between 0-1 interpolates between this behavior and the MoveX/Rotate actions, and 0 means only MoveX and Rotate are used.

CastingSlot = 3: 0=don’t cast. 1-3=cast corresponding ability.

ChangeFocus = 4: 0=keep current focus. 1=focus home statue. 2-3=focus teammates, 4=focus enemy statue, 5-7=focus enemy

class gym_derk.TeamStatsKeys(value)¶

An enumeration.

Reward = 0¶

OpponentReward = 1¶

Hitpoints = 2¶

AliveTime = 3¶

CumulativeHitpoints = 4¶

gym_derk.run_derk_agent_server_in_background(handle_session, **kwargs)¶

Launch a DerkAgentServer a background thread

Accepts the same arguments as gym_derk.DerkAgentServer

Reward function¶

The reward function is based on the OpenAI Five reward function (https://gist.github.com/dfarhi/66ec9d760ae0c49a5c492c9fae93984a). These are the possible fields:

Field	Default value	Notes
damageEnemyStatue	0	Per hitpoint
damageEnemyUnit	0	Per hitpoint
killEnemyStatue	4
killEnemyUnit	1
healFriendlyStatue	0	Per hitpoint
healTeammate1	0	Per hitpoint
healTeammate2	0	Per hitpoint
timeSpentHomeBase	0	Every 5 seconds
timeSpentHomeTerritory	0	Every 5 seconds
timeSpentAwayTerritory	0	Every 5 seconds
timeSpentAwayBase	0	Every 5 seconds
damageTaken	0	Per hitpoint
friendlyFire	0	Per hitpoint
healEnemy	0	Per hitpoint
fallDamageTaken	0	Per hitpoint
statueDamageTaken	0	Per hitpoint (the teams own statue)
teamSpirit	0	If this is 1, it means all rewards are averaged between teammates
timeScaling	1	This is a linear falloff with time for reward; 0 means no reward at all at the last step

Items¶

By default, a Derkling gets a random loadout assigend. Each slot has a 70% chance to be filled, which means there’s a 34% chance of three items, 44% chance of two items, 19% chance of one item and 3% chance of no items.

Name	Slot	Description
Talons	arms	Melee item dealing good, steady damge to a target.
BloodClaws	arms	Damage dealing melee item that also heals the equipper with each hit.
Cleavers	arms	Heave and powerful, but slow hitting melee item.
Cripplers	arms	Melee item that also cripple the opponent, making them move slower.
Pistol	arms	Ranged weapon. Pew pew!
Magnum	arms	Heavy ranged weapon that knocks the target back.
Blaster	arms	Heavy ranged weapon that deals massive damge.
FrogLegs	misc	Long strong legs, enabling the Derkling to quickly jump forward.
IronBubblegum	misc	Blows an iron-enforced bubble around a target, protecting them from damage.
HeliumBubblegum	misc	Blows a bubble filled with helium around a target, making them float up into the air.
Shell	misc	Increases the armor of a Derkling. Armor is further increased when they duck.
Trombone	misc	When the horn is blown, all enemies are forced to focus on the musician.
HealingGland	tail	Siphons hitpoints to the target.
VampireGland	tail	Drains a target of hitpoints and restores the casters hitpoints.
ParalyzingDart	tail	Launches a projectile at a target, dazing them for a short moment.

Changelog¶

1.1.1: Fix bug that made gym_derk.DerkEnv.reset() return None
1.1.0:
- Add gym_derk.DerkAppInstance.update_home_team_config()
- Add gym_derk.DerkAppInstance.update_away_team_config()
- Add gym_derk.DerkAppInstance.update_reward_function()
- Tweaked how reset/step were executed in DerkEnv (lock-step instead of free running)
1.0.1: Fix derkling configuration reading (should be mod the number of configs). Configuring your Derklings for details.
1.0.0:

The API has now been live for a while, so it’s time to call it 1.0. No changes from the previous version except the one below:

Introduced read_game_state to gym_derk.DerkAppInstance.create_session()

0.16.4: Tweak random loadout item chance (50% -> 70%). See Items for details.
0.16.3: Remove .GymLoaded timeout (fixed)
0.16.2: Remove .GymLoaded timeout
0.16.1: Fix bug with victory/loss reward function calculations
0.16.0: Remove all xxx_keys enum properties, and add gym_derk.ObservationKeys, gym_derk.ActionKeys and gym_derk.TeamStatsKeys
0.15.6: Fix exception thrown at env.close (https://github.com/MountRouke/DerkGymIssues/issues/6)
0.15.5: Tweak AI crowd logo display
0.15.4: Added gym_derk.DerkAppInstance.get_episode_stats()
0.15.3: Remove headless switch (wasn’t working), and improve installation instructions.
0.15.2:

Added: gym_derk.DerkAppInstance.episode_reset()

Added: gym_derk.DerkAppInstance.episode_step()

Added: gym_derk.DerkAppInstance.disconnect_all_remotes()

0.15.1: Make it possible to display the AICrowd logo in-game
0.15.0
- Remove “Points”; we only have Reward now
- Update default reward
- Update team_stats_keys; remove “Gold”, “Points” and “OpponentPoints” and add “Reward” and “OpponentReward”
- Winner is now based on the team with the highest reward
0.14.3: Prevent Derklings from moving too far off camera
0.14.2: Tweak camera to make more of the map visible
0.14.1: Configurable window size
0.14.0: Improve Derklig configuration documentation, and change Derkling bounties field name to rewardFunction.
0.13.2: Fix two memory leaks
0.13.1: Fix bug that prevented DerkEnv to start since 0.13.0
0.13.0
- Added gym_derk.DerkAppInstance.connect_to_agent_hosts()
- gym_derk.DerkAppInstance.create_session() no longer automatically connects to hosts. When this method is used you need to call gym_derk.DerkAppInstance.connect_to_agent_hosts() as well.
0.12.4:
- Added:
  - gym_derk.envs.DerkEnv.team_stats_keys
  - gym_derk.envs.DerkEnv.team_stats
  - gym_derk.DerkSession.team_stats_keys
  - gym_derk.DerkSession.team_stats
  - gym_derk.DerkAppInstance.team_stats_keys
  - gym_derk.DerkAppInstance.get_team_stats()
  - gym_derk.DerkAppInstance.print_team_stats()
0.12.3:
- Fix argument bug to run_derk_agent_server_in_background
0.12.2:
- Add run_derk_agent_server_in_background convenience method
0.12.1:
- Re-added a couple of convenience arguments to DerkEnv: n_arenas, reward_function, turbo_mode, home_team and away_team. These simply get added to the sesssion_args argument.
0.12.0:
- Rename DerkAppInstance.async_init_browser to DerkAppInstance.start, which the user now needs to call to start the app
- If the app closes for any reason, the DerkEnv is now able to restart it on reset
0.11.1:
- Add args to DerkAgentServer which are passed to the session runner
0.11.0:
- This version breaks the API into two parts; a high-level DerkEnv, suitable for working in for instance a notebook environment, and a low-level API with DerkAgentServer, DerkSession and DerkAppInstance.
- The arguments to DerkEnv have changed and is just three dict args that gets passed down the the low-level API. To set for instance n_arenas and app_host, it would look like this now: DerkEnv(session_args={ 'n_arenas': 10 }, app_args={ 'app_host': 'http://localhost:3000' })
0.10.0:
- Added env.action_keys and env.observation_keys
- Removed env.n_actions; use len(env.action_keys) instead.
- Removed env.n_senses; use len(env.observation_keys) instead.
0.9.0:
- Started keeping a changelog
- The connected_host argument was replaced with a connected_envs argument, and documentation added on how to specify it

Derk’s Gym 1.1.1¶

Examples¶

Basic example¶

Neural network example¶

Environment¶

Environment details¶

Arenas and parallelism¶

Team and episode stats¶

Running against other agents / Benchmarking¶

Installation & Running, os specific instructions¶

Competition (AICrowd)¶

Configuring your Derklings¶

Citation¶

API reference¶

High-level API¶

Low-level API¶

Reward function¶

Items¶

Changelog¶

Indices and tables¶