minihack package

class minihack.LevelGenerator(map=None, w=8, h=8, fill='.', lit=True, flags=('hardfloor'), solidfill=' ')[source]

Bases: object

LevelGenerator provides a convenient Python interface for quickly writing description files for MiniHack. The LevelGenerator class can be used to create MAZE-type levels with specified heights and widths, and can then fill those levels with objects, monsters and terrain, and specify the start point of the level.

Parameters

map (str or None) – The description of the map block of the environment. If None, the map will have a rectangle layout with the given height and width. Defaults to None.
w (int) – The width of map. Only used when map=None. Defaults to 8.
h (int) – The height of map. Only used when map=None. Defaults to 8.
fill (str) – A character describing the environment feature that fills the map. Only used when map=None. Defaults to “.”, which corresponds to floor.
lit (bool) – Whether the layout is lit or not. This affects the observations the agent will receive. If an area is not lit, the agent can only see directly adjacent grids. Defaults to True.
flags (tuple) – Flags of the environment. For the full list, see https://nethackwiki.com/wiki/Des-file_format#FLAGS. Defaults to (“hardfloor”,).
solidfill (str) – A character describing the environment feature used for filling solid / unspecified parts of the map. Defaults to ” “, which corresponds to solid wall.

__init__(map=None, w=8, h=8, fill='.', lit=True, flags=('hardfloor'), solidfill=' ')[source]: Initialize self. See help(type(self)) for accurate signature.

add_altar(place=None, align='random', type='random')[source]

Add an altar.

Parameters

place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.
align (str) – The alignment. Possible values are “noalign”, “law”, “neutral”, “chaos”, “coaligned”, “noncoaligned”, and “random”. Defaults to “random”.
type (str) – The type of the altar. Possible values are “sanctum”, “shrine”, “altar”, and “random”. Defaults to random.

add_boulder(place=None)[source]

Add a boulder to the floor.

Parameters

amount (int) – The amount of gold.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_door(state, place=None)[source]

Add a door.

Parameters

state (str) – The state of the door. Possible values are “locked”, “closed”, “open”, “nodoor”, and “random”. Defaults to “random”.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_fountain(place=None)[source]

Add a fountain.

Parameters: place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_goal_pos(place=None)[source]: Add a goal at the given place. Same as add_stair_down.

add_gold(amount, place=None)[source]

Add gold on the floor.

Parameters

amount (int) – The amount of gold.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_line(str)[source]

Add a custom string to the buttom of the description file.

Parameters: str (str) – The string to be concatenated to the des-file.

add_mazewalk(coord=None, dir='east')[source]

Creates a random maze, starting from the given coordinate.

Mazewalk turns map grids with solid stone into floor. From the starting position, it checks the mapgrid in the direction given, and if it’s solid stone, it will move there, and turn that place into floor. Then it will choose a random direction, jump over the nearest mapgrid in that direction, and check the next mapgrid for solid stone. If there is solid stone, mazewalk will move that direction, changing that place and the intervening mapgrid to floor. Normally the generated maze will not have any loops.

Pointing mazewalk at that will create a small maze of trees, but unless the map (at the place where it’s put into the level) is surrounded by something else than solid stone, mazewalk will get out of that MAP. Substituting floor characters for some of the trees “in the maze” will make loops in the maze, which are not otherwise possible. Substituting floor characters for some of the trees at the edges of the map will make maze entrances and exits at those places.

For more details see https://nethackwiki.com/wiki/Des-file_format#MAZEWALK.

Parameters: coord – A tuple with length two representing the (x, y) coordinates. If None is passed, the middle point of the map is selected. Defaults to None.

add_monster(name='random', symbol=None, place=None, args=())[source]

Add a monster to the map.

Parameters

name (str) – The name of the monster. Defaults to random.
symbol (str or None) – The symbol of the monster. The symbol should correspond to the family of the specified mosnter. For example, “d” symbol corresponds to canine monsters, so the name of the object should also correspond to canines (e.g. jackal). Not used when name is “random”. Defaults to None.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.
args (tuple) – Additional monster arguments, e.g. “hostile” or “peaceful”, “asleep” or “awake”, etc. For more details, see https://nethackwiki.com/wiki/Des-file_format#MONSTER.

add_object(name='random', symbol='%', place=None, cursestate=None)[source]

Add an object to the map.

Parameters

name (str) – The name of the object. Defaults to random.
symbol (str) – The symbol of the object. The symbol should correspond to the given object name. For example, “%” symbol corresponds to comestibles, so the name of the object should also correspond to commestibles (e.g. apple). Not used when name is “random”. Defaults to “%”.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.
cursetstate (str or None) – The cursed state of the object. Can be “blessed”, “uncursed”, “cursed” or “random”. Defaults to None (not used).

add_object_area(area_name, name='random', symbol='%', cursestate=None)[source]: Add an object in an area of the map defined by area_name variable. See add_object for more details.

add_sink(place=None)[source]

Add a sink.

Parameters: place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_stair_down(place=None)[source]

Add a stair down at the given place.

Parameters: place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

add_terrain(coord, flag, in_footer=False)[source]

Add terrain features to the map.

Parameters

coord (tuple) – A tuple with length two representing the (x, y) coordinates.
flag (str) – The flag corresponding to the desired terrain feature. Should belong to minihack.level_generator.MAP_CHARS. For more details, see https://nethackwiki.com/wiki/Des-file_format#Map_characters
in_footer (bool) – Whether to define the terrain feature as an additional line in the description file (True) or directly modify the map block with the given flag (False). Defaults to False.

add_trap(name='teleport', place=None)[source]

Add a trap.

Parameters

name (str) – The name of the trap. For possible values, see minihack.level_generator.TRAP_NAMES. Defaults to “teleport”.
place (None, tuple or str) – The place of the added object. If None, the location is selected randomly. Tuple values are used for providing exact (x, y) coordinates. String values are copied to des-file as is. Defaults to None.

fill_terrain(type, flag, x1, y1, x2, y2)[source]

Fill the areas between (x1, y1) and (x2, y2) with the given dungeon feature:

Parameters

type (str) – The type of filling. “rect” indicates an unfilled rectangle, containing just the edges and none of the interior points. “fillrect” denotes filled rectangle containing the edges and all of the interior points. “line” is used for a straight line drawn from one pair of coordinates to the other using Bresenham’s line algorithm.
flag (str) – The flag corresponding to the desired terrain feature. Should belong to minihack.level_generator.MAP_CHARS. For more details, see https://nethackwiki.com/wiki/Des-file_format#Map_characters
x1 (int) – x coordinate of point 1.
y1 (int) – y coordinate of point 1.
x2 (int) – x coordinate of point 2.
y2 (int) – y coordinate of point 2.

get_des()[source]

Returns the description file.

Returns: the description file as a string.
Return type: str

get_map_array()[source]: Returns the map as a two-dimensional numpy array.

get_map_str()[source]: Returns the map as a string.

init_map(map=None, x=8, y=8, fill='.')[source]: Initialise the map block of the des-file.

set_area_variable(var_name, type, x1, y1, x2, y2)[source]

Set a variable representing an area on the map.

Parameters

var_name (str) – The name of the variable.
type (str) – The type of filling. “rect” indicates an unfilled rectangle, containing just the edges and none of the interior points. “fillrect” denotes filled rectangle containing the edges and all of the interior points. “line” is used for a straight line drawn from one pair of coordinates to the other using Bresenham’s line algorithm.
x1 (int) – x coordinate of point 1.
y1 (int) – y coordinate of point 1.
x2 (int) – x coordinate of point 2.
y2 (int) – y coordinate of point 2.

set_start_pos(coord)[source]

Set the starting position of the agent.

Parameters: coord (tuple) – A tuple with length two representing the (x, y) coordinates.

set_start_rect(p1, p2)[source]

Set the starting position of the agent.

Parameters: coord (tuple) – A tuple with length two representing the (x, y) coordinates.

wallify()[source]: Wallify the map. Turns walls completely surrounded by other walls into solid stone ‘ ‘.

class minihack.MiniHack(*args: Any, **kwargs: Any)[source]

Bases: nle.env.tasks.

MiniHack base class.

All MiniHack environments are derived from this class, which itself is derived from NLE base class.

Note that this class itself is not used for creating new environment instances. Instead, MiniHackNavigation and MiniHackSkill provide a more convenient interface for doing this, both of which are directly derived from MiniHack for specific types of environments.

__init__(*args, des_file: str, reward_win=1, reward_lose=0, obs_crop_h=9, obs_crop_w=9, obs_crop_pad=0, reward_manager=None, use_wiki=False, autopickup=True, pet=False, observation_keys=['glyphs', 'chars', 'colors', 'specials', 'glyphs_crop', 'chars_crop', 'colors_crop', 'specials_crop', 'blstats', 'message'], seeds=None, include_see_actions=True, include_alignment_blstats=True, **kwargs)[source]

Constructs a new MiniHack environment.

Parameters

des_file (str) – The description file for the environment.
reward_win (float) – The reward received upon successfully completing an episode. Defaults to 1.
reward_lose (float) – The reward received upon death or aborting. Defaults to 1.
obs_crop_h (int) – The height of agent-centred cropped observation. Defaults to 9.
obs_crop_w (int) – The width of agent-centred cropped observation. Defaults to 9.
obs_crop_pad (int) – The padding for agent-centred cropped observation. Defaults to 0.
reward_manager (RewardManager or None) – The reward manager that describes the custom reward function of the agent. If None, the goal of the agent is to reach the stair down. Defaults to None.
use_wiki (bool) – Whether to use the NetHack wiki. Defaults to False.
autopickup (bool) – Turning autopickup on or off. Defaults to True.
pet (bool) – Whether to include the pet. Defaults to False.
observation_keys (list) – The keys of observations returned after every timestep by the environment as a dictionary. Defaults to minihack.base.MH_DEFAULT_OBS_KEYS.
seeds (list or None) – A list of integers used as level seeds for sampling episodes. The reset()` function samples a seed from this list uniformly at random and uses it for setting the level. When the sample_seed argument of the reset function is set to False, a random level will not be sampled from this list during environment resetting. If None, the entire level distribution is used. Defaults to None.
penalty_mode (str) – The name of the mode for calculating the time step penalty. Can be constant, exp, square, linear, or always. Defaults to constant. Inherited from NetHackScore.
penalty_step (float) – A constant applied to amount of frozen steps. Defaults to -0.01. Inherited from NetHackScore.
penalty_time (float) – A constant applied to amount of frozen steps. Defaults to -0.0. Inherited from NetHackScore.
save_ttyrec_every (int) – Integer, if 0, no ttyrecs (game recordings) will be saved. Otherwise, save a ttyrec every Nth episode. Defaults to 0. Inherited from NLE.
savedir (str or None) – Path to save ttyrecs (game recordings) into, if save_ttyrec_every is nonzero. If nonempty string, interpreted as a path to a new or existing directory. If “” (empty string) or None, NLE choses a unique directory name. Defaults to None. Inherited from NLE.
character (str) – Name of character. Defaults to “mon-hum-neu-mal”. Interited from NLE.
max_episode_steps (int) – maximum amount of steps allowed before the game is forcefully quit. In such cases, info["end_status"] ill be equal to StepStatus.ABORTED. Defaults to 200. Inherited from NLE.
actions (list) – list of actions. If None, the full action space will be used, i.e. nle.nethack.ACTIONS. Defaults to MH_FULL_ACTIONS. Inherited from NLE.
wizard (bool) – activate wizard mode. Defaults to False. Inherited from NLE.
allow_all_yn_questions (bool) – If set to True, no y/n questions in step() are declined. If set to False, only elements of SKIP_EXCEPTIONS are not declined. Defaults to True. Inherited from NLE.
allow_all_modes (bool) – If set to True, do not decline menus, text input or auto ‘MORE’. If set to False, only skip click through ‘MORE’ on death. Defaults to False. Inherited from NLE.
spawn_monsters (bool) – If False, disables normal NetHack behavior to randomly create monsters. Defaults to False. Inherited from NLE.
include_see_actions (bool) – If True, the agent’s action space includes the additional NLE actions introduced in the 0.8.1 release. Has no effect when the actions parameter is specified. Defaults to True.
include_alignment_blstats (bool) – If True, the agent’s observation space includes the alignment information in the blstats. This is introduced in NLE 0.9.0 release. Defaults to True.

get_neighbor_descriptions(observation=None)[source]: Returns the descriptions of nine neighboring grids around the agent.

get_neighbor_wiki_pages(observation=None)[source]: Returns the page contents of the neighboring objects from NetHack wiki.

get_object_direction(name, observation=None)[source]

Find the game direction of the (first) object in the neighboring nine tiles that contains the given name in its description.

Parameters

name (str) – Name of the object.
observation (dict) – Agent observation.

Returns

The index of the direction. None if not found.

Return type

int

get_screen_description(x, y, observation=None)[source]: Returns the description of the screen on (x,y) coordinates.

get_screen_wiki_page(x, y, observation=None)[source]: Returns the wiki page matching the object on (x,y) coordinates.

key_in_inventory(name)[source]

Returns key of the given object in the inventory.

Parameters: name (str) – Name of the object.
Returns: the key of the first item in the inventory that includes the argument name as a substring. Returns None if not found.
Return type: str

reset(*args, sample_seed=True, **kwargs)[source]

screen_contains(name, observation=None)[source]

Whether an object with the given name is visible on the screen, i.e. included in the screen descriptions of the observation dictionary.

Parameters

name (str) – Name of the object or monster.
observation (dict) – Agent observation.

Returns

True if the name is contained on the screen, False otherwise.

Return type

bool

step(action: int)[source]

update(des_file)[source]: Update the current environment by replacing its description file.

class minihack.MiniHackNavigation(*args: Any, **kwargs: Any)[source]

Bases: nle.env.tasks.

The base class for MiniHack Navigation tasks.

Navigation tasks have the following characteristics:

Restricted action space: By default, the agent can only move towards eight compass directions.
Yes/No questions, as well as menu-selection actions are disabled by default.
The character is set to chaotic human male rogue.
Auto-pick is enabled by default.
Maximum episode limit defaults to 100 (can be overriden via the max_episode_steps argument)
The default goal is to reach the stair down. This can be changed using a reward manager.

__init__(*args, des_file: Optional[str] = None, **kwargs)[source]

Constructs a new MiniHack environment.

Parameters

des_file (str) – The description file for the environment.
reward_win (float) – The reward received upon successfully completing an episode. Defaults to 1.
reward_lose (float) – The reward received upon death or aborting. Defaults to 1.
obs_crop_h (int) – The height of agent-centred cropped observation. Defaults to 9.
obs_crop_w (int) – The width of agent-centred cropped observation. Defaults to 9.
obs_crop_pad (int) – The padding for agent-centred cropped observation. Defaults to 0.
reward_manager (RewardManager or None) – The reward manager that describes the custom reward function of the agent. If None, the goal of the agent is to reach the stair down. Defaults to None.
use_wiki (bool) – Whether to use the NetHack wiki. Defaults to False.
autopickup (bool) – Turning autopickup on or off. Defaults to True.
pet (bool) – Whether to include the pet. Defaults to False.
observation_keys (list) – The keys of observations returned after every timestep by the environment as a dictionary. Defaults to minihack.base.MH_DEFAULT_OBS_KEYS.
seeds (list or None) – A list of integers used as level seeds for sampling episodes. The reset()` function samples a seed from this list uniformly at random and uses it for setting the level. When the sample_seed argument of the reset function is set to False, a random level will not be sampled from this list during environment resetting. If None, the entire level distribution is used. Defaults to None.
penalty_mode (str) – The name of the mode for calculating the time step penalty. Can be constant, exp, square, linear, or always. Defaults to constant. Inherited from NetHackScore.
penalty_step (float) – A constant applied to amount of frozen steps. Defaults to -0.01. Inherited from NetHackScore.
penalty_time (float) – A constant applied to amount of frozen steps. Defaults to -0.0. Inherited from NetHackScore.
save_ttyrec_every (int) – Integer, if 0, no ttyrecs (game recordings) will be saved. Otherwise, save a ttyrec every Nth episode. Defaults to 0. Inherited from NLE.
savedir (str or None) – Path to save ttyrecs (game recordings) into, if save_ttyrec_every is nonzero. If nonempty string, interpreted as a path to a new or existing directory. If “” (empty string) or None, NLE choses a unique directory name. Defaults to None. Inherited from NLE.
character (str) – Name of character. Defaults to “mon-hum-neu-mal”. Interited from NLE.
max_episode_steps (int) – maximum amount of steps allowed before the game is forcefully quit. In such cases, info["end_status"] ill be equal to StepStatus.ABORTED. Defaults to 200. Inherited from NLE.
actions (list) – list of actions. If None, the full action space will be used, i.e. nle.nethack.ACTIONS. Defaults to MH_FULL_ACTIONS. Inherited from NLE.
wizard (bool) – activate wizard mode. Defaults to False. Inherited from NLE.
allow_all_yn_questions (bool) – If set to True, no y/n questions in step() are declined. If set to False, only elements of SKIP_EXCEPTIONS are not declined. Defaults to True. Inherited from NLE.
allow_all_modes (bool) – If set to True, do not decline menus, text input or auto ‘MORE’. If set to False, only skip click through ‘MORE’ on death. Defaults to False. Inherited from NLE.
spawn_monsters (bool) – If False, disables normal NetHack behavior to randomly create monsters. Defaults to False. Inherited from NLE.
include_see_actions (bool) – If True, the agent’s action space includes the additional NLE actions introduced in the 0.8.1 release. Has no effect when the actions parameter is specified. Defaults to True.
include_alignment_blstats (bool) – If True, the agent’s observation space includes the alignment information in the blstats. This is introduced in NLE 0.9.0 release. Defaults to True.

class minihack.MiniHackSkill(*args: Any, **kwargs: Any)[source]

Bases: nle.env.tasks.

The base class for MiniHack Skill Acquisition tasks.

Navigation tasks have the following characteristics:

The full action space is used.
Yes/No questions are enabled, but the menu-selection actions are disabled by default.
The character is set to a neutral human male caveman.
Maximum episode limit defaults to 250 (can be overriden via the max_episode_steps argument)
The default goal is to reach the stair down. This can be changed using a reward manager.
Auto-pick is disabled by default.
Inventory strings and corresponding letter are also included as part of the agent observations.

__init__(*args, des_file, **kwargs)[source]

Constructs a new MiniHack environment.

Parameters

des_file (str) – The description file for the environment.
reward_win (float) – The reward received upon successfully completing an episode. Defaults to 1.
reward_lose (float) – The reward received upon death or aborting. Defaults to 1.
obs_crop_h (int) – The height of agent-centred cropped observation. Defaults to 9.
obs_crop_w (int) – The width of agent-centred cropped observation. Defaults to 9.
obs_crop_pad (int) – The padding for agent-centred cropped observation. Defaults to 0.
reward_manager (RewardManager or None) – The reward manager that describes the custom reward function of the agent. If None, the goal of the agent is to reach the stair down. Defaults to None.
use_wiki (bool) – Whether to use the NetHack wiki. Defaults to False.
autopickup (bool) – Turning autopickup on or off. Defaults to True.
pet (bool) – Whether to include the pet. Defaults to False.
observation_keys (list) – The keys of observations returned after every timestep by the environment as a dictionary. Defaults to minihack.base.MH_DEFAULT_OBS_KEYS.
seeds (list or None) – A list of integers used as level seeds for sampling episodes. The reset()` function samples a seed from this list uniformly at random and uses it for setting the level. When the sample_seed argument of the reset function is set to False, a random level will not be sampled from this list during environment resetting. If None, the entire level distribution is used. Defaults to None.
penalty_mode (str) – The name of the mode for calculating the time step penalty. Can be constant, exp, square, linear, or always. Defaults to constant. Inherited from NetHackScore.
penalty_step (float) – A constant applied to amount of frozen steps. Defaults to -0.01. Inherited from NetHackScore.
penalty_time (float) – A constant applied to amount of frozen steps. Defaults to -0.0. Inherited from NetHackScore.
save_ttyrec_every (int) – Integer, if 0, no ttyrecs (game recordings) will be saved. Otherwise, save a ttyrec every Nth episode. Defaults to 0. Inherited from NLE.
savedir (str or None) – Path to save ttyrecs (game recordings) into, if save_ttyrec_every is nonzero. If nonempty string, interpreted as a path to a new or existing directory. If “” (empty string) or None, NLE choses a unique directory name. Defaults to None. Inherited from NLE.
character (str) – Name of character. Defaults to “mon-hum-neu-mal”. Interited from NLE.
max_episode_steps (int) – maximum amount of steps allowed before the game is forcefully quit. In such cases, info["end_status"] ill be equal to StepStatus.ABORTED. Defaults to 200. Inherited from NLE.
actions (list) – list of actions. If None, the full action space will be used, i.e. nle.nethack.ACTIONS. Defaults to MH_FULL_ACTIONS. Inherited from NLE.
wizard (bool) – activate wizard mode. Defaults to False. Inherited from NLE.
allow_all_yn_questions (bool) – If set to True, no y/n questions in step() are declined. If set to False, only elements of SKIP_EXCEPTIONS are not declined. Defaults to True. Inherited from NLE.
allow_all_modes (bool) – If set to True, do not decline menus, text input or auto ‘MORE’. If set to False, only skip click through ‘MORE’ on death. Defaults to False. Inherited from NLE.
spawn_monsters (bool) – If False, disables normal NetHack behavior to randomly create monsters. Defaults to False. Inherited from NLE.
include_see_actions (bool) – If True, the agent’s action space includes the additional NLE actions introduced in the 0.8.1 release. Has no effect when the actions parameter is specified. Defaults to True.
include_alignment_blstats (bool) – If True, the agent’s observation space includes the alignment information in the blstats. This is introduced in NLE 0.9.0 release. Defaults to True.

class minihack.NetHackWiki(raw_wiki_file_name: str, processed_wiki_file_name: str, save_processed_json: bool = True, ignore_inpage_anchors: bool = True, preprocess_input: bool = True, exceptions: Optional[tuple] = None)[source]

Bases: object

A class representing Nethack Wiki Data - pages and links between them.

Parameters

raw_wiki_file_name (str) – The path to the raw file of NetHack wiki. The raw file can be downloaded using the get_nhwiki_data.sh script located in minihack/scripts.
processed_wiki_file_name (str) – The path to the processed file of NetHack wiki. The processing is performed in the __init__ function of this classed.
save_processed_json (bool) – Whether to save the processed json file of the wiki. Only considered when a raw wiki file is passed. Defaults to True.
ignore_inpage_anchors (bool) – Whether to ingnore in-page anchors. Defaults to True.
preprocess_input (bool) – Whether to perform a preprocessing on wiki data. Defaults to True.
exceptions (Tuple[str] or None) – Name of entities in screen descriptions that are ingored. If None, there are no exceptions. Defaults to None.

__init__(raw_wiki_file_name: str, processed_wiki_file_name: str, save_processed_json: bool = True, ignore_inpage_anchors: bool = True, preprocess_input: bool = True, exceptions: Optional[tuple] = None) → None[source]: Initialize self. See help(type(self)) for accurate signature.

get_page_data(page: str) → dict[source]

Get the data of a page.

Parameters: page (str) – The page name.
Returns: The page data as a dict.
Return type: dict

get_page_text(page: str) → str[source]

Get the text of a page.

Parameters: page (str) – The page name.
Returns: The text of the page.
Return type: str

class minihack.RewardManager[source]

Bases: minihack.reward_manager.AbstractRewardManager

This class is used for managing rewards, events and termination for MiniHack tasks.

Some notes on the ordering or calls in the MiniHack/NetHack base class:

step(action) is called on the environment
Within step, first a copy of the last observation is made, and then the underlying NetHack game is stepped
Then _is_episode_end(observation) is called to check whether this the episode has ended (and this is overridden if we’ve gone over our max_steps, or the underlying NetHack game says we’re done (i.e. we died)
Then _reward_fn(last_observation, observation) is called to calculate the reward at this time-step
if end_status tells us the game is done, we quit the game
then step returns the observation, calculated reward, done, and some

statistics.

All this means that we need to check whether an observation is terminal in _is_episode_end before we’re calculating the reward function.

The call of _is_episode_end in MiniHack will call check_episode_end_call in this class, which checks for termination and accumulates any reward, which is returned and zeroed in collect_reward.

__init__()[source]: Initialize self. See help(type(self)) for accurate signature.

add_amulet_event(reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered when an amulet is worn.

Parameters

reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_coordinate_event(coordinates: Tuple[int, int], reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered on when reaching the specified coordinates.

Parameters

coordinates (Tuple[int, int]) – The coordinates to be reached (tuple of ints).
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_custom_reward_fn(reward_fn: Callable[[MiniHack, Any, int, Any], float]) → None[source]

Add a custom reward function which is called every after step to calculate reward.

The function should be a callable which takes the environment, previous observation, action and current observation and returns a float reward.

Parameters: reward_fn (Callable[[MiniHack, Any, int, Any], float]) – A reward function which takes an environment, previous observation, action, next observation and returns a reward.

add_eat_event(name: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add an event which is triggered when name is eaten.

Parameters

name (str) – The name of the object being eaten.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_event(event: minihack.reward_manager.Event)[source]

Add an event to be managed by the reward manager.

Parameters: event (Event) – The event to be added.

add_kill_event(name: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered when a specified monster is killed.

Parameters

name (str) – The name of the monster to be killed.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_location_event(location: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered on reaching a specified location.

Parameters

name (str) – The name of the location to be reached.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_message_event(msgs: List[str], reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered when any of the given messages are seen.

Parameters

msgs (List[str]) – The name of the monster to be killed.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_positional_event(place_name: str, action_name: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered on taking a given action at a given place.

Parameters

place_name (str) – The name of the place to trigger the event.
action_name (int) – The name of the action to trigger the event.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_wear_event(name: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered when a specific armor is worn.

Parameters

name (str) – The name of the armor to be worn.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

add_wield_event(name: str, reward=1, repeatable=False, terminal_required=True, terminal_sufficient=False)[source]

Add event which is triggered when a specific weapon is wielded.

Parameters

name (str) – The name of the weapon to be wielded.
reward (float) – The reward for this event. Defaults to 1.
repeatable (bool) – Whether this event can be triggered multiple times. Defaults to False.
terminal_required (bool) – Whether this event is required for termination. Defaults to True.
terminal_sufficient (bool) – Whether this event is sufficient for termination. Defaults to False.

check_episode_end_call(env, previous_observation, action, observation) → bool[source]

Check if the task has ended, and accumulate any reward from the transition in self._reward.

Parameters

env (MiniHack) – The MiniHack environment in question.
previous_observation (tuple) – The previous state observation.
action (int) – The action taken.
observation (tuple) – The current observation.

Returns

Boolean whether the episode has ended.

Return type

bool

collect_reward() → float[source]

Return reward calculated and accumulated in check_episode_end_call, and then reset it.

Returns: The reward.
Return type: flaot

reset()[source]: Reset all events, to be called when a new episode occurs.

minihack package

Submodules