minihack.skills module
- class minihack.skills.MiniHackSkill(*args: Any, **kwargs: Any)[source]
Bases:
nle.env.tasks.
The base class for MiniHack Skill Acquisition tasks.
Navigation tasks have the following characteristics:
The full action space is used.
Yes/No questions are enabled, but the menu-selection actions are disabled by default.
The character is set to a neutral human male caveman.
Maximum episode limit defaults to 250 (can be overriden via the max_episode_steps argument)
The default goal is to reach the stair down. This can be changed using a reward manager.
Auto-pick is disabled by default.
Inventory strings and corresponding letter are also included as part of the agent observations.
- __init__(*args, des_file, **kwargs)[source]
Constructs a new MiniHack environment.
- Parameters
des_file (str) – The description file for the environment.
reward_win (float) – The reward received upon successfully completing an episode. Defaults to 1.
reward_lose (float) – The reward received upon death or aborting. Defaults to 1.
obs_crop_h (int) – The height of agent-centred cropped observation. Defaults to 9.
obs_crop_w (int) – The width of agent-centred cropped observation. Defaults to 9.
obs_crop_pad (int) – The padding for agent-centred cropped observation. Defaults to 0.
reward_manager (RewardManager or None) – The reward manager that describes the custom reward function of the agent. If None, the goal of the agent is to reach the stair down. Defaults to None.
use_wiki (bool) – Whether to use the NetHack wiki. Defaults to False.
autopickup (bool) – Turning autopickup on or off. Defaults to True.
pet (bool) – Whether to include the pet. Defaults to False.
observation_keys (list) – The keys of observations returned after every timestep by the environment as a dictionary. Defaults to
minihack.base.MH_DEFAULT_OBS_KEYS
.seeds (list or None) – A list of random seeds for sampling episodes. If none, the entire level distribution is used. Defaults to None.
penalty_mode (str) – The name of the mode for calculating the time step penalty. Can be
constant
,exp
,square
,linear
, oralways
. Defaults toconstant
. Inherited from NetHackScore.penalty_step (float) – A constant applied to amount of frozen steps. Defaults to -0.01. Inherited from NetHackScore.
penalty_time (float) – A constant applied to amount of frozen steps. Defaults to -0.0. Inherited from NetHackScore.
savedir (str or None) – path to save ttyrecs (game recordings) into. Defaults to None, which doesn’t save any data. Otherwise, interpreted as a path to a new or existing directory. If “” (empty string), NLE choses a unique directory name. Inherited from NLE.
character (str) – Name of character. Defaults to “mon-hum-neu-mal”. Interited from NLE.
max_episode_steps (int) – maximum amount of steps allowed before the game is forcefully quit. In such cases,
info["end_status"]
ill be equal toStepStatus.ABORTED
. Defaults to 5000. Inherited from NLE.actions (list) – list of actions. If None, the full action space will be used, i.e.
nle.nethack.ACTIONS
. Defaults to None. Inherited from NLE.wizard (bool) – activate wizard mode. Defaults to False.
allow_all_yn_questions (bool) – If set to True, no y/n questions in step() are declined. If set to False, only elements of SKIP_EXCEPTIONS are not declined. Defaults to False. Inherited from NLE.
allow_all_modes (bool) – If set to True, do not decline menus, text input or auto ‘MORE’. If set to False, only skip click through ‘MORE’ on death. Inherited from NLE.