minihack.skills module

class minihack.skills.MiniHackSkill(*args: Any, **kwargs: Any)[source]

Bases: nle.env.tasks.

The base class for MiniHack Skill Acquisition tasks.

Navigation tasks have the following characteristics:

  • The full action space is used.

  • Yes/No questions are enabled, but the menu-selection actions are disabled by default.

  • The character is set to a neutral human male caveman.

  • Maximum episode limit defaults to 250 (can be overriden via the max_episode_steps argument)

  • The default goal is to reach the stair down. This can be changed using a reward manager.

  • Auto-pick is disabled by default.

  • Inventory strings and corresponding letter are also included as part of the agent observations.

__init__(*args, des_file, **kwargs)[source]

Constructs a new MiniHack environment.

Parameters
  • des_file (str) – The description file for the environment.

  • reward_win (float) – The reward received upon successfully completing an episode. Defaults to 1.

  • reward_lose (float) – The reward received upon death or aborting. Defaults to 1.

  • obs_crop_h (int) – The height of agent-centred cropped observation. Defaults to 9.

  • obs_crop_w (int) – The width of agent-centred cropped observation. Defaults to 9.

  • obs_crop_pad (int) – The padding for agent-centred cropped observation. Defaults to 0.

  • reward_manager (RewardManager or None) – The reward manager that describes the custom reward function of the agent. If None, the goal of the agent is to reach the stair down. Defaults to None.

  • use_wiki (bool) – Whether to use the NetHack wiki. Defaults to False.

  • autopickup (bool) – Turning autopickup on or off. Defaults to True.

  • observation_keys (list) – The keys of observations returned after every timestep by the environment as a dictionary. Defaults to minihack.base.MH_DEFAULT_OBS_KEYS.

  • seeds (list or None) – A list of random seeds for sampling episodes. If none, the entire level distribution is used. Defaults to None.

  • penalty_mode (str) – The name of the mode for calculating the time step penalty. Can be constant, exp, square, linear, or always. Defaults to constant. Inherited from NetHackScore.

  • penalty_step (float) – A constant applied to amount of frozen steps. Defaults to -0.01. Inherited from NetHackScore.

  • penalty_time (float) – A constant applied to amount of frozen steps. Defaults to -0.0. Inherited from NetHackScore.

  • savedir (str or None) – path to save ttyrecs (game recordings) into. Defaults to None, which doesn’t save any data. Otherwise, interpreted as a path to a new or existing directory. If “” (empty string), NLE choses a unique directory name. Inherited from NLE.

  • character (str) – Name of character. Defaults to “mon-hum-neu-mal”. Interited from NLE.

  • max_episode_steps (int) – maximum amount of steps allowed before the game is forcefully quit. In such cases, info["end_status"] ill be equal to StepStatus.ABORTED. Defaults to 5000. Inherited from NLE.

  • actions (list) – list of actions. If None, the full action space will be used, i.e. nle.nethack.ACTIONS. Defaults to None. Inherited from NLE.

  • wizard (bool) – activate wizard mode. Defaults to False.

  • allow_all_yn_questions (bool) – If set to True, no y/n questions in step() are declined. If set to False, only elements of SKIP_EXCEPTIONS are not declined. Defaults to False. Inherited from NLE.

  • allow_all_modes (bool) – If set to True, do not decline menus, text input or auto ‘MORE’. If set to False, only skip click through ‘MORE’ on death. Inherited from NLE.