{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multi-Agent Environments\n",
    "\n",
    "Two multiagent environments are given in the package:\n",
    "\n",
    "* [GeneralSatelliteTasking](../api_reference/index.rst#bsk_rl.GeneralSatelliteTasking), \n",
    "  a [Gymnasium](https://gymnasium.farama.org)-based environment and the basis for all other environments.\n",
    "* [ConstellationTasking](../api_reference/index.rst#bsk_rl.ConstellationTasking), which\n",
    "  implements the [PettingZoo parallel API](https://pettingzoo.farama.org/api/parallel/).\n",
    "\n",
    "The latter is preferable for multi-agent RL (MARL) settings, as most algorithms are designed\n",
    "for this kind of API.\n",
    "\n",
    "## Configuring the Environment\n",
    "\n",
    "For this example, a multisatellite target imaging environment will be used. The goal is\n",
    "to maximize the value of unique images taken.\n",
    "\n",
    "As usual, the satellite type is defined first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bsk_rl import sats, act, obs, scene, data, comm\n",
    "from bsk_rl.sim import dyn, fsw\n",
    "\n",
    "class ImagingSatellite(sats.ImagingSatellite):\n",
    "    observation_spec = [\n",
    "        obs.OpportunityProperties(\n",
    "            dict(prop=\"priority\"), \n",
    "            dict(prop=\"opportunity_open\", norm=5700.0),\n",
    "            n_ahead_observe=10,\n",
    "        )\n",
    "    ]\n",
    "    action_spec = [act.Image(n_ahead_image=10)]\n",
    "    dyn_type = dyn.FullFeaturedDynModel\n",
    "    fsw_type = fsw.SteeringImagerFSWModel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Satellite properties are set to give the satellite near-unlimited power and storage resources. To randomize some parameters in a correlated manner across satellites, a ``sat_arg_randomizer`` is set and passed to the environment. In this case, the satellites are distributed in a trivial single-plane Walker-delta constellation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "from bsk_rl.utils.orbital import walker_delta_args\n",
    "\n",
    "sat_args = dict(\n",
    "    imageAttErrorRequirement=0.01,\n",
    "    imageRateErrorRequirement=0.01,\n",
    "    batteryStorageCapacity=1e9,\n",
    "    storedCharge_Init=1e9,\n",
    "    dataStorageCapacity=1e12,\n",
    "    u_max=0.4,\n",
    "    K1=0.25,\n",
    "    K3=3.0,\n",
    "    omega_max=0.087,\n",
    "    servo_Ki=5.0,\n",
    "    servo_P=150 / 5,\n",
    ")\n",
    "sat_arg_randomizer = walker_delta_args(altitude=800.0, inc=60.0, n_planes=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gym API\n",
    "\n",
    "GeneralSatelliteTasking uses tuples of actions and observations to interact with the\n",
    "environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bsk_rl import GeneralSatelliteTasking\n",
    "\n",
    "env = GeneralSatelliteTasking(\n",
    "    satellites=[\n",
    "        ImagingSatellite(\"EO-1\", sat_args),\n",
    "        ImagingSatellite(\"EO-2\", sat_args),\n",
    "        ImagingSatellite(\"EO-3\", sat_args),\n",
    "    ],\n",
    "    scenario=scene.UniformTargets(1000),\n",
    "    rewarder=data.UniqueImageReward(),\n",
    "    communicator=comm.LOSCommunication(),  # Note that dyn must inherit from LOSCommunication\n",
    "    sat_arg_randomizer=sat_arg_randomizer,\n",
    "    log_level=\"INFO\",\n",
    ")\n",
    "env.reset()\n",
    "\n",
    "env.observation_space"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.action_space"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consequently, actions are passed as a tuple. The step will stop the first time any\n",
    "satellite completes an action."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation, reward, terminated, truncated, info = env.step([7, 9, 8])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At this point, either every satellite can be retasked, or satellites can continue their\n",
    "previous action by passing `None` as the action. To see which satellites must be\n",
    "retasked (i.e. their previous action is done and they have nothing more to do), look at\n",
    "`\"requires_retasking\"` in each satellite's info."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "info"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on this list, we decide here to only retask the satellite that needs it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "actions = [0 if info[sat.name][\"requires_retasking\"] else None for sat in env.unwrapped.satellites]\n",
    "actions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation, reward, terminated, truncated, info = env.step(actions)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this environment, the environment will stop if any agent dies. To demonstrate this,\n",
    "one satellite is forcibly killed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from Basilisk.architecture import messaging\n",
    "\n",
    "def isnt_alive(log_failure=False):\n",
    "    \"\"\"Mock satellite 0 dying.\"\"\"\n",
    "    self = env.unwrapped.satellites[0]\n",
    "    death_message = messaging.PowerStorageStatusMsgPayload()\n",
    "    death_message.storageLevel = 0.0\n",
    "    self.dynamics.powerMonitor.batPowerOutMsg.write(death_message)\n",
    "    return self.dynamics.is_alive(log_failure=log_failure) and self.fsw.is_alive(\n",
    "        log_failure=log_failure\n",
    "    )\n",
    "\n",
    "env.unwrapped.satellites[0].is_alive = isnt_alive\n",
    "observation, reward, terminated, truncated, info = env.step([6, 7, 9])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## PettingZoo API\n",
    "\n",
    "The [PettingZoo parallel API](https://pettingzoo.farama.org/api/parallel/) environment, \n",
    "ConstellationTasking, is largely the same as GeneralSatelliteTasking. See their\n",
    "documentation for a full description of the API. It tends to separate things into\n",
    "dictionaries keyed by agent, rather than tuples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bsk_rl import ConstellationTasking\n",
    "\n",
    "env = ConstellationTasking(\n",
    "    satellites=[\n",
    "        ImagingSatellite(\"EO-1\", sat_args),\n",
    "        ImagingSatellite(\"EO-2\", sat_args),\n",
    "        ImagingSatellite(\"EO-3\", sat_args),\n",
    "    ],\n",
    "    scenario=scene.UniformTargets(1000),\n",
    "    rewarder=data.UniqueImageReward(),\n",
    "    communicator=comm.LOSCommunication(),  # Note that dyn must inherit from LOSCommunication\n",
    "    sat_arg_randomizer=sat_arg_randomizer,\n",
    "    log_level=\"INFO\",\n",
    ")\n",
    "env.reset()\n",
    "\n",
    "env.observation_spaces"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.action_spaces"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Actions are passed as a dictionary; the agent names can be accessed through the `agents`\n",
    "property."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation, reward, terminated, truncated, info = env.step(\n",
    "    {\n",
    "        env.agents[0]: 7,\n",
    "        env.agents[1]: 9,\n",
    "        env.agents[2]: 8,\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other than compatibility with MARL algorithms, the main benefit of the PettingZoo API\n",
    "is that it allows for individual agents to fail without terminating the entire environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Immediately kill satellite 0\n",
    "env.unwrapped.satellites[0].is_alive = isnt_alive\n",
    "env.agents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observation, reward, terminated, truncated, info = env.step({\n",
    "        env.agents[0]: 7,\n",
    "        env.agents[1]: 9,\n",
    "    }\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv_refactor",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}