Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelArea': 1.0,
 'panelEfficiency': 0.2,
 'nHat_B': array([0, 1, 0]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = 45.0, alt: float = 500, r_body: float = 6371, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = 0, f: Optional[float] = None) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2024-05-30 12:06:15,586 gym                            INFO       Calling env.reset() to get observation space
2024-05-30 12:06:15,587 gym                            INFO       Resetting environment with seed=907950944
2024-05-30 12:06:15,669 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2024-05-30 12:06:15,683 gym                            INFO       <0.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:15,683 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2024-05-30 12:06:15,787 gym                            INFO       Resetting environment with seed=1
2024-05-30 12:06:15,955 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2024-05-30 12:06:15,967 gym                            INFO       <0.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:15,967 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2024-05-30 12:06:15,971 gym                            INFO       <0.00> === STARTING STEP ===
2024-05-30 12:06:15,971 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2024-05-30 12:06:15,971 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2024-05-30 12:06:15,972 sim.simulator                  INFO       <0.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:15,975 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2024-05-30 12:06:15,976 data.base                      INFO       <60.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:15,976 gym                            INFO       <60.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:15,977 gym                            INFO       <60.00> Step reward: 0.0
2024-05-30 12:06:15,977 gym                            INFO       <60.00> === STARTING STEP ===
2024-05-30 12:06:15,977 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2024-05-30 12:06:15,977 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2024-05-30 12:06:15,978 sim.simulator                  INFO       <60.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:15,981 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2024-05-30 12:06:15,981 data.base                      INFO       <120.00> Data reward: {'EO1_11826041840': 30.0}
2024-05-30 12:06:15,982 gym                            INFO       <120.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:15,982 gym                            INFO       <120.00> Step reward: 30.0
2024-05-30 12:06:15,982 gym                            INFO       <120.00> === STARTING STEP ===
2024-05-30 12:06:15,983 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2024-05-30 12:06:15,983 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2024-05-30 12:06:15,983 sim.simulator                  INFO       <120.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:15,987 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2024-05-30 12:06:15,987 data.base                      INFO       <180.00> Data reward: {'EO1_11826041840': 60.0}
2024-05-30 12:06:15,987 gym                            INFO       <180.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:15,988 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.7341307878 (randomized by sat_args)
  Final data level: 0.8241307878

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2024-05-30 12:06:15,991 gym                            INFO       <180.00> === STARTING STEP ===
2024-05-30 12:06:15,991 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:15,991 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2024-05-30 12:06:15,992 sim.simulator                  INFO       <180.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,024 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2024-05-30 12:06:16,024 data.base                      INFO       <780.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,047 gym                            INFO       <780.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,048 gym                            INFO       <780.00> Step reward: 0.0
2024-05-30 12:06:16,048 gym                            INFO       <780.00> === STARTING STEP ===
2024-05-30 12:06:16,048 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,048 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2024-05-30 12:06:16,049 sim.simulator                  INFO       <780.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,080 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2024-05-30 12:06:16,080 data.base                      INFO       <1380.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,081 gym                            INFO       <1380.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,081 gym                            INFO       <1380.00> Step reward: 0.0
2024-05-30 12:06:16,081 gym                            INFO       <1380.00> === STARTING STEP ===
2024-05-30 12:06:16,081 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,082 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2024-05-30 12:06:16,082 sim.simulator                  INFO       <1380.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,113 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2024-05-30 12:06:16,113 data.base                      INFO       <1980.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,114 gym                            INFO       <1980.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,114 gym                            INFO       <1980.00> Step reward: 0.0
2024-05-30 12:06:16,114 gym                            INFO       <1980.00> === STARTING STEP ===
2024-05-30 12:06:16,114 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,115 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2024-05-30 12:06:16,115 sim.simulator                  INFO       <1980.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,146 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2024-05-30 12:06:16,146 data.base                      INFO       <2580.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,175 gym                            INFO       <2580.00> Satellites requiring retasking: ['EO1_11826041840']
Charge level: 0.160 (780.0 seconds)
        Eclipse: start: 5340.0 end: 1800.0
Charge level: 0.158 (1380.0 seconds)
        Eclipse: start: 4740.0 end: 1200.0
Charge level: 0.155 (1980.0 seconds)
        Eclipse: start: 4140.0 end: 600.0
2024-05-30 12:06:16,175 gym                            INFO       <2580.00> Step reward: 0.0
2024-05-30 12:06:16,175 gym                            INFO       <2580.00> === STARTING STEP ===
2024-05-30 12:06:16,176 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,176 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
2024-05-30 12:06:16,176 sim.simulator                  INFO       <2580.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,207 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2024-05-30 12:06:16,207 data.base                      INFO       <3180.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,208 gym                            INFO       <3180.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,208 gym                            INFO       <3180.00> Step reward: 0.0
2024-05-30 12:06:16,208 gym                            INFO       <3180.00> === STARTING STEP ===
2024-05-30 12:06:16,209 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,209 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2024-05-30 12:06:16,209 sim.simulator                  INFO       <3180.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,240 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2024-05-30 12:06:16,241 data.base                      INFO       <3780.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,241 gym                            INFO       <3780.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,241 gym                            INFO       <3780.00> Step reward: 0.0
2024-05-30 12:06:16,242 gym                            INFO       <3780.00> === STARTING STEP ===
2024-05-30 12:06:16,242 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,242 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
2024-05-30 12:06:16,242 sim.simulator                  INFO       <3780.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,274 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
Charge level: 0.175 (2580.0 seconds)
        Eclipse: start: 3540.0 end: 5670.0
Charge level: 0.763 (3180.0 seconds)
        Eclipse: start: 2940.0 end: 5070.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2340.0 end: 4470.0
2024-05-30 12:06:16,275 data.base                      INFO       <4380.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,276 gym                            INFO       <4380.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,276 gym                            INFO       <4380.00> Step reward: 0.0
2024-05-30 12:06:16,276 gym                            INFO       <4380.00> === STARTING STEP ===
2024-05-30 12:06:16,276 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,277 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2024-05-30 12:06:16,277 sim.simulator                  INFO       <4380.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,309 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2024-05-30 12:06:16,309 data.base                      INFO       <4980.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,310 gym                            INFO       <4980.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,310 gym                            INFO       <4980.00> Step reward: 0.0
2024-05-30 12:06:16,311 gym                            INFO       <4980.00> === STARTING STEP ===
2024-05-30 12:06:16,311 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,311 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2024-05-30 12:06:16,311 sim.simulator                  INFO       <4980.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,344 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2024-05-30 12:06:16,344 data.base                      INFO       <5580.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,345 gym                            INFO       <5580.00> Satellites requiring retasking: ['EO1_11826041840']
2024-05-30 12:06:16,345 gym                            INFO       <5580.00> Step reward: 0.0
2024-05-30 12:06:16,345 gym                            INFO       <5580.00> === STARTING STEP ===
2024-05-30 12:06:16,345 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2024-05-30 12:06:16,346 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2024-05-30 12:06:16,346 sim.simulator                  INFO       <5580.00> Running simulation at most to 5700.00 seconds
2024-05-30 12:06:16,353 data.base                      INFO       <5700.00> Data reward: {'EO1_11826041840': 0.0}
2024-05-30 12:06:16,353 gym                            INFO       <5700.00> Step reward: 0.0
2024-05-30 12:06:16,353 gym                            INFO       <5700.00> Episode terminated: False
2024-05-30 12:06:16,354 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 1740.0 end: 3870.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1140.0 end: 3270.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 540.0 end: 2670.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 420.0 end: 2550.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.