Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelArea': 1.0,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2025-09-30 17:49:51,912 gym                            INFO       Calling env.reset() to get observation space
2025-09-30 17:49:51,913 gym                            INFO       Resetting environment with seed=4235484085
2025-09-30 17:49:52,003 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-09-30 17:49:52,019 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2025-09-30 17:49:52,080 gym                            INFO       Resetting environment with seed=1
2025-09-30 17:49:52,213 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-09-30 17:49:52,228 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2025-09-30 17:49:52,234 gym                            INFO       <0.00> === STARTING STEP ===
2025-09-30 17:49:52,235 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-09-30 17:49:52,235 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2025-09-30 17:49:52,243 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-09-30 17:49:52,243 data.base                      INFO       <60.00> Total reward: {}
2025-09-30 17:49:52,244 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,244 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,246 gym                            INFO       <60.00> Step reward: 0.0
2025-09-30 17:49:52,247 gym                            INFO       <60.00> === STARTING STEP ===
2025-09-30 17:49:52,247 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-09-30 17:49:52,247 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2025-09-30 17:49:52,255 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-09-30 17:49:52,255 data.base                      INFO       <120.00> Total reward: {'EO1': 18.0}
2025-09-30 17:49:52,256 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,256 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,257 gym                            INFO       <120.00> Step reward: 18.0
2025-09-30 17:49:52,258 gym                            INFO       <120.00> === STARTING STEP ===
2025-09-30 17:49:52,259 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-09-30 17:49:52,260 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2025-09-30 17:49:52,267 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-09-30 17:49:52,267 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2025-09-30 17:49:52,268 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,269 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,270 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6741613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-09-30 17:49:52,275 gym                            INFO       <180.00> === STARTING STEP ===
2025-09-30 17:49:52,276 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,276 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2025-09-30 17:49:52,341 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2025-09-30 17:49:52,341 data.base                      INFO       <780.00> Total reward: {}
2025-09-30 17:49:52,342 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,343 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,348 gym                            INFO       <780.00> Step reward: 0.0
2025-09-30 17:49:52,348 gym                            INFO       <780.00> === STARTING STEP ===
2025-09-30 17:49:52,349 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,350 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2025-09-30 17:49:52,409 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-09-30 17:49:52,410 data.base                      INFO       <1380.00> Total reward: {}
2025-09-30 17:49:52,410 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,411 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,412 gym                            INFO       <1380.00> Step reward: 0.0
2025-09-30 17:49:52,413 gym                            INFO       <1380.00> === STARTING STEP ===
2025-09-30 17:49:52,414 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,414 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
2025-09-30 17:49:52,478 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-09-30 17:49:52,479 data.base                      INFO       <1980.00> Total reward: {}
2025-09-30 17:49:52,479 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,479 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,481 gym                            INFO       <1980.00> Step reward: 0.0
2025-09-30 17:49:52,482 gym                            INFO       <1980.00> === STARTING STEP ===
2025-09-30 17:49:52,482 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,483 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2025-09-30 17:49:52,547 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-09-30 17:49:52,548 data.base                      INFO       <2580.00> Total reward: {}
2025-09-30 17:49:52,548 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
2025-09-30 17:49:52,549 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,550 gym                            INFO       <2580.00> Step reward: 0.0
2025-09-30 17:49:52,551 gym                            INFO       <2580.00> === STARTING STEP ===
2025-09-30 17:49:52,551 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,552 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
2025-09-30 17:49:52,613 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-09-30 17:49:52,614 data.base                      INFO       <3180.00> Total reward: {}
2025-09-30 17:49:52,614 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,615 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,624 gym                            INFO       <3180.00> Step reward: 0.0
2025-09-30 17:49:52,625 gym                            INFO       <3180.00> === STARTING STEP ===
2025-09-30 17:49:52,625 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,626 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
Charge level: 0.630 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
2025-09-30 17:49:52,685 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-09-30 17:49:52,686 data.base                      INFO       <3780.00> Total reward: {}
2025-09-30 17:49:52,686 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,687 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,688 gym                            INFO       <3780.00> Step reward: 0.0
2025-09-30 17:49:52,689 gym                            INFO       <3780.00> === STARTING STEP ===
2025-09-30 17:49:52,690 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,690 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
2025-09-30 17:49:52,750 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-09-30 17:49:52,750 data.base                      INFO       <4380.00> Total reward: {}
2025-09-30 17:49:52,751 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2025-09-30 17:49:52,752 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,753 gym                            INFO       <4380.00> Step reward: 0.0
2025-09-30 17:49:52,754 gym                            INFO       <4380.00> === STARTING STEP ===
2025-09-30 17:49:52,754 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,754 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2025-09-30 17:49:52,815 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-09-30 17:49:52,815 data.base                      INFO       <4980.00> Total reward: {}
2025-09-30 17:49:52,816 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,816 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,818 gym                            INFO       <4980.00> Step reward: 0.0
2025-09-30 17:49:52,818 gym                            INFO       <4980.00> === STARTING STEP ===
2025-09-30 17:49:52,819 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,819 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2025-09-30 17:49:52,879 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-09-30 17:49:52,880 data.base                      INFO       <5580.00> Total reward: {}
2025-09-30 17:49:52,880 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,881 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2025-09-30 17:49:52,882 gym                            INFO       <5580.00> Step reward: 0.0
2025-09-30 17:49:52,883 gym                            INFO       <5580.00> === STARTING STEP ===
2025-09-30 17:49:52,883 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-09-30 17:49:52,884 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
2025-09-30 17:49:52,897 data.base                      INFO       <5700.00> Total reward: {}
2025-09-30 17:49:52,898 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2025-09-30 17:49:52,899 gym                            INFO       <5700.00> Step reward: 0.0
2025-09-30 17:49:52,900 gym                            INFO       <5700.00> Episode terminated: False
2025-09-30 17:49:52,900 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.