Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelArea': 1.0,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'min_orbital_radius': 6578136.6,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2025-12-03 22:24:17,371 gym                            INFO       Calling env.reset() to get observation space
2025-12-03 22:24:17,372 gym                            INFO       Resetting environment with seed=4195239533
2025-12-03 22:24:17,461 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-12-03 22:24:17,471 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2025-12-03 22:24:17,522 gym                            INFO       Resetting environment with seed=1
2025-12-03 22:24:17,644 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-12-03 22:24:17,653 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2025-12-03 22:24:17,658 gym                            INFO       <0.00> === STARTING STEP ===
2025-12-03 22:24:17,659 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-12-03 22:24:17,659 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2025-12-03 22:24:17,664 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-12-03 22:24:17,665 data.base                      INFO       <60.00> Total reward: {}
2025-12-03 22:24:17,665 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,666 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,667 gym                            INFO       <60.00> Step reward: 0.0
2025-12-03 22:24:17,668 gym                            INFO       <60.00> === STARTING STEP ===
2025-12-03 22:24:17,668 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-12-03 22:24:17,668 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2025-12-03 22:24:17,673 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-12-03 22:24:17,674 data.base                      INFO       <120.00> Total reward: {'EO1': 18.0}
2025-12-03 22:24:17,674 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,675 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,676 gym                            INFO       <120.00> Step reward: 18.0
2025-12-03 22:24:17,677 gym                            INFO       <120.00> === STARTING STEP ===
2025-12-03 22:24:17,677 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-12-03 22:24:17,678 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2025-12-03 22:24:17,682 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-12-03 22:24:17,683 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2025-12-03 22:24:17,683 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,684 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,686 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6741613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-12-03 22:24:17,691 gym                            INFO       <180.00> === STARTING STEP ===
2025-12-03 22:24:17,691 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,692 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2025-12-03 22:24:17,724 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2025-12-03 22:24:17,724 data.base                      INFO       <780.00> Total reward: {}
2025-12-03 22:24:17,725 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,725 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,729 gym                            INFO       <780.00> Step reward: 0.0
2025-12-03 22:24:17,730 gym                            INFO       <780.00> === STARTING STEP ===
2025-12-03 22:24:17,731 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,731 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2025-12-03 22:24:17,763 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-12-03 22:24:17,763 data.base                      INFO       <1380.00> Total reward: {}
2025-12-03 22:24:17,764 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,764 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,766 gym                            INFO       <1380.00> Step reward: 0.0
2025-12-03 22:24:17,766 gym                            INFO       <1380.00> === STARTING STEP ===
2025-12-03 22:24:17,767 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,767 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2025-12-03 22:24:17,799 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-12-03 22:24:17,799 data.base                      INFO       <1980.00> Total reward: {}
2025-12-03 22:24:17,800 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,800 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,801 gym                            INFO       <1980.00> Step reward: 0.0
2025-12-03 22:24:17,802 gym                            INFO       <1980.00> === STARTING STEP ===
2025-12-03 22:24:17,802 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,803 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2025-12-03 22:24:17,835 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-12-03 22:24:17,835 data.base                      INFO       <2580.00> Total reward: {}
2025-12-03 22:24:17,836 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,837 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,838 gym                            INFO       <2580.00> Step reward: 0.0
2025-12-03 22:24:17,839 gym                            INFO       <2580.00> === STARTING STEP ===
2025-12-03 22:24:17,839 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,840 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
Charge level: 0.630 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
2025-12-03 22:24:17,871 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-12-03 22:24:17,872 data.base                      INFO       <3180.00> Total reward: {}
2025-12-03 22:24:17,872 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,873 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,880 gym                            INFO       <3180.00> Step reward: 0.0
2025-12-03 22:24:17,880 gym                            INFO       <3180.00> === STARTING STEP ===
2025-12-03 22:24:17,880 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,882 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2025-12-03 22:24:17,912 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-12-03 22:24:17,913 data.base                      INFO       <3780.00> Total reward: {}
2025-12-03 22:24:17,914 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,914 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,916 gym                            INFO       <3780.00> Step reward: 0.0
2025-12-03 22:24:17,916 gym                            INFO       <3780.00> === STARTING STEP ===
2025-12-03 22:24:17,916 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,917 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2025-12-03 22:24:17,948 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-12-03 22:24:17,949 data.base                      INFO       <4380.00> Total reward: {}
2025-12-03 22:24:17,950 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,950 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,952 gym                            INFO       <4380.00> Step reward: 0.0
2025-12-03 22:24:17,952 gym                            INFO       <4380.00> === STARTING STEP ===
2025-12-03 22:24:17,953 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,953 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2025-12-03 22:24:17,984 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-12-03 22:24:17,985 data.base                      INFO       <4980.00> Total reward: {}
2025-12-03 22:24:17,986 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:17,986 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:17,988 gym                            INFO       <4980.00> Step reward: 0.0
2025-12-03 22:24:17,988 gym                            INFO       <4980.00> === STARTING STEP ===
2025-12-03 22:24:17,988 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:17,989 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2025-12-03 22:24:18,021 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-12-03 22:24:18,021 data.base                      INFO       <5580.00> Total reward: {}
2025-12-03 22:24:18,022 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:18,022 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2025-12-03 22:24:18,024 gym                            INFO       <5580.00> Step reward: 0.0
2025-12-03 22:24:18,024 gym                            INFO       <5580.00> === STARTING STEP ===
2025-12-03 22:24:18,025 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-12-03 22:24:18,025 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2025-12-03 22:24:18,033 data.base                      INFO       <5700.00> Total reward: {}
2025-12-03 22:24:18,033 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2025-12-03 22:24:18,034 gym                            INFO       <5700.00> Step reward: 0.0
2025-12-03 22:24:18,035 gym                            INFO       <5700.00> Episode terminated: False
2025-12-03 22:24:18,036 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.