Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'panelArea': 1.0,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'min_orbital_radius': 6578136.6,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2026-03-09 19:09:53,009 gym                            INFO       Calling env.reset() to get observation space
2026-03-09 19:09:53,010 gym                            INFO       Resetting environment with seed=2776843430
2026-03-09 19:09:53,092 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-03-09 19:09:53,103 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2026-03-09 19:09:53,109 gym                            INFO       Resetting environment with seed=1
2026-03-09 19:09:53,168 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-03-09 19:09:53,178 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2026-03-09 19:09:53,184 gym                            INFO       <0.00> === STARTING STEP ===
2026-03-09 19:09:53,185 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-03-09 19:09:53,185 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2026-03-09 19:09:53,190 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2026-03-09 19:09:53,191 data.base                      INFO       <60.00> Total reward: {}
2026-03-09 19:09:53,191 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,192 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,193 gym                            INFO       <60.00> Step reward: 0.0
2026-03-09 19:09:53,194 gym                            INFO       <60.00> === STARTING STEP ===
2026-03-09 19:09:53,194 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-03-09 19:09:53,195 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2026-03-09 19:09:53,199 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2026-03-09 19:09:53,200 data.base                      INFO       <120.00> Total reward: {'EO1': 24.0}
2026-03-09 19:09:53,200 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,201 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,203 gym                            INFO       <120.00> Step reward: 24.0
2026-03-09 19:09:53,203 gym                            INFO       <120.00> === STARTING STEP ===
2026-03-09 19:09:53,204 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-03-09 19:09:53,204 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2026-03-09 19:09:53,209 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2026-03-09 19:09:53,210 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2026-03-09 19:09:53,210 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,211 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,213 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6801613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2026-03-09 19:09:53,218 gym                            INFO       <180.00> === STARTING STEP ===
2026-03-09 19:09:53,219 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,220 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2026-03-09 19:09:53,253 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2026-03-09 19:09:53,254 data.base                      INFO       <780.00> Total reward: {}
2026-03-09 19:09:53,255 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,255 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,260 gym                            INFO       <780.00> Step reward: 0.0
2026-03-09 19:09:53,261 gym                            INFO       <780.00> === STARTING STEP ===
2026-03-09 19:09:53,261 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,261 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2026-03-09 19:09:53,294 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2026-03-09 19:09:53,295 data.base                      INFO       <1380.00> Total reward: {}
2026-03-09 19:09:53,295 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,296 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,297 gym                            INFO       <1380.00> Step reward: 0.0
2026-03-09 19:09:53,298 gym                            INFO       <1380.00> === STARTING STEP ===
2026-03-09 19:09:53,298 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,299 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2026-03-09 19:09:53,331 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2026-03-09 19:09:53,331 data.base                      INFO       <1980.00> Total reward: {}
2026-03-09 19:09:53,332 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,332 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,334 gym                            INFO       <1980.00> Step reward: 0.0
2026-03-09 19:09:53,335 gym                            INFO       <1980.00> === STARTING STEP ===
2026-03-09 19:09:53,335 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,336 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2026-03-09 19:09:53,368 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2026-03-09 19:09:53,369 data.base                      INFO       <2580.00> Total reward: {}
2026-03-09 19:09:53,369 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,370 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,371 gym                            INFO       <2580.00> Step reward: 0.0
2026-03-09 19:09:53,372 gym                            INFO       <2580.00> === STARTING STEP ===
2026-03-09 19:09:53,373 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,373 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.634 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
Charge level: 0.629 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
2026-03-09 19:09:53,407 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2026-03-09 19:09:53,407 data.base                      INFO       <3180.00> Total reward: {}
2026-03-09 19:09:53,408 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,409 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,416 gym                            INFO       <3180.00> Step reward: 0.0
2026-03-09 19:09:53,417 gym                            INFO       <3180.00> === STARTING STEP ===
2026-03-09 19:09:53,418 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,418 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2026-03-09 19:09:53,450 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2026-03-09 19:09:53,450 data.base                      INFO       <3780.00> Total reward: {}
2026-03-09 19:09:53,451 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,451 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,453 gym                            INFO       <3780.00> Step reward: 0.0
2026-03-09 19:09:53,453 gym                            INFO       <3780.00> === STARTING STEP ===
2026-03-09 19:09:53,454 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,454 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2026-03-09 19:09:53,487 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2026-03-09 19:09:53,487 data.base                      INFO       <4380.00> Total reward: {}
2026-03-09 19:09:53,488 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,488 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,490 gym                            INFO       <4380.00> Step reward: 0.0
2026-03-09 19:09:53,491 gym                            INFO       <4380.00> === STARTING STEP ===
2026-03-09 19:09:53,492 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,492 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2026-03-09 19:09:53,524 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2026-03-09 19:09:53,524 data.base                      INFO       <4980.00> Total reward: {}
2026-03-09 19:09:53,525 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,525 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,527 gym                            INFO       <4980.00> Step reward: 0.0
2026-03-09 19:09:53,527 gym                            INFO       <4980.00> === STARTING STEP ===
2026-03-09 19:09:53,528 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,528 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2026-03-09 19:09:53,560 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2026-03-09 19:09:53,561 data.base                      INFO       <5580.00> Total reward: {}
2026-03-09 19:09:53,561 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,562 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2026-03-09 19:09:53,563 gym                            INFO       <5580.00> Step reward: 0.0
2026-03-09 19:09:53,564 gym                            INFO       <5580.00> === STARTING STEP ===
2026-03-09 19:09:53,564 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2026-03-09 19:09:53,565 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2026-03-09 19:09:53,573 data.base                      INFO       <5700.00> Total reward: {}
2026-03-09 19:09:53,573 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2026-03-09 19:09:53,575 gym                            INFO       <5700.00> Step reward: 0.0
2026-03-09 19:09:53,575 gym                            INFO       <5700.00> Episode terminated: False
2026-03-09 19:09:53,576 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.