Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'panelArea': 1.0,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'min_orbital_radius': 6578136.6,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2026-01-05 18:32:31,670 gym                            INFO       Calling env.reset() to get observation space
2026-01-05 18:32:31,670 gym                            INFO       Resetting environment with seed=148424196
2026-01-05 18:32:31,760 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-01-05 18:32:31,771 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2026-01-05 18:32:31,833 gym                            INFO       Resetting environment with seed=1
2026-01-05 18:32:31,967 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-01-05 18:32:31,976 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2026-01-05 18:32:31,982 gym                            INFO       <0.00> === STARTING STEP ===
2026-01-05 18:32:31,983 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-01-05 18:32:31,983 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2026-01-05 18:32:31,989 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2026-01-05 18:32:31,989 data.base                      INFO       <60.00> Total reward: {}
2026-01-05 18:32:31,990 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:31,990 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:31,992 gym                            INFO       <60.00> Step reward: 0.0
2026-01-05 18:32:31,992 gym                            INFO       <60.00> === STARTING STEP ===
2026-01-05 18:32:31,993 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-01-05 18:32:31,993 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2026-01-05 18:32:31,999 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2026-01-05 18:32:31,999 data.base                      INFO       <120.00> Total reward: {'EO1': 18.0}
2026-01-05 18:32:32,000 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,001 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,002 gym                            INFO       <120.00> Step reward: 18.0
2026-01-05 18:32:32,003 gym                            INFO       <120.00> === STARTING STEP ===
2026-01-05 18:32:32,004 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-01-05 18:32:32,004 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2026-01-05 18:32:32,009 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2026-01-05 18:32:32,009 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2026-01-05 18:32:32,010 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,010 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,012 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6741613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2026-01-05 18:32:32,018 gym                            INFO       <180.00> === STARTING STEP ===
2026-01-05 18:32:32,019 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,019 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2026-01-05 18:32:32,053 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2026-01-05 18:32:32,054 data.base                      INFO       <780.00> Total reward: {}
2026-01-05 18:32:32,055 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,055 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,059 gym                            INFO       <780.00> Step reward: 0.0
2026-01-05 18:32:32,060 gym                            INFO       <780.00> === STARTING STEP ===
2026-01-05 18:32:32,060 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,061 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2026-01-05 18:32:32,094 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2026-01-05 18:32:32,095 data.base                      INFO       <1380.00> Total reward: {}
2026-01-05 18:32:32,096 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,096 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,098 gym                            INFO       <1380.00> Step reward: 0.0
2026-01-05 18:32:32,099 gym                            INFO       <1380.00> === STARTING STEP ===
2026-01-05 18:32:32,099 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,099 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2026-01-05 18:32:32,132 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2026-01-05 18:32:32,133 data.base                      INFO       <1980.00> Total reward: {}
2026-01-05 18:32:32,134 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,134 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,135 gym                            INFO       <1980.00> Step reward: 0.0
2026-01-05 18:32:32,136 gym                            INFO       <1980.00> === STARTING STEP ===
2026-01-05 18:32:32,137 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,137 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2026-01-05 18:32:32,170 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2026-01-05 18:32:32,170 data.base                      INFO       <2580.00> Total reward: {}
2026-01-05 18:32:32,171 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,171 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,173 gym                            INFO       <2580.00> Step reward: 0.0
2026-01-05 18:32:32,173 gym                            INFO       <2580.00> === STARTING STEP ===
2026-01-05 18:32:32,174 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,174 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
Charge level: 0.630 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
2026-01-05 18:32:32,207 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2026-01-05 18:32:32,208 data.base                      INFO       <3180.00> Total reward: {}
2026-01-05 18:32:32,209 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,209 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,216 gym                            INFO       <3180.00> Step reward: 0.0
2026-01-05 18:32:32,216 gym                            INFO       <3180.00> === STARTING STEP ===
2026-01-05 18:32:32,217 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,218 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2026-01-05 18:32:32,250 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2026-01-05 18:32:32,250 data.base                      INFO       <3780.00> Total reward: {}
2026-01-05 18:32:32,251 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,251 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,253 gym                            INFO       <3780.00> Step reward: 0.0
2026-01-05 18:32:32,253 gym                            INFO       <3780.00> === STARTING STEP ===
2026-01-05 18:32:32,254 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,254 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2026-01-05 18:32:32,288 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2026-01-05 18:32:32,288 data.base                      INFO       <4380.00> Total reward: {}
2026-01-05 18:32:32,289 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,289 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,291 gym                            INFO       <4380.00> Step reward: 0.0
2026-01-05 18:32:32,291 gym                            INFO       <4380.00> === STARTING STEP ===
2026-01-05 18:32:32,292 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,292 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2026-01-05 18:32:32,324 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2026-01-05 18:32:32,325 data.base                      INFO       <4980.00> Total reward: {}
2026-01-05 18:32:32,326 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,326 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,327 gym                            INFO       <4980.00> Step reward: 0.0
2026-01-05 18:32:32,328 gym                            INFO       <4980.00> === STARTING STEP ===
2026-01-05 18:32:32,328 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,329 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2026-01-05 18:32:32,362 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2026-01-05 18:32:32,363 data.base                      INFO       <5580.00> Total reward: {}
2026-01-05 18:32:32,363 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,364 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2026-01-05 18:32:32,366 gym                            INFO       <5580.00> Step reward: 0.0
2026-01-05 18:32:32,366 gym                            INFO       <5580.00> === STARTING STEP ===
2026-01-05 18:32:32,367 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2026-01-05 18:32:32,368 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2026-01-05 18:32:32,375 data.base                      INFO       <5700.00> Total reward: {}
2026-01-05 18:32:32,376 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2026-01-05 18:32:32,377 gym                            INFO       <5700.00> Step reward: 0.0
2026-01-05 18:32:32,377 gym                            INFO       <5700.00> Episode terminated: False
2026-01-05 18:32:32,378 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.