Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'panelArea': 1.0,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.DynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'min_orbital_radius': 6578136.6,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2026-02-25 01:08:37,407 gym                            INFO       Calling env.reset() to get observation space
2026-02-25 01:08:37,407 gym                            INFO       Resetting environment with seed=3465948439
2026-02-25 01:08:37,490 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-02-25 01:08:37,501 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2026-02-25 01:08:37,507 gym                            INFO       Resetting environment with seed=1
2026-02-25 01:08:37,563 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2026-02-25 01:08:37,572 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2026-02-25 01:08:37,577 gym                            INFO       <0.00> === STARTING STEP ===
2026-02-25 01:08:37,578 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-02-25 01:08:37,578 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2026-02-25 01:08:37,583 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2026-02-25 01:08:37,584 data.base                      INFO       <60.00> Total reward: {}
2026-02-25 01:08:37,584 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,585 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,586 gym                            INFO       <60.00> Step reward: 0.0
2026-02-25 01:08:37,587 gym                            INFO       <60.00> === STARTING STEP ===
2026-02-25 01:08:37,587 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-02-25 01:08:37,587 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2026-02-25 01:08:37,592 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2026-02-25 01:08:37,593 data.base                      INFO       <120.00> Total reward: {'EO1': 24.0}
2026-02-25 01:08:37,593 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,594 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,595 gym                            INFO       <120.00> Step reward: 24.0
2026-02-25 01:08:37,596 gym                            INFO       <120.00> === STARTING STEP ===
2026-02-25 01:08:37,596 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2026-02-25 01:08:37,597 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2026-02-25 01:08:37,601 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2026-02-25 01:08:37,602 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2026-02-25 01:08:37,602 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,603 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,604 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6801613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2026-02-25 01:08:37,610 gym                            INFO       <180.00> === STARTING STEP ===
2026-02-25 01:08:37,610 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,611 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2026-02-25 01:08:37,643 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2026-02-25 01:08:37,644 data.base                      INFO       <780.00> Total reward: {}
2026-02-25 01:08:37,644 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,645 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,649 gym                            INFO       <780.00> Step reward: 0.0
2026-02-25 01:08:37,649 gym                            INFO       <780.00> === STARTING STEP ===
2026-02-25 01:08:37,650 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,650 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2026-02-25 01:08:37,683 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2026-02-25 01:08:37,683 data.base                      INFO       <1380.00> Total reward: {}
2026-02-25 01:08:37,684 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,684 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,686 gym                            INFO       <1380.00> Step reward: 0.0
2026-02-25 01:08:37,686 gym                            INFO       <1380.00> === STARTING STEP ===
2026-02-25 01:08:37,687 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,687 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2026-02-25 01:08:37,720 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2026-02-25 01:08:37,720 data.base                      INFO       <1980.00> Total reward: {}
2026-02-25 01:08:37,720 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,721 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,723 gym                            INFO       <1980.00> Step reward: 0.0
2026-02-25 01:08:37,723 gym                            INFO       <1980.00> === STARTING STEP ===
2026-02-25 01:08:37,723 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,724 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2026-02-25 01:08:37,756 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2026-02-25 01:08:37,757 data.base                      INFO       <2580.00> Total reward: {}
2026-02-25 01:08:37,757 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,758 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,760 gym                            INFO       <2580.00> Step reward: 0.0
2026-02-25 01:08:37,760 gym                            INFO       <2580.00> === STARTING STEP ===
2026-02-25 01:08:37,760 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,761 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.634 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
Charge level: 0.629 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
2026-02-25 01:08:37,794 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2026-02-25 01:08:37,794 data.base                      INFO       <3180.00> Total reward: {}
2026-02-25 01:08:37,794 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,795 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,802 gym                            INFO       <3180.00> Step reward: 0.0
2026-02-25 01:08:37,802 gym                            INFO       <3180.00> === STARTING STEP ===
2026-02-25 01:08:37,803 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,803 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2026-02-25 01:08:37,835 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2026-02-25 01:08:37,835 data.base                      INFO       <3780.00> Total reward: {}
2026-02-25 01:08:37,836 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,836 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,838 gym                            INFO       <3780.00> Step reward: 0.0
2026-02-25 01:08:37,838 gym                            INFO       <3780.00> === STARTING STEP ===
2026-02-25 01:08:37,839 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,839 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2026-02-25 01:08:37,871 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2026-02-25 01:08:37,872 data.base                      INFO       <4380.00> Total reward: {}
2026-02-25 01:08:37,873 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,873 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,874 gym                            INFO       <4380.00> Step reward: 0.0
2026-02-25 01:08:37,875 gym                            INFO       <4380.00> === STARTING STEP ===
2026-02-25 01:08:37,875 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,877 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2026-02-25 01:08:37,909 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2026-02-25 01:08:37,910 data.base                      INFO       <4980.00> Total reward: {}
2026-02-25 01:08:37,910 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,911 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,912 gym                            INFO       <4980.00> Step reward: 0.0
2026-02-25 01:08:37,913 gym                            INFO       <4980.00> === STARTING STEP ===
2026-02-25 01:08:37,913 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,914 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2026-02-25 01:08:37,946 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2026-02-25 01:08:37,946 data.base                      INFO       <5580.00> Total reward: {}
2026-02-25 01:08:37,947 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,948 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2026-02-25 01:08:37,949 gym                            INFO       <5580.00> Step reward: 0.0
2026-02-25 01:08:37,950 gym                            INFO       <5580.00> === STARTING STEP ===
2026-02-25 01:08:37,950 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2026-02-25 01:08:37,951 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2026-02-25 01:08:37,958 data.base                      INFO       <5700.00> Total reward: {}
2026-02-25 01:08:37,958 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2026-02-25 01:08:37,959 gym                            INFO       <5700.00> Step reward: 0.0
2026-02-25 01:08:37,960 gym                            INFO       <5700.00> Episode terminated: False
2026-02-25 01:08:37,960 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.