Getting Started

This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).

Load Modules

In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.

[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw

from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)

If no errors were raised, you have a functional installation of bsk_rl.

Configure the Satellite

Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.

[2]:
class MyScanningSatellite(sats.AccessSatellite):
    observation_spec = [
        obs.SatProperties(
            dict(prop="storage_level_fraction"),
            dict(prop="battery_charge_fraction")
        ),
        obs.Eclipse(),
    ]
    action_spec = [
        act.Scan(duration=60.0),  # Scan for 1 minute
        act.Charge(duration=600.0),  # Charge for 10 minutes
    ]
    dyn_type = dyn.ContinuousImagingDynModel
    fsw_type = fsw.ContinuousImagingFSWModel

Based on this class specification, a list of configurable parameters for the satellite can be generated.

[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
 'maxCounterValue': 4,
 'thrMinFireTime': 0.02,
 'desatAttitude': 'sun',
 'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
 'thrForceSign': 1,
 'K': 7.0,
 'Ki': -1,
 'P': 35.0,
 'imageAttErrorRequirement': 0.01,
 'imageRateErrorRequirement': None,
 'inst_pHat_B': [0, 0, 1],
 'utc_init': 'this value will be set by the world model',
 'batteryStorageCapacity': 288000.0,
 'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'disturbance_vector': None,
 'dragCoeff': 2.2,
 'imageTargetMaximumRange': -1,
 'instrumentBaudRate': 8000000.0,
 'instrumentPowerDraw': -30.0,
 'basePowerDraw': 0.0,
 'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'maxWheelSpeed': inf,
 'u_max': 0.2,
 'rwBasePower': 0.4,
 'rwMechToElecEfficiency': 0.0,
 'rwElecToMechEfficiency': 0.5,
 'panelArea': 1.0,
 'panelEfficiency': 0.2,
 'nHat_B': array([ 0,  0, -1]),
 'mass': 330,
 'width': 1.38,
 'depth': 1.04,
 'height': 1.58,
 'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
 'rN': None,
 'vN': None,
 'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
 'mu': 398600436000000.0,
 'dataStorageCapacity': 160000000.0,
 'storageUnitValidCheck': False,
 'storageInit': 0,
 'thrusterPowerDraw': 0.0,
 'transmitterBaudRate': -8000000.0,
 'transmitterNumBuffers': 100,
 'transmitterPacketSize': None,
 'transmitterPowerDraw': -15.0}

When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.

[4]:
sat_args = {}

# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0

# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10

# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)

Making the Environment

For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.

[5]:
env = gym.make(
    "SatelliteTasking-v1",
    satellite=sat,
    scenario=scene.UniformNadirScanning(),
    rewarder=data.ScanningTimeReward(),
    time_limit=5700.0,  # approximately 1 orbit
    log_level="INFO",
)
2025-11-03 18:03:11,489 gym                            INFO       Calling env.reset() to get observation space
2025-11-03 18:03:11,489 gym                            INFO       Resetting environment with seed=1936775291
2025-11-03 18:03:11,583 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-11-03 18:03:11,594 gym                            INFO       <0.00> Environment reset

Interacting with the Environment

First, the environment is reset.

[6]:
observation, info = env.reset(seed=1)
2025-11-03 18:03:11,657 gym                            INFO       Resetting environment with seed=1
2025-11-03 18:03:11,794 sats.satellite.EO1             INFO       <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-11-03 18:03:11,804 gym                            INFO       <0.00> Environment reset

Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.

[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
    observation, reward, terminated, truncated, info = env.step(action=0)
print("  Final data level:", observation[0])
2025-11-03 18:03:11,810 gym                            INFO       <0.00> === STARTING STEP ===
2025-11-03 18:03:11,811 sats.satellite.EO1             INFO       <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-03 18:03:11,811 sats.satellite.EO1             INFO       <0.00> EO1: setting timed terminal event at 60.0
2025-11-03 18:03:11,817 sats.satellite.EO1             INFO       <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-11-03 18:03:11,817 data.base                      INFO       <60.00> Total reward: {}
2025-11-03 18:03:11,818 comm.communication             INFO       <60.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,818 sats.satellite.EO1             INFO       <60.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,820 gym                            INFO       <60.00> Step reward: 0.0
2025-11-03 18:03:11,820 gym                            INFO       <60.00> === STARTING STEP ===
2025-11-03 18:03:11,821 sats.satellite.EO1             INFO       <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-03 18:03:11,821 sats.satellite.EO1             INFO       <60.00> EO1: setting timed terminal event at 120.0
2025-11-03 18:03:11,826 sats.satellite.EO1             INFO       <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-11-03 18:03:11,826 data.base                      INFO       <120.00> Total reward: {'EO1': 18.0}
2025-11-03 18:03:11,827 comm.communication             INFO       <120.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,828 sats.satellite.EO1             INFO       <120.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,829 gym                            INFO       <120.00> Step reward: 18.0
2025-11-03 18:03:11,829 gym                            INFO       <120.00> === STARTING STEP ===
2025-11-03 18:03:11,829 sats.satellite.EO1             INFO       <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-03 18:03:11,830 sats.satellite.EO1             INFO       <120.00> EO1: setting timed terminal event at 180.0
2025-11-03 18:03:11,835 sats.satellite.EO1             INFO       <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-11-03 18:03:11,835 data.base                      INFO       <180.00> Total reward: {'EO1': 60.0}
2025-11-03 18:03:11,836 comm.communication             INFO       <180.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,836 sats.satellite.EO1             INFO       <180.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,837 gym                            INFO       <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
  Final data level: 0.6741613078

The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.

Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.

[8]:
while not truncated:
    observation, reward, terminated, truncated, info = env.step(action=1)
    print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-11-03 18:03:11,843 gym                            INFO       <180.00> === STARTING STEP ===
2025-11-03 18:03:11,843 sats.satellite.EO1             INFO       <180.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:11,844 sats.satellite.EO1             INFO       <180.00> EO1: setting timed terminal event at 780.0
2025-11-03 18:03:11,878 sats.satellite.EO1             INFO       <780.00> EO1: timed termination at 780.0 for action_charge
2025-11-03 18:03:11,879 data.base                      INFO       <780.00> Total reward: {}
2025-11-03 18:03:11,879 comm.communication             INFO       <780.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,880 sats.satellite.EO1             INFO       <780.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,884 gym                            INFO       <780.00> Step reward: 0.0
2025-11-03 18:03:11,885 gym                            INFO       <780.00> === STARTING STEP ===
2025-11-03 18:03:11,885 sats.satellite.EO1             INFO       <780.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:11,886 sats.satellite.EO1             INFO       <780.00> EO1: setting timed terminal event at 1380.0
2025-11-03 18:03:11,919 sats.satellite.EO1             INFO       <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-11-03 18:03:11,920 data.base                      INFO       <1380.00> Total reward: {}
2025-11-03 18:03:11,921 comm.communication             INFO       <1380.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,922 sats.satellite.EO1             INFO       <1380.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,923 gym                            INFO       <1380.00> Step reward: 0.0
2025-11-03 18:03:11,924 gym                            INFO       <1380.00> === STARTING STEP ===
2025-11-03 18:03:11,924 sats.satellite.EO1             INFO       <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:11,925 sats.satellite.EO1             INFO       <1380.00> EO1: setting timed terminal event at 1980.0
2025-11-03 18:03:11,959 sats.satellite.EO1             INFO       <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-11-03 18:03:11,959 data.base                      INFO       <1980.00> Total reward: {}
2025-11-03 18:03:11,960 comm.communication             INFO       <1980.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,960 sats.satellite.EO1             INFO       <1980.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:11,962 gym                            INFO       <1980.00> Step reward: 0.0
2025-11-03 18:03:11,962 gym                            INFO       <1980.00> === STARTING STEP ===
2025-11-03 18:03:11,963 sats.satellite.EO1             INFO       <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:11,963 sats.satellite.EO1             INFO       <1980.00> EO1: setting timed terminal event at 2580.0
2025-11-03 18:03:11,997 sats.satellite.EO1             INFO       <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-11-03 18:03:11,998 data.base                      INFO       <2580.00> Total reward: {}
2025-11-03 18:03:11,998 comm.communication             INFO       <2580.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:11,999 sats.satellite.EO1             INFO       <2580.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,000 gym                            INFO       <2580.00> Step reward: 0.0
2025-11-03 18:03:12,001 gym                            INFO       <2580.00> === STARTING STEP ===
2025-11-03 18:03:12,001 sats.satellite.EO1             INFO       <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:12,002 sats.satellite.EO1             INFO       <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
        Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
        Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
        Eclipse: start: 4410.0 end: 840.0
Charge level: 0.630 (2580.0 seconds)
        Eclipse: start: 3810.0 end: 240.0
2025-11-03 18:03:12,036 sats.satellite.EO1             INFO       <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-11-03 18:03:12,037 data.base                      INFO       <3180.00> Total reward: {}
2025-11-03 18:03:12,037 comm.communication             INFO       <3180.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,038 sats.satellite.EO1             INFO       <3180.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,045 gym                            INFO       <3180.00> Step reward: 0.0
2025-11-03 18:03:12,045 gym                            INFO       <3180.00> === STARTING STEP ===
2025-11-03 18:03:12,046 sats.satellite.EO1             INFO       <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:12,047 sats.satellite.EO1             INFO       <3180.00> EO1: setting timed terminal event at 3780.0
2025-11-03 18:03:12,080 sats.satellite.EO1             INFO       <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-11-03 18:03:12,081 data.base                      INFO       <3780.00> Total reward: {}
2025-11-03 18:03:12,082 comm.communication             INFO       <3780.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,082 sats.satellite.EO1             INFO       <3780.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,084 gym                            INFO       <3780.00> Step reward: 0.0
2025-11-03 18:03:12,085 gym                            INFO       <3780.00> === STARTING STEP ===
2025-11-03 18:03:12,085 sats.satellite.EO1             INFO       <3780.00> EO1: action_charge tasked for 600.0 seconds
Charge level: 1.000 (3180.0 seconds)
        Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
        Eclipse: start: 2610.0 end: 4710.0
2025-11-03 18:03:12,085 sats.satellite.EO1             INFO       <3780.00> EO1: setting timed terminal event at 4380.0
2025-11-03 18:03:12,119 sats.satellite.EO1             INFO       <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-11-03 18:03:12,119 data.base                      INFO       <4380.00> Total reward: {}
2025-11-03 18:03:12,120 comm.communication             INFO       <4380.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,120 sats.satellite.EO1             INFO       <4380.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,122 gym                            INFO       <4380.00> Step reward: 0.0
2025-11-03 18:03:12,123 gym                            INFO       <4380.00> === STARTING STEP ===
2025-11-03 18:03:12,123 sats.satellite.EO1             INFO       <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:12,124 sats.satellite.EO1             INFO       <4380.00> EO1: setting timed terminal event at 4980.0
2025-11-03 18:03:12,157 sats.satellite.EO1             INFO       <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-11-03 18:03:12,158 data.base                      INFO       <4980.00> Total reward: {}
2025-11-03 18:03:12,158 comm.communication             INFO       <4980.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,159 sats.satellite.EO1             INFO       <4980.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,161 gym                            INFO       <4980.00> Step reward: 0.0
2025-11-03 18:03:12,161 gym                            INFO       <4980.00> === STARTING STEP ===
2025-11-03 18:03:12,162 sats.satellite.EO1             INFO       <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:12,162 sats.satellite.EO1             INFO       <4980.00> EO1: setting timed terminal event at 5580.0
2025-11-03 18:03:12,196 sats.satellite.EO1             INFO       <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-11-03 18:03:12,196 data.base                      INFO       <5580.00> Total reward: {}
2025-11-03 18:03:12,197 comm.communication             INFO       <5580.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,197 sats.satellite.EO1             INFO       <5580.00> EO1: Satellite EO1 requires retasking
2025-11-03 18:03:12,199 gym                            INFO       <5580.00> Step reward: 0.0
2025-11-03 18:03:12,200 gym                            INFO       <5580.00> === STARTING STEP ===
2025-11-03 18:03:12,201 sats.satellite.EO1             INFO       <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-11-03 18:03:12,201 sats.satellite.EO1             INFO       <5580.00> EO1: setting timed terminal event at 6180.0
2025-11-03 18:03:12,211 data.base                      INFO       <5700.00> Total reward: {}
2025-11-03 18:03:12,211 comm.communication             INFO       <5700.00> Optimizing data communication between all pairs of satellites
2025-11-03 18:03:12,213 gym                            INFO       <5700.00> Step reward: 0.0
2025-11-03 18:03:12,213 gym                            INFO       <5700.00> Episode terminated: False
2025-11-03 18:03:12,214 gym                            INFO       <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
        Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
        Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
        Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
        Eclipse: start: 690.0 end: 2790.0

It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.