Getting Started
This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).
Load Modules
In this tutorial, the environment will be created with gym.make, so it is necessary to import the top-level bsk_rl module as well as gym and bsk_rl components.
[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw
from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)
If no errors were raised, you have a functional installation of bsk_rl.
Configure the Satellite
Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.
[2]:
class MyScanningSatellite(sats.AccessSatellite):
observation_spec = [
obs.SatProperties(
dict(prop="storage_level_fraction"),
dict(prop="battery_charge_fraction")
),
obs.Eclipse(),
]
action_spec = [
act.Scan(duration=60.0), # Scan for 1 minute
act.Charge(duration=600.0), # Charge for 10 minutes
]
dyn_type = dyn.ContinuousImagingDynModel
fsw_type = fsw.ContinuousImagingFSWModel
Based on this class specification, a list of configurable parameters for the satellite can be generated.
[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
'maxCounterValue': 4,
'thrMinFireTime': 0.02,
'desatAttitude': 'sun',
'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
'thrForceSign': 1,
'K': 7.0,
'Ki': -1,
'P': 35.0,
'imageAttErrorRequirement': 0.01,
'imageRateErrorRequirement': None,
'inst_pHat_B': [0, 0, 1],
'utc_init': 'this value will be set by the world model',
'batteryStorageCapacity': 288000.0,
'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'disturbance_vector': None,
'dragCoeff': 2.2,
'imageTargetMaximumRange': -1,
'instrumentBaudRate': 8000000.0,
'instrumentPowerDraw': -30.0,
'basePowerDraw': 0.0,
'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'maxWheelSpeed': inf,
'u_max': 0.2,
'rwBasePower': 0.4,
'rwMechToElecEfficiency': 0.0,
'rwElecToMechEfficiency': 0.5,
'panelArea': 1.0,
'panelEfficiency': 0.2,
'nHat_B': array([ 0, 0, -1]),
'mass': 330,
'width': 1.38,
'depth': 1.04,
'height': 1.58,
'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'rN': None,
'vN': None,
'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
'mu': 398600436000000.0,
'min_orbital_radius': 6578136.6,
'dataStorageCapacity': 160000000.0,
'storageUnitValidCheck': False,
'storageInit': 0,
'thrusterPowerDraw': 0.0,
'transmitterBaudRate': -8000000.0,
'transmitterNumBuffers': 100,
'transmitterPacketSize': None,
'transmitterPowerDraw': -15.0}
When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args dictionary.
[4]:
sat_args = {}
# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0
# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10
# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)
Making the Environment
For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.
[5]:
env = gym.make(
"SatelliteTasking-v1",
satellite=sat,
scenario=scene.UniformNadirScanning(),
rewarder=data.ScanningTimeReward(),
time_limit=5700.0, # approximately 1 orbit
log_level="INFO",
)
2025-11-05 22:49:57,110 gym INFO Calling env.reset() to get observation space
2025-11-05 22:49:57,110 gym INFO Resetting environment with seed=2499520810
2025-11-05 22:49:57,198 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-11-05 22:49:57,208 gym INFO <0.00> Environment reset
Interacting with the Environment
First, the environment is reset.
[6]:
observation, info = env.reset(seed=1)
2025-11-05 22:49:57,271 gym INFO Resetting environment with seed=1
2025-11-05 22:49:57,400 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 6000.00 seconds
2025-11-05 22:49:57,409 gym INFO <0.00> Environment reset
Next, we take the scan action (action=0) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.
[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
observation, reward, terminated, truncated, info = env.step(action=0)
print(" Final data level:", observation[0])
2025-11-05 22:49:57,414 gym INFO <0.00> === STARTING STEP ===
2025-11-05 22:49:57,415 sats.satellite.EO1 INFO <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-05 22:49:57,416 sats.satellite.EO1 INFO <0.00> EO1: setting timed terminal event at 60.0
2025-11-05 22:49:57,420 sats.satellite.EO1 INFO <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-11-05 22:49:57,420 data.base INFO <60.00> Total reward: {}
2025-11-05 22:49:57,421 comm.communication INFO <60.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,421 sats.satellite.EO1 INFO <60.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,423 gym INFO <60.00> Step reward: 0.0
2025-11-05 22:49:57,423 gym INFO <60.00> === STARTING STEP ===
2025-11-05 22:49:57,424 sats.satellite.EO1 INFO <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-05 22:49:57,424 sats.satellite.EO1 INFO <60.00> EO1: setting timed terminal event at 120.0
2025-11-05 22:49:57,429 sats.satellite.EO1 INFO <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-11-05 22:49:57,429 data.base INFO <120.00> Total reward: {'EO1': 18.0}
2025-11-05 22:49:57,430 comm.communication INFO <120.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,430 sats.satellite.EO1 INFO <120.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,432 gym INFO <120.00> Step reward: 18.0
2025-11-05 22:49:57,432 gym INFO <120.00> === STARTING STEP ===
2025-11-05 22:49:57,433 sats.satellite.EO1 INFO <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-11-05 22:49:57,433 sats.satellite.EO1 INFO <120.00> EO1: setting timed terminal event at 180.0
2025-11-05 22:49:57,437 sats.satellite.EO1 INFO <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-11-05 22:49:57,438 data.base INFO <180.00> Total reward: {'EO1': 60.0}
2025-11-05 22:49:57,439 comm.communication INFO <180.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,440 sats.satellite.EO1 INFO <180.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,441 gym INFO <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
Final data level: 0.6741613078
The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction, starts at a random value set by the storageInit function in sat_args and increases based on the time spent imaging.
Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.
[8]:
while not truncated:
observation, reward, terminated, truncated, info = env.step(action=1)
print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-11-05 22:49:57,446 gym INFO <180.00> === STARTING STEP ===
2025-11-05 22:49:57,447 sats.satellite.EO1 INFO <180.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,447 sats.satellite.EO1 INFO <180.00> EO1: setting timed terminal event at 780.0
2025-11-05 22:49:57,479 sats.satellite.EO1 INFO <780.00> EO1: timed termination at 780.0 for action_charge
2025-11-05 22:49:57,480 data.base INFO <780.00> Total reward: {}
2025-11-05 22:49:57,480 comm.communication INFO <780.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,481 sats.satellite.EO1 INFO <780.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,485 gym INFO <780.00> Step reward: 0.0
2025-11-05 22:49:57,485 gym INFO <780.00> === STARTING STEP ===
2025-11-05 22:49:57,486 sats.satellite.EO1 INFO <780.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,486 sats.satellite.EO1 INFO <780.00> EO1: setting timed terminal event at 1380.0
2025-11-05 22:49:57,517 sats.satellite.EO1 INFO <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-11-05 22:49:57,518 data.base INFO <1380.00> Total reward: {}
2025-11-05 22:49:57,519 comm.communication INFO <1380.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,519 sats.satellite.EO1 INFO <1380.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,520 gym INFO <1380.00> Step reward: 0.0
2025-11-05 22:49:57,521 gym INFO <1380.00> === STARTING STEP ===
2025-11-05 22:49:57,521 sats.satellite.EO1 INFO <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,522 sats.satellite.EO1 INFO <1380.00> EO1: setting timed terminal event at 1980.0
2025-11-05 22:49:57,554 sats.satellite.EO1 INFO <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-11-05 22:49:57,554 data.base INFO <1980.00> Total reward: {}
2025-11-05 22:49:57,555 comm.communication INFO <1980.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,555 sats.satellite.EO1 INFO <1980.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,557 gym INFO <1980.00> Step reward: 0.0
2025-11-05 22:49:57,557 gym INFO <1980.00> === STARTING STEP ===
2025-11-05 22:49:57,558 sats.satellite.EO1 INFO <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,558 sats.satellite.EO1 INFO <1980.00> EO1: setting timed terminal event at 2580.0
2025-11-05 22:49:57,590 sats.satellite.EO1 INFO <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-11-05 22:49:57,590 data.base INFO <2580.00> Total reward: {}
2025-11-05 22:49:57,591 comm.communication INFO <2580.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,591 sats.satellite.EO1 INFO <2580.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,593 gym INFO <2580.00> Step reward: 0.0
2025-11-05 22:49:57,593 gym INFO <2580.00> === STARTING STEP ===
2025-11-05 22:49:57,594 sats.satellite.EO1 INFO <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,595 sats.satellite.EO1 INFO <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.637 (780.0 seconds)
Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
Eclipse: start: 4410.0 end: 840.0
Charge level: 0.630 (2580.0 seconds)
Eclipse: start: 3810.0 end: 240.0
2025-11-05 22:49:57,626 sats.satellite.EO1 INFO <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-11-05 22:49:57,627 data.base INFO <3180.00> Total reward: {}
2025-11-05 22:49:57,628 comm.communication INFO <3180.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,628 sats.satellite.EO1 INFO <3180.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,635 gym INFO <3180.00> Step reward: 0.0
2025-11-05 22:49:57,636 gym INFO <3180.00> === STARTING STEP ===
2025-11-05 22:49:57,636 sats.satellite.EO1 INFO <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,636 sats.satellite.EO1 INFO <3180.00> EO1: setting timed terminal event at 3780.0
2025-11-05 22:49:57,668 sats.satellite.EO1 INFO <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-11-05 22:49:57,668 data.base INFO <3780.00> Total reward: {}
2025-11-05 22:49:57,669 comm.communication INFO <3780.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,669 sats.satellite.EO1 INFO <3780.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,671 gym INFO <3780.00> Step reward: 0.0
2025-11-05 22:49:57,671 gym INFO <3780.00> === STARTING STEP ===
2025-11-05 22:49:57,672 sats.satellite.EO1 INFO <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,672 sats.satellite.EO1 INFO <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
Eclipse: start: 2610.0 end: 4710.0
2025-11-05 22:49:57,703 sats.satellite.EO1 INFO <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-11-05 22:49:57,704 data.base INFO <4380.00> Total reward: {}
2025-11-05 22:49:57,704 comm.communication INFO <4380.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,705 sats.satellite.EO1 INFO <4380.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,706 gym INFO <4380.00> Step reward: 0.0
2025-11-05 22:49:57,706 gym INFO <4380.00> === STARTING STEP ===
2025-11-05 22:49:57,707 sats.satellite.EO1 INFO <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,707 sats.satellite.EO1 INFO <4380.00> EO1: setting timed terminal event at 4980.0
2025-11-05 22:49:57,738 sats.satellite.EO1 INFO <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-11-05 22:49:57,739 data.base INFO <4980.00> Total reward: {}
2025-11-05 22:49:57,739 comm.communication INFO <4980.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,740 sats.satellite.EO1 INFO <4980.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,741 gym INFO <4980.00> Step reward: 0.0
2025-11-05 22:49:57,741 gym INFO <4980.00> === STARTING STEP ===
2025-11-05 22:49:57,742 sats.satellite.EO1 INFO <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,742 sats.satellite.EO1 INFO <4980.00> EO1: setting timed terminal event at 5580.0
2025-11-05 22:49:57,774 sats.satellite.EO1 INFO <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-11-05 22:49:57,775 data.base INFO <5580.00> Total reward: {}
2025-11-05 22:49:57,775 comm.communication INFO <5580.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,775 sats.satellite.EO1 INFO <5580.00> EO1: Satellite EO1 requires retasking
2025-11-05 22:49:57,777 gym INFO <5580.00> Step reward: 0.0
2025-11-05 22:49:57,777 gym INFO <5580.00> === STARTING STEP ===
2025-11-05 22:49:57,779 sats.satellite.EO1 INFO <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-11-05 22:49:57,779 sats.satellite.EO1 INFO <5580.00> EO1: setting timed terminal event at 6180.0
2025-11-05 22:49:57,786 data.base INFO <5700.00> Total reward: {}
2025-11-05 22:49:57,787 comm.communication INFO <5700.00> Optimizing data communication between all pairs of satellites
2025-11-05 22:49:57,788 gym INFO <5700.00> Step reward: 0.0
2025-11-05 22:49:57,789 gym INFO <5700.00> Episode terminated: False
2025-11-05 22:49:57,789 gym INFO <5700.00> Episode truncated: True
Charge level: 1.000 (4380.0 seconds)
Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
Eclipse: start: 690.0 end: 2790.0
It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.