Getting Started
This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).
Load Modules
In this tutorial, the environment will be created with gym.make
, so it is necessary to import the top-level bsk_rl
module as well as gym
and bsk_rl
components.
[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw
from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)
If no errors were raised, you have a functional installation of bsk_rl
.
Configure the Satellite
Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.
[2]:
class MyScanningSatellite(sats.AccessSatellite):
observation_spec = [
obs.SatProperties(
dict(prop="storage_level_fraction"),
dict(prop="battery_charge_fraction")
),
obs.Eclipse(),
]
action_spec = [
act.Scan(duration=60.0), # Scan for 1 minute
act.Charge(duration=600.0), # Charge for 10 minutes
]
dyn_type = dyn.ContinuousImagingDynModel
fsw_type = fsw.ContinuousImagingFSWModel
Based on this class specification, a list of configurable parameters for the satellite can be generated.
[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
'maxCounterValue': 4,
'thrMinFireTime': 0.02,
'desatAttitude': 'sun',
'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
'thrForceSign': 1,
'K': 7.0,
'Ki': -1,
'P': 35.0,
'imageAttErrorRequirement': 0.01,
'imageRateErrorRequirement': None,
'inst_pHat_B': [0, 0, 1],
'utc_init': 'this value will be set by the world model',
'batteryStorageCapacity': 288000.0,
'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'disturbance_vector': None,
'dragCoeff': 2.2,
'imageTargetMaximumRange': -1,
'instrumentBaudRate': 8000000.0,
'instrumentPowerDraw': -30.0,
'basePowerDraw': 0.0,
'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'maxWheelSpeed': inf,
'u_max': 0.2,
'rwBasePower': 0.4,
'rwMechToElecEfficiency': 0.0,
'rwElecToMechEfficiency': 0.5,
'panelArea': 1.0,
'panelEfficiency': 0.2,
'nHat_B': array([ 0, 0, -1]),
'mass': 330,
'width': 1.38,
'depth': 1.04,
'height': 1.58,
'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'rN': None,
'vN': None,
'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
'mu': 398600436000000.0,
'dataStorageCapacity': 160000000.0,
'storageUnitValidCheck': False,
'storageInit': 0,
'thrusterPowerDraw': 0.0,
'transmitterBaudRate': -8000000.0,
'transmitterNumBuffers': 100,
'transmitterPacketSize': None,
'transmitterPowerDraw': -15.0}
When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args
dictionary.
[4]:
sat_args = {}
# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0
# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10
# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)
Making the Environment
For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.
[5]:
env = gym.make(
"SatelliteTasking-v1",
satellite=sat,
scenario=scene.UniformNadirScanning(),
rewarder=data.ScanningTimeReward(),
time_limit=5700.0, # approximately 1 orbit
log_level="INFO",
)
2025-06-20 19:57:43,363 gym INFO Calling env.reset() to get observation space
2025-06-20 19:57:43,364 gym INFO Resetting environment with seed=1025121480
2025-06-20 19:57:43,442 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-06-20 19:57:43,462 gym INFO <0.00> Environment reset
Interacting with the Environment
First, the environment is reset.
[6]:
observation, info = env.reset(seed=1)
2025-06-20 19:57:43,468 gym INFO Resetting environment with seed=1
2025-06-20 19:57:43,559 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-06-20 19:57:43,579 gym INFO <0.00> Environment reset
Next, we take the scan action (action=0
) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.
[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
observation, reward, terminated, truncated, info = env.step(action=0)
print(" Final data level:", observation[0])
2025-06-20 19:57:43,584 gym INFO <0.00> === STARTING STEP ===
2025-06-20 19:57:43,585 sats.satellite.EO1 INFO <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-06-20 19:57:43,586 sats.satellite.EO1 INFO <0.00> EO1: setting timed terminal event at 60.0
2025-06-20 19:57:43,594 sats.satellite.EO1 INFO <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-06-20 19:57:43,594 data.base INFO <60.00> Total reward: {}
2025-06-20 19:57:43,595 comm.communication INFO <60.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,595 sats.satellite.EO1 INFO <60.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,597 gym INFO <60.00> Step reward: 0.0
2025-06-20 19:57:43,598 gym INFO <60.00> === STARTING STEP ===
2025-06-20 19:57:43,599 sats.satellite.EO1 INFO <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-06-20 19:57:43,599 sats.satellite.EO1 INFO <60.00> EO1: setting timed terminal event at 120.0
2025-06-20 19:57:43,607 sats.satellite.EO1 INFO <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-06-20 19:57:43,608 data.base INFO <120.00> Total reward: {'EO1': 18.0}
2025-06-20 19:57:43,609 comm.communication INFO <120.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,609 sats.satellite.EO1 INFO <120.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,611 gym INFO <120.00> Step reward: 18.0
2025-06-20 19:57:43,612 gym INFO <120.00> === STARTING STEP ===
2025-06-20 19:57:43,612 sats.satellite.EO1 INFO <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-06-20 19:57:43,613 sats.satellite.EO1 INFO <120.00> EO1: setting timed terminal event at 180.0
2025-06-20 19:57:43,621 sats.satellite.EO1 INFO <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-06-20 19:57:43,622 data.base INFO <180.00> Total reward: {'EO1': 60.0}
2025-06-20 19:57:43,622 comm.communication INFO <180.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,623 sats.satellite.EO1 INFO <180.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,625 gym INFO <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
Final data level: 0.6741613078
The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction
, starts at a random value set by the storageInit
function in sat_args
and increases based on the time spent imaging.
Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.
[8]:
while not truncated:
observation, reward, terminated, truncated, info = env.step(action=1)
print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-06-20 19:57:43,631 gym INFO <180.00> === STARTING STEP ===
2025-06-20 19:57:43,632 sats.satellite.EO1 INFO <180.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:43,632 sats.satellite.EO1 INFO <180.00> EO1: setting timed terminal event at 780.0
2025-06-20 19:57:43,703 sats.satellite.EO1 INFO <780.00> EO1: timed termination at 780.0 for action_charge
2025-06-20 19:57:43,704 data.base INFO <780.00> Total reward: {}
2025-06-20 19:57:43,704 comm.communication INFO <780.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,705 sats.satellite.EO1 INFO <780.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,711 gym INFO <780.00> Step reward: 0.0
2025-06-20 19:57:43,711 gym INFO <780.00> === STARTING STEP ===
2025-06-20 19:57:43,712 sats.satellite.EO1 INFO <780.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:43,713 sats.satellite.EO1 INFO <780.00> EO1: setting timed terminal event at 1380.0
2025-06-20 19:57:43,782 sats.satellite.EO1 INFO <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-06-20 19:57:43,783 data.base INFO <1380.00> Total reward: {}
2025-06-20 19:57:43,784 comm.communication INFO <1380.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,784 sats.satellite.EO1 INFO <1380.00> EO1: Satellite EO1 requires retasking
Charge level: 0.637 (780.0 seconds)
Eclipse: start: 5610.0 end: 2040.0
2025-06-20 19:57:43,786 gym INFO <1380.00> Step reward: 0.0
2025-06-20 19:57:43,787 gym INFO <1380.00> === STARTING STEP ===
2025-06-20 19:57:43,788 sats.satellite.EO1 INFO <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:43,788 sats.satellite.EO1 INFO <1380.00> EO1: setting timed terminal event at 1980.0
2025-06-20 19:57:43,849 sats.satellite.EO1 INFO <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-06-20 19:57:43,850 data.base INFO <1980.00> Total reward: {}
2025-06-20 19:57:43,850 comm.communication INFO <1980.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,851 sats.satellite.EO1 INFO <1980.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,853 gym INFO <1980.00> Step reward: 0.0
2025-06-20 19:57:43,854 gym INFO <1980.00> === STARTING STEP ===
2025-06-20 19:57:43,854 sats.satellite.EO1 INFO <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:43,855 sats.satellite.EO1 INFO <1980.00> EO1: setting timed terminal event at 2580.0
Charge level: 0.635 (1380.0 seconds)
Eclipse: start: 5010.0 end: 1440.0
Charge level: 0.632 (1980.0 seconds)
Eclipse: start: 4410.0 end: 840.0
2025-06-20 19:57:43,924 sats.satellite.EO1 INFO <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-06-20 19:57:43,924 data.base INFO <2580.00> Total reward: {}
2025-06-20 19:57:43,925 comm.communication INFO <2580.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:43,925 sats.satellite.EO1 INFO <2580.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:43,927 gym INFO <2580.00> Step reward: 0.0
2025-06-20 19:57:43,928 gym INFO <2580.00> === STARTING STEP ===
2025-06-20 19:57:43,928 sats.satellite.EO1 INFO <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:43,929 sats.satellite.EO1 INFO <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.630 (2580.0 seconds)
Eclipse: start: 3810.0 end: 240.0
2025-06-20 19:57:44,000 sats.satellite.EO1 INFO <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-06-20 19:57:44,000 data.base INFO <3180.00> Total reward: {}
2025-06-20 19:57:44,001 comm.communication INFO <3180.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,001 sats.satellite.EO1 INFO <3180.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:44,011 gym INFO <3180.00> Step reward: 0.0
2025-06-20 19:57:44,012 gym INFO <3180.00> === STARTING STEP ===
2025-06-20 19:57:44,013 sats.satellite.EO1 INFO <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:44,013 sats.satellite.EO1 INFO <3180.00> EO1: setting timed terminal event at 3780.0
2025-06-20 19:57:44,072 sats.satellite.EO1 INFO <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-06-20 19:57:44,073 data.base INFO <3780.00> Total reward: {}
2025-06-20 19:57:44,073 comm.communication INFO <3780.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,074 sats.satellite.EO1 INFO <3780.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:44,076 gym INFO <3780.00> Step reward: 0.0
2025-06-20 19:57:44,076 gym INFO <3780.00> === STARTING STEP ===
2025-06-20 19:57:44,077 sats.satellite.EO1 INFO <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:44,077 sats.satellite.EO1 INFO <3780.00> EO1: setting timed terminal event at 4380.0
Charge level: 1.000 (3180.0 seconds)
Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
Eclipse: start: 2610.0 end: 4710.0
2025-06-20 19:57:44,134 sats.satellite.EO1 INFO <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-06-20 19:57:44,135 data.base INFO <4380.00> Total reward: {}
2025-06-20 19:57:44,135 comm.communication INFO <4380.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,135 sats.satellite.EO1 INFO <4380.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:44,137 gym INFO <4380.00> Step reward: 0.0
2025-06-20 19:57:44,138 gym INFO <4380.00> === STARTING STEP ===
2025-06-20 19:57:44,138 sats.satellite.EO1 INFO <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:44,138 sats.satellite.EO1 INFO <4380.00> EO1: setting timed terminal event at 4980.0
2025-06-20 19:57:44,195 sats.satellite.EO1 INFO <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-06-20 19:57:44,195 data.base INFO <4980.00> Total reward: {}
2025-06-20 19:57:44,196 comm.communication INFO <4980.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,197 sats.satellite.EO1 INFO <4980.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:44,198 gym INFO <4980.00> Step reward: 0.0
2025-06-20 19:57:44,199 gym INFO <4980.00> === STARTING STEP ===
2025-06-20 19:57:44,199 sats.satellite.EO1 INFO <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:44,199 sats.satellite.EO1 INFO <4980.00> EO1: setting timed terminal event at 5580.0
Charge level: 1.000 (4380.0 seconds)
Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
Eclipse: start: 1410.0 end: 3510.0
2025-06-20 19:57:44,266 sats.satellite.EO1 INFO <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-06-20 19:57:44,267 data.base INFO <5580.00> Total reward: {}
2025-06-20 19:57:44,267 comm.communication INFO <5580.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,267 sats.satellite.EO1 INFO <5580.00> EO1: Satellite EO1 requires retasking
2025-06-20 19:57:44,269 gym INFO <5580.00> Step reward: 0.0
2025-06-20 19:57:44,269 gym INFO <5580.00> === STARTING STEP ===
2025-06-20 19:57:44,271 sats.satellite.EO1 INFO <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-06-20 19:57:44,271 sats.satellite.EO1 INFO <5580.00> EO1: setting timed terminal event at 6180.0
2025-06-20 19:57:44,286 data.base INFO <5700.00> Total reward: {}
2025-06-20 19:57:44,287 comm.communication INFO <5700.00> Optimizing data communication between all pairs of satellites
2025-06-20 19:57:44,288 gym INFO <5700.00> Step reward: 0.0
2025-06-20 19:57:44,289 gym INFO <5700.00> Episode terminated: False
2025-06-20 19:57:44,289 gym INFO <5700.00> Episode truncated: True
Charge level: 1.000 (5580.0 seconds)
Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
Eclipse: start: 690.0 end: 2790.0
It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.