Getting Started
This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).
Load Modules
In this tutorial, the environment will be created with gym.make
, so it is necessary to import the top-level bsk_rl
module as well as gym
and bsk_rl
components.
[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw
from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)
If no errors were raised, you have a functional installation of bsk_rl
.
Configure the Satellite
Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.
[2]:
class MyScanningSatellite(sats.AccessSatellite):
observation_spec = [
obs.SatProperties(
dict(prop="storage_level_fraction"),
dict(prop="battery_charge_fraction")
),
obs.Eclipse(),
]
action_spec = [
act.Scan(duration=60.0), # Scan for 1 minute
act.Charge(duration=600.0), # Charge for 10 minutes
]
dyn_type = dyn.ContinuousImagingDynModel
fsw_type = fsw.ContinuousImagingFSWModel
Based on this class specification, a list of configurable parameters for the satellite can be generated.
[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
'maxCounterValue': 4,
'thrMinFireTime': 0.02,
'desatAttitude': 'sun',
'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
'thrForceSign': 1,
'K': 7.0,
'Ki': -1,
'P': 35.0,
'imageAttErrorRequirement': 0.01,
'imageRateErrorRequirement': None,
'inst_pHat_B': [0, 0, 1],
'utc_init': 'this value will be set by the world model',
'batteryStorageCapacity': 288000.0,
'storedCharge_Init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
'disturbance_vector': None,
'dragCoeff': 2.2,
'imageTargetMaximumRange': -1,
'instrumentBaudRate': 8000000.0,
'instrumentPowerDraw': -30.0,
'basePowerDraw': 0.0,
'wheelSpeeds': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
'maxWheelSpeed': inf,
'u_max': 0.2,
'rwBasePower': 0.4,
'rwMechToElecEfficiency': 0.0,
'rwElecToMechEfficiency': 0.5,
'panelArea': 1.0,
'panelEfficiency': 0.2,
'nHat_B': array([ 0, 0, -1]),
'mass': 330,
'width': 1.38,
'depth': 1.04,
'height': 1.58,
'sigma_init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
'omega_init': <function bsk_rl.sim.dyn.BasicDynamicsModel.<lambda>()>,
'rN': None,
'vN': None,
'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = 45.0, alt: float = 500, r_body: float = 6371, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = 0, f: Optional[float] = None) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
'mu': 398600436000000.0,
'dataStorageCapacity': 160000000.0,
'storageUnitValidCheck': False,
'storageInit': 0,
'thrusterPowerDraw': 0.0,
'transmitterBaudRate': -8000000.0,
'transmitterNumBuffers': 100,
'transmitterPowerDraw': -15.0}
When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args
dictionary.
[4]:
sat_args = {}
# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0
# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10
# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)
Making the Environment
For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.
[5]:
env = gym.make(
"SatelliteTasking-v1",
satellite=sat,
scenario=scene.UniformNadirScanning(),
rewarder=data.ScanningTimeReward(),
time_limit=5700.0, # approximately 1 orbit
log_level="INFO",
)
2025-05-13 19:53:02,805 gym INFO Calling env.reset() to get observation space
2025-05-13 19:53:02,806 gym INFO Resetting environment with seed=1490332139
2025-05-13 19:53:02,896 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-05-13 19:53:02,913 gym INFO <0.00> Environment reset
Interacting with the Environment
First, the environment is reset.
[6]:
observation, info = env.reset(seed=1)
2025-05-13 19:53:02,967 gym INFO Resetting environment with seed=1
2025-05-13 19:53:03,090 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-05-13 19:53:03,108 gym INFO <0.00> Environment reset
Next, we take the scan action (action=0
) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.
[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
observation, reward, terminated, truncated, info = env.step(action=0)
print(" Final data level:", observation[0])
2025-05-13 19:53:03,113 gym INFO <0.00> === STARTING STEP ===
2025-05-13 19:53:03,114 sats.satellite.EO1 INFO <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-05-13 19:53:03,115 sats.satellite.EO1 INFO <0.00> EO1: setting timed terminal event at 60.0
2025-05-13 19:53:03,122 sats.satellite.EO1 INFO <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-05-13 19:53:03,123 data.base INFO <60.00> Total reward: {}
2025-05-13 19:53:03,123 comm.communication INFO <60.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,124 sats.satellite.EO1 INFO <60.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,125 gym INFO <60.00> Step reward: 0.0
2025-05-13 19:53:03,126 gym INFO <60.00> === STARTING STEP ===
2025-05-13 19:53:03,127 sats.satellite.EO1 INFO <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-05-13 19:53:03,127 sats.satellite.EO1 INFO <60.00> EO1: setting timed terminal event at 120.0
2025-05-13 19:53:03,134 sats.satellite.EO1 INFO <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-05-13 19:53:03,135 data.base INFO <120.00> Total reward: {'EO1': 30.0}
2025-05-13 19:53:03,135 comm.communication INFO <120.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,136 sats.satellite.EO1 INFO <120.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,138 gym INFO <120.00> Step reward: 30.0
2025-05-13 19:53:03,138 gym INFO <120.00> === STARTING STEP ===
2025-05-13 19:53:03,139 sats.satellite.EO1 INFO <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-05-13 19:53:03,139 sats.satellite.EO1 INFO <120.00> EO1: setting timed terminal event at 180.0
2025-05-13 19:53:03,146 sats.satellite.EO1 INFO <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-05-13 19:53:03,147 data.base INFO <180.00> Total reward: {'EO1': 60.0}
2025-05-13 19:53:03,148 comm.communication INFO <180.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,148 sats.satellite.EO1 INFO <180.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,149 gym INFO <180.00> Step reward: 60.0
Initial data level: 0.7341307878 (randomized by sat_args)
Final data level: 0.8241307878
The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction
, starts at a random value set by the storageInit
function in sat_args
and increases based on the time spent imaging.
Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.
[8]:
while not truncated:
observation, reward, terminated, truncated, info = env.step(action=1)
print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-05-13 19:53:03,155 gym INFO <180.00> === STARTING STEP ===
2025-05-13 19:53:03,155 sats.satellite.EO1 INFO <180.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,156 sats.satellite.EO1 INFO <180.00> EO1: setting timed terminal event at 780.0
2025-05-13 19:53:03,212 sats.satellite.EO1 INFO <780.00> EO1: timed termination at 780.0 for action_charge
2025-05-13 19:53:03,213 data.base INFO <780.00> Total reward: {}
2025-05-13 19:53:03,214 comm.communication INFO <780.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,214 sats.satellite.EO1 INFO <780.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,218 gym INFO <780.00> Step reward: 0.0
2025-05-13 19:53:03,219 gym INFO <780.00> === STARTING STEP ===
2025-05-13 19:53:03,220 sats.satellite.EO1 INFO <780.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,220 sats.satellite.EO1 INFO <780.00> EO1: setting timed terminal event at 1380.0
2025-05-13 19:53:03,276 sats.satellite.EO1 INFO <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-05-13 19:53:03,277 data.base INFO <1380.00> Total reward: {}
2025-05-13 19:53:03,277 comm.communication INFO <1380.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,278 sats.satellite.EO1 INFO <1380.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,279 gym INFO <1380.00> Step reward: 0.0
2025-05-13 19:53:03,279 gym INFO <1380.00> === STARTING STEP ===
2025-05-13 19:53:03,280 sats.satellite.EO1 INFO <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,280 sats.satellite.EO1 INFO <1380.00> EO1: setting timed terminal event at 1980.0
Charge level: 0.339 (780.0 seconds)
Eclipse: start: 5340.0 end: 1800.0
Charge level: 0.337 (1380.0 seconds)
Eclipse: start: 4740.0 end: 1200.0
2025-05-13 19:53:03,337 sats.satellite.EO1 INFO <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-05-13 19:53:03,338 data.base INFO <1980.00> Total reward: {}
2025-05-13 19:53:03,339 comm.communication INFO <1980.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,339 sats.satellite.EO1 INFO <1980.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,340 gym INFO <1980.00> Step reward: 0.0
2025-05-13 19:53:03,341 gym INFO <1980.00> === STARTING STEP ===
2025-05-13 19:53:03,341 sats.satellite.EO1 INFO <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,342 sats.satellite.EO1 INFO <1980.00> EO1: setting timed terminal event at 2580.0
2025-05-13 19:53:03,398 sats.satellite.EO1 INFO <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-05-13 19:53:03,398 data.base INFO <2580.00> Total reward: {}
2025-05-13 19:53:03,399 comm.communication INFO <2580.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,400 sats.satellite.EO1 INFO <2580.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,410 gym INFO <2580.00> Step reward: 0.0
2025-05-13 19:53:03,411 gym INFO <2580.00> === STARTING STEP ===
2025-05-13 19:53:03,412 sats.satellite.EO1 INFO <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,412 sats.satellite.EO1 INFO <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.334 (1980.0 seconds)
Eclipse: start: 4140.0 end: 600.0
Charge level: 0.354 (2580.0 seconds)
Eclipse: start: 3540.0 end: 5670.0
2025-05-13 19:53:03,479 sats.satellite.EO1 INFO <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-05-13 19:53:03,479 data.base INFO <3180.00> Total reward: {}
2025-05-13 19:53:03,480 comm.communication INFO <3180.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,480 sats.satellite.EO1 INFO <3180.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,482 gym INFO <3180.00> Step reward: 0.0
2025-05-13 19:53:03,482 gym INFO <3180.00> === STARTING STEP ===
2025-05-13 19:53:03,483 sats.satellite.EO1 INFO <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,484 sats.satellite.EO1 INFO <3180.00> EO1: setting timed terminal event at 3780.0
Charge level: 0.942 (3180.0 seconds)
Eclipse: start: 2940.0 end: 5070.0
2025-05-13 19:53:03,551 sats.satellite.EO1 INFO <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-05-13 19:53:03,551 data.base INFO <3780.00> Total reward: {}
2025-05-13 19:53:03,552 comm.communication INFO <3780.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,552 sats.satellite.EO1 INFO <3780.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,554 gym INFO <3780.00> Step reward: 0.0
2025-05-13 19:53:03,554 gym INFO <3780.00> === STARTING STEP ===
2025-05-13 19:53:03,555 sats.satellite.EO1 INFO <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,556 sats.satellite.EO1 INFO <3780.00> EO1: setting timed terminal event at 4380.0
2025-05-13 19:53:03,611 sats.satellite.EO1 INFO <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-05-13 19:53:03,612 data.base INFO <4380.00> Total reward: {}
2025-05-13 19:53:03,612 comm.communication INFO <4380.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,613 sats.satellite.EO1 INFO <4380.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,615 gym INFO <4380.00> Step reward: 0.0
2025-05-13 19:53:03,615 gym INFO <4380.00> === STARTING STEP ===
2025-05-13 19:53:03,615 sats.satellite.EO1 INFO <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,616 sats.satellite.EO1 INFO <4380.00> EO1: setting timed terminal event at 4980.0
2025-05-13 19:53:03,672 sats.satellite.EO1 INFO <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-05-13 19:53:03,672 data.base INFO <4980.00> Total reward: {}
2025-05-13 19:53:03,673 comm.communication INFO <4980.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,673 sats.satellite.EO1 INFO <4980.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,675 gym INFO <4980.00> Step reward: 0.0
2025-05-13 19:53:03,675 gym INFO <4980.00> === STARTING STEP ===
2025-05-13 19:53:03,676 sats.satellite.EO1 INFO <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,676 sats.satellite.EO1 INFO <4980.00> EO1: setting timed terminal event at 5580.0
Charge level: 1.000 (3780.0 seconds)
Eclipse: start: 2340.0 end: 4470.0
Charge level: 1.000 (4380.0 seconds)
Eclipse: start: 1740.0 end: 3870.0
Charge level: 1.000 (4980.0 seconds)
Eclipse: start: 1140.0 end: 3270.0
2025-05-13 19:53:03,732 sats.satellite.EO1 INFO <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-05-13 19:53:03,733 data.base INFO <5580.00> Total reward: {}
2025-05-13 19:53:03,733 comm.communication INFO <5580.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,733 sats.satellite.EO1 INFO <5580.00> EO1: Satellite EO1 requires retasking
2025-05-13 19:53:03,735 gym INFO <5580.00> Step reward: 0.0
2025-05-13 19:53:03,736 gym INFO <5580.00> === STARTING STEP ===
2025-05-13 19:53:03,736 sats.satellite.EO1 INFO <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-05-13 19:53:03,737 sats.satellite.EO1 INFO <5580.00> EO1: setting timed terminal event at 6180.0
2025-05-13 19:53:03,751 data.base INFO <5700.00> Total reward: {}
2025-05-13 19:53:03,751 comm.communication INFO <5700.00> Optimizing data communication between all pairs of satellites
2025-05-13 19:53:03,752 gym INFO <5700.00> Step reward: 0.0
2025-05-13 19:53:03,753 gym INFO <5700.00> Episode terminated: False
2025-05-13 19:53:03,754 gym INFO <5700.00> Episode truncated: True
Charge level: 1.000 (5580.0 seconds)
Eclipse: start: 540.0 end: 2670.0
Charge level: 1.000 (5700.0 seconds)
Eclipse: start: 420.0 end: 2550.0
It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.