Getting Started
This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).
Load Modules
In this tutorial, the environment will be created with gym.make
, so it is necessary to import the top-level bsk_rl
module as well as gym
and bsk_rl
components.
[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw
from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)
If no errors were raised, you have a functional installation of bsk_rl
.
Configure the Satellite
Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.
[2]:
class MyScanningSatellite(sats.AccessSatellite):
observation_spec = [
obs.SatProperties(
dict(prop="storage_level_fraction"),
dict(prop="battery_charge_fraction")
),
obs.Eclipse(),
]
action_spec = [
act.Scan(duration=60.0), # Scan for 1 minute
act.Charge(duration=600.0), # Charge for 10 minutes
]
dyn_type = dyn.ContinuousImagingDynModel
fsw_type = fsw.ContinuousImagingFSWModel
Based on this class specification, a list of configurable parameters for the satellite can be generated.
[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
'maxCounterValue': 4,
'thrMinFireTime': 0.02,
'desatAttitude': 'sun',
'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
'thrForceSign': 1,
'K': 7.0,
'Ki': -1,
'P': 35.0,
'imageAttErrorRequirement': 0.01,
'imageRateErrorRequirement': None,
'inst_pHat_B': [0, 0, 1],
'utc_init': 'this value will be set by the world model',
'batteryStorageCapacity': 288000.0,
'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'disturbance_vector': None,
'dragCoeff': 2.2,
'imageTargetMaximumRange': -1,
'instrumentBaudRate': 8000000.0,
'instrumentPowerDraw': -30.0,
'basePowerDraw': 0.0,
'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'maxWheelSpeed': inf,
'u_max': 0.2,
'rwBasePower': 0.4,
'rwMechToElecEfficiency': 0.0,
'rwElecToMechEfficiency': 0.5,
'panelArea': 1.0,
'panelEfficiency': 0.2,
'nHat_B': array([ 0, 0, -1]),
'mass': 330,
'width': 1.38,
'depth': 1.04,
'height': 1.58,
'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'rN': None,
'vN': None,
'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
'mu': 398600436000000.0,
'dataStorageCapacity': 160000000.0,
'storageUnitValidCheck': False,
'storageInit': 0,
'thrusterPowerDraw': 0.0,
'transmitterBaudRate': -8000000.0,
'transmitterNumBuffers': 100,
'transmitterPacketSize': None,
'transmitterPowerDraw': -15.0}
When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args
dictionary.
[4]:
sat_args = {}
# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0
# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10
# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)
Making the Environment
For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.
[5]:
env = gym.make(
"SatelliteTasking-v1",
satellite=sat,
scenario=scene.UniformNadirScanning(),
rewarder=data.ScanningTimeReward(),
time_limit=5700.0, # approximately 1 orbit
log_level="INFO",
)
2025-08-25 18:17:16,260 gym INFO Calling env.reset() to get observation space
2025-08-25 18:17:16,261 gym INFO Resetting environment with seed=1747159948
2025-08-25 18:17:16,351 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-08-25 18:17:16,367 gym INFO <0.00> Environment reset
Interacting with the Environment
First, the environment is reset.
[6]:
observation, info = env.reset(seed=1)
2025-08-25 18:17:16,422 gym INFO Resetting environment with seed=1
2025-08-25 18:17:16,545 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-08-25 18:17:16,561 gym INFO <0.00> Environment reset
Next, we take the scan action (action=0
) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.
[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
observation, reward, terminated, truncated, info = env.step(action=0)
print(" Final data level:", observation[0])
2025-08-25 18:17:16,567 gym INFO <0.00> === STARTING STEP ===
2025-08-25 18:17:16,567 sats.satellite.EO1 INFO <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-08-25 18:17:16,568 sats.satellite.EO1 INFO <0.00> EO1: setting timed terminal event at 60.0
2025-08-25 18:17:16,576 sats.satellite.EO1 INFO <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-08-25 18:17:16,576 data.base INFO <60.00> Total reward: {}
2025-08-25 18:17:16,577 comm.communication INFO <60.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,577 sats.satellite.EO1 INFO <60.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,578 gym INFO <60.00> Step reward: 0.0
2025-08-25 18:17:16,579 gym INFO <60.00> === STARTING STEP ===
2025-08-25 18:17:16,580 sats.satellite.EO1 INFO <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-08-25 18:17:16,580 sats.satellite.EO1 INFO <60.00> EO1: setting timed terminal event at 120.0
2025-08-25 18:17:16,588 sats.satellite.EO1 INFO <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-08-25 18:17:16,588 data.base INFO <120.00> Total reward: {'EO1': 18.0}
2025-08-25 18:17:16,589 comm.communication INFO <120.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,590 sats.satellite.EO1 INFO <120.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,591 gym INFO <120.00> Step reward: 18.0
2025-08-25 18:17:16,591 gym INFO <120.00> === STARTING STEP ===
2025-08-25 18:17:16,592 sats.satellite.EO1 INFO <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-08-25 18:17:16,592 sats.satellite.EO1 INFO <120.00> EO1: setting timed terminal event at 180.0
2025-08-25 18:17:16,599 sats.satellite.EO1 INFO <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-08-25 18:17:16,600 data.base INFO <180.00> Total reward: {'EO1': 60.0}
2025-08-25 18:17:16,601 comm.communication INFO <180.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,601 sats.satellite.EO1 INFO <180.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,602 gym INFO <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
Final data level: 0.6741613078
The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction
, starts at a random value set by the storageInit
function in sat_args
and increases based on the time spent imaging.
Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.
[8]:
while not truncated:
observation, reward, terminated, truncated, info = env.step(action=1)
print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-08-25 18:17:16,608 gym INFO <180.00> === STARTING STEP ===
2025-08-25 18:17:16,609 sats.satellite.EO1 INFO <180.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,609 sats.satellite.EO1 INFO <180.00> EO1: setting timed terminal event at 780.0
2025-08-25 18:17:16,666 sats.satellite.EO1 INFO <780.00> EO1: timed termination at 780.0 for action_charge
2025-08-25 18:17:16,667 data.base INFO <780.00> Total reward: {}
2025-08-25 18:17:16,667 comm.communication INFO <780.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,668 sats.satellite.EO1 INFO <780.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,673 gym INFO <780.00> Step reward: 0.0
2025-08-25 18:17:16,673 gym INFO <780.00> === STARTING STEP ===
2025-08-25 18:17:16,674 sats.satellite.EO1 INFO <780.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,675 sats.satellite.EO1 INFO <780.00> EO1: setting timed terminal event at 1380.0
2025-08-25 18:17:16,731 sats.satellite.EO1 INFO <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-08-25 18:17:16,732 data.base INFO <1380.00> Total reward: {}
2025-08-25 18:17:16,732 comm.communication INFO <1380.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,733 sats.satellite.EO1 INFO <1380.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,734 gym INFO <1380.00> Step reward: 0.0
2025-08-25 18:17:16,735 gym INFO <1380.00> === STARTING STEP ===
2025-08-25 18:17:16,735 sats.satellite.EO1 INFO <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,736 sats.satellite.EO1 INFO <1380.00> EO1: setting timed terminal event at 1980.0
Charge level: 0.637 (780.0 seconds)
Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
Eclipse: start: 5010.0 end: 1440.0
2025-08-25 18:17:16,794 sats.satellite.EO1 INFO <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-08-25 18:17:16,794 data.base INFO <1980.00> Total reward: {}
2025-08-25 18:17:16,795 comm.communication INFO <1980.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,795 sats.satellite.EO1 INFO <1980.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,796 gym INFO <1980.00> Step reward: 0.0
2025-08-25 18:17:16,797 gym INFO <1980.00> === STARTING STEP ===
2025-08-25 18:17:16,798 sats.satellite.EO1 INFO <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,798 sats.satellite.EO1 INFO <1980.00> EO1: setting timed terminal event at 2580.0
2025-08-25 18:17:16,856 sats.satellite.EO1 INFO <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-08-25 18:17:16,856 data.base INFO <2580.00> Total reward: {}
2025-08-25 18:17:16,857 comm.communication INFO <2580.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,857 sats.satellite.EO1 INFO <2580.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,859 gym INFO <2580.00> Step reward: 0.0
2025-08-25 18:17:16,860 gym INFO <2580.00> === STARTING STEP ===
2025-08-25 18:17:16,860 sats.satellite.EO1 INFO <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,860 sats.satellite.EO1 INFO <2580.00> EO1: setting timed terminal event at 3180.0
Charge level: 0.632 (1980.0 seconds)
Eclipse: start: 4410.0 end: 840.0
Charge level: 0.630 (2580.0 seconds)
Eclipse: start: 3810.0 end: 240.0
2025-08-25 18:17:16,924 sats.satellite.EO1 INFO <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-08-25 18:17:16,925 data.base INFO <3180.00> Total reward: {}
2025-08-25 18:17:16,925 comm.communication INFO <3180.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:16,926 sats.satellite.EO1 INFO <3180.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:16,934 gym INFO <3180.00> Step reward: 0.0
2025-08-25 18:17:16,935 gym INFO <3180.00> === STARTING STEP ===
2025-08-25 18:17:16,936 sats.satellite.EO1 INFO <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:16,936 sats.satellite.EO1 INFO <3180.00> EO1: setting timed terminal event at 3780.0
Charge level: 1.000 (3180.0 seconds)
Eclipse: start: 3210.0 end: 5310.0
2025-08-25 18:17:17,006 sats.satellite.EO1 INFO <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-08-25 18:17:17,006 data.base INFO <3780.00> Total reward: {}
2025-08-25 18:17:17,007 comm.communication INFO <3780.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:17,007 sats.satellite.EO1 INFO <3780.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:17,009 gym INFO <3780.00> Step reward: 0.0
2025-08-25 18:17:17,009 gym INFO <3780.00> === STARTING STEP ===
2025-08-25 18:17:17,010 sats.satellite.EO1 INFO <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:17,010 sats.satellite.EO1 INFO <3780.00> EO1: setting timed terminal event at 4380.0
2025-08-25 18:17:17,067 sats.satellite.EO1 INFO <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-08-25 18:17:17,068 data.base INFO <4380.00> Total reward: {}
2025-08-25 18:17:17,068 comm.communication INFO <4380.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:17,069 sats.satellite.EO1 INFO <4380.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:17,070 gym INFO <4380.00> Step reward: 0.0
2025-08-25 18:17:17,070 gym INFO <4380.00> === STARTING STEP ===
2025-08-25 18:17:17,071 sats.satellite.EO1 INFO <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:17,071 sats.satellite.EO1 INFO <4380.00> EO1: setting timed terminal event at 4980.0
2025-08-25 18:17:17,131 sats.satellite.EO1 INFO <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-08-25 18:17:17,131 data.base INFO <4980.00> Total reward: {}
2025-08-25 18:17:17,132 comm.communication INFO <4980.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:17,132 sats.satellite.EO1 INFO <4980.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:17,134 gym INFO <4980.00> Step reward: 0.0
2025-08-25 18:17:17,134 gym INFO <4980.00> === STARTING STEP ===
2025-08-25 18:17:17,135 sats.satellite.EO1 INFO <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:17,135 sats.satellite.EO1 INFO <4980.00> EO1: setting timed terminal event at 5580.0
Charge level: 1.000 (3780.0 seconds)
Eclipse: start: 2610.0 end: 4710.0
Charge level: 1.000 (4380.0 seconds)
Eclipse: start: 2010.0 end: 4110.0
Charge level: 1.000 (4980.0 seconds)
Eclipse: start: 1410.0 end: 3510.0
2025-08-25 18:17:17,195 sats.satellite.EO1 INFO <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-08-25 18:17:17,196 data.base INFO <5580.00> Total reward: {}
2025-08-25 18:17:17,196 comm.communication INFO <5580.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:17,197 sats.satellite.EO1 INFO <5580.00> EO1: Satellite EO1 requires retasking
2025-08-25 18:17:17,198 gym INFO <5580.00> Step reward: 0.0
2025-08-25 18:17:17,199 gym INFO <5580.00> === STARTING STEP ===
2025-08-25 18:17:17,199 sats.satellite.EO1 INFO <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-08-25 18:17:17,199 sats.satellite.EO1 INFO <5580.00> EO1: setting timed terminal event at 6180.0
Charge level: 1.000 (5580.0 seconds)
Eclipse: start: 810.0 end: 2910.0
2025-08-25 18:17:17,213 data.base INFO <5700.00> Total reward: {}
2025-08-25 18:17:17,213 comm.communication INFO <5700.00> Optimizing data communication between all pairs of satellites
2025-08-25 18:17:17,215 gym INFO <5700.00> Step reward: 0.0
2025-08-25 18:17:17,215 gym INFO <5700.00> Episode terminated: False
2025-08-25 18:17:17,216 gym INFO <5700.00> Episode truncated: True
Charge level: 1.000 (5700.0 seconds)
Eclipse: start: 690.0 end: 2790.0
It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.