Getting Started
This tutorial demonstrates the configuration and use of a simple BSK-RL environment. BSK-RL and dependencies should already be installed at this point (see Installation if you haven’t installed the package yet).
Load Modules
In this tutorial, the environment will be created with gym.make
, so it is necessary to import the top-level bsk_rl
module as well as gym
and bsk_rl
components.
[1]:
import gymnasium as gym
import numpy as np
from bsk_rl import act, data, obs, scene, sats
from bsk_rl.sim import dyn, fsw
from Basilisk.architecture import bskLogging
bskLogging.setDefaultLogLevel(bskLogging.BSK_WARNING)
If no errors were raised, you have a functional installation of bsk_rl
.
Configure the Satellite
Satellites are configurable agents in the environment. To make a new environment, start by specifying the observations and actions of a satellite type, as well as the underlying Basilisk simulation models used by the satellite.
[2]:
class MyScanningSatellite(sats.AccessSatellite):
observation_spec = [
obs.SatProperties(
dict(prop="storage_level_fraction"),
dict(prop="battery_charge_fraction")
),
obs.Eclipse(),
]
action_spec = [
act.Scan(duration=60.0), # Scan for 1 minute
act.Charge(duration=600.0), # Charge for 10 minutes
]
dyn_type = dyn.ContinuousImagingDynModel
fsw_type = fsw.ContinuousImagingFSWModel
Based on this class specification, a list of configurable parameters for the satellite can be generated.
[3]:
MyScanningSatellite.default_sat_args()
[3]:
{'hs_min': 0.0,
'maxCounterValue': 4,
'thrMinFireTime': 0.02,
'desatAttitude': 'sun',
'controlAxes_B': [1, 0, 0, 0, 1, 0, 0, 0, 1],
'thrForceSign': 1,
'K': 7.0,
'Ki': -1,
'P': 35.0,
'imageAttErrorRequirement': 0.01,
'imageRateErrorRequirement': None,
'inst_pHat_B': [0, 0, 1],
'utc_init': 'this value will be set by the world model',
'batteryStorageCapacity': 288000.0,
'storedCharge_Init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'disturbance_vector': None,
'dragCoeff': 2.2,
'imageTargetMaximumRange': -1,
'instrumentBaudRate': 8000000.0,
'instrumentPowerDraw': -30.0,
'basePowerDraw': 0.0,
'wheelSpeeds': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'maxWheelSpeed': inf,
'u_max': 0.2,
'rwBasePower': 0.4,
'rwMechToElecEfficiency': 0.0,
'rwElecToMechEfficiency': 0.5,
'panelArea': 1.0,
'panelEfficiency': 0.2,
'nHat_B': array([ 0, 0, -1]),
'mass': 330,
'width': 1.38,
'depth': 1.04,
'height': 1.58,
'sigma_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'omega_init': <function bsk_rl.sim.dyn.base.BasicDynamicsModel.<lambda>()>,
'rN': None,
'vN': None,
'oe': <function bsk_rl.utils.orbital.random_orbit(i: Optional[float] = None, a: Optional[float] = 6871, e: float = 0, Omega: Optional[float] = None, omega: Optional[float] = None, f: Optional[float] = None, alt: float = None, r_body: float = 6371) -> Basilisk.utilities.orbitalMotion.ClassicElements>,
'mu': 398600436000000.0,
'dataStorageCapacity': 160000000.0,
'storageUnitValidCheck': False,
'storageInit': 0,
'thrusterPowerDraw': 0.0,
'transmitterBaudRate': -8000000.0,
'transmitterNumBuffers': 100,
'transmitterPacketSize': None,
'transmitterPowerDraw': -15.0}
When instantiating a satellite, these parameters can be overriden with a constant or rerandomized every time the environment is reset using the sat_args
dictionary.
[4]:
sat_args = {}
# Set some parameters as constants
sat_args["imageAttErrorRequirement"] = 0.05
sat_args["dataStorageCapacity"] = 1e10
sat_args["instrumentBaudRate"] = 1e7
sat_args["storedCharge_Init"] = 50000.0
# Randomize the initial storage level on every reset
sat_args["storageInit"] = lambda: np.random.uniform(0.25, 0.75) * 1e10
# Make the satellite
sat = MyScanningSatellite(name="EO1", sat_args=sat_args)
Making the Environment
For this example, we will be using the single-agent SatelliteTasking environment. Along with passing the satellite that we configured, the environment takes a scenario, which defines the environment the satellite is acting in, and a rewarder, which defines how data collected from the scenario is rewarded.
[5]:
env = gym.make(
"SatelliteTasking-v1",
satellite=sat,
scenario=scene.UniformNadirScanning(),
rewarder=data.ScanningTimeReward(),
time_limit=5700.0, # approximately 1 orbit
log_level="INFO",
)
2025-07-02 01:15:22,033 gym INFO Calling env.reset() to get observation space
2025-07-02 01:15:22,034 gym INFO Resetting environment with seed=3972050394
2025-07-02 01:15:22,123 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-07-02 01:15:22,143 gym INFO <0.00> Environment reset
Interacting with the Environment
First, the environment is reset.
[6]:
observation, info = env.reset(seed=1)
2025-07-02 01:15:22,210 gym INFO Resetting environment with seed=1
2025-07-02 01:15:22,344 sats.satellite.EO1 INFO <0.00> EO1: Finding opportunity windows from 0.00 to 5700.00 seconds
2025-07-02 01:15:22,366 gym INFO <0.00> Environment reset
Next, we take the scan action (action=0
) a few times. This allows for the satellite to settle its attitude in the nadir pointing mode to satisfy imaging conditions. Note that the logs show little or no data accumulated in the first two steps as it settles, but achieves 60 reward (corresponding to 60 seconds of imaging) by the third step.
[7]:
print("Initial data level:", observation[0], "(randomized by sat_args)")
for _ in range(3):
observation, reward, terminated, truncated, info = env.step(action=0)
print(" Final data level:", observation[0])
2025-07-02 01:15:22,371 gym INFO <0.00> === STARTING STEP ===
2025-07-02 01:15:22,372 sats.satellite.EO1 INFO <0.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-07-02 01:15:22,372 sats.satellite.EO1 INFO <0.00> EO1: setting timed terminal event at 60.0
2025-07-02 01:15:22,380 sats.satellite.EO1 INFO <60.00> EO1: timed termination at 60.0 for action_nadir_scan
2025-07-02 01:15:22,380 data.base INFO <60.00> Total reward: {}
2025-07-02 01:15:22,381 comm.communication INFO <60.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,381 sats.satellite.EO1 INFO <60.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,382 gym INFO <60.00> Step reward: 0.0
2025-07-02 01:15:22,383 gym INFO <60.00> === STARTING STEP ===
2025-07-02 01:15:22,384 sats.satellite.EO1 INFO <60.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-07-02 01:15:22,384 sats.satellite.EO1 INFO <60.00> EO1: setting timed terminal event at 120.0
2025-07-02 01:15:22,392 sats.satellite.EO1 INFO <120.00> EO1: timed termination at 120.0 for action_nadir_scan
2025-07-02 01:15:22,392 data.base INFO <120.00> Total reward: {'EO1': 18.0}
2025-07-02 01:15:22,393 comm.communication INFO <120.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,393 sats.satellite.EO1 INFO <120.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,394 gym INFO <120.00> Step reward: 18.0
2025-07-02 01:15:22,395 gym INFO <120.00> === STARTING STEP ===
2025-07-02 01:15:22,395 sats.satellite.EO1 INFO <120.00> EO1: action_nadir_scan tasked for 60.0 seconds
2025-07-02 01:15:22,396 sats.satellite.EO1 INFO <120.00> EO1: setting timed terminal event at 180.0
2025-07-02 01:15:22,403 sats.satellite.EO1 INFO <180.00> EO1: timed termination at 180.0 for action_nadir_scan
2025-07-02 01:15:22,404 data.base INFO <180.00> Total reward: {'EO1': 60.0}
2025-07-02 01:15:22,404 comm.communication INFO <180.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,405 sats.satellite.EO1 INFO <180.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,406 gym INFO <180.00> Step reward: 60.0
Initial data level: 0.5961613078 (randomized by sat_args)
Final data level: 0.6741613078
The observation reflects the increase in stored data. The first element, corresponding to storage_level_fraction
, starts at a random value set by the storageInit
function in sat_args
and increases based on the time spent imaging.
Finally, the charging mode is tasked repeatedly in 10-minute increments until the environment time limit is reached.
[8]:
while not truncated:
observation, reward, terminated, truncated, info = env.step(action=1)
print(f"Charge level: {observation[1]:.3f} ({env.unwrapped.simulator.sim_time:.1f} seconds)\n\tEclipse: start: {observation[2]:.1f} end: {observation[3]:.1f}")
2025-07-02 01:15:22,411 gym INFO <180.00> === STARTING STEP ===
2025-07-02 01:15:22,412 sats.satellite.EO1 INFO <180.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,413 sats.satellite.EO1 INFO <180.00> EO1: setting timed terminal event at 780.0
2025-07-02 01:15:22,475 sats.satellite.EO1 INFO <780.00> EO1: timed termination at 780.0 for action_charge
2025-07-02 01:15:22,476 data.base INFO <780.00> Total reward: {}
2025-07-02 01:15:22,477 comm.communication INFO <780.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,477 sats.satellite.EO1 INFO <780.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,482 gym INFO <780.00> Step reward: 0.0
2025-07-02 01:15:22,483 gym INFO <780.00> === STARTING STEP ===
2025-07-02 01:15:22,484 sats.satellite.EO1 INFO <780.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,484 sats.satellite.EO1 INFO <780.00> EO1: setting timed terminal event at 1380.0
2025-07-02 01:15:22,554 sats.satellite.EO1 INFO <1380.00> EO1: timed termination at 1380.0 for action_charge
2025-07-02 01:15:22,555 data.base INFO <1380.00> Total reward: {}
2025-07-02 01:15:22,555 comm.communication INFO <1380.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,555 sats.satellite.EO1 INFO <1380.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,557 gym INFO <1380.00> Step reward: 0.0
2025-07-02 01:15:22,557 gym INFO <1380.00> === STARTING STEP ===
2025-07-02 01:15:22,558 sats.satellite.EO1 INFO <1380.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,558 sats.satellite.EO1 INFO <1380.00> EO1: setting timed terminal event at 1980.0
Charge level: 0.637 (780.0 seconds)
Eclipse: start: 5610.0 end: 2040.0
Charge level: 0.635 (1380.0 seconds)
Eclipse: start: 5010.0 end: 1440.0
2025-07-02 01:15:22,620 sats.satellite.EO1 INFO <1980.00> EO1: timed termination at 1980.0 for action_charge
2025-07-02 01:15:22,621 data.base INFO <1980.00> Total reward: {}
2025-07-02 01:15:22,622 comm.communication INFO <1980.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,622 sats.satellite.EO1 INFO <1980.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,624 gym INFO <1980.00> Step reward: 0.0
2025-07-02 01:15:22,625 gym INFO <1980.00> === STARTING STEP ===
2025-07-02 01:15:22,625 sats.satellite.EO1 INFO <1980.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,626 sats.satellite.EO1 INFO <1980.00> EO1: setting timed terminal event at 2580.0
Charge level: 0.632 (1980.0 seconds)
Eclipse: start: 4410.0 end: 840.0
2025-07-02 01:15:22,684 sats.satellite.EO1 INFO <2580.00> EO1: timed termination at 2580.0 for action_charge
2025-07-02 01:15:22,684 data.base INFO <2580.00> Total reward: {}
2025-07-02 01:15:22,685 comm.communication INFO <2580.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,685 sats.satellite.EO1 INFO <2580.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,687 gym INFO <2580.00> Step reward: 0.0
2025-07-02 01:15:22,688 gym INFO <2580.00> === STARTING STEP ===
2025-07-02 01:15:22,688 sats.satellite.EO1 INFO <2580.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,689 sats.satellite.EO1 INFO <2580.00> EO1: setting timed terminal event at 3180.0
2025-07-02 01:15:22,752 sats.satellite.EO1 INFO <3180.00> EO1: timed termination at 3180.0 for action_charge
2025-07-02 01:15:22,753 data.base INFO <3180.00> Total reward: {}
2025-07-02 01:15:22,753 comm.communication INFO <3180.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,753 sats.satellite.EO1 INFO <3180.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,764 gym INFO <3180.00> Step reward: 0.0
2025-07-02 01:15:22,764 gym INFO <3180.00> === STARTING STEP ===
2025-07-02 01:15:22,765 sats.satellite.EO1 INFO <3180.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,765 sats.satellite.EO1 INFO <3180.00> EO1: setting timed terminal event at 3780.0
2025-07-02 01:15:22,822 sats.satellite.EO1 INFO <3780.00> EO1: timed termination at 3780.0 for action_charge
2025-07-02 01:15:22,823 data.base INFO <3780.00> Total reward: {}
2025-07-02 01:15:22,823 comm.communication INFO <3780.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,823 sats.satellite.EO1 INFO <3780.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,825 gym INFO <3780.00> Step reward: 0.0
2025-07-02 01:15:22,825 gym INFO <3780.00> === STARTING STEP ===
Charge level: 0.630 (2580.0 seconds)
Eclipse: start: 3810.0 end: 240.0
Charge level: 1.000 (3180.0 seconds)
Eclipse: start: 3210.0 end: 5310.0
Charge level: 1.000 (3780.0 seconds)
Eclipse: start: 2610.0 end: 4710.0
2025-07-02 01:15:22,826 sats.satellite.EO1 INFO <3780.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,826 sats.satellite.EO1 INFO <3780.00> EO1: setting timed terminal event at 4380.0
2025-07-02 01:15:22,883 sats.satellite.EO1 INFO <4380.00> EO1: timed termination at 4380.0 for action_charge
2025-07-02 01:15:22,884 data.base INFO <4380.00> Total reward: {}
2025-07-02 01:15:22,884 comm.communication INFO <4380.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,885 sats.satellite.EO1 INFO <4380.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,886 gym INFO <4380.00> Step reward: 0.0
2025-07-02 01:15:22,887 gym INFO <4380.00> === STARTING STEP ===
2025-07-02 01:15:22,887 sats.satellite.EO1 INFO <4380.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,888 sats.satellite.EO1 INFO <4380.00> EO1: setting timed terminal event at 4980.0
Charge level: 1.000 (4380.0 seconds)
Eclipse: start: 2010.0 end: 4110.0
2025-07-02 01:15:22,945 sats.satellite.EO1 INFO <4980.00> EO1: timed termination at 4980.0 for action_charge
2025-07-02 01:15:22,946 data.base INFO <4980.00> Total reward: {}
2025-07-02 01:15:22,946 comm.communication INFO <4980.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:22,947 sats.satellite.EO1 INFO <4980.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:22,948 gym INFO <4980.00> Step reward: 0.0
2025-07-02 01:15:22,948 gym INFO <4980.00> === STARTING STEP ===
2025-07-02 01:15:22,949 sats.satellite.EO1 INFO <4980.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:22,950 sats.satellite.EO1 INFO <4980.00> EO1: setting timed terminal event at 5580.0
2025-07-02 01:15:23,019 sats.satellite.EO1 INFO <5580.00> EO1: timed termination at 5580.0 for action_charge
2025-07-02 01:15:23,020 data.base INFO <5580.00> Total reward: {}
2025-07-02 01:15:23,020 comm.communication INFO <5580.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:23,021 sats.satellite.EO1 INFO <5580.00> EO1: Satellite EO1 requires retasking
2025-07-02 01:15:23,022 gym INFO <5580.00> Step reward: 0.0
2025-07-02 01:15:23,023 gym INFO <5580.00> === STARTING STEP ===
2025-07-02 01:15:23,023 sats.satellite.EO1 INFO <5580.00> EO1: action_charge tasked for 600.0 seconds
2025-07-02 01:15:23,024 sats.satellite.EO1 INFO <5580.00> EO1: setting timed terminal event at 6180.0
2025-07-02 01:15:23,037 data.base INFO <5700.00> Total reward: {}
2025-07-02 01:15:23,037 comm.communication INFO <5700.00> Optimizing data communication between all pairs of satellites
2025-07-02 01:15:23,038 gym INFO <5700.00> Step reward: 0.0
2025-07-02 01:15:23,039 gym INFO <5700.00> Episode terminated: False
2025-07-02 01:15:23,040 gym INFO <5700.00> Episode truncated: True
Charge level: 1.000 (4980.0 seconds)
Eclipse: start: 1410.0 end: 3510.0
Charge level: 1.000 (5580.0 seconds)
Eclipse: start: 810.0 end: 2910.0
Charge level: 1.000 (5700.0 seconds)
Eclipse: start: 690.0 end: 2790.0
It is observed that the battery decrease while the satellite is in eclipse, but once the satellite is out of eclipse, the battery quickly increases to full charge.