Description
The planning and execution of observational cosmology campaigns have undergone a substantial increase in complexity, particularly for advanced telescopes like the Rubin Observatory’s LSST, JWST, and the Nancy Grace Roman telescope.
Traditionally, astronomical observatories have relied on manual planning to scan a predefined list of astronomical objects, which usually results in suboptimal observations.
We are developing a framework for statistical learning-based optimization of telescope pointings to gather data that is most useful for a pre-defined scientific reward.
We frame the observational campaign as a Markovian Decision Process, which captures the nature of sequential decision-making.
We implement this through reinforcement learning (RL), which has emerged in the field of artificial intelligence as a powerful approach to training autonomous systems.
In this study, we focus on the application of RL algorithms on an offline dataset containing simulated observations with a discrete set of sky locations the telescope is allowed to visit, referred to as the “action space.”
Two key aspects are investigated: 1) the preprocessing of the dataset using normalization techniques and potential observation space reduction, and 2) the application of value-based networks for decision-making.
Considering the range of well-known RL algorithms, this study has mainly targeted value-based networks, and in particular Deep Q-Networks (DQNs), since they outperform policy-based networks on the offline dataset.
Our experimental results demonstrate that the combination of preprocessing techniques, along with value-based networks, yields high performances and capabilities to generalize on unseen data for our task.
Furthermore, the analysis highlighted how varying certain hyperparameters led to a significant impact on the obtained results.
Our results contribute to the advancement of autonomous systems, specifically in the context of process scheduling.