Decision Making with Unknown Future Costs

dc.contributor.authorChen, Yitian
dc.date.accessioned2026-02-04T07:51:39Z
dc.date.available2026-02-04T07:51:39Z
dc.date.issued2026
dc.description.abstractThis thesis develops a unified framework for decision-making problems with unknown future costs, providing both theoretical guarantees and empirical evaluations of its performance. We begin by studying the online Linear Quadratic (LQ) optimal control problem for the cases where (i) future costs are unknown beyond a certain preview horizon and sequentially revealed over time; and (ii) costs are unknown and must be inferred from observed optimal trajectory data. We then extend the framework to dynamic LQ games with sequentially revealed (and potentially previewed) costs. In all settings, the proposed framework is based on predicting and tracking a candidate optimal trajectory using the available costs. We begin by applying the proposed framework to the online LQ control problem with sequentially revealed cost. We adopt the notion of regret as the decision quality measurement. We show that the regret of the proposed method is upper bounded by terms that decay exponentially fast as the preview horizon of future costs increases. Simulations verify this exponential decay and demonstrate that our controller outperforms state-of-the-art methods that do not leverage cost feedback. We then consider the case where the costs must be inferred from observed optimal trajectory data. This is a new framework for solving the learning from demonstration problem. We establish a theoretical connection between the regret and the estimation error of the estimated optimal control gain. A regret bound is derived under an Extended Kalman Filter(EKF)-based parameter estimation scheme, and its performance is validated through numerical experiments. We then apply this framework to a new dynamic LQ game problem, where the costs are sequentially revealed to the players (and may be previewed). We introduce the notion of \emph{price of uncertainty} (PoU) that generalises the notion of regret to multi-agent settings. We establish bounds on the PoU incurred when all players are adopting the designed controller using our framework. Simulation results validate the theoretical bounds on PoU.
dc.identifier.urihttps://hdl.handle.net/1885/733805235
dc.language.isoen_AU
dc.titleDecision Making with Unknown Future Costs
dc.typeThesis (PhD)
local.contributor.affiliationCollege of Systems and Society, The Australian National University
local.contributor.supervisorShames, Iman
local.identifier.doi10.25911/3G84-S219
local.identifier.proquestYes
local.identifier.researcherIDOEN-2202-2025
local.mintdoimint
local.thesisANUonly.author48588eac-27bf-4cbd-a144-c1d5bbcc2e11
local.thesisANUonly.keyf602f089-a566-2903-c0ee-fc50eea7113b
local.thesisANUonly.title000000026363_TC_1

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chen_Yitian_Thesis.pdf
Size:
1.21 MB
Format:
Adobe Portable Document Format
Description:
Thesis Material