Reinforcement Learning With Feedback Graphs. (arXiv:2005.03789v1 [cs.LG])

[Submitted on 7 May 2020]

Abstract: We study episodic reinforcement learning in Markov decision processes when
the agent receives additional feedback per step in the form of several
transition observations. Such additional observations are available in a range
of tasks through extended sensors or prior knowledge about the environment
(e.g., when certain actions yield similar outcome). We formalize this setting
using a feedback graph over state-action pairs and show that model-based
algorithms can leverage the additional feedback for more sample-efficient
learning. We give a regret bound that, ignoring logarithmic factors and
lower-order terms, depends only on the size of the maximum acyclic subgraph of
the feedback graph, in contrast with a polynomial dependency on the number of
states and actions in the absence of a feedback graph. Finally, we highlight
challenges when leveraging a small dominating set of the feedback graph as
compared to the bandit setting and propose a new algorithm that can use
knowledge of such a dominating set for more sample-efficient learning of a
near-optimal policy.

Submission history

From: Christoph Dann [view email]
[v1]
Thu, 7 May 2020 22:35:37 UTC (388 KB)

Source: http://arxiv.org/abs/2005.03789

Generative Data Intelligence

Reinforcement Learning with Feedback Graphs. (arXiv:2005.03789v1 [cs.LG])

Submission history

AERA Highlights: 2024 Annual Meeting Draws More than 15,100 Attendees to Philadelphia, 2025 Annual Meeting Call for Submissions to Open Week of May 13,...

Top Tips for Keeping Your AI Startup’s IT Staff Inspired

Latest Intelligence

Top Tips for Keeping Your AI Startup’s IT Staff Inspired

How to get and use the Rescue Hook in Content Warning

How to get MetaCoins (MC) in Content Warning

Qatar Airways Cargo sets new standards with launch of Advanced Animal Centre

When Is the Next Valorant Map Release Date? » TalkEsport

WestJet Group CEO unveils growth strategy in Halifax amidst return of transatlantic connectivity