1

Efficient Planning in a Compact Latent Action Space

We propose a novel planning-based sequence modelling method that can scale to high dimensionality state-action space.

Zhengyao Jiang, Tianjun Zhang, Micheal Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Efficient Planning in a Compact Latent Action Space

Optimal Transport for Offline Imitation Learning

We present an offline imitation learning based on optimal transport that demonstrates strong performance and sample efficiency

Yicheng Luo, Zhengyao Jiang, Samuel Cohen, Edward Grefenstette, Deisenroth Marc

Graph Backup: Data Efficient Backup Exploiting Markovian Data

We propose to treat the transition data of an MDP as a graph, and define a novel backup operator exploiting this graph structure. Comparing to multi-step backup, our graph backup method allows counterfactual credit assignment, and can reduce the variance that comes from stochastic environment dynamics.

Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

Grid-to-Graph: Flexible Spatial Relational Inductive Biases for Reinforcement Learning

We proposed a principled and flexible framework to encode relational inductive bias using a relational graph. The relational inductive biases are crucial for the generalization of neural network models and are usually hard-coded in the neural architectures.

Zhengyao Jiang, Pasquale Minervini, Minqi Jiang, Tim Rocktäschel

Neural Logic Reinforcement Learning

To address interpretability and generalization of DRL, we propose a novel algorithm named Neural Logic Reinforcement Learning (NLRL) to represent the policies in reinforcement learning by first-order logic.

Zhengyao Jiang, Shan Luo