
Policy Gradient Bayesian Robust Optimization for Imitation Learning
The difficulty in specifying rewards for many realworld problems has le...
read it

Robust Maximum Entropy Behavior Cloning
Imitation learning (IL) algorithms use expert demonstrations to learn a ...
read it

SoftRobust Algorithms for Handling Model Misspecification
In reinforcement learning, robust policies for highstakes decisionmaki...
read it

Bayesian Robust Optimization for Imitation Learning
One of the main challenges in imitation learning is determining what act...
read it

Entropic Risk Constrained SoftRobust Policy Optimization
Having a perfect model to compute the optimal policy is often infeasible...
read it

Partial Policy Iteration for L1Robust Markov Decision Processes
Robust Markov decision processes (MDPs) allow to compute reliable soluti...
read it

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
In this paper, we introduce proximal gradient temporal difference learni...
read it

Optimizing NormBounded Weighted Ambiguity Sets for Robust MDPs
Optimal policies in Markov decision processes (MDPs) are very sensitive ...
read it

HighConfidence Policy Optimization: Reshaping Ambiguity Sets in Robust MDPs
Robust MDPs are a promising framework for computing robust policies in r...
read it

Robust Exploration with Tight Bayesian Plausibility Sets
Optimism about the poorly understood states and actions is the main driv...
read it

Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs
Robust MDPs (RMDPs) can be used to compute policies with provable worst...
read it

Tight Bayesian Ambiguity Sets for Robust MDPs
Robustness is important for sequential decision making in a stochastic d...
read it

Interpretable Reinforcement Learning with Ensemble Methods
We propose to use boosted regression trees as a way to compute humanint...
read it

A Practical Method for Solving Contextual Bandit Problems Using Decision Trees
Many efficient algorithms with strong theoretical guarantees have been p...
read it

Value Directed Exploration in MultiArmed Bandits with Structured Priors
Multiarmed bandits are a quintessential machine learning problem requir...
read it

Safe Policy Improvement by Minimizing Robust Baseline Regret
An important problem in sequential decisionmaking under uncertainty is ...
read it

Building an Interpretable Recommender via LossPreserving Transformation
We propose a method for building an interpretable recommender system for...
read it

Robust PartiallyCompressed LeastSquares
Randomized matrix compression techniques, such as the JohnsonLindenstra...
read it

A Bilinear Programming Approach for Multiagent Planning
Multiagent planning and coordination problems are common and known to be...
read it

Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation
We propose solution methods for previouslyunsolved constrained MDPs in ...
read it

An Approximate Solution Method for Large RiskAverse Markov Decision Processes
Stochastic domains often involve riskaverse decision makers. While rece...
read it

Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds
Approximate dynamic programming is a popular method for solving large Ma...
read it

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Approximate dynamic programming has been used successfully in a large va...
read it
Marek Petrik
is this you? claim profile