Zhang, Xinhua; Aberdeen, Douglas; Vishwanathan, S
Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically distributed (iid). In this paper we explore the use of CRFs in a class of temporal learning algorithms, namely policy-gradient reinforcement learning (RL). Now the labels are...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.