- Title
- Cluster BP and cluster CCCP : Two simple methods for
computing Kikuchi approximations
- Author(s)
- Taisuke Sato
- Contact person
- Taisuke Sato
- Abstract
- Loopy BP (belief propagation) came out of the UAI
community and it is expected to be an efficient
approximation method for computing marginal
distributions not only for Bayesian networks but for
distributions specified by a product of potentials in
general.
In this paper, we first look back on the derivation of
loopy BP from the Bethe approximation to variational
free energy, and its generalization to the Kikuchi
approximation. We then propose two new algorithms,
cluster BP and cluster CCCP, which run on cluster
graphs and are guaranteed to compute the Kikuchi
approximation when the graph satisfies the tree
condition. The latter is also guaranteed to compute a
local minimum of the Kikuchi approximation.
- Title
- Generating Self-Evaluations to Learn Appropriate Actions in Various Games
- Author(s)
- Koichi Moriyama and Masayuki Numao
- Contact person
- Koichi Moriyama (moriyama@mori.cs.titech.ac.jp)
- Abstract
- In game theory, the combination of each player's action goes to a
Nash equilibrium. However, this combination is not optimal in some
games such as the prisoner's dilemma (PD). Nevertheless, almost all
multi-agent reinforcement learning algorithms aim to converge to a
Nash equilibrium. There are, on the other hand, several methods
aiming to depart from bad Nash equilibria and proceed to a better
combination by handling rewards in PD-like games, but since they
use fixed handling methods, they are worse in non-PD-like games.
In this paper, we construct an agent learning appropriate actions
in both PD-like and non-PD-like games through self-evaluations. The
agent has two conditions for judging whether the game is like PD or
not and two methods which generate self-evaluations according to
the judgement. We conducted experiments in two kinds of game played
by the agents.