Counterfactual Regret Minimization
CFR is a defensive play algorithm that learns by playing vs. itself. It starts with random strategy, and iterates to reach optimal Nash Equilibrium strategy. We'll start by defining counterfactual, regret, and minimization.
- Counterfactual: Refers to the opposite of what actually happened. For the statement "I forgot a portable charger and my phone died," the counterfactual statement would be "If I'd had a portable charger, my phone wouldn't have died." CFR assigns a "score" for the counterfact, based on the move taken, vs what would have happened had you taken the counterfactual move.
- Regret: Refers to the different outcomes between a made decision and an optimal ones. It attempts to measure how much one should regret having taken an action. CFR assigns a "score" for the regret, solely based on the outcomes of the move one made, vs the optimal move.
- Minimzation: Refers to the attempts taken to minimize the regret; in other words, attempts to bridge the gap between the made move and the optimal move.