New BEHAVE-research: Model of obfuscation based decision-making

Monday 9 April, Caspar has presented –at his section’s colloquium series– a new model of decision-making with relevance for moral choice behavior of humans and artificial agents. The so-called obfuscation-based model postulates that in some (moral) choice situations, agents may wish to hide the motivations (e.g. moral rules) underlying their choices, from onlookers. Here are the slides of the presentation.

The abstract of the talk is as follows: Formal models of decision-making are routinely founded on the assumption that agents base their choices on underlying motivations (also named preferences, goals, decision rules, desires, etc.); this talk presents a new perspective on modelling decision-making, by assuming that agents –when making choices– aim to obfuscate (hide) their underlying motivations. In other words, where decision models usually assume that motivations echo through in choices, this model postulates that decision-makers may want to suppress that echo.

Such obfuscation-behaviour is likely to occur in various situations: think of a person facing a moral dilemma, who is unsure which moral principle to apply and afraid that an onlooker (which may be her own ‘moral persona’) will punish her with contempt or feelings of guilt, if the ‘wrong’ moral principle is applied. Or think of an Artificial Agent that is being trained using a penalty-system to avoid implicit moral biases underlying her choices. In such situations, the agent benefits from choosing actions that, while being in line with her motivations, at the same time hide those motivations for onlookers or prosecutors.

 Combining notions of Bayesian inference and Information entropy, I present a mathematical representation of such obfuscating agent behaviour; and I illustrate how the actions chosen by obfuscators differ from those chosen by agents that do not attempt to obfuscate. I also show how an onlooker may try to design choice sets that maximize the information that may be extracted from choices made by (non-)obfuscating agents.