Title: Demand-Independent Optimal Tolls

Abstract: Wardrop equilibria in nonatomic congestion games are in general inefficient as they do not induce an optimal flow that minimizes the total travel time. Network tolls are a prominent and popular way to induce an optimum flow in equilibrium. The classical approach to find such tolls is marginal cost pricing which requires the exact knowledge of the demand on the network. In this paper, we investigate under which conditions demand-independent optimal tolls exist that induce the system optimum flow for any travel demand in the network. We give several characterizations for the existence of such tolls both in terms of the cost structure and the network structure of the game. Specifically we show that demand-independent optimal tolls exist if and only if the edge cost functions are shifted monomials as used by the Bureau of Public Roads. Moreover, non-negative demand-independent optimal tolls exist when the network is a directed acyclic multi-graph. Finally, we show that any network with a single origin/destination pair admits demand-independent optimal tolls that, although not necessarily non-negative, satisfy a budget constraint.

&nsbp;

Title: The efficiency of resource allocation mechanisms for budget-constrained users

Abstract: We study the efficiency of mechanisms for allocating a divisible resource. Given scalar signals submitted by all users, such a mechanism decides the fraction of the resource that each user will receive and a payment that will be collected from her. Users are self-interested and aim to maximize their utility (defined as their value for the resource fraction they receive minus their payment). Starting with the seminal work of Johari and Tsitsiklis [Operations Research, 2004], a long list of papers studied the price of anarchy (in terms of the social welfare — the total users’ value) of resource allocation mechanisms for a variety of allocation and payment rules. Here, we further assume that each user has a budget constraint that invalidates strategies that yield a payment that is higher than the user’s budget. This subtle assumption (which is arguably more realistic) constitutes the traditional price of anarchy analysis meaningless as the set of equilibria may change drastically and their social welfare can be arbitrarily far from optimal. Instead, we study the price of anarchy using the liquid welfare benchmark that measures efficiency taking budget constraints into account. We show a tight bound of 2 on the liquid price of anarchy of the well-known Kelly mechanism and prove that this result is essentially best possible among all multi-user resource allocation mechanisms. This comes in sharp contrast to the no-budget setting where there are mechanisms that considerably outperform Kelly in terms of social welfare and even achieve full efficiency. In our proofs, we exploit the particular structure of worst-case games and equilibria, which also allows us to design (nearly) optimal two-player mechanisms by solving differential equations.

&nsbp;

Title: Monotone game theory for multi-agent systems

Abstract: Modern society is based on complex systems populated by selfish decision makers, generally called agents, e.g. energy companies in power grids, car drivers in road traffic. A fundamental challenge in these multi-agent systems is to achieve an efficient equilibrium where the decision of each agent is optimal given the collective decisions or those of some other agents. Computational game theory has been addressing this challenge via so-called mechanism design. In this seminar, we will adopt a mathematical perspective to computational, monotone game theory and show how to embed mechanism design into fixed-point operator theory. This perspective allows us to design multi-agent dynamics that convergence to an equilibrium under mild technical assumptions on the problem data, via decentralized computation and structured information exchange.

&nsbp;

Title: Cooperation in bandit and expert networks

Abstract: We consider networks of communicating learning agents that cooperate to solve a common task in both the bandit and expert settings. Agents use an underlying communication network to get information about actions selected by other agents. In the talk, we show that cooperation allows to obtain bounds on the average per-agent regret that are strictly better than those for noncooperating agents. We also use the cooperative network setting to study trade-offs between delay and feedback in online learning.

&nsbp;

Title: Markov Games with Incomplete Information on One Side

Abstract: We study the optimal use of information in Markov games with incomplete information on one side and two states. We provide a finite-stage algorithm for calculating the limit value as the gap between stages goes to 0, and an optimal strategy for the informed player in the limiting game in continuous time. This limiting strategy induces an ε-optimal strategy for the informed player, provided the gap between stages is small.

&nsbp;

Title: The efficiency of resource allocation mechanisms for budget-constrained users

Abstract: We study the efficiency of mechanisms for allocating a divisible resource. Given scalar signals submitted by all users, such a mechanism decides the fraction of the resource that each user will receive and a payment that will be collected from her. Users are self-interested and aim to maximize their utility (defined as their value for the resource fraction they receive minus their payment). Starting with the seminal work of Johari and Tsitsiklis [Operations Research, 2004], a long list of papers studied the price of anarchy (in terms of the social welfare — the total users’ value) of resource allocation mechanisms for a variety of allocation and payment rules. Here, we further assume that each user has a budget constraint that invalidates strategies that yield a payment that is higher than the user’s budget. This subtle assumption (which is arguably more realistic) constitutes the traditional price of anarchy analysis meaningless as the set of equilibria may change drastically and their social welfare can be arbitrarily far from optimal. Instead, we study the price of anarchy using the liquid welfare benchmark that measures efficiency taking budget constraints into account. We show a tight bound of 2 on the liquid price of anarchy of the well-known Kelly mechanism and prove that this result is essentially best possible among all multi-user resource allocation mechanisms. This comes in sharp contrast to the no-budget setting where there are mechanisms that considerably outperform Kelly in terms of social welfare and even achieve full efficiency. In our proofs, we exploit the particular structure of worst-case games and equilibria, which also allows us to design (nearly) optimal two-player mechanisms by solving differential equations.

&nsbp;

Title: Computation of Equilibria in Resource Allocation Games

Abstract: In this talk I discuss two recent results on the computation of pure Nash equilibria in atomic splittable congestion games. When cost functions are player-specific and affine, we devise the first polynomial time algorithm computing a pure Nash equilibrium. Our algorithm is combinatorial and computes the exact equilibrium assuming rational input. When cost functions are player-specific, convex and bounded by a Lipschitz constant, an equilibrium for a polymatroid congestion games can be found within a running time that is pseudo polynomial in the aggregated demand of the players.

Lastly, we show that every multimarket oligopoly with affine price functions and quadratic production cost be transformed in an atomic splittable singleton congestion game within polynomial time. Hence, all computational results obtained for atomic splittable congestion games also hold for this class of multimarket oligopolies.

&nsbp;

Title: On the Performance of Deferred-Acceptance Auctions

Abstract: Deferred Acceptance (DA) auctions were introduced by Milgrom and Segal (2014), motivated by the design of spectrum auctions. This framework is based on backward-greedy algorithms and yields mechanisms with remarkable incentive properties. In particular, auctions that fit under this framework are group-strategyproof, and can be implemented as an obviously-strategyproof ascending auction. Most existing works on DA auctions however considered only binary single-parameter problems, where each bidder is either accepted or rejected by the mechanism. In this talk, we will first overview the framework along with results on the binary setting. We will then provide a generalization of the DA auction framework to non-binary settings, that allow outcomes with multiple “levels of service” instead of an accept/reject decision. We will then describe how to apply this in order to obtain approximately welfare-maximizing DA auctions for a number of basic mechanism design problems, including: problems with polymatroid constraints or multiple knapsack constraints, the problem of scheduling jobs to minimize their total weighted completion time, and multi-unit auctions.

&nsbp;

Title: Simplifying optimal strategies in zero-sum stochastic games

Abstract: We consider two classes of zero-sum stochastic games: positive stochastic games and stochastic games with the limsup/liminf evaluation. In these games, it is known that a player may have no optimal strategy at his disposal, due to discontinuity in the payoffs. We prove that when a player does have an optimal strategy then he also has a simple optimal strategy: a stationary optimal strategy or an optimal strategy that only requires finite memory.

&nsbp;

Title: Assume Admissible Synthesis

Abstract: In this talk, I will show how to use the notion of admissibility in order to solve synthesis problem for reactive system. In particular, I will introduce a novel rule for synthesis of reactive systems, applicable to systems made of n components which have each their own objectives. Contrary to previous proposals in the literature, the rule « assume admissible » defines sets of solutions which are rectangular. This property leads to solutions which are robust and resilient, and allows one to synthesize strategies separately for each component.

&nsbp;

Title: No-regret learning, games and optimization: some comments on recent advances

Abstract: The purpose of this talk is to underline links between algorithms used in convex optimization, games and on-line learning. We will comment on recent advances in particular on:

– discrete and continuous time approaches

– link with variational inequalities

– extension to infinite dimensional set up.

&nsbp;

Title: Extortion and cooperation in repeated games

Abstract: Social dilemmas are situations in which individuals have an incentive to defect at the expense of other group members. Fortunately, when such situations occur repeatedly, reciprocal strategies like Tit-for-Tat can resolve the social dilemma. However, William Press and Freeman Dyson, a computer scientist and a theoretical physicist, have recently shown that repeated interactions also allow individuals to extort their co-players. Using an extortionate strategy, an individual can force the co-player to cooperate, although the individual itself is not fully cooperative. In my talk, I will explain how these strategies work, and I will review recent theoretical and experimental results on when extortion pays.

&nsbp;

Title: Distributed decision making in fully probabilistic setting

Abstract: The talk will outline one promising direction of distributed decision making. We consider a flat structure of interacting learning agents without a facilitator. The decision making task is inspected from the viewpoint of a single selfish agent. This concept relies on no central coordinator‐mediator, but stimulates the implicit need of an agent to respect other agents in its neighborhood. The talk will outline possible cooperation scenarios and their conceptual solution as well as knowledge transfer between agents.

&nsbp;

Title: Dynamic Reduced Form Allocations

Abstract: In many dynamic environments with incomplete information, all that matters to Bayesian agents is the sequence of per-period probabilities with which they get a unit assigned over time. From the designer’s perspective, an allocation rule is a more complicated object, namely, a full-fledged strategy in the dynamic game. We investigate which sequences can be obtained from such strategies. This allows a reduction in the dimensionality of the problem. Applications include dynamic pricing, or matching without transfers.

&nsbp;

Title: Risk-Aware Planning in Partially Observable Markov Decision Processes

Abstract: Partially observable Markov decision processes (POMDPs) are a standard model of decision making under uncertainty. The vast majority of work on POMDP planning focuses on the standard notion of expectation optimization, where the task is to find a policy which maximizes the expected utility. This is not always desirable, as policies that maximize expectation might take an unacceptable risk. In our work, we consider a risk-constrained expectation optimization in POMDPs. That is, we aim to find a policy which ensures that certain constraint on the probability of bad outcomes is satisfied, and among all such policies maximizes the expected utility. We consider two types of risk constraints: a strict constraint, which stipulates that the policy never produces an outcome whose utility is below a given threshold (a bad outcome); and a probabilistic constraint, where the probability of a bad outcome must be below a given risk bound. We present new algorithms for both of these constrained problems. The new algorithms combine the Monte Carlo tree search approach with additional steps based on techniques from the theory of turn-based discounted games and constrained Markov decision processes. They provide a sound, convergent, and efficient approach for risk-constrained POMDPs.

&nsbp;

Title: Games with distorted information with two different forms of beliefs – Belief distorted Nash equilibria and their applications

Abstract: I want to present new concepts of equilibria which can be applicable in dynamic game theoretic problems usually not taken into account in game theoretic modelling of various aspects of real life problems in which players do not have perfect information about the game they play.

The questions that are designed to answer to are as follows.

- Can beliefs of players about future values of some parameters being results of players choices cause this future values behave according to their beliefs?
- Can it happen if it is against the objective knowledge about e.g. the dynamics of the system?
- Can such players’ behaviour be sustainable even if it does not lead to Nash or correlated equilibrium?
- What is the mechanism?
- Can such beliefs be self-verifying? Is it possible that rational players believe they play a Nash equilibrium?

The class of games under consideration are discrete time dynamic games in which players have imperfect information about the game they play.

Players can observe the state variable changing in response to a statistic of players’ decisions and past values of this statistics, and form some expectations about their future values based on their observations and best respond to their expectations. Expectations may have various forms: either they are probability distribution of future scenarios as a result of history and player’s current choice of decision or they constitute sets of scenarios regarded as possible. A mixed version taking ambiguity into account is also possible.

The concept of pre-belief-distorted Nash equilibrium (pre-BDNE) in which at each stage of the game players best respond to their beliefs and current decisions of the others, belief-distorted Nash equilibrium (BDNE) being a pre-BDNE at which the beliefs cannot be falsified during the play and various concepts of self-verification of beliefs are introduced. All these concepts have different definitions for each form of beliefs.

The general idea behind BDNE is similar to all the imperfect information concepts in game theory: players form some beliefs, they best respond to those beliefs (and, possibly, some other parameters) and if their beliefs are not falsified, there is no need to update them.

It is also the idea behind scientific research in the case when players use research results for their optimization which, in turn, influences results of research: a model is formulated (which is used by the players in their optimization), players optimize given the assumed model and if the subsequent observations do not contradict the model, it is often regarded as a correct model of reality. The fact that players’ behaviour resulting from assuming a certain model influences those observation is not taken into account, so profiles that are not Nash equilibria can be sustained and false model about reality can be regarded as true.

The relations between BDNE and Nash or subjective Nash equilibria are examined as well as the existence.

The following examples can be used to illustrate the concepts and their properties.

- A simple ecosystem constituting a common property of its users. We assume that the number of users is large and that every player may have problems with assessing their marginal influence on the aggregate extraction and, consequently, the future trajectory of the state of the resource.
- A repeated minority game being a modification of the El Farol problem. There are players who choose each time whether to stay at home or to go to the bar. If the bar is overcrowded, then it is better to stay at home, the less it is crowded the better it is to go.
- A model of a market describing either Cournot oligopoly or competitive market. Players may have problems with assessing their actual share in the market and, therefore, their actual influence on prices.
- A repeated prisoners dilemma. At each stage, each of two players assesses possible future reactions of the other player to their decision to cooperate or defect at this stage.
- Formation of traffic jams.
- Prevailing of taboos in primitive societies.

&nsbp;

Title: Verification of Markov Decision Processes using Learning Algorithms

Abstract: I will present an application of machine-learning algorithms in the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space.

I will concentrate on computation of maximum reachability probabilities in MDPs and present two methods based on a heuristic-driven partial exploration of the model. First, bounded real-time dynamic programming that yields precise lower and upper bounds on the maximum reachability probability. Second, Monte Carlo tree search, a method celebrated in various machine-learning settings, that combines exact computation using search trees with sampling methods.