A central problem in artificial intelligence is to plan to maximizefuture reward under uncertainty in a partially observable environment. Models of such environments include Partially ObservableMarkov Decision Processes (POMDPs) as well as their generalizations, Predictive State Representations (PSRs) and Observable Operator Models (OOMs). POMDPs model the state ofthe world as a latent variable; in contrast, PSRs and OOMs represent state by tracking occurrence probabilities of a set of futureevents (called tests or characteristic events) conditioned on pastevents (called histories or indicative events). Unfortunately, exactplanning algorithms such as value iteration are intractable formost realistic POMDPs due to the curse of history and the curse ofdimensionality. However, PSRs and OOMs hold the promiseof mitigating both of these curses: first, many successful approximate planning techniques designed to address these problems inPOMDPs can easily be adapted to PSRs and OOMs. Second,PSRs and OOMs are often more compact than their correspondingPOMDPs (i.e., need fewer state dimensions), mitigating the curseof dimensionality. Finally, since tests and histories are observablequantities, it has been suggested that PSRs and OOMs should beeasier to learn than POMDPs; with a successful learning algorithm,we can look for a model which ignores all but the most importantcomponents of state, reducing dimensionality still further.In this paper we take an important step toward realizing the abovehopes. In particular, we propose and demonstrate a fast and statistically consistent spectral algorithm which learns the parametersof a PSR directly from sequences of action-observation pairs. Wethen close the loop from observations to actions by planning in thelearned model and recovering a policy which is near-optimal in theoriginal environment. Closing the loop is a much more stringenttest than simply checking short-term prediction accuracy, since thequality of an optimized policy depends strongly on the accuracy ofthe model: inaccurate models typically lead to useless plans. Closing the Learning-Planning Loop with Predictive State Representations
Intelligent agents have to be provided with different skillsand technological resources in order to deal with highly changing environments, uncertain, incomplete and potentially inconsistent information and bounded computational resources.BDI architectures, argumentation-based techniques and recent technologies like web services have been incorporatedin the design of intelligent agents to address some of thesedifficult aspects. However, they have usually been focusedon some partial aspects and it is not easy to see how thesemodels and technologies could be effectively integrated in asingle framework. In this work, we propose a general framework that integrates web services and argumentation-basedinference into the design of a BDI system. Our proposalextends the well known advantages of BDI systems by using web services to access relevant information available onthe Web and defeasible argumentation inference to manageincomplete and potentially inconsistent information. An approach to integrate web services and argumentation into a BDI system
Based on a semiotic perspective, this paper proposes a first step towards combining the advantages from both AOSE methods and AI inspired agent organization (AO) models in order to propose a MAS project development cycle using an organization centered approach. A Semiotic Perspective for Multiagent Systems Development
The Electronic Institutions (EIs) framework is designed forregulating interactions among heterogeneous agents in opensystems. In EIs, agent interactions are speech acts whoseexchange is organized as conversation protocols called scenes.Agents can participate simultaneously in multiple scenesplaying a single role in each one of them. However, at somepoint, the execution of a given scene may require the presence of an agent playing a particular role. When such anagent is missing, a deadlock may ensue unless the institution or the agents themselves can invoke the participationof an agent to play the missing role. Such functionality isnot provided in the current EI framework. We propose anextension of the framework that addresses that problem in ageneric way: the provision of an institutional agent in chargeof instantiating new agents and dispatching them to scenesthrough a participation request protocol. In this paper wemake the proposal precise and illustrate it with a use case. Requesting agent participation on Electronic Institutions
Wikis today are being used as a tool to conduct collaborativewriting assignments in classrooms. However, typical Wikis (1) donot provide group formation methods to improve the collaborativelearning of the students and (2) suffer from typical problems ofcollaborative learning like free-riding (earning credit withoutcontribution) and lacking conveniences to facilitate teacherinterventions. To improve the state of the art of the typical Wikisused in classrooms, we have designed and implementedClassroomWiki — a Web-based collaborative Wiki writing toolthat combines a set of learner pedagogy theories with multiagenttracking, modeling, and group formation. For the students,ClassroomWiki provides a Web interface for writing and revisingtheir group's Wiki and a topic-based forum for discussing theirideas during collaboration. When the students collaborate,ClassroomWiki's agents track all student activities and builddetailed student models that represent their contributions towardtheir groups and uses MHCF algorithm to form student groups toimprove the collaborative learning of students. We have deployedClassroomWiki in two university-level courses to investigate theimpact of ClassroomWiki. The results show that ClassroomWikican (1) improve the collaborative learning outcome of the studentsby its group formation framework and (2) alleviate free-riding andfacilitate teacher interventions by its multiagent tracking andmodeling. ClassroomWiki: A Wiki for the Classroom with Multiagent Tracking, Modeling, and Group Formation
n this paper we combine two different economically inspired mechanisms, acting at intersection and at networklevel, respectively, to accommodate driver preferences inreservation-based urban traffic management. At intersectionlevel, intersection manager agents assign space-time chunksthrough combinatorial auctions, while at network level apricing scheme, based on general market equilibrium, accounts for an efficient use of network resources. Our experiments show that this combined approach on the one handallows drivers to effectively improve their travel times if theyare willing to pay more money for their trip, while on theother hand the negative impact on social welfare (averagetravel times) is unnoticeable. Accommodating driver preferences in reservation-based urban traffic management
Abductive inference has many known applications in multi-agent systems. However, most abductive frameworks relyon a centrally executed proof procedure whereas many ofthe application problems are distributed by nature. Confidentiality and communication overhead concerns often preclude transmitting all the knowledge required for centralisedreasoning. We present in this paper a novel multi-agent abductive reasoning framework underpinned by a flexible andextensible distributed proof procedure that permits collaborative abductive reasoning with constraints between agentsover decentralised knowledge. Distributed Abductive Reasoning with Constraints
Norms, to become effective, must be recognised as normsby agents. These agents must be able to accept normsbut maintaining their autonomy. In this paper, the multi-context BDI agent architecture has been extended with arecognition and a normative context in order to allow agentsto acquire norms from their environment and consider normsin their decision making processes. A BDI Architecture for Normative Decision Making
We extend the MC-DCOP model to problems where each agentcontrols multiple variables, map a service-oriented computing domain to this MV-MC-DCOP model, and use the solutions as a preprocessing step to an existing inexact MDP solver. The Multi Variable Multi Constrained Distributed Constraint Optimization Framework
In this paper, we study the concepts of male optimality anduniqueness of stable marriages for partially ordered preferences. We give an algorithm to find a stable marriage that ismale optimal, and a sufficient condition on the preferences,which guarantees the uniqueness of stable marriages. Male optimality and uniqueness in stable marriage problems with partial orders
We explore how local interactions can simplify the process of decision-making in multiagent systems. We review decentralized sparse-interaction Markov decision process thatexplicitly distinguishes the situations in which the agentsin the team must coordinate from those in which they canact independently. We situate this class of problems withindifferent multiagent models, such as MMDPs and transitionindependent Dec-MDPs. We contribute new algorithm for efficient planning in this class of problems. We provideempirical comparisons between our algorithms and other existing algorithms for this class of problems. Approximate Planning for Decentralized MDPs with Sparse Interactions
Agents that solve problems in unknown graphs are usuallyrequired to iteratively explore parts of the graph. In thispaper we research the problem of finding a $k$-clique in anunknown graph while minimizing the number of required exploration actions. Two novel heuristics ($KnownDegree$ and$Clique^*$) are proposed to reduce the required explorationcost by carefully choosing which part of the environmentto explore. We further investigate the problem by addingprobabilistic knowledge of the graph and propose an MDPand a Monte Carlo based heuristic ($RClique^*$ ) that usesknowledge of edges probabilities to reduce the required exploration cost. The efficiency of the proposed approaches isdemonstrated on simulated random and scale free graphs. Searching for a k-Clique in Unknown Graphs
In this paper, we study a particular subclass of partially observablemodels, called quasi-deterministic partially observable Markov decision processes (Q{\sc DET-POMDP}s), characterized by deterministictransitions and stochastic observations. While this framework doesnot model the same general problems as POMDPs, it still captures anumber of interesting and challenging problems and have, in somecases, interesting properties. By studying the observability available in this subclass, we suggest that Q{\sc DET-POMDP}s may fall manysteps in the complexity hierarchy. An extension of this frameworkto the decentralized case also reveals a subclass of numerous problems that can be approximated in polynomial space. Quasi Deterministic POMDPs and DecPOMDPs
In this paper, we introduce a new heuristic search algorithmbased on mean values for real-time planning, called MHSP.It consists in associating the principles of UCT, a bandit-based algorithm which gave very good results in computergames, and especially in Computer Go, with heuristic searchin order to obtain a real-time planner in the context of classical planning. Compared to UCT, at leaf nodes, MHSPreplaces the simulations by heuristic values given by planning graph techniques. When the heuristic is admissible,the initial mean values of nodes are optimistic, which is acorrect way of guiding exploration. A Mean-based Approach for Real-Time Planning
Decentralized POMDPs are powerful theoretical models for coordinating agents' decisions in environments with uncertainty, but the generally intractable complexity of optimal joint policy construction presents a significant obstacle in applying DEC-POMDPs to problems where many agents face many policy choices. Here, we argue that when most agent choices are independent of peers' choices, much of this complexity can be avoided: instead of coordinating full policies, agents need only coordinate policy abstractions that explicitly convey the essential interaction influences. To this end, we develop a novel framework for abstracting the influences of a general class of transition-dependent Dec-POMDP agents where the compactness of agents' nonlocal models is a function of the degree to which they interact with their peers (and not the number of peers). In addition to the computational advantages over state-of-the-art policy search method (supported by an initial empirical comparison), our framework has the benefits of agent privacy and flexibility for approximation. From Policies to Influences: A Framework For Nonlocal Abstraction In Transition-dependent Dec-POMDP Agents
Knowledge transfer is a powerful approach to solve Markov Decision Processes. In this paper, we present approaches that use bisimulation-style metrics (Ferns, Panangaden & Precup, 2004) to compute the similarity of states in a large problem to states in smaller problems, which might have already been solved. We propose algorithms that decide what actions to transfer form the small to the large problem, given this information. We also show that this approach can be used, even more successfully, when using temporally extended actions (Sutton, Precup & Singh, 1999). We present theoretical guarantees on the quality of the transferred policy, as well as promising empirical results. Using bisimulation for policy transfer in MDPs
We argue that taking surprise into account in the artificialagents's reasoning may have advantageous implications invarious situations. Relying on theoretical and empirical evidence, our arguments are supported by the application ofsurprise-based agents to three different domains, namely exploration of unknown environments, divergent productionand evaluation of creative products, and selective attentionto travel information. The Practical Advantage of Surprise-based Agents
The paper develops a new approach to bounded model checking fora logic of knowledge and branching time. Experimental results arepresented that demonstrate improved model checking performance,compared with previous approaches, on a range of examples. Improved Bounded Model Checking for a Fair Branching-Time Temporal Epistemic Logic
While much work has focused on the creation of norm aware agents,much less has been concerned with aiding a system's designers inunderstanding the effects of norms on a system. However, sincenorms are generally pre-determined by designers, providing suchsupport can be critical in enabling norm refinement for more effective or efficient system regulation. In this paper, we address just thisproblem by providing explanations as to why some norm is applicable, violated, or in some other state. We make use of conceptualgraph based semantics to provide an easily interpretable graphicalrepresentation of the norms within a system. Such a representationallows for visual explanation of the state of norms, showing forexample why they may have been activated or violated. These explanations then enables easy understanding of the system operationwithout needing to follow the system's underlying logic. Graphically Explaining Norms
We present an agent-based coordination and planning methodfor autonomous aerial surveillance of multiple urban areasusing a group of fixed-wing unmanned aerial vehicles (UAVs).The goal of the surveillance is to observe a set of groundpoints of interest within the target areas as often as possible. The method differs from the existing work by explicitconsideration of sensor occlusions that can occur due to highbuildings and/or other obstacles in the target area. The solution employs a decomposition of the problem in two subproblems: the problem of single-area surveillance and theproblem of allocating UAVs to multiple areas. The overallmethod is evaluated empirically on a realistic simulation ofaerial surveillance built using the AgentFly framework. Occlusion-aware Multi-UAV Surveillance
This paper addresses the problem of identifying player coordination patterns in multi-player adversarial games. In the Rush 2008football simulator, we observe that each play relies on the effortsof different subgroups within the main team to score team touchdowns. We present a method to automatically identify these subgroups from historical play data based on: 1) mutual informationbetween the offensive player, defensive blocker, and ball location2) the observed ball work flow. After extracting these subgroups,we demonstrate how subgroups can be used to create new playsby performing play adaptations of existing offensive plays tuned tocounter specific defensive plays. Identifying and Utilizing Subgroup Coordination Patterns in Team Adversarial Games
Determining the optimal coalition structure is a central problem in multi-agent systems. Two popular techniques includedynamic programming and anytime search algorithms. Dynamic programming algorithms guarantee an optimal solution and have the best worst case running time. Anytime algorithms are flexible as they can terminate before the searchhas completed, but have a significantly poorer worst caseruntime. This paper provides an anytime dynamic programming algorithm with the worst case runtime of dynamic programming and the flexibility of anytime search. Anytime Dynamic Programming for Coalition Structure Generation
In this paper we address efficient decentralised coordination for cooperative multi-agent systems (framed as DCOPs) by taking intoaccount the communication and computational resources of the system. We focus on techniques that exploit structural independenceamong agents' actions to provide optimal solutions to the coordination problem, and, in particular, we use the Generalized DistributedLaw (GDL) algorithm. In this settings, we propose a novel resourceaware heuristic to build junction trees and to schedule GDL computations across the agents. Our approach aims at minimising directlythe total running time of the coordination process, rather than thetheoretical complexity of the computation, by considering computational and communication capabilities of agents. Efficient Multi-Agent Coordination Using Resource-Aware Junction Trees
In this paper we demonstrate how decentralized multi-projectscheduling problems can be solved efficiently by a group ofproject manager agents playing a simple sequence learninggame. In the multi-project scheduling problem, multipleprojects, each having a number of activities, must be scheduled. A set of local and global resources are available forcarrying out the activities of the projects.It is shown that the sequence learning game improves thebest objective function value found (minimal average project delay). In fact, the combination of local reinforcementlearning, the sequence learning game and a smart forward-backward implementation of the serial scheduler realizes, onaverage over all MPSPLIB benchmark instances, a 25\% improvement on the best published results. A Game Theoretic Approach to Decentralized Multi-Project Scheduling
The Distributed Stochastic Algorithm (DSA) is a distributedhill-climbing technique for solving large Distributed Constraint Optimization Problems (DCOPs) such as distributedscheduling, resource allocation, and distributed route planning. The best known version of DSA, DSA-B, works byhaving agents change their assignments with probability $p$when making that change will improve their solution (ahill-climbing move). To escape local minima, DSA-B performs a lateral escape move by switching to another equallygood value with the same probability $p$. It is unclear whyhill climbing and escape moves are chosen with the sameprobability. We investigate the performance effects of making these moves with different probabilities, $p_H$ and $p_L$.Through empirical evaluation, we discover that the efficiencyof DSA can not only be considerably improved, but canbe more specifically tuned to a particular domain or user'sneeds when these two move types are considered separately.Our work also shows that DSA can outperform both DBAand DPP when it is properly tuned. Improving the Efficiency of the Distributed Stochastic Algorithm
The problem of multiagent Gaussian inference in a dynamicenvironment, also known as distributed Kalman filtering,is formulated into the framework of message passing algorithms. Upon generalizing the derivation of the standardKalman filter to the distributed case, we propose novel solutions that outperform current state of the art techniques. Collaborative Multiagent Gaussian Inference in a Dynamic Environment Using Belief Propagation
In this paper, we describe the how a multiagent Simple TemporalProblem can be partitioned into private and shared componentswith important implications for privacy and concurrency. Partitioning the Multiagent Simple Temporal Problem for Concurrency and Privacy
This paper proposes Market-based Iterative Risk Allocation (MIRA),a new market-based decentralized optimization algorithm for multi-agent systems under stochastic uncertainty, with a focus on problems with continuous action and state space. In large coordinationproblems, from power grid management to multi-vehicle missions,multiple agents act collectively in order to maximize the performance of the system, while satisfying mission constraints. Theseoptimal action plans are particularly susceptible to risk when uncertainty is introduced. We present a decentralized optimization algorithm that minimizes the system cost while ensuring that the probability of violating mission constraints is below a user-specified upper bound.We build upon the paradigm of risk allocation, in which theplanner optimizes not only the sequence of actions, but also its allocation of risk among state constraints. We extend the concept ofrisk allocation to multi-agent systems by highlighting risk as a resource that is traded in a computational market. The equilibriumprice of risk that balances the supply and demand is found by aniterative price adjustment process called t\^{a}tonnement (also knownas Walrasian auction). Our work is distinct from the classical t\^{a}tonnement approach in that we use Brent's method to provide fastguaranteed convergence to the equilibrium price. The simulationresults demonstrate the efficiency and optimality of the proposeddecentralized optimization algorithm. Market-based Risk Allocation for Multi-agent Systems
A new general framework for agent cooperation and coordinationin solving distributed constraint satisfaction problems (DCSPs) ispresented. The Asynchronous Partitioning Framework (APF) firstpartitions agents into groups of agents, based on some heuristic,prior to any search being conducted. During the partitioning phaseone of the agents in each group is assigned the role of a groupleader. Next, two distinct types of search processes among theagents are performed concurrently. The first type of search isconducted within each group, in parallel and asynchronously toall searches in other groups. The second type of search, the globalsearch, is conducted between the groups, and treats each group asif it is a single agent represented by its group leader. The structureof the groups remains static throughout the search processes. Twodistinct algorithms implementing APF are presented, and theadvantages of APF are evaluated experimentally. Asynchronous Partitioning Framework
We model the multi-level alliance forming ability of malebottlenose dolphins to develop a decentralized multi-levelcoalition formation algorithm for a multi-agent system. Thegoal is to produce a model that is rich enough to capturethe biological phenomenon of forming alliances, yet remainsimple so that it can be implemented on engineered systems,such as network of unmanned vehicles. Biologically Inspired Coalition Formation of Multi-Agent Systems
Most clustering methods rely on central data structures and/or cannot cope with dynamically changing settings. However, issues related to the current use of Internet resources (distribution of data,privacy, etc.) require new ways of dealing with data clustering. Inmultiagent systems this is also becoming an issue as one wishesto group agents in an efficient way and according to some featuresof the environment. In this paper we briefly discuss how a distributed clustering algorithm that is inspired by swarm intelligencetechniques is used in problems of task allocation. Distributed Clustering for Group Formation and Task Allocation
In this paper we identify the spreading components requiredto design convention emergence mechanisms. We have empirically tested the effectiveness of our approach to emergeconventions in scenarios where the space of conventions islarge. Moreover, we identify their shortcomings when usedfor the emergence of stable, global conventions despite unreliable communications (be them because of noise, maliciousness, or errors). Therefore, in order to guarantee the robustemergence of conventions we propose to endow agents witha self-protection component. Convention Emergence Through Spreading Mechanisms
This paper contributes to the field of humoroids — humor-equipped conversational agents. We present our multi-agentsystem (multi-humoroid), which tells jokes according to users'emotions, in order to make them feel better. We briefly describethe components and the design of the system, and present resultsof two experiments showing that the system with humor wasevaluated as generally better than a baseline dialogue agent. Multi-humoroid: Joking System That Reacts With Humor To Humans' Bad Moods
As robots become more commonplace, the tools to facilitate knowledge transfer from human to robot will be vital, especially for non-technical users. While some ongoingwork considers the role of human reinforcement in intelligent algorithms, the burden of learning is often placed solelyon the computer. These approaches neglect the expressive capabilities of humans, especially regarding our abilityto quickly refine motor skills. Thus, when designing autonomous robots that interact with humans, not only is itimportant to leverage machine learning, but it is also veryuseful to have the tools in place to facilitate the transfer ofknowledge between man and machine. We introduce such atool for enabling a human to transfer motion learning capabilities to a robot.In this paper, we propose a general framework for MotionAcquisition in Robots through Iterative Online EvaluativeTraining (MARIOnET). Specifically, MARIOnET represents a direct and real-time interface between a human ina motion-capture suit and a robot, with a training processthat provides a convenient human interface and requires notechnical knowledge. In our framework, the learning happens exclusively by the human - not the robot. However,the robot provides a natural interface for interaction, andis able to store and reuse trained behaviors autonomouslyin the future. Our approach exploits the ability at whichhumans are able to learn and refine fine-motor skills.Implemented on two robots (one quadruped and one biped),our results indicate that both technical and non-technicalusers are able to harness MARIOnET to quickly improve arobot's performance of a task requiring fine-motor skills. MARIOnET: Motion Acquisition for Robots through Iterative Online Evaluative Training
This paper describes a model, ASKNEXT, for connecting agentsusing social networks for knowledge exchanges using email. Itproposes a protocol and a mathematical model to understand howemails spread throughout social networks of agents and people,which can be used to predict the scalability of agents exchangingemails to find answers to questions. Automation of social networks with QA agents
This paper provides a novel approach to multi-agent coordination in general sum Markov games. Contrary to whatis common in multi-agent learning, our approach does notfocus on reaching a particular equilibrium between agentpolicies. Instead, it learns a basis set of special joint agentpolicies, over which it can randomize to build different solutions.The main idea is to tackle a Markov game by decomposingit into a set of multi-agent common interest problems; eachreflecting one agent's preferences in the system. With onlya minimum of coordination, simple reinforcement learningagents using Parameterised Learning Automata are able tosolve this set of common interest problems in parallel. Asa result, a team of simple learning agents becomes able toswitch play between desired joint policies rather than mixingindividual policies. Taking Turns in General Sum Markov Games
Coordinating multiagent systems to maximize global information collection both presents scientific challenges (whatshould each agent aim to achieve?) and provides application opportunities (planetary exploration, search and rescue). In particular, in many domains where communicationis expensive (for example, because of limited power or computation), the coordination must be achieved in a passivemanner, without agents explicitly informing other agents oftheir states and/or intended actions. In this work, we extend results on such multiagent coordination algorithms todomains where the agents cannot achieve the required taskswithout forming teams. Robot Coordination with Ad-hoc Team Formation
MASQ (Multi-Agent Systems based on Quadrants) is a generic meta-model that proposes an integral view of interaction in a multi-agent system. It defines four perspectivesaccording to two axes: internal/external and individual/collective.This paper identifies a set of MASQ principles and analyzes how a formal model can be built based on them. Weoutline the most important design choices that have to bemade and their impact on the architecture of the resultingmulti-agent system. A formal approach to MASQ
Psychologists note that humans regularly use categories tosimplify and speed the process of person perception. Theinfluence of categorical thinking on interpersonal expectations iscommonly referred to as a stereotype. This research explores theconstruction and use of stereotypes in human-robot interaction.We present a novel algorithm that creates generalized models of arobot's interactive partner. The results of this work have potentialimplications for social robotics, autonomous agents, and possiblypsychology. Using Stereotypes to Understand One's Interactive Partner
We develop an epistemic logic to specify and reason about information flow and its underlying communication channels. By combining ideas from Dynamic Epistemic Logic (DEL) and InterpretedSystems (IS), our semantics offers a natural and neat way of modelling multi-agent communication scenarios with different assumptions about the observational power of agents. Logic of Information Flow on Communication Channels
We investigate the problem of locally monitoring contractregulated behaviours agent-based in web services. We encode contract clauses in service specifications by using extended timed automata. We propose a non intrusive localmonitoring framework along with an API to monitor thefulfilment (or violation) of contractual obligations. We illustrate our methodology by monitoring a service compositionscenario from the vehicle repair domain, and report on theexperimental results. Runtime monitoring of contract regulated web services
This paper investigates search techniques for multi-agentsettings in which the most suitable agent needs to be foundand the goal is to minimize the expected cost of search.Given the ability to vary the extent of search, a search strategy is a sequence of search iterations of varying extent. Iterative Expanding Search in Multi-Agent Systems
We apply game-theoretic techniques to urban security deploymentand propose new algorithms to efficiently solve real-world domainsthat are intractable with existing algorithms. How to Protect a City: Strategic Security Placement in Graph-Based Domains
We study a problem where a group of agents has to decidehow some fixed value should be shared among them. Weare interested in settings where the share that each agentreceives is based on how that agent is evaluated by othermembers of the group, where highly regarded agents receive a greater share compared to agents that are not wellregarded. We introduce two mechanisms for determiningagents' shares: the peer-evaluation mechanism, where eachagent gives a direct evaluation for every other member ofthe group, and the peer-prediction mechanism, where eachagent is asked to report how they believe group members willevaluate a particular agent. The sharing is based on the provided information. While both mechanisms are individuallyrational, the first mechanism is strategy-proof and budget-balanced, but it can be collusion-prone. Further, the secondmechanism is collusion-resistant and incentive-compatible. Sharing a Reward Based on Peer Evaluations
Game-theoretic analyses of multi-agent systems typically assume that all agents have full knowledge of everyone's possible moves, information sets and utilities for each outcome.Bayesian games relax this assumption by allowing agents tohave different "types," representing different beliefs aboutthe game being played, and to have uncertainty over otheragents' types. However, applications of Bayesian games almost universally assume that all agents share a commonprior distribution over everyone's type. We argue, in concordwith certain economists, that such games fail to accuratelyrepresent many situations. However, when the commonprior assumption is abandoned, several modeling challengesarise, one of which is the emergence of complex belief hierarchies. In these cases it is necessary to specify which partsof other agents' beliefs are relevant to an agent's decision-making (or need be known by that agent). We address thisissue by suggesting a concise way of representing Bayesiangames with uncommon priors. Our representation centersaround the concept of a block, which groups agents' view of(a) the game being played and (b) their posterior beliefs.This allows us to construct the belief graph, a graphicalstructure that allows agents' knowledge of other agents' beliefs to be carefully specified. Furthermore, when agents'views of the world are represented by extensive form games,our block structure places useful semantic constraints onthe extensive form trees. Our representation can be usedto naturally represent games with rich belief structures andinteresting predicted behavior. Representing Bayesian Games Without a Common Prior
The concept of centrality plays an important role in network analysis. Game theoretic centrality measures have been recently proposed, which are based on computing the Shapley Value (SV) ofeach node (agent) in a suitably constructed co-operative networkgame. However, the naive method of exactcomputation of SVs takes exponential time in the number of nodes.In this paper, we develop analytical formulas for computing SVsof nodes for various kinds of centrality-related co-operative gamesplayed on both weighted and unweighted networks. These formulas not only provide an efficient and error-free way of computingnode centralities, but their surprisingly simple closed form expressions also offer intuition into why certain nodes are relatively moreimportant to a network. Game Theoretic Network Centrality: Exact Formulas and Efficient Algorithms
In this paper we extend the class of Bayesian coordination games toinclude explicit observation and communication. This general classof problems includes the canonical multi-door multi-agent Tigerproblem. We argue that this class of games is appropriate for situations where the agents observation, communication and payoff-earning actions are limited by some common resource, without introducing arbitrary penalties for communicating (unlike most existing approaches). Valuing Search and Communication in Partially-Observable Coordination Problems
First-order (i.e., gradient-based) methods for solving two-personzero-sum sequential games of imperfect information have recentlybecome important tools in the construction of game theory-basedagents. The computation time per iteration is typically dominatedby matrix-vector product operations involving the payoff matrix$A$. In this paper, we describe two techniques for scaling up thisoperation. The first is a randomized sampling technique that approximates $A$ with a sparser matrix $\tilde{A}$. Then an approximate equilibrium for the original game is found by finding an approximateequilibrium of the sampled game. The second technique involvesthe development of an algorithm and system for performing thematrix-vector product on a cache-coherent Non-Uniform MemoryAccess (ccNUMA) architecture. The two techniques can be appliedtogether or separately, and they each lead to an algorithm that significantly outperforms the fastest prior gradient-based method. Speeding Up Gradient-Based Algorithms for Sequential Games
We propose a new equilibrium concept, perfect cooperativeequilibrium (PCE), which may help explain players' behaviorin games where cooperation is observed in practice. We alsoconsider a few related equilibrium concepts that take intoaccount the degree of cooperation. Cooperative Equilibrium
Recent work has applied game-theoretic models to real-world security problems at the Los Angeles International Airport (LAX)and Federal Air Marshals Service (FAMS). The analysis of thesedomains is based on input from domain experts intended to capture the best available intelligence information about potential terrorist activities and possible security countermeasures. Nevertheless, these models are subject to significant uncertainty — especiallyin security domains where intelligence about adversary capabilities and preferences is very difficult to gather. This uncertaintypresents significant challenges for applying game-theoretic analysis in these domains. Our experimental results show that standard solution methods based on perfect information assumptionsare very sensitive to payoff uncertainty, resulting in low payoffs forthe defender. We describe a model of Bayesian Stackelberg gamesthat allows for general distributional uncertainty over the attacker'spayoffs. We conduct an experimental analysis of two algorithms forapproximating equilibria of these games, and show that the resulting solutions give much better results than the standard approachwhen there is payoff uncertainty. Robust Bayesian Methods for Stackelberg Security Games
The computation of a Nash equilibrium in a game is a challenging problem in artificial intelligence. This is because thecomputational time of the algorithms provided by the literature is, in the worst case, exponential in the size of the game.To deal with this problem, it is common the resort to concepts of approximate equilibrium. In this paper, we follow adifferent route, presenting, to the best of our knowledge, thefirst algorithm based on the combination of support enumeration methods and local search techniques to find an exactNash equilibrium in two-player general-sum games and, inthe case no equilibrium is found within a given deadline, toprovide an approximate equilibrium. We design some dimensions for our algorithm and we experimentally evaluatethem with games that are unsolvable with the algorithmsknown in the literature within a reasonable time. Our preliminary results are promising, showing that our techniquescan allow one to solve hard games in a short time. Local Search Techniques for Computing Equilibria in Two-Player General-Sum Strategic-Form Games
Although combinatorial auctions have received a great deal of attention from the computer science community over the past decade,research in this domain has focussed on settings in which a bidderonly has preferences over the bundles of goods they themselvesreceive, and is indifferent about how other goods are allocated toother bidders. In general, however, bidders in combinatorial auctions will be subject to externalities: they care about how the goodsthey are not themselves allocated are allocated to others. Our aimin the present work is to study such combinatorial auctions withexternalities from a computational perspective. We first presentour formal model, and then develop a classification scheme for thetypes of externalities that may be exhibited in a bidder's valuationfunction. We develop a bidding language for combinatorial auctions with externalities, which uses weighted logical formulae torepresent bidder valuation functions. We then investigate the properties of this representation: we study the complexity of the winnerdetermination problem, and characterise the complexity of classifying the properties of valuation functions. Finally, we considerapproximation methods for winner determination. Combinatorial Auctions with Externalities
This paper presents an approach to automated mechanism design inthe domain of double auctions. We describe a novel parameterizedspace of double auctions, and then introduce an evolutionary searchmethod that searches this space of parameters. The approach evaluates auction mechanisms using the framework of the TAC MarketDesign Game and relates the performance of the markets in thatgame to their constituent parts using reinforcement learning. Experiments show that the strongest mechanisms we found using thisapproach are able to win the Market Design Game against known,strong opponents. A Grey-Box Approach to Automated Mechanism Design
We study a more powerful variant of false-name manipulation in Internet auctions: an agent can submit multiple false-name bids, butthen, once the allocation and payments have been decided, withdraw some of her false-name identities (have some of her false-name identities refuse to pay). While these withdrawn identitieswill not obtain the items they won, their initial presence may havebeen beneficial to the agent's other identities. We define a mechanism to be false-name-proof with withdrawal (FNPW) if the aforementioned manipulation is never beneficial. FNPW is a strongercondition than false-name-proofness (FNP).We discuss the relation between FNP and FNPW in general combinatorial auction settings. We also propose the maximum marginalvalue item pricing (MMVIP) mechanism, which we prove is FNPW.(The full version contains a number of other results.) False-Name-Proofness with Bid Withdrawal
We address the problem of designing efficient mechanismsthat never yield revenue, instead requiring small subsidies.Such mechanisms will be pertinent for settings in whichtaxing agents is undesirable or impractical — imagine, e.g.,a government or private philanthropist that seeks to makea minimal monetary contribution that will allow a groupof individuals to reach an efficient decision without beingstripped of any of the surplus. Our approach is a close analog of Cavallo (2006), where structure in agent valuations is used to arrive at agent-independent revenue lower bounds for the VCGmechanism that form the basis for redistributing VCG revenue back to the agents without distorting incentives; in thecurrent paper we use valuation structure to obtain revenueupper bounds that form the basis for a second stage of redistribution, returning the revenue of the redistribution mechanism of Cavallo (2006) such that revenue is non-positive but still close tozero. The mechanism we propose is applicable to arbitrarydecision problems, always achieving dominant strategy efficiency, ex post individual rationality, and no-revenue. Insingle-item allocation settings it is asymptotically stronglybudget-balanced as the population size grows; we show empirically that for standard distributions over valuations itrequires subsidies that are less than 5\% of social value, inexpectation, for groups of more than 5 agents. Efficient Mechanisms with Small Subsidies
We present a series of results providing evidence that the incentive problem with approximate VCG-based mechanismsis often not very severe. Our first result uses average-caseanalysis to show that if an algorithm can solve the allocationproblem well for a large proportion of instances, incentivesto lie essentially disappear. We next show that even if suchincentives exist, a simple enhancement of the mechanismmakes it unlikely that any player will find an improving deviation. Additionally, we offer a simulation-based techniqueto verify empirically the incentive properties of an arbitraryapproximation algorithm and demonstrate it in a specificinstance using combinatorial auction data. Incentive Analysis of Approximately Efficient Allocation Algorithms
Combinatorial auctions (CAs) are an important mechanismfor allocating multiple goods while allowing self-interestedagents to specify preferences over bundles of items. Winner determination for a CA is known to be NP-complete.However, restricting the problem can allow us to solve winner determination in polynomial time. These restrictionssometimes apply to the CA's representation. There are twocommonly studied, and structurally different graph representations of a CA: bid graphs and item graphs. We studythe relationship between these two representations.We show that for a given combinatorial auction, if a graphwith maximum cycle length three is a valid item graph forthe auction, then its bid graph representation is a chordalgraph. Next, we present a new technique for constructingitem graphs using a novel definition of equivalence amongcombinatorial auctions. The solution to the WDP for a givenCA can easily be translated to a solution on an equivalentCA. We use our technique to simplify item graphs, and showthat if a CA's bid graph is chordal, then there exists anequivalent CA with a valid item graph of treewidth one, forwhich a solution to the WDP is known to be efficient. Thisresult demonstrates how CA equivalence can simplify thestructure of item graphs and lead to more efficient solutionsto the WDP, which are also a solutions to the WDP for theoriginal auctions. An Investigation of Representations of Combinatorial Auctions
Combinatorial auctions (CAs) have been studied by the multiagent systems community for some time, since these auctions are an effective mechanism for resource allocation whenagents are self-interested. One challenge, however, is thatthe winner-determination problem (WDP) for combinatorialauctions is NP-hard in the general case. However, there areways to leverage meaningful structure in the auction so as toachieve a polynomial-time algorithm for the WDP. In thispaper, using the formal scope of parameterized complexitytheory, we systematically investigate alternative parameterizations of the bids made by the agents (i.e. the input tothe WDP for combinatorial auctions) and are able to determine when a parameterization reduces the complexity ofthe WDP (fixed-parameter tractable), and when a particularparameterization results in the WDP remaining hard (fixed-parameter intractable). Our results are relevant to auctiondesigners since they provide information as to what types ofbidding-restrictions are effective for simplifying the winnerdetermination problem, and which would simply limit theexpressiveness of the agents while not providing any additional computational gains. Parameterizing the Winner Determination Problem for Combinatorial Auctions
In this work, we propose a novel option pricing mechanism for reducing the exposure problem encountered by bidders with complementary valuations when participating in sequential, second-priceauction markets. In our mechanism, both the option and the exercise price are determined dynamically, by the bids received in eachauction. We show that our flexible options model can achieve bettermarket allocation efficiency, at an only marginal cost to seller revenues compared to existing state of the art option pricing models. Flexibly Priced Options: A New Mechanism for Sequential Auction Markets with Complementary Goods
We extend the framework of mixed multi-unit combinatorialauctions, which deals with transformations of goods ratherthan only with atomic goods, by allowing time constraintsin the bids offering these transformations. This way, bidderscan express their scheduling preferences, while previously theauctioneer alone could decide the order of transformations. Time Constraints in Mixed Multi-unit Combinatorial Auctions
Bilateral bargaining is the most common economic transaction. Customarily, it is formulated as a non-cooperativegame with uncertain-information and infinite actions (offersare real-value). Its automation is a long-standing open problem in artificial intelligence and no algorithmic methodologyemployable regardless of the kind of uncertainty is provided.In this paper, we provide the first step (with one-sided uncertainty) of an algorithmic game theory framework to solvebargaining with any kind of uncertainty. The idea behindour framework is to reduce, by analytical tools, a bargainingproblem to a finite game and then to compute, by algorithmic tools, an equilibrium in this game. An Algorithmic Game Theory Framework for Bilateral Bargaining with Uncertainty
In a representative agent model, the behavior of a socialsystem is described in terms of a single aggregate decisionmaker. Such models are popular in economic and financeresearch, largely due to their analytic tractability, but failto account for real-world agent heterogeneity. Agent-basedmodels naturally incorporate heterogeneity, but are seen ashard to generalize. We propose an empirical game-theoreticapproach to address this concern, and provide a case studyto demonstrate this approach. Agent-Based Analysis of Asset Pricing under Ambiguous Information
In this paper, a decentralised self-organisation mechanism inan agent network is proposed. The aim of this mechanism isto achieve efficient task allocation in the agent network viadynamically altering the structural relations among agents,i.e. changing the underlying network structure. The mechanism enables agents in the network to reason with whomto adapt relations and to learn how to adapt relations byusing only local information. The local information is accumulated from agents’ historical interactions with others. Self-organisation in an Agent Network via Learning
In this paper, we propose a coherence-driven approach to actionselection in agents. The mechanism is inspired by the cognitivetheory of coherence as proposed by Thagard. Based on a proposalto extend BDI agents with coherence, we interpret, how action selection can be viewed as a coherence-maximising problem. Contrasted against the classical BDI approach to action selection whereactions are selected against a pre-determined set of beliefs and desires, this method provides us with a reasoning formalism that incorporates uncertainty and dynamism in the world model withoutloosing the type of formal qualities that make BDI-like architectures so attractive for testability and reliability reasons. A Coherence-Driven Action Selection in Dynamic Environments
This paper presents an approach to explore an unknownindoor environment using vision as the sensing modality,thereby building a topological map of images. The contribution of this paper is in a new approach that identifiesthe next best place to move from a node in the topologicalgraph. This decision is taken locally at a node by choosingthe next best direction, when there are open spaces beforethe robot, and globally by choosing the next best node tobranch off a new exploration, when there are no open spacesbefore the robot. We propose a method to assign weightsto nodes for this purpose. Weight is defined as a function ofthe depth of local descriptors of images, and the number oftimes they were seen across different nodes. The efficacy ofthe approach to explore office like environments is verifiedthrough several experiments on a P3DX robot. Image Based Exploration for Indoor Environments using Local Features
We address the problem of multi-robot area coverage andpresent a new approach in the case where the map of thearea and its static obstacles are known and the robots havea limited visibility range. The proposed method starts bylocating a set of static guards on the map of the target areaand then builds a graph called Reduced-CDT, a new environment representation method based on the ConstrainedDelaunay Triangulation (CDT). Multi-Prim’s is used to decompose the resultant graph into a forest of partial spanningtrees (PSTs). Each PST is modified through a mechanismcalled Constrained Spanning Tour (CST) to build a cyclewhich is then assigned to a covering robot. Subsequently,robots start navigating the cycles and consequently coverthe whole area. The proposed approach is complete provided that at least one of the robots operates correctly. Multi-Robot Area Coverage with Limited Visibility
A team of networked UAVs are deployed in an unknown region to search and destroy targets. To successfully destroya target, a coalition of UAVs with sufficient cumulative resources needs to be assigned. Forming coalitions under networks with dynamic topology is difficult and the type ofcoalition formation strategy adopted affects the mission performance. In this paper, we determine a mechanism to formcoalitions in dynamically changing networks and investigatedifferent coalition formation strategies. Multiple UAV Coalition Formation Strategies
A high-quality human-robot interface is essential for thesuccess of search and rescue operations in urban environments that are too challenging for fully-autonomous operation. Teleoperating multiple robots greatly increases thecomplexity of the human’s cognitive task, since the operator’s concentration is divided among multiple robots. Thus,simply adding more robots to the system does not necessarily expand the effective coverage region nor increase the rateat which the operator can search. We present CoOperator,an agent-based human-robot interface that infers operatordistraction and identifies any robots that are not currentlybeing effectively managed. A CoOperator agent assumescontrol of such a robot and moves it along a search paththat complements the operator’s explicit teleoperation. TheCoOperator agent seamlessly cedes control to the user whenever direct commands are given and resumes directing therobot if the operator’s attention shifts. We demonstrate thatour agents significantly improve multi-robot teleoperationthrough user studies on four urban search and rescue scenarios with a team of three simulated Pioneer 3DX robots. Improving Multi-Robot Teleoperation by Inferring Operator Distraction
In this paper, we demonstrate the benefit of role allocationin a collective of autonomous robots performing a simpletransport task. We demonstrate that, under certain conditions, the performance of the collective can be improvedwhen a subset of the robots assume institutional roles as traffic regulators. The concept of institutional roles is part of ahigh-level approach to the control of multi-robot collectivescalled Institutional Robotics. We compare the institutionalrobotics approach to a swarm robotics approach. Based onresults of experiments in simulation, we conclude that thecoordination provided by the traffic regulating robots improves performance for large collectives, but for small collectives the performance is higher when all robots are directlyinvolved in carrying out the task. Coordination Through Institutional Roles in Robot Collectives
We formally present the Mutual State Capability-Based RoleAssignment (MuSCRA) model, as we introduce that an agent,acting in a team, has capabilities that depend not only onits own individual skills, but also on its teammates and theirmutual state. The MuSCRA model includes a descriptionof roles in terms of its association value with states and actions. Role assignment policies are evaluated with a utilityaccounting for the match between the new mutual state capabilities and the desired roles, weighted by a risk factor. Mutual State Capability-Based Role Assignment Model
In this paper we present a method for navigating a multi-robot system through an environment while additionallymaintaining a set of constraints. Our approach is basedon graph structures that model movements and constraintsseparately, in order to cover different robots and a large classof possible constraints. Additionally, the partition of movement and constraint graph allows us to use known graphalgorithms like Steiner trees to solve the problem of findinga target configuration for the robots. We construct so calledseparated distance graphs from the Steiner tree and the underlying roadmap graph, which allow to assemble valid navigation plans fast. Coordinated Navigation for Multi-Robot Systems with Additional Constraints
Immediate rewards play a key role in a reinforcement learning (RL) scenario as they help the system deal with the credit assignment problem. Therefore, reward function definition has a drastic effect on both how fast the system learns and to what policy it converges. It becomes even more important in case of multi-agent learning, where the state space usually gets even bigger. This paper proposes a Genetic Algorithms (GA) based reward function shaping method for multi-robot learning problems and evaluates its performance in a robot soccer case study. A set of metrics calculated from the positions of the players and the ball on the field are used as the primitive building blocks of an immediate reward function consisting of a weighted combination of these metrics yielding a significantly better soccer playing performance is obtained using GA. A Reward Function Generation Method Using Genetic Algorithms: A Robot Soccer Case Study
We present an algorithm for multi robotic exploration of anunknown terrain where the robots are also required to servethe role of hops or nodes in a communication link maintainedbetween a fixed base station and the last robot (end effector robot) in the chain. A baseline algorithm is presentedas a tree traversal mechanism akin to a depth first strategy, further embellished by an adaptive rule that decidesthe number of children based on the local obstacle configuration at a node and avoidance of redundancy in traversal through a look-ahead method that decides the utility ofspreading the tree from the current robotic hop node. Thissystem finds immense utility in arenas such as planetary exploration, search and rescue scenarios and scenarios wherethe robots have limited on-board computing capabilities andneed to continuously preserve the link with a fixed base station for receiving instructions or transfer of data. Multi Robotic Exploration with Communication Requirement to a Fixed Base Station
We present the first real-world multi-robot system that can autonomously self-assemble (and dis-assemble) to form different morphologies capable of solving tasks that appear in an a priori unknown order. Robots Autonomously Self-Assemble into Dedicated Morphologies to Solve Different Tasks
We develop a heuristic approach, called ESP, that solves large pursuit-evasionproblems on series-parallel (that is, treewidth-2) graphs quickly and withsmall costs. It exploits their topology by performing dynamic programming ontheir decomposition graphs. We show that ESP scales up to much larger graphsthan a strawman approach based on previous results from the literature. ESP: Pursuit Evasion on Series-Parallel Graphs
The execution of tasks for a mobile robot embedded in a dynamicenvironment brings about several challenges, due to the dynamicchanges of the environment and the inaccurate perception of therobot. This paper tackles the problem of on-line execution monitoring when the agent has different tasks and several plan to accomplish them, as in the BDI framework. Our method considersuncertainty in the duration of actions with a probabilistic modelof action duration, and evaluates the cost of each possible plan atrun-time in terms of probability of successful termination within adesired expected time. On-line robot execution monitoring using probabilistic action duration
A robot moving in the presence of humans is highly constrained by the dynamic environment and the need to comply with human safety, preferences, cognition and actions. The dynamic nature of the environment should be taken into account at planning time and must be allowed for by flexible plan execution, to produce safe, efficient and legible robot behavior. Moreover, when the space in which humans and robots interact is small, additional measures must be taken to allow efficient robot movement. This paper extends an existing human-aware navigation planner with mechanisms to account for the dynamic movement of humans in confined areas. Dynamic Generation and Execution of Human Aware Navigation Plans
We present results of semi-automated annotation of dialogue data collected from 5,200 gameplay logs from The Restaurant Game (http://www.theRestaurantGame.net), and show that classifieddialogue acts can function effectively as a common currency formodeling behavior with interleaved physical actions and words. Toward an Interleaved Model of Actions and Words in Social Simulation
Creating agents that act reasonably in uncertain environments is a primary goal of agent-based research. In this workwe explore the theory that wishful thinking can be an ective strategy in uncertain and competitive decision scenarios.Specifically, we present the constraints necessary for wishful thinking to outperform Expected Utility Maximizationand take instances of popular games from Game-Theoreticliterature showing how they relate to our constraints andwhether they can benefit from wishful-thinking. Wishful Thinking In Effective Decision Making
We describe how, by modelling plot generation as a Continual Multiagent Planning process, dynamic stories can be generated in whichcharacters not only inteleave perception, action and interaction, butin which also beliefs and motivations may change repeatedly, thusdriving the plot forward. Dynamic Plot Generation by Continual Multiagent Planning
In this paper we introduce the notion of character’s values tomediate between agents and story direction in storytellingsystems. By relating characters’ goals with their values, theactivation of goals depends on the values that are put atstake by the story incidents. Based on this framework, wepropose a reference architecture for value–based interactivestorytelling. Directing value-driven artificial characters
EEMML (Emotional Eye Movement Markup Language) is a scripting toolthat enables authors to describe and generate emotional eye movementin virtual agents. The EEMML is capable of describing and generatingboth basic eye movement and emotional eye movement, includingprimary (joy, sadness, anger, fear, disgust and surprise) andintermediate (emotions that can be represented as the mixture of twoprimary emotions) emotions for virtual agents. The emotional eyemovement generation framework is based upon the MPEG-4 FAP (facialanimation parameters), and the animations are driven by parameterspicked from the Cohn-Kanade AU-Coded facial expression database aswell as real-time eye movement data(pupil size, blink rate andsaccade). Emotional Eye Movement Markup Language for Virtual Agents
A number of interactive storytelling (IS) systems offer the user the possibility to input natural language input which determines how a story progresses. In this paper, we present an approach where the user's affective and attentive state mainly drives the evolution of the narrative. We introduce a framework for real-time signal processing (SSI) to analyze the users' state which then influences the feelings of the story characters and their actions in the story. SSI is not only used to enable more natural character responses that are sensitive to the user's state at runtime. In addition SSI offers the possibility to collect a large variety of synchronized user data which can be used to analyze the user's experience in offline mode. The underlying narrative in which the approach was tested is based on a classical XIX$^\mathrm{th}$ century psychological novel: Madame Bovary, by Flaubert. Multimodal Interaction with a Virtual Character in Interactive Storytelling
In this paper we investigate a concrete epistemic situation:there are agents (humans, robots, cameras,...) and propositions (lamps on or off, obstacles dangerous or not,...) located in Lineland. We express properties with the standardepistemic logic language like “Agent A knows that agent Bknows that lamp L is on”. We give some words about model-checking, satisfiability problem and common knowledge. Knowledge in LineLand
The use of virtual agents in training requires them to haveseveral human-like characteristics; one of these is the ability to appear deceptive. We take work from the psychologyliterature on cues to deception, with a focus on language-related cues, and examine whether it is possible to use resources from the field of Language Technology to constructscenarios with agents showing cues to deception detectableby human judges, a task that has been shown in a text-onlycontext to be difficult. We show that this detection is infact possible in the context of virtual agents, and that thereare interesting results for individual cues, in particular fordialogue- versus lexical-level cues, and a ‘placebo’ effect. Deceptive Agents and Language
Governing the behavior of autonomous agents in multi-agent systems to reach overall system benefit has long been an active area of research. One approach of recent prevalence is to provide agents with explicit specifications of what they should, should not or may do within the system, i.e. normative statements or norms. In a business setting, these norms exactly mirror the contractual agreements made between business organizations. As such, agent-based normative systems offer the potential for a business to model, understand the consequences of, and then refine contracts to improve the outcomes for that business. However, languages and tools for specifying norms do not by themselves provide understanding of the emergent behavior in a complex domain. In this paper, we combine a simulation technique designed for investigating and tuning emergent behavior in multi-agent systems with an approach to modeling norms of the complexity found in business contracts. We show, using an aerospace case study, that our approach can aid in the refinement of such contracts by exposing the consequences of contract variations. A Simulation Approach to Design Contracts that Govern Emergent Multi-Agent Systems
Recently, with the advance of information technologies, theamount of available information has increased exponentially.This has a great influence on individual acts because information is quite important for decision making and valuejudgment. In this paper, we propose a model of informationdiffusion considering information diversification and investigating its impact. In particular, we focus on the influenceof innovators. Contrary to the results of preceding studies,our results reveal that a new phenomenon called "reversal ofinfluence" emerges under information diversification. Reversal of Influence: Decrease of Innovator's Influence under Information Diversification
We create an agent-based simulation to explore consumerlock-in in a duopoly of experience goods (goods with characteristics that are difficult to determine in advance, butascertained upon consumption). We model heterogeneousagents using simple assumptions, where agents choose between products based upon personal experience and neighbours' decisions. We test strategies to break a lock-in througha free give-away and advertising. We find that, under ourassumptions, breaking a lock-in required the formation ofregions where the competitor product was adopted, likenedto a niche in the market. An agent-based simulation of lock-in dynamics in a duopoly
Most Multi-Agent System designers use several notions — like "agent", "artifact", "object", etc. – to classify the entities involved in simulations. These notions require different methodologies, data structures and algorithms. In thispaper, we show that the representation of entities can befavorably unified. As a consequence, the design and implementation process are made easier, since the designer hasno longer to assign a fixed type to each entity during modelconstruction. The implementation handles entities throughan unified data structure and algorithm, and is thereforelightweight and more maintainable. Such an unification isperformed without efficiency loss in a concrete simulationmethodology called Ioda. According to common sense, wepropose to call such an unified entity simply "agent"! Everything can be Agent!
Most agent-based modeling techniques generate one trajectory perrun, greatly under-sampling the space of possible trajectories.Swarming agents that interact through digital pheromones canexplore many alternative futures in parallel, based on an interpretation of pheromone fields as probability fields that yields moreinformation from them than swarming models usually yield, andalso facilitates integration with probability-based AI mechanisms. Generation and Analysis of Multiple Futures with Swarming Agents
Crowd behavior simulation has been an active field of research because of its utility in several applications such asemergency planning and evacuations, designing and planning pedestrian areas, subway or rail-road stations, besidesin education, training and entertainment. The most advanced and realistic simulation systems employ intelligentautonomous agents with a balance between individual andgroup intelligence for scalability of the architectures. Although several systems have even been commercialized, little attention has been accorded to the problem of validatingthe outcomes of these simulations in a generalized manner,against reality. The extent of validation fails to go muchbeyond visual matching between the simulation and the actual scenarios (with recordings of human crowds), which canlead to highly subjective and often questionable conclusions.The existing numerical measures often rely on ad-hoc applications, e.g., local crowd densities are measured to verifypatterns, without a systematic procedure to identify at whattimes in the simulation and the scenarios can the densities becompared. Furthermore, if there are multiple systems thatsimulate crowd behavior in the same scenario in the samevirtual environment, then no technique is currently known toquantitatively compare these systems in terms of realism. Inthis abstract, we present the first (to the best of our knowledge) principled, unified, and automated technique to quantitatively validate and compare the global performance ofcrowd egress simulation systems. We also evaluate a multiagent based crowd egress simulation system (that we haverecently developed, but we do not discuss this system here)using our technique and demonstrate a high degree of validity of that system as well. Validation of Agent Based Crowd Egress Simulation
We present a multiagent organization for data interpretation and fusion in which each agent uses an encapsulatedBayesian network for knowledge representation, and agentscommunicate by exchanging beliefs (marginal posterior probabilities) on shared variables. We call this organizationan Agent-Encapsulated Bayesian Network (AEBN) system.Communication of probabilities among agents leads to rumors, i.e. potential double counting of information. Wepropose a new and correct method to compensate for rumors in AEBN systems by passing extended messages thatcontain joint probabilities. Agent-Encapsulated Bayesian Networks and the Rumor Problem
In this paper, we address the issue of the specification andverification of commitment protocols having a social semantics. We begin with developing a new language to formallyspecify these protocols and desirable properties by enhancing $CTL^*$ logic with modalities of commitments and actions on these commitments. We also present a symbolicmodel checking algorithm for commitments and their actions based on OBDDs. Finally, we present an implementation and experimental results of the proposed protocol usingthe NuSMV and MCMAS symbolic model checkers. Symbolic Model Checking for Agent Interactions
This paper proposes an agent communication protocol specified asa set of dialogue rules for resolving conflicts using assumption-based argumentation. Arguments are built from a set of rules andassumptions using backward deduction. Beyond arguments agentscan handle, we propose the notion of partial arguments along withpartial acceptability. This protocol merges inquiry and persuasion stages and by building and reasoning about partial arguments,agents can jointly find arguments supporting a new solution fortheir conflict, which is not known by any of them individually. An Agent Communication Protocol for Resolving Conflicts
The object of our research is resource allocation which considers contractual dependencies across service chain tiers toavoid overcommitment and overpurchasing. We propose amulti-tier negotiation protocol for solving this problem. Theproposed artifact is developed from an interaction protocolengineering perspective and a protocol specification is given.Besides basic safety properties like the absence of deadlock,we formally verify that the protocol prevents overcommitments and overpurchasing by means of the model checkerSpin. Towards Model Checking & Simulation of a Multi-Tier Negotiation Protocol for Service Chains
In this paper, we propose to combine Semantic Web technologiesand multiagent systems in a novel way to enable users to locate andshare URLs relevant to their search interests. Distributed Semantic Search for the Web: A Multiagent Approach
We provide a defence against whitewashing for trust assessment mechanisms (TAM) by using an underlying social network in MAS and P2P. Since interaction requests are routedthrough the social network, routers can block requests fromportions of the network known for whitewashing. Furthermore, by limiting feedback spread to the interaction routers,the trust assessment can be done without querying for feedback with a small loss in efficiency. A Social-Network Defense against Whitewashing
This paper introduces a new conceptual model representing the stock market dynamics. This model is essentially based on cognitive behavior of the investors. In order to validate our model, we build an artificial stock market simulation based on agent-oriented methodologies. The proposed simulator is composed of market supervisor agent essentially responsible for managing transactions via an order book and various kinds of investor agents depending to their profile. The purpose of this simulation is to understand the influence of psychological character of an investor and its neighborhood on its decision-making and their impact on the market in terms of price fluctuations. Interactions between investors and information exchange during a transaction reproduce the market dynamics and organize the multi-agent based pricing. Towards a new cognitive modeling approach for multi-agent based simulation of stock market dynamics
The decision to grant trust in virtual societies is often an evidencebased process. The evidence for such decision derives from acomplex set, where mutual relationships and contradictions arelikely, rather than form a list of isolated objects. This papercompares the basic and widely used aggregation strategy, whereconflicts and mutual relationships among evidence are ignored,with an argumentation-based strategy. Argumentation vs. Aggregation of Trust Evidence
Collaborative filtering systems have been developed to manage information overload in online communities. In thesesystems, users rank content provided by other users on thevalidity or usefulness within their particular context. Slashdot is an example of such a community where peers rate eachothers’ comments based on their relevance to the post. Thiswork extracts a wide variety of features from the Slashdotmetadata and posts’ linguistic contents to identify featuresthat can predict the community rating. We find that author reputation, use of pronouns, and author sentiment aresalient. We achieve 76\% accuracy at predicting the community’s rating of the post as good, neutral, or bad. Using Machine Learning to Augment Collaborative Filtering of Community Discussions
In open multi-agent systems trust models are an importanttool for agents to achieve effective interactions. However,the agents do not necessarily use similar trust models, leading to semantic differences between trust evaluations in thedifferent agents. We show how to form a trust alignmentby considering the interactions agents share. We describe amethod, using inductive learning algorithms, to accomplishthis alignment. Inductively Generated Trust Alignments Based on Shared Interactions
Trust has been viewed as an integral component of agent decision making in the context of multiagent systems (MASs).Various formal and semi-formal trust schemes, motivatedby diverse considerations and influenced by various fields ofstudy, have been proposed, implemented, and evaluated. Webelieve that there still exists a pressing need for developinga comprehensive trust management scheme that addressesmost, if not all, issues underlying trust development, maintenance, and use. To facilitate the discussion of a generaland comprehensive trust management scheme, we provideour own operational definition of trust motivated by uncertainty management and utility optimization. We identifythe various components required of a comprehensive trustmanagement scheme and their roles in determining agentperformance in a competitive, open MAS. Comprehensive Trust Management
In online social networks, there are usually many social trust paths between agents.Thus, a challenging problem is which social trust path is the optimal one that can yield the most trustworthy evaluation result. In this paper, we present a new complex social network structure and propose a new concept, Quality of Trust (QoT), for social trust path selection in complex social networks. Quality of Trust for Social Trust Path Selection in Complex Social Networks
The problem of unfair testimonies remains an open issuein reputation systems for online trading communities. Acommon attempt is to use binary ratings to model sellers’reputation. However, this attempt leads to that the researchof tackling unfair testimonies also focuses on reputation systems using binary ratings. In this extended abstract, wepropose a two-stage clustering approach to filter unfair testimonies for reputation systems using multi-nominal ratings.The proposed approach uses clustering to identify unfair testimonies and further contributes to providing buyers a moreaccurate reputation evaluation regarding the target seller. A Clustering Approach to Filtering Unfair Testimonies for Reputation Systems
Optimal seeding in balanced knockout tournaments has only been studied in very limited settings, for example, maximizing predictive power for up to 8 players using only the relative ranking of the players (ordinal information). We broaden the scope of the analysis along several dimensions: tournaments of size up to 128, different player models, ordinal as well as cardinal solutions, and two additional objective functions. Optimal Seeding in Knockout Tournaments
Organization-based multi-agent systems (MAS) can be represented by means of 3D virtual worlds facilitating then theinteraction among participants, i.e humans and agents. Inthis paper we propose a system that can automatically generate a 3D virtual world from formal specifications of botha MAS and a design visual style (i.e a shape grammar).We propose an extension of shape grammars in combinationwith virtual world paradigms, called Virtual World Grammar (VWG), to support the design generation process. Virtual World Grammar includes semantic information aboutboth MAS specification which establishes the activities participants can engage on and shape grammar elements. Thisinformation, along with heuristics and validations, guidesthe generation process and allows us to produce functionaldesigns. The Virtual World Builder Toolkit, integrated inour Shape Grammar Interpreter tool, supports both the definition and execution of the Virtual World Grammar. Virtual World Grammar
Consider, for example, the well-known game of Roshambo, or rock-paper-scissors, in which two players select one of three actions simultaneously. One may know thatthe adversary will base its next action on some bounded sequence of the past joint actions, but may be unaware of itsexact strategy. For example, one may notice that every timeit selects $P$ , the adversary selects $S$ in the next step; or perhaps whenever it selects $R$ in three of the last four steps,the adversary selects $P$ $90\%$ of the time in the next step.The challenge is that to begin with, neither the adversaryfunction that maps action histories to future actions (maybe stochastic), nor even how far back it looks back in theaction history (other than an upper bound) may be known.At a high level, this paper is concerned with automaticallybuilding such predictive models of an adversary’s future actions as a function of past interactions. Online Model Learning in Adversarial Markov Decision Processes
The design of reinforcement learning solutions to many problemsartificially constrain the action set available to an agent, in orderto limit the exploration/sample complexity. While exploring, if anagent can discover new actions that can break through the constraintsof its basic/atomic action set, then the quality of the learneddecision policy could improve. On the flipside, considering allpossible non-atomic actions might explode the explorationcomplexity. We present a potential based solution to this dilemma, andempirically evaluate it in grid navigation tasks. In particular, weshow that both the solution quality and the sample complexity improvesignificantly when basic reinforcement learning is coupled with actiondiscovery. Our approach relies on reducing the number of decisionspoints, which is particularly suited for multiagent coordinationlearning, since agents tend to learn more easily with fewercoordination problems (CPs). To demonstrate this we extend actiondiscovery to multi-agent reinforcement learning. We show that JointAction Learners (JALs) indeed learn coordination policies of higherquality with lower sample complexity when coupled with actiondiscovery, in a multi-agent box-pushing task. Action Discovery for Reinforcement Learning
In this paper we develop a Bayesian policy search approachfor Multi-Agent RL (MARL), which is model-free and allowsfor priors on policy parameters. We present a novel optimization algorithm based on hybrid MCMC, which leveragesboth the prior and gradient information estimated from trajectories. Our experiments demonstrate the automatic discovery of roles through reinforcement learning in a real-timestrategy game. Bayesian Role Discovery for Multi-Agent Reinforcement Learning
Scaling Reinforcement Learning (RL) to real-world problemswith continuous state and action spaces remains a challenge.This is partly due to the reason that the optimal value function can become quite complex in continuous domains. Inthis paper, we propose to avoid learning the optimal valuefunction at all but to use direct policy search methods incombination with model-based RL instead. Model-based Direct Policy Search
Understanding how cooperative behaviour emerges within a population of individuals has been the focus of a great deal of research in the multi-agent systems community. In this paper, we examine the effectiveness of two different learning mechanisms — an evolutionary-based technique and a social imitation technique — in promoting and maintaining cooperation in the spatial N-player Iterated Prisoner's Dilemma (NIPD) game. Comprehensive Monte Carlo simulation experiments show that both mechanisms are able to evolve high levels of cooperation in the NIPD despite the diminished impact of direct reciprocation. However, the performance of evolutionary learning is significantly better than social learning, especially for larger population sizes. Our conclusion implies that when designing autonomous agents situated in complex environments, the use of evolutionary-based adaptation mechanisms will help realising efficient collective actions. Co-evolution of Agent Strategies in N-player Dilemmas
As agent-human teams get increasingly deployed in the real-world,agent designers need to take into account that humans and agentshave different abilities to specify preferences. In this paper, we focus on how human biases in specifying preferences for resourcesimpacts the performance of large, heterogeneous teams. In particular, we model the inclination of humans to simplify their preference functions and to exaggerate their utility for desired resources.We then study the effect of these biases on two different problems,which are representative of most resource allocation problems addressed in literature. Analyzing the impact of human bias on human-agent teams in resource allocation domains
In this paper, we attempt to formalize an approach to the syncretic In this paper, we attempt to formalize a novel approach to the syncretic argumentation, which allows agents with different epistemology to engage in argumentation, taking into account the Golden Rule in the ethics of reciprocity and Confucius' Golden Rule. We address this new argumentation framework in two ways. One is by introducing the lattice homomorphism on truth-values (epistemic states) of propositions, and the new definitions of arguments justified under syncretized knowledge base. For the other, we devise the lattice fusion, which is induced through the lattice product. Syncretic Argumentation by Lattice Homomorphism and Fusion
We present an efficient approach for identifying, learningand modeling the policies of others during collaborative activities. In a set of experiments, we demonstrate that moreaccurate models of others’ policies (or norms) can be developed more rapidly using various forms of evidence fromargumentation-based dialogue. Learning Policies through Argumentation-derived Evidence
We introduce the flexible approach for determining agents’orientation on ontology mappings (FDO), which provides aflexible mechanism for agents to decide whether or not theysupport an argument about a mapping. Whilst this resultsin agents relaxing some preferences over suitable mappings,it produces a larger consensus of possible mappings due tothe generation of a greater number of arguments in favourof the candidate mappings (compared to Laera {\em et al.}’s MbAapproach), and better reflects the agents preferences thanwhen only a single threshold and preference value are used. Flexible agreement mechanism for dynamic meaning negotiation
In this paper, a non-mediated multi-issue bilateral bargaining model for complex utility functions is presented. Beforethe negotiation process, a genetic algorithm (GA) is used tosample one’s own utility function. During the negotiationprocess, genetic operators are applied over the opponent’sand one’s own proposals in order to sample new proposalsthat are interesting for both parties. Genetic-Aided Multi-Issue Bilateral Bargaining for Complex Utility Functions
This paper presents the effect of adaptively introducing appropriate strategies into the award phase of the contract netprotocol (CNP) in a massively multi-agent system (MMAS). Effect of Probabilistic Task Allocation Based on Statistical Analysis of Bid Values
Users’ preferences play a key role in automated negotiation sincethey dictate how an agent will act on behalf of its user. However,elicitation of these preferences from the user is difficult when thereare dependencies between preferences. In many settings, expectinga user to provide a total ordering of her preferences is unrealistic.Thus, it is essential to build agents that can negotiate with onlypartial preference information. In order to achieve this goal, wedevelop negotiation strategies that work on qualitative preferencerepresentations, such as CP-nets that require only partial preferenceinformation. Effective Negotiation with Partial Preference Information
The problem of finding agents' rational strategies in bargaining with incomplete information is well known to be challenging. The literature provides a collection of results for very narrow uncertainty settings, but no generally applicable algorithm. In this paper, we focus on the alternating-offers finite horizon bargaining protocol with one-sided uncertainty regarding agents' reserve prices. We provide an algorithm based on the combination of game theoretic analysis and search techniques which finds agents' equilibrium in pure strategies when they exist. Our approach is sound, complete and, in principle, can be applied to other uncertainty settings. Searching for Pure Strategy Equilibria in Bilateral Bargaining With One-sided Uncertainty
We present an initial comparative evaluation between monotonic mixing and the traditional linear weighted combination of tactics in a multi-issue negotiation scenario. As thetraditional mixing method may produce a non-monotonicsequence of utilities of proposed offers in case imitative andnon-imitative tactics are mixed together (even when weightsare static) we demonstrate that both agents can gain higherutilities in many scenarios when using monotonic mixing. On Monotonic Mixed Tactics and Strategies for Bilateral Multi-Issue Negotiations
We present a framework for non-mediated bilateral multi-issue negotiation under non-monotonic preference spaces.The framework is based on a region-based recursive bargaining mechanism. Preliminary experimental evaluationshows that our approach may obtain approximate Pareto-optimal results in acceptable negotiation time with a lowfailure rate. A Multi-issue Negotiation Framework for Non-monotonic Preference Spaces