Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Journal of Economic Dynamics and Control 14 (1990) 329-373

Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents

Authors
Affiliations
University of Minnesota, Minneapolis, MN 55455, USA
Duke University, Durham, NC 27706, USA
Hoover Institution, Stanford, CA 94305, USA

Abstract

We study the exchange economies of Kiyotaki & Wright (1989) in which agents must use a commodity or fiat money as a medium of exchange if trade is to occur. Our agents are artificially intelligent and are modeled as using classifier systems to make decisions. In the assignment of credit within the classifier systems, we introduce some innovations designed to study sequential decision problems in a multi-agent environment. For most economies that we have simulated, trading and consumption patterns converge to a stationary Nash equilibrium even if agents start with random rules. In economies with multiple equilibria, the only equilibrium that emerges in our simulations is the one in which goods with low storage costs play the role of medium of exchange (i.e., the ‘fundamental equilibrium’ of Kiyotaki and Wright).

Keywords:genetic algorithmsclassifier systemscommodity moneyagent-based simulationKiyotaki-Wright

1. Introduction

Kiyotaki & Wright (1989) studied how three classes of rational agents would interact within a Nash-Markov equilibrium in a world of repeated ‘Wicksell triangles’. This paper studies how a collection of artificially intelligent agents would learn to coordinate their activities if they were thrust into the economic environment described by Kiyotaki and Wright. Our agents’ behavior is determined by Holland (1975) classifier systems.[1]

Thus, while Kiyotaki and Wright studied stationary equilibria in which beliefs about ‘media of exchange’ are consistent with trading patterns, we study economies in which particular commodities emerge as media of exchange. We also study an economy (Economy C) in which a good from which no agent derives utility emerges as fiat money. We want to learn whether our artificially intelligent agents can learn to play a Markovian Nash equilibrium of the Kiyotaki-Wright model. When there are multiple Nash equilibria (e.g., the ‘fundamental’ and ‘speculative’ equilibria of Kiyotaki and Wright), we want to know whether the system might converge to some but not others of these equilibria. In addition, if classifier systems do converge to Nash-Markov equilibria, they might be used to compute equilibria in other economies for which it is difficult to obtain analytic solutions. To this end, we also study enlarged versions of the Kiyotaki-Wright model.

A classifier system is a collection of potential decision rules, together with an accounting system for selecting which rules to use. The accounting system credits rules generating good outcomes and debits rules generating bad outcomes. The system is designed to select a ‘co-adaptive’ set of rules that work well over the range of situations that an agent typically encounters. In a multiple-agent environment like Kiyotaki and Wright’s, the range of situations encountered by one agent depends on the actions taken by other agents. This means that the collections of rules used by the collection of agents must co-adapt jointly.

We study two sorts of classifier systems, with the distinction between them being induced by the fact that for many problems enumerating all possible rules would require a very long list. The first kind of classifier system is one in which a complete enumeration of all possible rules is carried along. For many problems, it is not feasible to use a complete enumeration classifier system because the state and action spaces are so large that the set of all possible rules is much too big. For us, a complete enumeration is tractable because the Kiyotaki-Wright model has low-dimensional state and action spaces.

The second kind of classifier system is designed for situations in which it is not efficient to carry along a complete enumeration of decision rules. In this case, a modified version of the genetic algorithm of Holland is used as a device for periodically eliminating some rules and injecting new rules into the population of rules to be operated on by the accounting system. The genetic-algorithm version of the classifier models learning in the face of unforeseen contingencies. When an unprecedented state arises which is not covered by the existing set of rules, the system contains a procedure for manufacturing and experimenting with new rules that apply in the new situation. Even though it is feasible for us to work with a complete enumeration of rules, we also study the genetic-algorithm version of the classifier.[2]

In this paper, we set up classifier systems for the Kiyotaki-Wright environment and simulate them on a computer. We also formulate definitions of equilibrium and stability for classifier systems, and pose some convergence questions. Although we suggest possible convergence theorems, we prove no theorems in this paper. 2. The Kiyotaki-Wright Environment describes the Kiyotaki-Wright environment. 3. Classifier Systems for the Kiyotaki-Wright Environment describes the classifier systems without genetics. 4. Classifier Systems for Supporting Kiyotaki and Wright’s Stationary Equilibria describes behavior in the stationary equilibria of Kiyotaki and Wright and also the types of classifier systems that could support that behavior. 5. Concepts of Convergence for Classifier Systems Playing Games defines stability criteria for sets of classifier systems, and discusses how these definitions can be used to formulate questions about whether systems of classifier systems converge to a stationary equilibrium. 6. Incomplete Enumeration Classifiers and the ‘Genetic Algorithm’ describes classifier systems operating with genetics. 7. Simulation Results describes our simulations of systems of classifier systems. In addition to the economies studied by Kiyotaki and Wright (including one with fiat money), we study an economy with five types of agents and five goods. 8. Conclusions concludes the paper.

2. The Kiyotaki-Wright Environment

There are three types of agents, with types being indexed by i=1,2,3i = 1, 2, 3. Type ii agents get utility only from consuming type ii good. Type ii agent has access to a technology for producing type ii^* good, where iii^* \neq i. We initially specify (i,i)(i, i^*) according to Kiyotaki-Wright’s ‘model A’, namely, as follows:

iiii^*
12
23
31

This specification assumes no ‘double coincidence of wants’ and seems to call for a multilateral trading arrangement. All goods are indivisible. Each agent can store one and only one unit of only one good from one period to the next. When an agent of type ii consumes good ii at time tt, he immediately produces good ii^*, which he carries over to the next period. The net utility to an agent of type ii of consuming good yy and producing good ii^* is given by ui(y)u_i(y). We assume that an agent of type ii does not know his utility function, but does recognize utility when he experiences it. The goods are costly to store. Storing good kk (k=1,2,3k = 1, 2, 3) from tt to t+1t+1 imposes costs at tt of sks_k. Following Kiyotaki and Wright, we assume that s3>s2>s1>0s_3 > s_2 > s_1 > 0. We summarize this cost function by saying that s(y)s(y) is the one-period cost of storing one unit of good yy. We assume that individuals do not know this cost function, but that they do recognize costs when they bear them.

The economy and each agent within it live forever. There are large and equal numbers of agents of each type. We modify Kiyotaki and Wright’s model by assuming that each agent cares about his long-run average level of utility. (Kiyotaki and Wright assumed that agents have preferences ordered by expected discounted future utilities.) Each period, there is a random matching process that assigns each and every agent in the economy to a pair with one and only one other agent in the economy. The random matching technology matches agents without regard to type. Only pairs of agents matched together can trade at a point in time.

The economy begins at t=0t = 0 with agents being endowed with an arbitrary and randomly generated initial distribution of holdings of goods. At each date t0t \geq 0, each agent ii has to make two decisions sequentially. First, given the good he is holding and given the good held by the partner with whom he is matched at tt, he must decide whether to propose to trade. Trade occurs only if both parties propose to do so. Second, the agent must decide whether or not to consume the good with which he exits the trading process. If he doesn’t consume, he simply carries the good into period t+1t + 1, experiencing cost sks_k. If an agent of type ii does consume good yy at tt, he experiences net utility ui(yt)u_i(y_t), produces good ii^*, and experiences carrying costs sis_{i^*}. We assume that ui(k)=0u_i(k) = 0 if kik \neq i and ui(i)=ui>0u_i(i) = u_i > 0. We follow Kiyotaki and Wright both in adopting their specification of the physical environment and in considering only Markov strategies, but we drop the assumption of rational agents. Instead, our agents are assumed to be ‘artificially intelligent’, in the sense that they use versions of the classifier system introduced by Holland (1986).

Before describing the classifier systems used by our agents, we introduce some notation for naming agents and for describing the state of each individual agent in the economy. There is a collection of agents A={1,2,,A}\mathscr{A} = \{1, 2, \ldots, A\}. A typical element of A\mathscr{A} is denoted aa. Kiyotaki and Wright assumed that there was a continuum of each type of agent, while we assume that there is a finite number of each type. The first A1A_1 agents are of type I, the next A2A_2 are of type II, and the following A3A_3 are of type III. Here A=3AiA = 3A_i.

At the beginning of time tt, agent aa is carrying good xatx_{at}. The variable xatx_{at} characterizes the pre-match state of agent aa at time tt. There is a random matching process which each period matches each agent aAa \in \mathscr{A} with a distinct agent ρt(a)A\rho_t(a) \in \mathscr{A}. For each agent in each period, the matching process induces a function ρt(a):AA\rho_t(a): \mathscr{A} \rightarrow \mathscr{A}. After the matching process, the pre-trade state of agent aa is (xat,xρt(a)t)(x_{at}, x_{\rho_t(a)t}). The pair (xat,xρt(a)t)=zat(x_{at}, x_{\rho_t(a)t}) = z_{at} records what agent aa is carrying and what the agent ρt(a)\rho_t(a) with whom aa is matched at tt is carrying.

At tt, after being matched with agent ρt(a)\rho_t(a), agent aa decides whether or not to propose to trade. We let λat\lambda_{at} denote the trading decision of agent aa at time tt, where

λat={1if a proposes to trade xat for xρt(a)t0if a refuses to trade\lambda_{at} = \begin{cases} 1 & \text{if } a \text{ proposes to trade } x_{at} \text{ for } x_{\rho_t(a)t} \\ 0 & \text{if } a \text{ refuses to trade} \end{cases}

Similarly, λρt(a)t\lambda_{\rho_t(a)t} summarizes the trading decision of the agent ρt(a)\rho_t(a) with whom aa is paired at tt. Trade takes place if and only if λatλρt(a)t=1\lambda_{at} \cdot \lambda_{\rho_t(a)t} = 1.

Let xat+x_{at}^+ denote the post-trade (but pre-consuming decision) holdings of agent aa at tt. We then have that

xat+=(1λatλρt(a)t)xat+λatλρt(a)txρt(a)tx_{at}^+ = (1 - \lambda_{at} \cdot \lambda_{\rho_t(a)t}) x_{at} + \lambda_{at} \lambda_{\rho_t(a)t} x_{\rho_t(a)t}

After leaving the trading process with holdings xat+x_{at}^+, agent aa must decide whether to consume xat+x_{at}^+ or to carry it into the next period. We let γat\gamma_{at} denote the consumption decision of agent aa at tt where

γat={1if a decides to consume xat+0if a decides not to consume\gamma_{at} = \begin{cases} 1 & \text{if } a \text{ decides to consume } x_{at}^+ \\ 0 & \text{if } a \text{ decides not to consume} \end{cases}

If agent aa decides to consume, he automatically produces good f(a)f(a), which he carries into (t+1)(t + 1). From the specification of ii^* as a function of ii described above for Kiyotaki and Wright’s model A, we have that f(a)f(a) is a good of type 2 if aa is a type I agent, f(a)f(a) is a good of type 3 if aa is a type II agent, and f(a)f(a) is a good of type 1 if aa is a type III agent. It follows that beginning-of-period holdings of aa at time t+1t + 1 are described by[3]

xa,t+1=γatf(a)+(1γat)((1λatλρt(a)t)xat+λatλρt(a)txρt(a)t)x_{a,t+1} = \gamma_{at} f(a) + (1 - \gamma_{at}) \left( (1 - \lambda_{at} \cdot \lambda_{\rho_t(a)t}) x_{at} + \lambda_{at} \cdot \lambda_{\rho_t(a)t} x_{\rho_t(a)t} \right)

3. Classifier Systems for the Kiyotaki-Wright Environment

We now describe the classifier systems that agents use to make their trading and consumption decisions.[4] Agents sequentially use two interconnected classifier systems each period. A first trading classifier system receives input in the form of the pre-trade state zat=(xat,xρt(a)t)z_{at} = (x_{at}, x_{\rho_t(a)t}). This classifier system determines the trading decision λat\lambda_{at}, which interacts with the trading decision λρt(a)t\lambda_{\rho_t(a)t} of the agent ρt(a)\rho_t(a) with whom aa is paired at tt to determine xat+x_{at}^+ [see eq. (2)]. A second consumption classifier system takes xat+x_{at}^+ as input and determines the consumption decision γat\gamma_{at}. The two classifier systems are used sequentially in this way each period, and their accounting systems are interconnected in ways to be described below.

A classifier system consists of the following objects:

  1. A collection of trinary strings, called ‘classifiers’.

  2. A system for interpreting or decoding the strings or classifiers as instructions mapping states into decisions. The first part of a string encodes a particular state or condition, while the second part encodes a particular action. Thus, an individual classifier or string is just an encoding of a single (state, action) pair. For a given state, there can be many classifiers present in a classifier system.

  3. A list of ‘strengths’ attached to each classifier at each point in time t=0,1,t = 0, 1, \ldots

  4. A system for reading in external messages or ‘states’ and for determining the set of classifiers that are pertinent, i.e., the set of classifiers whose ‘conditions’ are satisfied or ‘matched’ at that state. For a given state, the system can sometimes contain several classifiers with distinct actions.

  5. An ‘auction system’ for determining which of the pertinent classifiers is allowed to make a decision at tt. Among all classifiers whose condition part matches the actual state at tt, the ‘highest bidder’ in the auction actually ‘makes the decision’ in real time.[5]

  6. An accounting system for updating the strengths of the collection of classifiers in response to the net rewards that flow into the system as a result of the decisions that are made.

And possibly:

  1. A genetic algorithm for introducing new strings and extinguishing old strings. This algorithm will be called whenever states are encountered for which no existing classifiers have their conditions met. The algorithm may also be applied from time to time as a device to promote experimentation.

We now describe each of these elements as applied to the Kiyotaki-Wright environment. We first consider an exchange classifier system, which maps the pre-trade state zatz_{at} into a trading decision. This classifier system consists of a list of trinary strings of length 7. The first three digits represent own storage, the next three represent trading partner’s storage, and the seventh represents the trading decision.

Table 1:Encoding of goods in classifiers

Code

Meaning

1 0

Good 1

0 1

Good 2

0 0

Good 3

0 #

Not good 1

0

Not good 2

Not good 3

The coding is written in the trinary alphabet (1,0,#)(1, 0, \text{\#}), where #\text{\#} means ‘don’t care’. For the trading decision, the code is in the binary alphabet (1,0)(1, 0), where 1 means trade, while 0 means don’t trade.

To illustrate how the codes in Table 1 are applied to encode particular trading classifiers, we consider the following two classifiers:

Own storagePartner storageTrading decision
1 0 00 0 11
1 0 0# # 00

The first classifier instructs an agent who is carrying good 1 and who is matched with someone carrying good 3 to offer to trade. The second classifier instructs an agent who is carrying good 1 not to trade if he is matched with someone who is not carrying good 3.

The exchange classifier system of agent aa consists of a list of such classifiers. Let e=1,2,,Eae = 1, 2, \ldots, E_a index this collection of classifiers. A given classifier system has a fixed number of classifiers. For the exchange classifier system, how many distinct classifiers are possible? Evidently, 33×33×2=14583^3 \times 3^3 \times 2 = 1458. However, most of these classifiers are redundant in the sense that only a subset of them is needed to represent all possible trading decision rules defined on the state space zat=(xat,xρt(a)t)z_{at} = (x_{at}, x_{\rho_t(a)t}) when there are three goods. All rules can be written in terms of pairs of conditions drawn from Table 1. Thus, only 6×6×2=726 \times 6 \times 2 = 72 strings are required to represent the complete set of possible rules for this system.

Assigned to each exchange classifier e{1,2,,Ea}e \in \{1, 2, \ldots, E_a\} is a strength, denoted Sea(t)S_e^a(t). The strength Sea(t)S_e^a(t) evolves over time in a way determined by the accounting system. The strengths attached to classifiers are used to determine the decisions made by the classifier system at tt. For a given state or ‘condition’ zat=(xat,xρt(a)t)z_{at} = (x_{at}, x_{\rho_t(a)t}), there is a collection of classifiers within the classifier system whose conditions are satisfied. We denote the set of such classifiers by Me(zat)M_e(z_{at}), defined as

Me(zat)={e:zat matches the condition part of classifier e}M_e(z_{at}) = \{ e : z_{at} \text{ matches the condition part of classifier } e \}

The members of Me(zat)M_e(z_{at}) form a class of potential ‘bidders’ in an ‘auction’ whose purpose is to determine which classifier makes the decision of agent aa at time tt. When state zatz_{at} is encountered at time tt by agent aa, the classifier belonging to Me(zat)M_e(z_{at}) that has the highest strength makes the decision. Let et(zat)e_t(z_{at}) denote the index of the classifier to be used in deciding whether to trade at tt. Then,

et(zat)=argmax{Sea(t):eMe(zat)}e_t(z_{at}) = \arg\max\{S_e^a(t): e \in M_e(z_{at})\}

We denote the action (trade or no trade) taken by classifier et(zat)e_t(z_{at}) as λat\lambda_{at}. Equations (5) and (6) describe the ‘auction system’ by which the highest strength rule that applies in a given state is given the right to decide for agent aa at tt.

Because the trade and consumption classifier systems will be making payments to one another, we have to describe the consumption classifier system before describing the accounting system that updates strengths of the trade classifier system. The consumption classifier system is a collection of trinary strings of length 4. The first three positions encode xat+x_{at}^+ using the same code that was described in Table 1. The condition part of the strings is written in terms of the trinary alphabet (0,1,#)(0, 1, \text{\#}). The fourth position of a consumption string is the ‘action’ part, taking values of 1 (meaning consume) and 0 (meaning don’t consume).

We let consumption classifier strings be indexed by c{1,2,,Ca}c \in \{1, 2, \ldots, C_a\}. The strength assigned to classifier cc of agent aa is Sca(t)S_c^a(t). By virtue of eq. (2), the state xat+x_{at}^+ can be expressed in terms of zat=(xat,xρt(a)t)z_{at} = (x_{at}, x_{\rho_t(a)t}). Therefore it suffices to denote the set of ‘matched’ classifiers by Mc(zat)M_c(z_{at}), where

Mc(zat)={c:xat+ matches the condition part of classifier c}M_c(z_{at}) = \{c: x_{at}^+ \text{ matches the condition part of classifier } c\}

Let ct(zat)c_t(z_{at}) denote the classifier that makes the consumption decision at tt. The highest-strength classifier gets to make the decision:

ct(zat)=argmax{Sca(t):cMc(zat)}c_t(z_{at}) = \arg\max\{S_c^a(t): c \in M_c(z_{at})\}

Given the decisions determined by (6) and (8), xat+x_{at}^+ evolves according to the law of motion (4).

3.1 Counters and Strength Evolution

We attach to each exchange classifier ee a ‘counter’ τea(t)\tau_e^a(t) which records the cumulative number of times that classifier ee has won the auction as of date tt. We shall change the strength of classifier ee only when it actually wins the auction and thereby gets to make the exchange decision, which is why we need the counter τea(t)\tau_e^a(t). The counter τea(t)\tau_e^a(t) for classifier ee is defined recursively in terms of the indicator Iea(t)I_e^a(t), which records whether classifier ee wins the auction:

Iea(t)={1if e wins the auction (unless classifier e sets λat=1while λρt(a)t=0, so that the offer to trade is not reciprocated)0otherwiseI_e^a(t) = \begin{cases} 1 & \text{if } e \text{ wins the auction (unless classifier } e \text{ sets } \lambda_{at} = 1 \\ & \text{while } \lambda_{\rho_t(a)t} = 0, \text{ so that the offer to trade is not reciprocated)} \\ 0 & \text{otherwise} \end{cases}
τea(t)=s=0tIea(s)+1\tau_e^a(t) = \sum_{s=0}^{t} I_e^a(s) + 1

Notice that we initialize the counter of each classifier at unity. The strength of classifier ee at tt will be represented as Sea(t)=Seτea(t)aS_e^a(t) = S_{e\tau_e^a(t)}^a.

Similarly, we attach to each consumption classifier cc a ‘counter’ τca(t)\tau_c^a(t) which records the cumulative number of times that classifier cc has won the auction as of date tt. The counter τca(t)\tau_c^a(t) is defined by

Ica(t)={1if c wins the auction0otherwiseI_c^a(t) = \begin{cases} 1 & \text{if } c \text{ wins the auction} \\ 0 & \text{otherwise} \end{cases}
τca(t)=s=0tIca(s)+1\tau_c^a(t) = \sum_{s=0}^t I_c^a(s) + 1

The strength of classifier cc at tt will be represented as Sca(t)=Scτca(t)aS_c^a(t) = S_{c\tau_c^a(t)}^a.

3.2 Bid Functions and Strength Updates

The counters τea(t)\tau_e^a(t) and τca(t)\tau_c^a(t) induce a transformation of time in terms of which the strengths of classifiers cc and ee are updated. At date tt, a strength Scτca(t)aS_{c\tau_c^a(t)}^a is attached to classifier cc, while a strength Seτea(t)aS_{e\tau_e^a(t)}^a is attached to classifier ee of agent aa. At date tt, if classifier ee’s condition is matched [i.e., if eMe(zat)e \in M_e(z_{at})], then classifier ee makes a bid of b1(e)Sea(t)b_1(e)S_e^a(t), where b1(e)b_1(e) is a positive fraction that can depend on ee. If classifier ee wins the auction, its bid will be deducted from its strength. The winning bid will be allocated to augment the strength of other classifiers whose actions drove the system to the state that satisfied ee’s condition. We choose the particular bid function

b1(e)=b11+b12σeb2(c)=b21+b22σc\begin{aligned} b_1(e) &= b_{11} + b_{12}\sigma_e \\ b_2(c) &= b_{21} + b_{22}\sigma_c \end{aligned}

where b11b_{11} and b12b_{12} are positive constants adding up to less than one, and σe\sigma_e is a fraction which is proportional to the specificity of a particular classifier. In particular, we choose σe=1/(1+number of #’s in the string)\sigma_e = 1/(1 + \text{number of \#'s in the string}). Similarly, we define a function b2(c)b_2(c) with σc=1/(1+number of #’s in the string)\sigma_c = 1/(1 + \text{number of \#'s in the string}). By the above choices of b1(e)b_1(e) and b2(c)b_2(c), we favor specific rules over more general rules that can be activated by a particular state. When cMc(zat)c \in M_c(z_{at}), classifier cc makes a bid of b2(c)Sca(t)b_2(c)S_c^a(t).

Only winning classifiers pay their bids by having them deducted from their strengths. The bid of the winning exchange classifier at tt is paid to the winning consumption classifier at t1t - 1, which is the classifier that is to be credited with setting the time tt state to zatz_{at}. The bid of the winning consumption classifier at tt is paid to the winning exchange classifier at time tt, which is to be credited with setting the post-exchange state at tt at xat+x_{at}^+, thereby giving the winning consumption classifier a chance to bid.

We represent these payments in terms of the following difference equations:

Scτca(t)a=Scτca(t)1a1(τca(t)1)[(1+b2(c))Scτca(t)1aeIea(t)b1(e)Seτea(t)aUa(γcta)]S_{c\tau_c^a(t)}^a = S_{c\tau_c^a(t)-1}^a - \frac{1}{(\tau_c^a(t)-1)} \left[ (1+b_2(c)) S_{c\tau_c^a(t)-1}^a - \sum_e I_e^a(t) b_1(e) S_{e\tau_e^a(t)}^a - U_a(\gamma_{ct}^a) \right]
Seτea(t)+1a=Seτea(t)a1τea(t)[(1+b1(e))Seτea(t)acIca(t)b2(c)Scτca(t)a]S_{e\tau_e^a(t)+1}^a = S_{e\tau_e^a(t)}^a - \frac{1}{\tau_e^a(t)} \left[ (1+b_1(e)) S_{e\tau_e^a(t)}^a - \sum_c I_c^a(t) b_2(c) S_{c\tau_c^a(t)}^a \right]

In (14), Ua(γcta)U_a(\gamma_{ct}^a) is the external payoff when the winning consumption classifier cc makes final consumption decision γcta\gamma_{ct}^a. If the post-exchange state at tt is xat+x_{at}^+, then we have

Ua(γcta)=γcta[ui(xat+)s(f(a))]+(1γcta)s(xat+)U_a(\gamma_{ct}^a) = \gamma_{ct}^a \left[ u_i(x_{at}^+) - s(f(a)) \right] + (1 - \gamma_{ct}^a) s(x_{at}^+)

There are several features of (14) and (15) that bear emphasizing. First, these are recursive formulas that make Scτca(t)aS_{c\tau_c^a(t)}^a and Seτea(t)aS_{e\tau_e^a(t)}^a averages of past payoffs (external rewards plus bids received from other classifiers) minus payments (bids made to other classifiers). Use of cumulative average net payoffs in this way (as opposed to total payoffs) is a departure from the existing literature on classifiers. Second, notice how the term eIea(t)b1(e)Seτea(t)a\sum_e I_e^a(t) b_1(e) S_{e\tau_e^a(t)}^a expresses the condition that only the winning exchange classifier at tt pays the winning consumption classifier at t1t - 1. Third, notice how the use of the counters τea(t)\tau_e^a(t) and τca(t)\tau_c^a(t) causes changes to be made only to the strengths attached to the winning classifiers.

Example of flow of payments in classifier systems for type I agent: transfer payments denoted in bold lines, decision flows denoted in thin lines.

Figure 1:Example of flow of payments in classifier systems for type I agent: transfer payments denoted in bold lines, decision flows denoted in thin lines.

Figure 1 illustrates how the exchange and consumption classifier systems interact for an individual of type I. Suppose that the state at time tt is encoded 010 100. This means that the type I individual is storing good 2 and has been matched with someone who is storing good 1. The individual has two classifiers whose conditions are matched, one with strength 50, the other with strength 10. The strength 50 classifier makes the decision, which is ‘1’, namely, offer to trade. If the individual’s trading partner also offers to trade, the individual exits the trading process with state 100, which is the input into type I’s consumption classifier. With this input, there are two consumption classifiers whose conditions are matched, one with strength 80 that says ‘consume’, the other with strength -1 that says ‘don’t consume’. The individual consumes, initiating an external payoff. The bold lines indicate the flows of payoffs. In addition to the external payoff to the winning consumption classifier, there is a payment from the winning consumption classifier at tt to the winning exchange classifier at tt, and a payment from the winning exchange classifier at tt to the winning consumption classifier at t1t - 1.

With (14), (15), and (16), we have completed our description of the classifier system for agent aa when that classifier system contains a complete enumeration of possible rules. Our version of the Kiyotaki-Wright model is formed by assuming that each agent aa of type ii uses the same classifier system to make decisions.[6] Thus, we can replace aa superscripts and subscripts in the descriptions of strengths by ii’s. We start with a set of initial strengths Si(0)S^i(0) for i=1,2,3i = 1, 2, 3. We temporarily introduce the index TT to index calendar time, while tt now temporarily indexes the cumulative number of matches which have occurred rather than calendar time. Our artificially intelligent agents then interact as follows for each T=1,2,,LT = 1, 2, \ldots, L, where LL is the length of our simulation. All AA agents are randomly matched into A/2A/2 pairs. For each pair (a,a)(a, a'), the exchange and classifier systems for agents whose types are aa and aa', respectively, are used mutually to determine the trading and consumption decisions of agents aa and aa'. After the consumption decisions have been made for the pair, the counters and the strengths of the classifier systems for both types of agents are updated according to (10), (12), (14), and (15). Also, after each pair has played, the synthetic time counter tt is augmented by one. After all A/2A/2 pairs have played, the counter TT is augmented by one, and the entire process is repeated. The counter TT records the passage of physical time, while the counter tt records the cumulative number of matches that have occurred. The outcomes of all matches t[(T1)(A/2)+1,T(A/2)]t \in [(T-1)(A/2) + 1, T(A/2)] are interpreted as having occurred during period TT.

In 7. Simulation Results, we shall be interested in recording various frequency distributions across agents as a function of calendar time. That is, there will be distinct frequency distributions of, e.g., holdings of goods across agents for each date in calendar time. At the risk of confusion, we choose to index calendar time by tt (rather than TT) in subsequent sections.

4. Classifier Systems for Supporting Kiyotaki and Wright’s Stationary Equilibria

4.1 The Fundamental Equilibrium

For their model A, Kiyotaki and Wright define as the fundamental equilibrium the trading pattern defined by the triangle depicted in Figure 2. In this trading pattern, good 1, which has the lowest storage cost, serves as the general medium of exchange. In terms of storage, agents of type I only store good 2, which they exchange only for good 1; agents of type III only store good 1, which they exchange only for good 3; and agents of type II half of the time store good 1, which they exchange for good 2, and half of the time good 3, which they exchange for good 1.

Exchange patterns in fundamental equilibrium of model A.

Figure 2:Exchange patterns in fundamental equilibrium of model A.

We can characterize equilibria in terms of a set of probabilities that describe holding and exchanging patterns. We define

πith(k)=probability that a type i agent is holding k at tπth(k)=probability that a randomly selected agent is holding k at tπite(kj)=probability that a type i agent exchanges k for j at tπite(kjk)=probability that a type i agent exchanges k for j given that i is holding kπitc(k)=probability that a type i agent consumes k at tπitc(kk)=probability that a type i agent consumes k given that he holds k after tradingπt(ik)=probability that an agent is of type i given that he holds k at t\begin{aligned} \pi_{it}^h(k) &= \text{probability that a type } i \text{ agent is holding } k \text{ at } t \\ \pi_t^h(k) &= \text{probability that a randomly selected agent is holding } k \text{ at } t \\ \pi_{it}^e(kj) &= \text{probability that a type } i \text{ agent exchanges } k \text{ for } j \text{ at } t \\ \pi_{it}^e(kj|k) &= \text{probability that a type } i \text{ agent exchanges } k \text{ for } j \text{ given that } i \text{ is holding } k \\ \pi_{it}^c(k) &= \text{probability that a type } i \text{ agent consumes } k \text{ at } t \\ \pi_{it}^c(k|k) &= \text{probability that a type } i \text{ agent consumes } k \text{ given that he holds } k \text{ after trading} \\ \pi_t(i|k) &= \text{probability that an agent is of type } i \text{ given that he holds } k \text{ at } t \end{aligned}

We denote by Πt\Pi_t the entire set of probabilities defined in (17) for all i,k,ji, k, j. Associated with Kiyotaki and Wright’s stationary equilibrium is a time-invariant Πt=Π\Pi_t = \Pi. In Table 2 we display the Π\Pi’s for the fundamental equilibrium of Kiyotaki and Wright’s model A.

Table 2:Equilibrium probabilities for Economy A1

(a) Probability that ii holds jj, πih(j)\pi_i^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

0

1

0

i=2i=2

0.5

0

0.5

i=3i=3

1

0

0

Table 3:Joint exchange probabilities πie(jk)\pi_i^e(jk) in the stationary equilibrium of Economy A1

πie(jk)\pi_i^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0, 0, 0)

(0.167,[0,0.333],0)(0.167, [0,0.333], 0)

(0,0,0)(0, 0, 0)

i=2i=2

([0,0.25],0.167,0)([0,0.25], 0.167, 0)

(0,0,0)(0, 0, 0)

(0.167,0,[0,0.083])(0.167, 0, [0,0.083])

i=3i=3

([0,0.5],0,0.167)([0,0.5], 0, 0.167)

(0,0,0)(0, 0, 0)

(0,0,0)(0, 0, 0)

The (i,j)(i,j) entry in Table 3 is the triple (πie(j1),πie(j2),πie(j3))(\pi_i^e(j1), \pi_i^e(j2), \pi_i^e(j3)), representing the probability that a type ii agent holds jj, meets an agent with kk, and trades. Note that πie(jk)\pi_i^e(jk) here denotes the joint holding-meeting-trading probability, where jj indexes the good held and kk the good of the trading partner; this differs from the conditional exchange probability πite(kj)\pi_{it}^e(kj) defined in (17), which conditions on holding kk and denotes exchanging kk for jj.

Table 4:Equilibrium exchange strategies for Economy A1

π~ie(jkj)\tilde{\pi}_i^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

([0,1],0,0)?([0,1],0,0)?

(1,[0,1],0)(1,[0,1],0)

(1,1,[0,1])?(1,1,[0,1])?

i=2i=2

([0,1],1,0)([0,1],1,0)

(0,[0,1],0)?(0,[0,1],0)?

(1,1,[0,1])(1,1,[0,1])

i=3i=3

([0,1],0,1)([0,1],0,1)

(1,[0,1],1)?(1,[0,1],1)?

(0,0,[0,1])?(0,0,[0,1])?

Note: A question mark denotes sequential equilibrium strategies for events of zero probability. The (i,j)(i,j) entry in the table is the triple (π~ie(j1j),π~ie(j2j),π~ie(j3j))(\tilde{\pi}_i^e(j1|j), \tilde{\pi}_i^e(j2|j), \tilde{\pi}_i^e(j3|j)), representing the probability that a type ii agent is willing to exchange good jj for goods 1, 2, or 3 respectively, given that ii is holding jj.

Table 5:Consumption probabilities for Economy A1

πic(jj)\pi_i^c(j|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

1

0

0

i=2i=2

0

1

0

i=3i=3

0

0

1

Table 5 shows that in any stationary equilibrium, each type of agent consumes only its own desired good.

We now describe how the strategies that support Kiyotaki and Wright’s ‘fundamental equilibrium’ can be represented as a classifier system. For the classifiers we introduce the following additional notation. Let ek,j,die^i_{k,j,d} denote the exchange classifier for a type ii agent that reads ‘If I am storing good kk and I meet an agent with good jj, then my exchange decision is dd’; d{0,1}d \in \{0, 1\}. Similarly, specific consumption classifiers for type ii agents are of the form ck,dic^i_{k,d} for ‘If, at the end of the exchange subperiod, I am storing good kk, then my consumption action is dd’. For more general rules, we use the subindex j-j to denote ‘not good jj’ and #\text{\#} to denote ‘any good’. That is, e2,1,01e^1_{2,-1,0} is the classifier for type I agents that reads ‘If I am storing good 2 and I meet an agent not carrying good 1, then do not trade’, and c#,01c^1_{\text{\#},0} is a classifier for type I that reads ‘No matter what I am storing at the end of the exchange subperiod, do not consume’.

We can simplify the study of classifier systems by considering only a small complete set of rules. For type I agents these rules are: e2,1,11e^1_{2,1,1}; e2,1,01e^1_{2,-1,0}; e3,1,11e^1_{3,1,1}; e3,1,01e^1_{3,-1,0}; c1,11c^1_{1,1}; c1,01c^1_{-1,0}. If {e2,1,11;e2,1,01;c1,11;c1,01}\{e^1_{2,1,1}; e^1_{2,-1,0}; c^1_{1,1}; c^1_{-1,0}\} are the only classifiers being used by agents of type I, we can say that type I agents support the fundamental equilibrium. We can also describe classifiers for type II and type III agents that support the fundamental equilibrium. These are, for type II: e3,1,12e^2_{3,1,1}; e3,2,12e^2_{3,2,1}; e3,1,02e^2_{3,-1,0}; e1,2,02e^2_{1,-2,0}; c2,12c^2_{2,1}; c2,02c^2_{-2,0}; and for type III: e1,3,13e^3_{1,3,1}; e1,3,03e^3_{1,-3,0}; c3,13c^3_{3,1}; c3,03c^3_{-3,0}.

It is useful to define formally the sense in which these fixed lists of classifiers support Kiyotaki and Wright’s fundamental equilibrium. Let DaD_a be a fixed set of classifiers (exchange and consumption classifiers) for agent aa. Let the set DaD_a have no redundancies in the sense that for every state zatz_{at}, DaD_a uniquely determines actions. Then the fixed set of classifiers DaD_a represents a strategy for agent aa defined on zatz_{at}. We use the following definition of optimality for a fixed set of classifiers:

Definition. Given the probability distributions Π\Pi and fixed sets of classifiers of the other agents DaD_{a'}, for all aaa' \neq a, a fixed set of classifiers DaD_a is said to be optimal for agent aa if there exists no other set D~a\tilde{D}_a which yields higher long-run average utility for agent aa.

A stationary Nash equilibrium can then be defined as follows:

Definition. A stationary Nash equilibrium is a set of probability distributions Π\Pi and fixed sets of classifiers DaD_a, a=1,,Aa = 1, \ldots, A, such that:

  1. Given Π\Pi and DaD_{a'}, for aaa' \neq a, DaD_a is optimal for agent aa.

  2. {Da,a=1,,A}\{D_a, a = 1, \ldots, A\} and the random matching technology imply that Πt=Π\Pi_t = \Pi for all tt.

This definition is just an alternative representation of Kiyotaki and Wright’s definition in terms of fixed lists of classifiers.

Table 6 illustrates the sequence of events under the circumstance that the only active (i.e., auction-winning) classifiers are those that support the fundamental equilibrium for agents of type I. A natural question is whether the classifier systems of the three types of agents will actually converge to a situation in which the fundamental equilibrium of Kiyotaki and Wright is supported. Before introducing some concepts that will help us think about this question, we shall briefly describe another kind of equilibrium of Kiyotaki and Wright’s model and the kind of classifier system that could support it.

Table 6:Behavior of type I agent in a fundamental equilibrium

State

Action

Next State

Holding good 2

Meet agent with good 1 → Trade

Holding good 1

Holding good 1

Consume → Produce good 2

Holding good 2

4.2 Kiyotaki and Wright’s ‘Speculative Equilibrium’

Kiyotaki and Wright show that for their model A the trading pattern associated with a fundamental equilibrium is not the only possible one in equilibrium. They show that there can occur a speculative equilibrium with a trading pattern depicted in Figure 3.

Exchange patterns in speculative equilibrium of Economy A.

Figure 3:Exchange patterns in speculative equilibrium of Economy A.

Kiyotaki and Wright establish a condition under which the unique stationary rational-expectations equilibrium is a fundamental equilibrium. The condition, for the limiting case as the discount rate converges to zero, is that the following inequality is satisfied:[7]

s3s2>(π1h(3)π1h(2))13u1s_3 - s_2 > \left(\pi_1^h(3) - \pi_1^h(2)\right) \frac{1}{3} u_1

In the speculative equilibrium, Table 7 depicts the flow of events in the stationary rational-expectations equilibrium for a classifier system of a type I agent. Notice that the flows depicted reduce to a set of fundamental classifiers if the dotted link is removed, i.e., if type I agents decide not to trade good 2 for good 3.

Table 7:Behavior of type I agent in a speculative equilibrium

State

Action

Next State

Holding good 2

Meet agent with good 1 → Trade

Holding good 1

Holding good 2

Meet agent with good 3 → Trade*

Holding good 3

Holding good 1

Consume → Produce good 2

Holding good 2

Holding good 3

Meet agent with good 1 → Trade

Holding good 1

Note: The asterisk (*) denotes the speculative move that distinguishes this from the fundamental equilibrium.

In 7. Simulation Results we shall provide an example of a simulated classifier economy where (18) is not satisfied (u1u_1 is too high), but where nevertheless when agents start with a complete set of classifiers (with homogeneous strengths), the classifier systems converge to the fundamental equilibrium and not to the speculative equilibrium. Before turning to these simulations, we define some notions of stability that are designed to distinguish alternative senses in which a set of classifier systems may be said to converge.

5. Concepts of Convergence for Classifier Systems Playing Games

Given the decision rules being employed by other agents, (14)-(15) for agent aa forms a system of stochastic difference equations in the strengths Sa(t)S^a(t). Since the classifier systems of the three types of agents i=1,2,3i = 1, 2, 3 are operating simultaneously, the behavior of the entire system is determined by the system of difference equations (14)-(15) for agent types i=1,2,3i = 1, 2, 3. Let Si(t)S^i(t) denote the strengths for an agent of type ii. It is natural to inquire whether the system formed by (14)-(15) for a=i=1,2,3a = i = 1, 2, 3 converges to a stationary point in the space of strengths, and if it does so, whether that stationary point supports the fundamental equilibrium of Kiyotaki and Wright.

Although we have not yet carried out an analysis of the convergence of this system, it seems useful at this point to report our ideas about how such an analysis could be structured. In this section we indicate how ideas from the stochastic-approximation literature might be used to study convergence in the space of strengths.[8]

From the stochastic-approximation literature,[9] we know that any limit points (Sc,Se)(S_c^*, S_e^*) must satisfy

E[(1+b2(c))ScaeIea(t)b1(e)SeaUa(γca)]=0E\left[ (1 + b_2(c)) S_c^a - \sum_e I_e^a(t) b_1(e) S_e^a - U_a(\gamma_c^a) \right] = 0
E[(1+b1(e))SeacIca(t)b2(c)Sca]=0E\left[ (1 + b_1(e)) S_e^a - \sum_c I_c^a(t) b_2(c) S_c^a \right] = 0

for all cc and ee. For a given aa we define the solutions (Sc,Se)(S_c^*, S_e^*) of (20) as a stationary set of strengths for agent aa. Given a stationary set of strengths for agent aa, one can determine the set of classifiers {e,c}=Wa(zat)\{e, c\} = W_a(z_{at}) that would win the auction for each zatz_{at}. If aa has the fixed set of classifiers Wa(zat)=DaW_a(z_{at}) = D_a, where DaD_a is the fixed set of classifiers that supports aa’s behavior in a Nash equilibrium, then the stationary set of strengths associated with Wa(zat)W_a(z_{at}) can be said to support the stationary Nash equilibrium behavior of agent aa.

These definitions inspire several questions. First, given stationary sets of strengths for agents of types 1, 2, 3, is it true that Wa(zat)=DaW_a(z_{at}) = D_a for a type ii agent, for each i=1,2,3i = 1, 2, 3? That is, does a stationary set of strengths for the classifier systems for agents of type i=1,2,3i = 1, 2, 3 support a stationary Nash equilibrium?

Second, given fixed sets of classifiers DaD_a for aa of type i=1,2,3i = 1, 2, 3 that support a stationary Nash equilibrium, do these sets of classifiers imply the existence of a set of strengths SaS_a for aa of type i=1,2,3i = 1, 2, 3 that solve (20)? That is, is every Nash equilibrium supported as a stationary point in strengths?

Third, if we let the classifier systems of agents of type i=1,2,3i = 1, 2, 3 run in real time, does the joint system formed by (14)-(15) for agents of each type converge almost surely, and if so, to which solution of (20) does it converge? We know from the theory of stochastic approximation that if it converges, it must converge to solutions of (20).

Associated with the third of these questions is the notion of global asymptotic stability. In future work, we plan to apply results from the stochastic approximation literature to study the global asymptotic stability of versions of our interacting classifier systems.

We also intend to study a form of stability that is distinct from and weaker than global asymptotic stability. This form of stability is summarized in the following:

Nash-like stability test. Take a stationary equilibrium fixed by Π\Pi. Place a single agent of a given type ii in the environment formed by the fixed probabilities Π\Pi. Let a classifier system for that single agent operate until it converges. Check to see whether it converges to imply the collection of classifiers DaD_a for an agent aa of type ii which are required to support the equilibrium probabilities. In this experiment, in running the classifier system for the single agent, the decision rules of all other agents of his type are being held fixed at their equilibrium values. This experiment is to be repeated for a single agent of each type.

In the remainder of this paper, we shall not pursue the formal analysis of the stability and convergence questions formulated above.[10] Instead, we shall report computer simulations. In these computer simulations, convergence can be based either on the limiting behavior of the strengths, as indicated above, or on the convergence of key aspects of the empirical frequency distribution of holdings Πt\Pi_t, which we defined in 4. Classifier Systems for Supporting Kiyotaki and Wright’s Stationary Equilibria.

6. Incomplete Enumeration Classifiers and the ‘Genetic Algorithm’

We now describe how the classifier system is modified when we deal with systems in which there is not a complete enumeration of rules. In order to complete the system, two aspects must be added to the above description of the classifier system under complete enumeration. The first is a way of generating an initial population of rules. The second is a way of deleting old rules from the system and adding new rules to the system. To generate the initial population of rules, we simply use a random number generator to generate bit strings by choosing successive bits independently.

The process of adding new rules and deleting old ones is accomplished by adding four major operations to the previous description of the workings of the classifier system. We call these steps ‘creation’, ‘diversification’, ‘specialization’, and ‘generalization’, respectively.

Creation: The creation operation is activated when there is no classifier matching the current state zatz_{at}, i.e., Me(zat)M_e(z_{at}) is empty. In this case, a new classifier is created with its condition part defined by the current state and its action part randomly selected. A ‘weak’ classifier from the set of all classifiers is deleted to keep the number of classifiers constant.

Diversification: The diversification operation is designed to inject sufficient diversity into the range of actions called for by different classifiers in a given situation. For the exchange classifier, this step occurs at the stage at which zatz_{at} has been observed and the set Me(zat)M_e(z_{at}) has been constructed. Recall that Me(zat)M_e(z_{at}) is the set of exchange classifiers whose condition parts match zatz_{at}. If all eMe(zat)e \in M_e(z_{at}) have the same action (0 or 1), the diversification operation creates a classifier encoding the specific state zatz_{at} and an action opposite to that taken by the other classifiers in Me(zat)M_e(z_{at}). The strength of the winning classifier is assigned to this new classifier. This new classifier is added to the system, while simultaneously a ‘weak’ classifier from the set Me(zat)M_e(z_{at}) is deleted from the system. The weakness of a classifier is measured in terms of a combination of strength and cumulated number of times the classifier has won the bidding in the past. The diversification operation is used each time the classifier system is called upon. Notice that if this step were to be added to a complete enumeration system, it would have no effect on the population of rules because sufficient diversity is present from the beginning.

Specialization: The specialization operation is called randomly with a probability that we choose to be diminishing over time. The random variable governing specialization calls this step into action just after the winning bid has been determined. The winning classifier is checked to determine whether there are any #\text{\#}'s in its condition part. A new classifier is synthesized by exposing each #\text{\#} in the condition part to a probability of being switched to a 0 or a 1, whichever one specifically encodes the particular zatz_{at} that was just observed or matched. This new and (probably) more specific rule is then used to replace a weak rule from the set eMe(zat)e \in M_e(z_{at}) [or cMc(zat)c \in M_c(z_{at}) in the case of the consumption classifier], where weakness is measured by a combination of strengths and cumulative number of times the auction has been won.

Generalization: The fourth step, which we call generalization, is a version of a ‘genetic algorithm’. This step is called randomly, according to a probability that we specify to be declining through time. At a point when the step is called, the process is initiated by generating two distinct populations of classifiers: ‘potential parents’ and ‘potential exterminants’. The potential parents are chosen to be a population of classifiers of specified size. They are chosen according to a ‘fitness criterion’ that weights strength and cumulative number of times that the classifier won the auction. The potential exterminants are chosen to be a population of specified size, whose members are of low fitness as measured by a fitness criterion weighting strengths and cumulative victories in past bidding.

From the population of potential parents, a specified number of pairs of parents are randomly drawn to engage in ‘mating’ for the purpose of creating children. How a pair of parents mates can be illustrated for two exchange classifiers, which are bit strings of length 7, as depicted in Figure 4.

The pair mates in two steps. First the pair draws two random integers uniformly distributed on [1,7][1, 7]. The two integers signify the positions between bits depicted in Figure 4. Next, a Bernoulli random variable with probability 0.5 is drawn, assuming values of ‘in’ or ‘out’. If ‘in’ is drawn, we focus on the bits between the two lines, i.e., the bits inside the randomly drawn interval. If ‘out’ is drawn, we focus on the bits outside the randomly drawn interval. Within the interval of focus, we complete the mating process via the following version of genetic crossover. For each of the parent strings in the area of focus, we change to a #\text{\#} any bits for which the parent strings disagree. (Here, #\text{\#} agrees with either 0 or 1, while 0 disagrees with 1). This process produces two children. The children are each assigned strengths that are the average of their parents’ strengths. The children are introduced into the population of classifiers. For each child added to the population, a randomly selected individual from the population of ‘potential exterminants’ is deleted from the population.

This completes our description of the algorithm to be used for updating the classifier system when we deal with an incomplete enumeration classifier system. In the simulations to be reported in 7. Simulation Results, we shall employ both complete and incomplete enumeration classifier systems.

The mating process for exchange classifiers who have drawn ‘3,6’ and ‘in’.

Figure 4:The mating process for exchange classifiers who have drawn ‘3,6’ and ‘in’.

7. Simulation Results

The sets of parameters defining the economies under study are summarized in Table 8.

Table 8:Description of the economies

Economy

Production

Storage costs

Utility

Initial CS

Equil. type

A1.1

2

s1=0.1,s2=1,s3=20s_1=0.1, s_2=1, s_3=20

ui=100u_i=100

F

F

A1.2

2

s1=0.1,s2=1,s3=20s_1=0.1, s_2=1, s_3=20

ui=100u_i=100

R

F

A2.1

2

s1=0.1,s2=1,s3=20s_1=0.1, s_2=1, s_3=20

ui=500u_i=500

F

S

A2.2

2

s1=0.1,s2=1,s3=20s_1=0.1, s_2=1, s_3=20

ui=500u_i=500

R

S

B.1

3

s1=1,s2=4,s3=9s_1=1, s_2=4, s_3=9

ui=100u_i=100

F

F/S

B.2

3

s1=1,s2=4,s3=9s_1=1, s_2=4, s_3=9

ui=100u_i=100

R

F/S

C

2

s1=9,s2=14,s3=29,s0=0s_1=9, s_2=14, s_3=29, s_0=0

ui=100u_i=100

R

F

D

3

s1=1,s2=4,s3=9,s4=16,s5=30s_1=1, s_2=4, s_3=9, s_4=16, s_5=30

ui=200u_i=200

R

Notes: Utility levels uiu_i are set equal for i=1,2,3i = 1, 2, 3. CS denotes ‘classifier system’. ‘F’ implies fixed enumeration and ‘R’ implies randomly generated rules.

7.1 Economy A1

Table 9:Parameter values used for Economy A1.1

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=72,Ca=12E_a = 72, C_a = 12

Storage costs

s1=0.1,s2=1,s3=20s_1 = 0.1, s_2 = 1, s_3 = 20

Utility

ui=100,iu_i = 100, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.25,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.25, b_{22} = 0.25

Economy A1.1 (Economy A1 with complete enumeration of classifiers) shows that the distribution of holdings rapidly converges to the stationary equilibrium distributions. Similarly, the exchange and consumption strategies implemented by the system of winning classifiers virtually coincide with the equilibrium strategies. In Table 10, and in all subsequent tables of empirical frequencies, the frequencies reported are ten-period moving averages ending at the indicated time periods.[11]

Table 10:Frequency with which ii holds jj at t=500t=500 and t=1000t=1000 for Economy A1.1

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

0

1

0

i=2i=2 (t=500t=500)

0.502

0

0.498

i=3i=3 (t=500t=500)

1

0

0

i=1i=1 (t=1000t=1000)

0

1

0

i=2i=2 (t=1000t=1000)

0.506

0

0.494

i=3i=3 (t=1000t=1000)

1

0

0

Table 11:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=500t=500 and t=1000t=1000 for Economy A1.1

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

(0,0,0)(0,0,0)

(0.16,0,0)(0.16,0,0)

(0,0,0)(0,0,0)

i=2i=2 (t=500t=500)

(0.26,0.16,0)(0.26,0.16,0)

(0,0,0)(0,0,0)

(0.17,0,0)(0.17,0,0)

i=3i=3 (t=500t=500)

(0.52,0,0.17)(0.52,0,0.17)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=1i=1 (t=1000t=1000)

(0,0,0)(0,0,0)

(0.17,0,0)(0.17,0,0)

(0,0,0)(0,0,0)

i=2i=2 (t=1000t=1000)

(0.27,0.17,0)(0.27,0.17,0)

(0,0,0)(0,0,0)

(0.19,0,0)(0.19,0,0)

i=3i=3 (t=1000t=1000)

(0.51,0,0.19)(0.51,0,0.19)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

Table 12:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=1000t=1000 for Economy A1.1

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(1,0,0)(1,0,0)

i=2i=2

(1,1,0)(1,1,0)

(1,1,0)(1,1,0)

i=3i=3

(1,0,1)(1,0,1)

Table 13:Highest-strength consumption classifiers for type I at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

1 0 0 1

45.8

# # 0 0

−0.47

0 # # 1

−9.16

Table 14:Highest-strength exchange classifiers for type I at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

0 # # 1 0 0 1

11.22

0 # # 0 1 0 0

−0.08

# # 0 0 0 1 0

−0.08

Table 15:Highest-strength consumption classifiers for type II at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

1 0 0 0

0.0089

0 1 0 1

49.04

# 0 # 0

−10.90

Table 16:Highest-strength exchange classifiers for type II at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

# # 0 # # 0 1

4.86

# # 0 # 0 # 0

0.002

0 0 1 1 0 0 1

0.002

0 # # 0 1 0 1

12.72

0 # # 0 # # 0

−1.79

Table 17:Highest-strength consumption classifiers for type III at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

# 0 # 0

−0.024

# # 0 0

−0.412

0 0 1 1

31.62

Table 18:Highest-strength exchange classifiers for type III at t=1000t=1000, Economy A1.1

Iteration

Classifier

Strength

1000

# 0 # # 0 # 1

−0.01

# 0 # 0 1 0 0

−0.01

# 0 # 0 0 1 1

7.78

Distribution of holdings for Economy A1.1.

Figure 5:Distribution of holdings for Economy A1.1.

Table 19:Parameter values used for Economy A1.2

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=72,Ca=12E_a = 72, C_a = 12

Storage costs

s1=0.1,s2=1,s3=20s_1 = 0.1, s_2 = 1, s_3 = 20

Utility

ui=100,iu_i = 100, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.25,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.25, b_{22} = 0.25

Specialization

fs(t)=1/2t,ps=0.01f_s(t) = 1/2\sqrt{t}, p_s = 0.01

Generalization

fg(t)=1/2t,p1=0.2,p2=0.7,p3=0.2,p4=0.5,S=0,Ne=8,Nc=4f_g(t) = 1/2\sqrt{t}, p_1 = 0.2, p_2 = 0.7, p_3 = 0.2, p_4 = 0.5, S = 0, N_e = 8, N_c = 4

Economy A1.2 is identical to Economy A1.1, except that we use a randomly generated list of rules initially, and rely on a genetic algorithm to inject new rules into the classifier’s system.[12]

Table 20:Frequency with which ii holds jj at t=1000t=1000 and t=2000t=2000 for Economy A1.2

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=1000t=1000)

0

0.992

0.008

i=2i=2 (t=1000t=1000)

0.226

0

0.774

i=3i=3 (t=1000t=1000)

1

0

0

i=1i=1 (t=2000t=2000)

0

0.98

0.02

i=2i=2 (t=2000t=2000)

0.318

0

0.682

i=3i=3 (t=2000t=2000)

1

0

0

Table 21:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=1000t=1000 and t=2000t=2000 for Economy A1.2

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=1000t=1000)

(0,0,0)(0,0,0)

(0.08,0.14,0.01)(0.08,0.14,0.01)

(0,0,0)(0,0,0)

i=2i=2 (t=1000t=1000)

(0,0.08,0)(0,0.08,0)

(0,0,0)(0,0,0)

(0.28,0.01,0.09)(0.28,0.01,0.09)

i=3i=3 (t=1000t=1000)

(0,0,0.28)(0,0,0.28)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=1i=1 (t=2000t=2000)

(0,0,0)(0,0,0)

(0.09,0.19,0.03)(0.09,0.19,0.03)

(0,0,0)(0,0,0)

i=2i=2 (t=2000t=2000)

(0,0.08,0)(0,0.08,0)

(0,0,0)(0,0,0)

(0.2,0.03,0.06)(0.2,0.03,0.06)

i=3i=3 (t=2000t=2000)

(0,0.01,0.2)(0,0.01,0.2)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

Table 22:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=1000t=1000 and t=2000t=2000 for Economy A1.2

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=1000t=1000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,,)(0,\text{—},\text{—})

(1,1,0)(1,1,0)

(1,1,0)(1,1,0)

i=2i=2

(0,1,0)(0,1,0)

(,,)(\text{—},\text{—},\text{—})

(1,0,1)(1,0,1)

i=3i=3

(0,0,1)(0,0,1)

(,,1)(\text{—},\text{—},1)

(,,1)(\text{—},\text{—},1)

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=2000t=2000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(,,)(\text{—},\text{—},\text{—})

(1,0,1)(1,0,1)

(1,1,1)(1,1,1)

i=2i=2

(0,1,1)(0,1,1)

(,,)(\text{—},\text{—},\text{—})

(1,1,1)(1,1,1)

i=3i=3

(0,1,1)(0,1,1)

(,,1)(\text{—},\text{—},1)

Table 23:Consumption frequency πitc(jj)\pi_{it}^c(j|j) at t=1000t=1000 and t=2000t=2000 for Economy A1.2

πitc(jj)\pi_{it}^c(j|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=1000t=1000)

1

0.011

0.049

i=2i=2 (t=1000t=1000)

0.45

1

0.366

i=3i=3 (t=1000t=1000)

0.033

1

i=1i=1 (t=2000t=2000)

1

0

0.69

i=2i=2 (t=2000t=2000)

0.23

1

0.641

i=3i=3 (t=2000t=2000)

0.84

1

1

Table 24:Highest-strength consumption classifiers for type I, Economy A1.2

Iteration

Classifier

Strength

1000

1 0 0 1

38.11

0 1 # 0

−0.44

0 0 1 1

−4.26

2000

1 0 0 1

41.65

0 1 0 0

−0.43

0 0 1 0

−8.86

Table 25:Highest-strength exchange classifiers for type I, Economy A1.2

Iteration

Classifier

Strength

1000

0 1 0 1 0 0 1

8.27

0 1 0 0 1 0 1

−0.09

0 1 0 0 0 1 0

−0.10

2000

0 1 0 1 0 0 1

9.05

0 1 0 0 1 0 0

−0.10

0 1 0 0 0 1 1

−0.10

Table 26:Highest-strength consumption classifiers for type II, Economy A1.2

Iteration

Classifier

Strength

1000

# # 0 0

1.88

0 # 0 1

11.48

0 0 1 1

−9.29

2000

# # 0 0

8.05

0 # 0 1

38.40

0 0 1 1

−9.01

Table 27:Highest-strength exchange classifiers for type II, Economy A1.2

Iteration

Classifier

Strength

1000

1 0 0 1 0 0 0

0.29

1 0 0 0 1 0 1

1.67

1 0 0 0 0 1 0

0.28

0 0 1 # # 0 1

2.23

0 # 1 0 1 0 0

4.08

0 0 1 0 0 1 1

−2.08

2000

1 0 0 1 0 0 0

1.32

1 0 0 0 1 0 1

6.15

1 0 0 0 0 1 0

1.25

0 0 1 1 0 0 1

1.34

0 # 1 0 1 0 1

2.98

0 0 1 0 0 1 1

−1.99

Table 28:Highest-strength consumption classifiers for type III, Economy A1.2

Iteration

Classifier

Strength

1000

1 0 0 0

0.00

0 0 1 1

30.94

2000

1 0 0 1

8.97

0 0 1 1

30.55

Table 29:Highest-strength exchange classifiers for type III, Economy A1.2

Iteration

Classifier

Strength

1000

1 0 0 # # # 0

0.28

1 0 0 0 # # 0

1.77

1 # # 0 # 1 1

7.39

2000

1 0 0 1 # 0 0

0.14

1 0 0 0 1 0 1

0.42

1 # # 0 # 1 1

7.25

As we can see from Figure 6 and the above tables, the distribution of holdings for agents of type I and III rapidly converge to their equilibrium levels. Type II agents only hold goods 1 and 3 as predicted but it takes longer to converge to the equilibrium values. The exchange decisions are the correct ones (even in period 1000), except for a couple of events of very small probability (type I agents trade good 2 for 3 and type III agents trade good 1 for 2).

Consumption decisions are basically correct, except that II consumes 3 and III consumes 1. Overeating has been a minor problem in several of our economies. In most such cases that we have encountered, such wrong behavior has been associated with low-frequency states for which it is difficult for the classifier system to acquire the experience required to reward better classifiers and ‘right’ actions.

In summary, both with complete enumeration and without it, our simulations of Economy A1 seem to be converging to the stationary equilibrium, although this convergence is slower when initial classifiers are randomly generated and new classifiers are synthesized.

Distribution of holdings for Economy A1.2.

Figure 6:Distribution of holdings for Economy A1.2.

7.2 Economy A2

Table 30:Parameter values used for Economy A2.1

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=72,Ca=12E_a = 72, C_a = 12

Storage costs

s1=0.1,s2=1,s3=20s_1 = 0.1, s_2 = 1, s_3 = 20

Utility

ui=500,iu_i = 500, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.25,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.25, b_{22} = 0.25

Economy A2 differs from Economy A1 only in that agents receive 500 utils from consuming their desired good instead of 100. With this change of parameters, for high enough discount factors the unique stationary equilibrium of Kiyotaki and Wright’s economy is the so-called speculative equilibrium. The probabilities that characterize the speculative equilibrium are summarized in Table 31.

Table 31:Speculative equilibrium: probability that ii holds jj, πih(j)\pi_i^h(j), for Economy A2

πih(j)\pi_i^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

0

0.707

0.293

i=2i=2

0.586

0

0.414

i=3i=3

1

0

0

Table 32:Speculative equilibrium: joint exchange probability πie(jk)\pi_i^e(jk) for Economy A2

πie(jk)\pi_i^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0,0,0)

(0.138,[0,0.167],0.097)(0.138,[0,0.167],0.097)

(0.097,0,[0,0.029])(0.097,0,[0,0.029])

i=2i=2

([0,0.269],0.138,0.057)([0,0.269],0.138,0.057)

(0,0,0)(0,0,0)

(0.138,0.098,[0,0.098])(0.138,0.098,[0,0.098])

i=3i=3

([0,0.528],0,0.235)([0,0.528],0,0.235)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

Table 33:Speculative equilibrium exchange strategies π~ie(jkj)\tilde{\pi}_i^e(jk|j) for Economy A2

π~ie(jkj)\tilde{\pi}_i^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

([0,1],0,0)?([0,1],0,0)?

(1,[0,1],1)(1,[0,1],1)

(1,0,[0,1])(1,0,[0,1])

i=2i=2

([0,1],1,0)([0,1],1,0)

(0,[0,1],0)?(0,[0,1],0)?

(1,1,[0,1])(1,1,[0,1])

i=3i=3

([0,1],0,1)([0,1],0,1)

(1,[0,1],1)?(1,[0,1],1)?

(0,0,[0,1])?(0,0,[0,1])?

Table 34:Frequency with which ii holds jj at t=500t=500 and t=1000t=1000 for Economy A2.1

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

0

1

0

i=2i=2 (t=500t=500)

0.504

0

0.496

i=3i=3 (t=500t=500)

1

0

0

i=1i=1 (t=1000t=1000)

0

1

0

i=2i=2 (t=1000t=1000)

0.466

0

0.534

i=3i=3 (t=1000t=1000)

1

0

0

Table 35:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=500t=500 and t=1000t=1000 for Economy A2.1

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

(0,0,0)(0,0,0)

(0.16,0.17,0.15)(0.16,0.17,0.15)

(0,0,0)(0,0,0)

i=2i=2 (t=500t=500)

(0,0.16,0)(0,0.16,0)

(0,0,0)(0,0,0)

(0.17,0.15,0.04)(0.17,0.15,0.04)

i=3i=3 (t=500t=500)

(0.34,0,0.17)(0.34,0,0.17)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=1i=1 (t=1000t=1000)

(0,0,0)(0,0,0)

(0.18,0.13,0.19)(0.18,0.13,0.19)

(0,0,0)(0,0,0)

i=2i=2 (t=1000t=1000)

(0,0.18,0)(0,0.18,0)

(0,0,0)(0,0,0)

(0.18,0.19,0.04)(0.18,0.19,0.04)

i=3i=3 (t=1000t=1000)

(0.34,0,0.18)(0.34,0,0.18)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

Table 36:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=500t=500 and t=1000t=1000 for Economy A2.1

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=500t=500)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0,0,0)

(1,0,1)(1,0,1)

(0,0,0)(0,0,0)

i=2i=2

(0,1,0)(0,1,0)

(0,1,0)(0,1,0)

(1,1,0)(1,1,0)

i=3i=3

(1,0,1)(1,0,1)

(1,1,1)(1,1,1)

(1,0,1)(1,0,1)

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=1000t=1000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0,0,0)

(1,0,1)(1,0,1)

(0,0,0)(0,0,0)

i=2i=2

(0,1,0)(0,1,0)

(0,1,0)(0,1,0)

(1,1,0)(1,1,0)

i=3i=3

(1,0,1)(1,0,1)

(1,1,1)(1,1,1)

(1,0,1)(1,0,1)

Table 34 depicts a pattern of holdings characteristic of a fundamental equilibrium, not a speculative one. Compare with the theoretical predictions of Table 31Table 32. Recall that a crucial difference between a speculative and a fundamental equilibrium is that in the former, type I agents are willing to exchange good 2 for good 3, while in the latter they are not. Table 36 shows that type I agents are willing to exchange good 2 for 3, i.e., π~ite(jk2)=(1,0,1)\tilde{\pi}_{it}^e(jk|2) = (1,0,1) for i=Ii=\text{I}. Then how is it that the distribution of holding patterns fails to support a speculative equilibrium? The answer is to be found in the consumption classifiers of type I agents, which are too general (have too many #\text{\#} signs) to distinguish among goods stored. It happens that the winning classifiers embody levels of generality that result in overconsumption of good 3 and a failure to transmit the information from the consumption classifier to the exchange classifier that would be required to support a speculative equilibrium.[13]

We shall only briefly summarize the results for Economy A2.2 with random generation of initial classifiers, and refer the reader to the working paper for tables giving the numerical results. After 1000 iterations the economy had not converged to a steady state. Trading patterns were closer to the fundamental equilibrium than to the speculative equilibrium in period 1000. (The reverse was true in period 500.) However, the classifier system of type I agents did distinguish between storage of different goods for consumption, and the classifier c#,01c^1_{\text{\#},0} was slowly gaining strength over time. Nevertheless, our simulations provided no support for a hope that the economy would converge to the speculative equilibrium if it were to last longer.

While the results for Economy A2 are fairly inconclusive, they illustrate two interesting points. First, they expose some deficiencies of the genetic algorithm which we have used (we return to this in 8. Conclusions). Second, they raise the issue of whether our artificially intelligent agents are too impatient.

Patience requires experience. The transfer system inside the classifier system is designed to converge to a set of long-run average strengths. In the limit, the artificially intelligent agents should behave as long-run average payoff maximizers, since the steady-state strengths weight payoffs by their relative frequencies. It takes time, however, for optimal rules to achieve the desired strengths. The behavior of our artificially intelligent agents can be very myopic at the beginning. In economies, such as Economy A2, in which the nature of the equilibrium changes with the discount rate, this early myopia might have a perverse effect in diverting the economy towards a low discount-factor stationary equilibrium, such as the fundamental equilibrium. The underlying algorithm has to provide enough experimentation to avoid an early ‘lock in’.[14][15] The present algorithm seems defective in that it has too little experimentation to support the speculative equilibrium even in the long simulations we have run.

7.3 Economy B

Table 37:Parameter values used for Economy B.1

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=72,Ca=12E_a = 72, C_a = 12

Storage costs

s1=1,s2=4,s3=9s_1 = 1, s_2 = 4, s_3 = 9

Utility

ui=100,iu_i = 100, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.25,b12=0.25,b21=0.25,b22=0.25b_{11} = 0.25, b_{12} = 0.25, b_{21} = 0.25, b_{22} = 0.25

Economy B has a different production pattern than Economy A (I produces 3, II produces 1, and III produces 2). For the specified parameters two stationary equilibria are possible: fundamental and speculative. Stability tests based on classifier systems are potentially useful as selection criteria in economies with multiple equilibria. Table 38 and Table 41 summarize the trading patterns for the fundamental and the speculative equilibrium, respectively.

Exchange patterns in Economy B: Fundamental Equilibrium (left) and Speculative Equilibrium (right).

Figure 7:Exchange patterns in Economy B: Fundamental Equilibrium (left) and Speculative Equilibrium (right).

Table 38:Fundamental equilibrium: probability that ii holds jj, πih(j)\pi_i^h(j), for Economy B

πih(j)\pi_i^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

0

0.293

0.707

i=2i=2

1

0

0

i=3i=3

0.586

0.414

0

Table 39:Fundamental equilibrium: joint exchange probability πie(jk)\pi_i^e(jk) for Economy B

πie(jk)\pi_i^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0,0,0)

(0.098,[0,0.069],0)(0.098,[0,0.069],0)

(0.138,0.098,[0,0.167])(0.138,0.098,[0,0.167])

i=2i=2

([0,0.529],0.236,0)([0,0.529],0.236,0)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=3i=3

([0,0.310],0,0.138)([0,0.310],0,0.138)

(0.138,[0,0.098],0.098)(0.138,[0,0.098],0.098)

(0,0,0)(0,0,0)

Table 40:Fundamental equilibrium exchange strategies π~ie(jkj)\tilde{\pi}_i^e(jk|j) for Economy B

π~ie(jkj)\tilde{\pi}_i^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

([0,1],0,0)?([0,1],0,0)?

(1,[0,1],0)(1,[0,1],0)

(1,1,[0,1])(1,1,[0,1])

i=2i=2

([0,1],1,0)([0,1],1,0)

(0,[0,1],0)?(0,[0,1],0)?

(1,1,[0,1])?(1,1,[0,1])?

i=3i=3

([0,1],0,1)([0,1],0,1)

(1,[0,1],1)(1,[0,1],1)

(0,0,[0,1])?(0,0,[0,1])?

Table 41:Speculative equilibrium: probability that ii holds jj, πih(j)\pi_i^h(j), for Economy B

πih(j)\pi_i^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

0

0.586

0.414

i=2i=2

0.707

0

0.293

i=3i=3

0

1

0

Table 42:Speculative equilibrium: joint exchange probability πie(jk)\pi_i^e(jk) for Economy B

πie(jk)\pi_i^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(0,0,0)(0,0,0)

(0.138,[0,0.310],0)(0.138,[0,0.310],0)

(0.098,0.138,[0,0.098])(0.098,0.138,[0,0.098])

i=2i=2

([0,0.167],0.138,0.098)([0,0.167],0.138,0.098)

(0,0,0)(0,0,0)

(0,0.098,[0,0.069])(0,0.098,[0,0.069])

i=3i=3

(0,0,0)(0,0,0)

(0,[0,0.529],0.138)(0,[0,0.529],0.138)

(0,0,0)(0,0,0)

Table 43:Speculative equilibrium exchange strategies π~ie(jkj)\tilde{\pi}_i^e(jk|j) for Economy B

π~ie(jkj)\tilde{\pi}_i^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

([0,1],0,0)?([0,1],0,0)?

(1,[0,1],0)(1,[0,1],0)

(1,1,[0,1])(1,1,[0,1])

i=2i=2

([0,1],1,1)([0,1],1,1)

(0,[0,1],0)?(0,[0,1],0)?

(0,1,[0,1])(0,1,[0,1])

i=3i=3

([0,1],0,1)?([0,1],0,1)?

(0,[0,1],1)(0,[0,1],1)

(0,0,[0,1])?(0,0,[0,1])?

Economy B.1 displays an interesting pattern of evolution. At iteration 500 the distribution of holdings and, especially, the trading patterns correspond to the speculative equilibrium. However, the economy moves away from this state and by iteration 1000 has practically converged to the fundamental equilibrium.

Table 44:Frequency with which ii holds jj at t=500t=500 and t=1000t=1000 for Economy B.1

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

0

0.464

0.536

i=2i=2 (t=500t=500)

0.994

0

0.006

i=3i=3 (t=500t=500)

0

1

0

i=1i=1 (t=1000t=1000)

0

0.28

0.72

i=2i=2 (t=1000t=1000)

0.994

0

0.006

i=3i=3 (t=1000t=1000)

0.526

0.474

0

Table 45:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=500t=500 and t=1000t=1000 for Economy B.1

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=500t=500)

(0,0,0)(0,0,0)

(0.16,0.05,0.01)(0.16,0.05,0.01)

(0.01,0.19,0.04)(0.01,0.19,0.04)

i=2i=2 (t=500t=500)

(0,0.16,0.01)(0,0.16,0.01)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=3i=3 (t=500t=500)

(0,0,0)(0,0,0)

(0,0,0.18)(0,0,0.18)

(0,0,0)(0,0,0)

i=1i=1 (t=1000t=1000)

(0,0,0)(0,0,0)

(0.08,0.02,0.01)(0.08,0.02,0.01)

(0.15,0.11,0.07)(0.15,0.11,0.07)

i=2i=2 (t=1000t=1000)

(0.41,0.25,0.01)(0.41,0.25,0.01)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=3i=3 (t=1000t=1000)

(0.11,0.01,0.15)(0.11,0.01,0.15)

(0.18,0.03,0.10)(0.18,0.03,0.10)

(0,0,0)(0,0,0)

Table 46:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=500t=500 and t=1000t=1000 for Economy B.1

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=500t=500)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(1,0,0)(1,0,0)

(1,1,1)(1,1,1)

(1,1,1)(1,1,1)

i=2i=2

(0,1,0)(0,1,0)

(1,1,0)(1,1,0)

(1,1,1)(1,1,1)

i=3i=3

(0,0,1)(0,0,1)

(0,0,1)(0,0,1)

(0,0,0)(0,0,0)

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=1000t=1000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(1,0,0)(1,0,0)

(1,0,0)(1,0,0)

(1,1,1)(1,1,1)

i=2i=2

(1,1,1)(1,1,1)

(1,1,0)(1,1,0)

(1,1,0)(1,1,0)

i=3i=3

(1,0,1)(1,0,1)

(1,1,1)(1,1,1)

(0,0,0)(0,0,0)

Table 47:Consumption frequency πitc(jj)\pi_{it}^c(j|j) at t=500t=500 for Economy B.1

πitc(jj)\pi_{it}^c(j|j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

1

0.07

0.508

i=2i=2

0.009

1

0.998

i=3i=3

0.050

0.632

1

Economy B.2 with random initial classifiers had not converged after 2000 periods. Nevertheless, the economy seems to be moving towards the fundamental equilibrium.

Table 48:Parameter values used for Economy B.2

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=72,Ca=12E_a = 72, C_a = 12

Storage costs

s1=1,s2=4,s3=9s_1 = 1, s_2 = 4, s_3 = 9

Utility

ui=100,iu_i = 100, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.25,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.25, b_{22} = 0.25

Specialization

fs(t)=1/2t,ps=0.01f_s(t) = 1/2\sqrt{t}, p_s = 0.01

Generalization

fg(t)=1/2t,p1=0.2,p2=0.7,p3=0.2,p4=0.5,S=0,Ne=8,Nc=4f_g(t) = 1/2\sqrt{t}, p_1 = 0.2, p_2 = 0.7, p_3 = 0.2, p_4 = 0.5, S = 0, N_e = 8, N_c = 4

Table 49:Frequency with which ii holds jj at t=1000t=1000 and t=2000t=2000 for Economy B.2

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=1000t=1000)

0

0.434

0.566

i=2i=2 (t=1000t=1000)

0.89

0

0.11

i=3i=3 (t=1000t=1000)

0.088

0.912

0

i=1i=1 (t=2000t=2000)

0

0.354

0.646

i=2i=2 (t=2000t=2000)

0.996

0

0.004

i=3i=3 (t=2000t=2000)

0.268

0.732

0

Table 50:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=1000t=1000 and t=2000t=2000 for Economy B.2

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

i=1i=1 (t=1000t=1000)

(0,0,0)(0,0,0)

(0.14,0.08,0.02)(0.14,0.08,0.02)

(0.18,0.18,0.05)(0.18,0.18,0.05)

i=2i=2 (t=1000t=1000)

(0.14,0.41,0.19)(0.14,0.41,0.19)

(0,0,0)(0,0,0)

(0.03,0.05,0.1)(0.03,0.05,0.1)

i=3i=3 (t=1000t=1000)

(0.01,0,0.02)(0.01,0,0.02)

(0.26,0.16,0.20)(0.26,0.16,0.20)

(0,0,0)(0,0,0)

i=1i=1 (t=2000t=2000)

(0,0,0)(0,0,0)

(0.14,0.04,0.01)(0.14,0.04,0.01)

(0.06,0.16,0.07)(0.06,0.16,0.07)

i=2i=2 (t=2000t=2000)

(0.13,0.36,0)(0.13,0.36,0)

(0,0,0)(0,0,0)

(0,0,0)(0,0,0)

i=3i=3 (t=2000t=2000)

(0.04,0,0.06)(0.04,0,0.06)

(0.22,0.12,0.16)(0.22,0.12,0.16)

(0,0,0)(0,0,0)

Table 51:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=1000t=1000 and t=2000t=2000 for Economy B.2

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=1000t=1000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(,0,)?(\text{—},0,\text{—})?

(1,1,1)(1,1,1)

(1,1,0)(1,1,0)

i=2i=2

(1,1,1)(1,1,1)

(,,)(\text{—},\text{—},\text{—})

(1,1,1)(1,1,1)

i=3i=3

(0,1,1)(0,1,1)

(,,1)(\text{—},\text{—},1)

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=2000t=2000)

j=1j=1

j=2j=2

j=3j=3

i=1i=1

(,1,)?(\text{—},1,\text{—})?

(1,1,0)(1,1,0)

(1,1,0)(1,1,0)

i=2i=2

(1,0,1)(1,0,1)

(,,)(\text{—},\text{—},\text{—})

(1,1,1)(1,1,1)

i=3i=3

(1,0,1)(1,0,1)

(1,,1)(1,\text{—},1)

In particular, with the exception of a zero-frequency event (II is willing to trade 1 for 3) the trading patterns in Economy B.2 correspond to the fundamental equilibrium.

These two simulations for Economy B provide examples in which the classifier systems seem to select the fundamental equilibrium over the speculative equilibrium.[16] Furthermore, the results for Economy B.1 indicate that this is not the result of myopic behavior.

7.4 Economy C (Fiat Money)

Table 52:Parameter values used for Economy C

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=150,Ca=20E_a = 150, C_a = 20

Storage costs

s1=9,s2=14,s3=29,s0=0s_1 = 9, s_2 = 14, s_3 = 29, s_0 = 0

Utility

ui=100,iu_i = 100, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.025,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.025, b_{22} = 0.25

Specialization

fs(t)=1/2t,ps=0.01f_s(t) = 1/2\sqrt{t}, p_s = 0.01

Generalization

fg(t)=1/2t,p1=0.2,p2=0.7,p3=0.2,p4=0.5,S=0,Ne=8,Nc=4f_g(t) = 1/2\sqrt{t}, p_1 = 0.2, p_2 = 0.7, p_3 = 0.2, p_4 = 0.5, S = 0, N_e = 8, N_c = 4

In Economy C a new good, good 0, with the characteristics of ‘fiat money’ is introduced. No agent derives utility from consuming good 0. There are no storage costs associated with good 0.

Fiat money is introduced into the system by forcing some agents to store good 0 in period 0. In particular, 48 units of good 0 are randomly allocated to 48 agents. To avoid fluctuations in the quantity of money, agents are not allowed to consume good 0. A modified setup, not pursued here, would give agents the opportunity to learn not to eat money.

The production technologies are as in Economy A (see Table 8). Table 53 provides the probabilities that characterize the unique (fundamental) equilibrium of Economy C. The equilibrium trading patterns are depicted in Figure 8.

Table 53:Fundamental equilibrium: probability that ii holds jj, πih(j)\pi_i^h(j), for Economy C

πih(j)\pi_i^h(j)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

0

0.26

0.62

j=2j=2

0.74

0

0

j=3j=3

0

0.42

0

j=0j=0

0.26

0.32

0.38

Table 54:Fundamental equilibrium: joint exchange probability πie(jk)\pi_i^e(jk) for Economy C

πie(jk)\pi_i^e(jk)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(0,0,0,0)(0,0,0,0)

([0,0.08],0.064,0,0.023)([0,0.08],0.064,0,0.023)

([0,0.18],0,0.087,0.034)([0,0.18],0,0.087,0.034)

j=2j=2

(0.064,[0,0.08],0,0.079)(0.064,[0,0.08],0,0.079)

(0,0,0,0)(0,0,0,0)

(0,0,0,0)(0,0,0,0)

j=3j=3

(0,0,0,0)(0,0,0,0)

(0.087,0,[0,0.06],0.053)(0.087,0,[0,0.06],0.053)

(0,0,0,0)(0,0,0,0)

j=0j=0

(0.076,0,0,[0,0.08])(0.076,0,0,[0,0.08])

(0,0.079,0,[0,0.1])(0,0.079,0,[0,0.1])

(0,0,0.053,[0,0.12])(0,0,0.053,[0,0.12])

Table 55:Fundamental equilibrium exchange strategies π~ie(jkj)\tilde{\pi}_i^e(jk|j) for Economy C

π~ie(jkj)\tilde{\pi}_i^e(jk|j)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(,0,0,0)?(\text{—},0,0,0)?

(,1,0,1)(\text{—},1,0,1)

(,0,1,1)(\text{—},0,1,1)

j=2j=2

(1,,0,1)(1,\text{—},0,1)

(0,,0,0)?(0,\text{—},0,0)?

(1,,1,1)?(1,\text{—},1,1)?

j=3j=3

(1,1,,1)?(1,1,\text{—},1)?

(1,1,,1)(1,1,\text{—},1)

(0,0,,0)?(0,0,\text{—},0)?

j=0j=0

(1,0,0,)(1,0,0,\text{—})

(0,1,0,)(0,1,0,\text{—})

(0,0,1,)(0,0,1,\text{—})

Exchange pattern in Economy C.

Figure 8:Exchange pattern in Economy C.

Table 56:Frequency with which ii holds jj at t=750t=750 and t=1250t=1250 for Economy C

πith(j)\pi_{it}^h(j)

i=1i=1

i=2i=2

i=3i=3

j=1j=1 (t=750t=750)

0

0.18

0.58

j=2j=2 (t=750t=750)

0.68

0

0.01

j=3j=3 (t=750t=750)

0

0.58

0

j=0j=0 (t=750t=750)

0.32

0.23

0.41

j=1j=1 (t=1250t=1250)

0

0.18

0.77

j=2j=2 (t=1250t=1250)

0.54

0

0.01

j=3j=3 (t=1250t=1250)

0

0.53

0

j=0j=0 (t=1250t=1250)

0.46

0.28

0.21

Table 57:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=750t=750 for Economy C

πite(jk)\pi_{it}^e(jk)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(0,0,0,0)(0,0,0,0)

(0,0.052,0.004,0.008)(0,0.052,0.004,0.008)

(0.008,0.008,0.112,0.06)(0.008,0.008,0.112,0.06)

j=2j=2

(0.06,0,0,0.062)(0.06,0,0,0.062)

(0,0,0,0)(0,0,0,0)

(0,0,0.002,0.004)(0,0,0.002,0.004)

j=3j=3

(0,0,0,0)(0,0,0,0)

(0.116,0.002,0.052,0.048)(0.116,0.002,0.052,0.048)

(0,0,0,0)(0,0,0,0)

j=0j=0

(0.064,0,0,0)(0.064,0,0,0)

(0,0.058,0,0)(0,0.058,0,0)

(0.004,0.008,0.048,0.044)(0.004,0.008,0.048,0.044)

Table 58:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=1250t=1250 for Economy C

πite(jk)\pi_{it}^e(jk)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(0,0,0,0)(0,0,0,0)

(0.016,0.028,0.002,0.028)(0.016,0.028,0.002,0.028)

(0,0.016,0.028,0.002)(0,0.016,0.028,0.002)

j=2j=2

(0.044,0,0,0.066)(0.044,0,0,0.066)

(0,0,0,0)(0,0,0,0)

(0,0,0,0)(0,0,0,0)

j=3j=3

(0,0,0,0)(0,0,0,0)

(0.014,0,0.044,0.044)(0.014,0,0.044,0.044)

(0,0,0,0)(0,0,0,0)

j=0j=0

(0.06,0,0,0)(0.06,0,0,0)

(0,0.066,0,0)(0,0.066,0,0)

(0,0,0.044,0)(0,0,0.044,0)

Table 59:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=750t=750 and t=1250t=1250 for Economy C

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=750t=750)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(,,,)(\text{—},\text{—},\text{—},\text{—})

(1,1,0,1)(1,1,0,1)

(0,0,1,0)(0,0,1,0)

j=2j=2

(1,0,0,1)(1,0,0,1)

(,,,)(\text{—},\text{—},\text{—},\text{—})

(0,0,1,1)(0,0,1,1)

j=3j=3

(0,0,0,0)(0,0,0,0)

(1,1,0,1)(1,1,0,1)

(0,0,0,1)(0,0,0,1)

j=0j=0

(1,0,0,0)(1,0,0,0)

(0,1,0,0)(0,1,0,0)

(0,0,1,1)(0,0,1,1)

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) (t=1250t=1250)

i=1i=1

i=2i=2

i=3i=3

j=1j=1

(,,,)(\text{—},\text{—},\text{—},\text{—})

(1,1,1,1)(1,1,1,1)

(0,0,1,0)(0,0,1,0)

j=2j=2

(1,0,0,1)(1,0,0,1)

(,,,)(\text{—},\text{—},\text{—},\text{—})

(0,0,0,0)(0,0,0,0)

j=3j=3

(0,1,0,0)(0,1,0,0)

(1,1,0,1)(1,1,0,1)

(,,,)(\text{—},\text{—},\text{—},\text{—})

j=0j=0

(1,0,0,0)(1,0,0,0)

(0,1,0,0)(0,1,0,0)

(0,0,1,0)(0,0,1,0)

The economy converges remarkably fast to the fundamental equilibrium. Given the complexity of the exchange patterns with fiat money, the results for Economy C indicate the ability of our artificially intelligent agents to set up complex social arrangements like fiat money.[17]

7.5 Economy D (Five Goods, Five Types)

Table 60:Parameter values used for Economy D

Parameter

Value

No. of agents

Ai=50A_i = 50

No. of classifiers

Ea=180,Ca=20E_a = 180, C_a = 20

Storage costs

s1=1,s2=4,s3=9,s4=16,s5=30s_1 = 1, s_2 = 4, s_3 = 9, s_4 = 16, s_5 = 30

Utility

ui=200,iu_i = 200, \forall i

Initial strengths

Seτea(0)a=0,Scτca(0)a=0S_{e\tau_e^a(0)}^a = 0, S_{c\tau_c^a(0)}^a = 0

Bids

b11=0.025,b12=0.025,b21=0.25,b22=0.25b_{11} = 0.025, b_{12} = 0.025, b_{21} = 0.25, b_{22} = 0.25

Specialization

fs(t)=1/2t,ps=0.01f_s(t) = 1/2\sqrt{t}, p_s = 0.01

Generalization

fg(t)=1/2t,p1=0.2,p2=0.7,p3=0.2,p4=0.5,S=0,Ne=8,Nc=4f_g(t) = 1/2\sqrt{t}, p_1 = 0.2, p_2 = 0.7, p_3 = 0.2, p_4 = 0.5, S = 0, N_e = 8, N_c = 4

In Economy D we enhance complexity by considering five goods and five types. The production technologies are described by Figure 9. That is, type I produces good 3, type II produces good 4, type III produces good 5, type IV produces good 1, and type V produces good 2. As before, each type of agent only derives utility from consuming the good of the same number. Storage costs are ranked in increasing order. For this economy we do not start with any characterization of a stationary equilibrium. With enough work, one could obtain an analytic solution for the fundamental equilibrium for such an economy, but here our purpose is to let our artificially intelligent agents suggest to us what an equilibrium might be. Given this suggestion, we could then presumably verify whether it is an equilibrium.

Production patterns in Economy D; an arrow shows that the agent at the origin of the arrow produces the good at the point of the arrow.

Figure 9:Production patterns in Economy D; an arrow shows that the agent at the origin of the arrow produces the good at the point of the arrow.

Table 61:Frequency with which ii holds jj at t=500t=500 for Economy D

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

0

0.336

0.626

0.032

0.006

i=2i=2

0.322

0

0.038

0.614

0.026

i=3i=3

0.118

0.180

0

0.054

0.648

i=4i=4

0.922

0.044

0.002

0

0.032

i=5i=5

0.210

0.720

0.038

0.032

0

Table 62:Frequency with which ii holds jj at t=1750t=1750 for Economy D

πith(j)\pi_{it}^h(j)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

0

0.832

0.158

0.004

0.006

i=2i=2

0.096

0

0.004

0.870

0.030

i=3i=3

0.048

0.418

0

0.052

0.482

i=4i=4

0.988

0.008

0.002

0

0.002

i=5i=5

0

0.974

0.024

0.002

0

Table 63:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=500t=500 for Economy D

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

(0,0,0,0,0)(0,0,0,0,0)

(0.03,0.02,0.01,0.01,0)(0.03,0.02,0.01,0.01,0)

(0.06,0.05,0.03,0,0)(0.06,0.05,0.03,0,0)

(0.01,0,0,0,0)(0.01,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

i=2i=2

(0.08,0.11,0.04,0.01,0.02)(0.08,0.11,0.04,0.01,0.02)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0.17,0.04,0.01,0.03,0.03)(0.17,0.04,0.01,0.03,0.03)

(0,0,0,0,0)(0,0,0,0,0)

i=3i=3

(0.04,0,0.02,0.02,0)(0.04,0,0.02,0.02,0)

(0.03,0,0.01,0.02,0.01)(0.03,0,0.01,0.02,0.01)

(0,0,0,0,0)(0,0,0,0,0)

(0.01,0,0,0.01,0)(0.01,0,0,0.01,0)

(0.09,0.09,0.01,0.03,0.03)(0.09,0.09,0.01,0.03,0.03)

i=4i=4

(0.24,0.04,0,0.14,0.03)(0.24,0.04,0,0.14,0.03)

(0,0,0,0,0.01)(0,0,0,0,0.01)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

i=5i=5

(0.03,0,0,0.03,0.03)(0.03,0,0,0.03,0.03)

(0.09,0.02,0.03,0,0.08)(0.09,0.02,0.03,0,0.08)

(0,0,0,0.01,0.01)(0,0,0,0.01,0.01)

(0.01,0,0,0,0)(0.01,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

Table 64:Exchange frequency πite(jk)\pi_{it}^e(jk) at t=1750t=1750 for Economy D

πite(jk)\pi_{it}^e(jk)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

(0,0,0,0,0)(0,0,0,0,0)

(0.03,0.17,0.01,0.01,0)(0.03,0.17,0.01,0.01,0)

(0,0.05,0,0,0)(0,0.05,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

i=2i=2

(0.01,0.04,0,0,0)(0.01,0.04,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0.18,0.04,0,0.10,0.03)(0.18,0.04,0,0.10,0.03)

(0,0,0,0,0)(0,0,0,0,0)

i=3i=3

(0.01,0.01,0,0.01,0)(0.01,0.01,0,0.01,0)

(0.01,0.05,0.02,0.04,0.02)(0.01,0.05,0.02,0.04,0.02)

(0,0,0,0,0)(0,0,0,0,0)

(0.01,0,0,0,0)(0.01,0,0,0,0)

(0,0.1,0,0.03,0.03)(0,0.1,0,0.03,0.03)

i=4i=4

(0.09,0.03,0,0.19,0)(0.09,0.03,0,0.19,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

i=5i=5

(0,0,0,0,0)(0,0,0,0,0)

(0.03,0.26,0.04,0,0.07)(0.03,0.26,0.04,0,0.07)

(0.01,0,0,0,0)(0.01,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

(0,0,0,0,0)(0,0,0,0,0)

Table 65:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=500t=500 for Economy D

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

(1,,,,0)(1,\text{—},\text{—},\text{—},0)

(1,1,1,0,0)(1,1,1,0,0)

(1,0,1,0,0)(1,0,1,0,0)

(1,1,1,1,)(1,1,1,1,\text{—})

(0,,,0,1)(0,\text{—},\text{—},0,1)

i=2i=2

(1,1,1,1,0)(1,1,1,1,0)

(1,1,1,,)(1,1,1,\text{—},\text{—})

(1,1,1,0,0)(1,1,1,0,0)

(1,1,1,0,0)(1,1,1,0,0)

(1,1,1,1,)(1,1,1,1,\text{—})

i=3i=3

(1,0,1,1,0)(1,0,1,1,0)

(1,0,1,1,1)(1,0,1,1,1)

(1,,1,1,)(1,\text{—},1,1,\text{—})

(1,,1,1,1)(1,\text{—},1,1,1)

(1,1,1,1,0)(1,1,1,1,0)

i=4i=4

(1,0,0,1,1)(1,0,0,1,1)

(1,,0,1,1)(1,\text{—},0,1,1)

(1,,1,1,)(1,\text{—},1,1,\text{—})

(1,,1,1,)(1,\text{—},1,1,\text{—})

(0,1,1,1,)(0,1,1,1,\text{—})

i=5i=5

(1,0,0,1,1)(1,0,0,1,1)

(1,0,0,0,1)(1,0,0,0,1)

(1,0,0,1,1)(1,0,0,1,1)

(1,1,1,1,1)(1,1,1,1,1)

(,,,,1)(\text{—},\text{—},\text{—},\text{—},1)

Table 66:Winning classifier actions π~ite(jkj)\tilde{\pi}_{it}^e(jk|j) at t=1750t=1750 for Economy D

π~ite(jkj)\tilde{\pi}_{it}^e(jk|j)

j=1j=1

j=2j=2

j=3j=3

j=4j=4

j=5j=5

i=1i=1

(1,,,,0)(1,\text{—},\text{—},\text{—},0)

(1,1,0,1,0)(1,1,0,1,0)

(1,1,0,0,0)(1,1,0,0,0)

(1,1,1,1,1)(1,1,1,1,1)

(0,1,1,1,0)(0,1,1,1,0)

i=2i=2

(1,1,1,0,0)(1,1,1,0,0)

(,,,,)(\text{—},\text{—},\text{—},\text{—},\text{—})

(1,1,0,,1)(1,1,0,\text{—},1)

(1,1,1,0,0)(1,1,1,0,0)

(1,1,1,1,1)(1,1,1,1,1)

i=3i=3

(1,0,1,1,)(1,0,1,1,\text{—})

(0,1,1,0,0)(0,1,1,0,0)

(,,,,)(\text{—},\text{—},\text{—},\text{—},\text{—})

(0,0,1,0,)(0,0,1,0,\text{—})

(1,1,1,1,0)(1,1,1,1,0)

i=4i=4

(0,0,0,1,0)(0,0,0,1,0)

(1,0,1,1,0)(1,0,1,1,0)

(0,0,0,1,0)(0,0,0,1,0)

(,,,,)(\text{—},\text{—},\text{—},\text{—},\text{—})

(1,1,1,1,1)(1,1,1,1,1)

i=5i=5

(1,0,,0,1)(1,0,\text{—},0,1)

(1,1,1,0,1)(1,1,1,0,1)

(0,1,0,1,1)(0,1,0,1,1)

(1,0,,1,1)(1,0,\text{—},1,1)

(,,,,)(\text{—},\text{—},\text{—},\text{—},\text{—})

From the simulation results, we can see that the trading patterns nearly seem to describe a fundamental equilibrium in which agents are only willing to trade for commodities of lower cost than the one currently in storage, except that they always accept the commodity of their type. These fundamental trading patterns are shown in Figure 10. Some speculative moves can be detected. An example of this is that agents of type II accept (from I) good 3 for good 1 in order to trade 3 for 2 with type III. We do not pursue here a full analysis of partially speculative equilibria for this economy.

Exchange patterns in Economy D exhibited by classifier systems.

Figure 10:Exchange patterns in Economy D exhibited by classifier systems.

8. Conclusions

The work described above is presented as the first steps of our project to use the classifier systems of Holland to model learning in multi-agent environments. Classifier systems have previously been applied to solve some complex optimization problems [see Machine Learning (1988)]. Our application has involved some extensions to handle the fact that ours is a multi-agent environment in which agents are solving Markovian dynamic decision problems.

The work described in this paper has accomplished two objectives that are parts of our broader project. First, our simulations have demonstrated by example that multi-agent systems of classifiers can exhibit interesting behavior, and can eventually learn to play Nash-Markov equilibria. Second, the process of making classifier systems cope with the Kiyotaki-Wright environment has prompted us to formalize a variety of concepts and definitions that our subsequent work shall build on. These definitions will facilitate formal analyses of systems of multi-agent classifier systems, which we intend to pursue.

Among the unfinished aspects which we leave for future papers, two interrelated ones are quite important. First, our experiments with the genetic-type algorithms in our incomplete-enumeration classifier systems have convinced us that existing algorithms can be improved. These improvements, which will increase the amount of experimentation that is done at the ‘right times’, will have the effect of improving the capacity of the classifier systems to settle upon optimal outcomes more rapidly, and to avoid becoming locked in to suboptimal patterns of interaction. The improved algorithms will also help with a second major piece of unfinished business, which is to develop analytical results on the convergence of classifier systems. The new algorithms will be set up in such a way that we can apply the stochastic approximation approach to convergence analysis which we alluded to in 5. Concepts of Convergence for Classifier Systems Playing Games.

References

Acknowledgments

This research began with visits by Marimon and Sargent to the Santa Fe Institute. We thank Brian Arthur and John Holland for several helpful discussions about genetic algorithms at the Santa Fe Institute. We also thank Randall Wright and Nancy Stokey for helpful comments on an earlier draft. Sargent’s research was supported by a grant from the National Science Foundation to the National Bureau of Economic Research. Marimon’s research was supported by a grant from the National Science Foundation and by the National Fellows Program at the Hoover Institution. This paper is an abbreviated version of a Hoover Institution working paper with the same title, which is available from the authors upon request.

Footnotes
  1. The way in which we model agents as searching for rewarding decision rules attributes much less knowledge and rationality to them than is typical in the literature on least-squares learning in the context of linear rational-expectations models. See Bray (1982), Bray & Savin (1986), Fourgeaud et al. (1986), and Marcet & Sargent (1989). In the least-squares learning literature, agents are posited to be almost fully rational and knowledgeable about the system they operate within, lacking knowledge only about particular parameters in laws of motion of a set of variables exogenous to their own decisions, which they learn about through the recursive application of least squares. But given their estimates, agents are supposed to compute optimal decision rules by using dynamic programming or the calculus of variations.

  2. When the state and action spaces are small, as in the Kiyotaki-Wright model, the list of all possible rules mapping states into actions is not too long. In such a situation, one can follow Axelrod (1987) and model an agent’s strategy as a single binary bit string. A pure genetic algorithm can then be applied, as in Axelrod (1987). This approach has been applied to the Kiyotaki-Wright model by Marimon and Miller (1989) and by Knez and Litterman (1989). While much easier to implement than classifier systems, this pure genetic approach also has some drawbacks. First, it is difficult to extend to situations involving ‘unforeseen contingencies’ and in which it is not practical to enumerate all of the histories that might be encountered in the course of play. Second, relative to the classifier, the pure genetic algorithm encodes information wastefully in the sense that an entire strategy is encoded, while the experience of play will typically throw up observations only on a small subset of the possible states. While the classifier system concentrates ‘effort’ on learning rules to use in frequently encountered states, the pure genetic algorithm purports to be learning about entire strategies. Third, classifiers can learn general rules that apply to several states and that offer potential informational economies.

  3. A pure genetic algorithm could be applied to the trading decision as follows. For each agent, there are 3×3=93 \times 3 = 9 possible values that his current state prior to trade can attain. There are two possible actions: trade or don’t trade. Letting 1 mean trade and 0 mean don’t trade, a binary string of length 9 can be used to encode a Markovian strategy. This is done by numbering the set of possible states j=1,,9j = 1, \ldots, 9 and then letting the jjth entry in the bit string denote the decision to be made when the jjth state is encountered. The genetic algorithm is then applied to select the best binary string (i.e., the best strategy). This approach, which has been applied to the Kiyotaki-Wright model [see Marimon and Miller (1989) and Knez and Litterman (1989)], has shortcomings as a starting point for studying more complex systems. Because the optimization is done over entire strategies, which include descriptions of decisions to be made in all possible states, even rarely visited ones, the encoding of information is wasteful and does not facilitate more intensive learning about frequently visited states.

  4. See Goldberg (1989) for a description of classifier systems and a survey of some of their applications.

  5. This rule could easily be modified to permit a ‘stochastic auction’ in which the probability of winning the auction varies directly with strength.

  6. We make this specification in order to economize on computer time and space. It is possible that this specification speeds convergence relative to a setting in which each agent aAa \in \mathscr{A} uses his own classifier system. However, relative to a setup in which each agent uses his own classifier system and so can experiment individually, the common classifier setup causes all type ii agents to experiment simultaneously. This feature might delay convergence.

  7. Notice that this condition involves both parameter values (s3s_3, s2s_2, u1u_1) and the quantities (π1h(3)\pi_1^h(3), π1h(2)\pi_1^h(2)) which are endogenous variables that are determined in equilibrium. Kiyotaki and Wright show that when the above inequality is reversed, then the unique stationary rational-expectations equilibrium is a speculative equilibrium. Vis-à-vis the fundamental equilibrium, the only change in trading patterns is that agents of type I ‘speculate’ by holding good 3, which is not too costly because in equilibrium inequality (18) is reversed.

  8. It was considerations from the stochastic-approximation literature that led us to alter Holland’s specification of the laws of motion of strengths so that they measure cumulative average past net rewards, not total rewards. The use of average rewards makes it possible for strengths to converge. Arthur (1989) uses average rewards for strengths for the same reason.

  9. Ljung & Söderström (1983) is a useful reference on stochastic approximation and on the ‘ordinary differential-equations approach’ to proving almost sure convergence. See Marcet & Sargent (1989) and Marcet & Sargent (1989) for some applications in economics.

  10. Brian Arthur and Carl Simon have used stochastic approximation arguments to establish convergence of strengths for a fixed system of two classifiers playing a two-armed bandit. Their arguments do not apply to our system because their system is one in which there are competing classifiers with distinct actions, whose condition parts are identical and are always met.

  11. When ‘—’ appears in the table, it means that the event in question occurred so rarely in our simulation that no reliable frequency can be computed.

  12. More standard Holland algorithms were first tried without much success, prompting us to produce the modified algorithm described in 6. Incomplete Enumeration Classifiers and the ‘Genetic Algorithm’.

  13. To conserve space, these classifiers are not reported here. They are contained in a longer version of this paper which is available from the authors as a working paper.

  14. A fundamental equilibrium exists for Economy A2 only if the discount factor is sufficiently low.

  15. Marimon and Miller (1989) report that in their experiments with the genetic algorithm 25 out of 30 runs of Economy A2 converged to the speculative equilibrium.

  16. In Marimon and Miller (1989) 30 out of 30 experiments resulted in convergence to the fundamental equilibrium.

  17. A version of Economy C with the storage costs of Economy A did not converge, probably because the low cost of storing good 1 (0.1) made this good too good a substitute for fiat money. These results for Economy C have been successfully replicated in Marimon and Miller (1989).

References
  1. Kiyotaki, N., & Wright, R. (1989). On Money as a Medium of Exchange. Journal of Political Economy, 97(4), 927–954.
  2. Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press.
  3. Holland, J. H. (1986). Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine Learning: An Artificial Intelligence Approach (Vol. 2). Morgan Kaufmann.
  4. Machine Learning. (1988). Special Issue on Genetic Algorithms. Machine Learning, 3(2/3).
  5. Bray, M. (1982). Learning, Estimation, and Stability of Rational Expectations. Journal of Economic Theory, 26, 318–339.
  6. Bray, M. M., & Savin, N. E. (1986). Rational Expectations Equilibria, Learning and Model Specification. Econometrica, 54, 1129–1160.
  7. Fourgeaud, C., Gourieroux, C., & Pradel, J. (1986). Learning Procedure and Convergence to Rationality. Econometrica, 54, 845–868.
  8. Marcet, A., & Sargent, T. J. (1989). Convergence of Least Squares Learning Mechanisms in Self Referential Linear Stochastic Models. Journal of Economic Theory, 48, 337–368.
  9. Axelrod, R. (1987). The Evolution of Strategies in the Iterated Prisoner’s Dilemma. In L. Davis (Ed.), Genetic Algorithms and Simulated Annealing. Morgan Kaufmann.
  10. Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley.
  11. Arthur, B. (1989). The Dynamics of Classifier Competitions.
  12. Ljung, L., & Söderström, T. (1983). Theory and Practice of Recursive Identification. MIT Press.
  13. Marcet, A., & Sargent, T. J. (1989). Convergence of Least Squares Learning in Environments with Private Information and Hidden State Variables. Journal of Political Economy, 97, 1306–1322.