Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents - Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents

Replicating Marimon, McGrattan, and Sargent (1990)¶

This notebook replicates the key computational experiments from:

Marimon, R., McGrattan, E., & Sargent, T. J. (1990). “Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents.” Journal of Economic Dynamics and Control, 14, 329-373.

Overview¶

Marimon, McGrattan, and Sargent studied the exchange economies of Kiyotaki and Wright (1989) in which agents must use a commodity or fiat money as a medium of exchange if trade is to occur. While Kiyotaki and Wright studied rational agents playing Nash-Markov equilibria, Marimon, McGrattan, and Sargent replaced rational agents with artificially intelligent agents that use John Holland’s classifier systems to learn decision rules.

The key questions addressed are:

Can artificially intelligent Holland agents learn to play Nash equilibrium strategies?
When multiple equilibria exist (“fundamental” vs. “speculative”), which equilibrium emerges?
Can classifier systems handle more complex economies, including fiat money?

Holland’s Classifier System¶

A classifier system is a collection of if-then rules (classifiers), each encoded as a string in a trinary alphabet $\{0, 1, \#\}$ , where $\#$ means “don’t care.” Each classifier has:

A condition part: specifies which states activate the rule
An action part: specifies the decision to take (e.g., trade or don’t trade)
A strength: a running average of past net rewards, used to select among competing rules

The system selects the highest-strength matching rule via an auction mechanism, and updates strengths via a bucket brigade payment system. A genetic algorithm periodically introduces new rules by combining successful existing ones.

1. The Kiyotaki-Wright Environment¶

There are three types of agents indexed by $i = 1, 2, 3$ . Each type $i$ agent:

Gets utility only from consuming good $i$
Has technology to produce good $i^*$ (where $i^* \neq i$ )
Can store only one unit of one good at a time

Model A production structure (a “Wicksell triangle” — no double coincidence of wants):

Type $i$	Produces $i^*$	Consumes
1	2	1
2	3	2
3	1	3

Storage costs satisfy $s_3 > s_2 > s_1 > 0$ , so good 1 is cheapest to store.

Each period:

Agents are randomly matched into pairs
Each agent decides whether to propose a trade (trade occurs only if both agree)
Each agent decides whether to consume the good they hold (if they consume good $i$ , they get utility $u_i$ and immediately produce good $i^*$ )

Equilibria¶

Fundamental equilibrium: Good 1 (lowest storage cost) serves as the general medium of exchange. Type II agents accept good 1 even though they don’t consume it, using it to trade for good 2.
Speculative equilibrium: Type I agents “speculate” by accepting good 3 (high storage cost) because they expect to easily trade it for good 1. This equilibrium can exist when $u_1$ is sufficiently large relative to $s_3 - s_2$ .

2. Implementation¶

We now implement the classifier system simulation. The implementation is self-contained — all code needed to replicate the paper’s results is in this notebook.

2.1 Imports and Setup¶

import numpy as np
import matplotlib.pyplot as plt
from dataclasses import dataclass, field
from typing import List, Tuple, Dict, Optional
import warnings
warnings.filterwarnings('ignore')

# For reproducibility
np.random.seed(42)

%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['figure.dpi'] = 120
plt.rcParams['font.size'] = 11

print("Setup complete.")

Setup complete.

2.2 Economy Configuration¶

We define a configuration class that specifies all the parameters of the Kiyotaki-Wright economy and the classifier system.

@dataclass
class EconomyConfig:
    """
    Configuration for a Kiyotaki-Wright economy with classifier system agents.
    
    Parameters
    ----------
    n_types : int
        Number of agent types (and goods).
    n_agents_per_type : int  
        Number of agents of each type.
    produces : np.ndarray
        produces[i] = good produced by type i after consumption (0-indexed).
    storage_costs : np.ndarray
        storage_costs[k] = per-period cost of storing good k.
    utility : np.ndarray
        utility[i] = utility type i gets from consuming good i.
    n_trade_classifiers : int
        Number of trade classifiers per agent type.
    n_consume_classifiers : int
        Number of consume classifiers per agent type.
    bid_trade : tuple
        (b11, b12) — bid constants for trade classifiers.
    bid_consume : tuple
        (b21, b22) — bid constants for consume classifiers.
    use_complete_enumeration : bool
        If True, start with all possible classifiers (complete enumeration).
    use_genetic_algorithm : bool
        If True, periodically apply the genetic algorithm.
    ga_frequency : float
        Base frequency for GA application (decreases as 1/sqrt(t)).
    ga_pcross : float
        Crossover probability (p₃ in paper, pcrosst in MATLAB). Default 0.6.
    ga_pmutation : float
        Per-bit mutation probability (p₄ in paper, pmutationt in MATLAB). Default 0.01.
    ga_propselect : float
        Proportion of population selected for reproduction (p₁ in paper,
        propselectt in MATLAB). Default 0.2.
    ga_propused : float
        Pre-selection: fraction of most-used rules as parent candidates (p₂ in paper,
        propmostusedt in MATLAB). Default 0.7.
    ga_crowdfactor_trade : int
        Number of crowding candidates for trade classifiers (crowdfactort in MATLAB). Default 8.
    ga_crowdfactor_consume : int
        Number of crowding candidates for consume classifiers (crowdfactorc in MATLAB). Default 4.
    ga_crowdsubpop : float
        Fraction of killable set used as crowding subpopulation (crowdsubpopt in MATLAB). Default 0.5.
    ga_uratio : tuple
        (strength_cutoff, usage_cutoff) for identifying killable rules.
        Rules with strength < uratio[0] OR usage/max_usage < uratio[1] are candidates
        for replacement (uratio in MATLAB). Default (0.0, 0.2).
    """
    name: str = "Economy A1"
    n_types: int = 3
    n_agents_per_type: int = 50
    produces: np.ndarray = field(default_factory=lambda: np.array([1, 2, 0]))  # 0-indexed
    storage_costs: np.ndarray = field(default_factory=lambda: np.array([0.1, 1.0, 20.0]))
    utility: np.ndarray = field(default_factory=lambda: np.array([100.0, 100.0, 100.0]))
    n_trade_classifiers: int = 72
    n_consume_classifiers: int = 12
    bid_trade: tuple = (0.025, 0.025)
    bid_consume: tuple = (0.25, 0.25)
    use_complete_enumeration: bool = True
    use_genetic_algorithm: bool = False
    ga_frequency: float = 0.5
    ga_pcross: float = 0.6
    ga_pmutation: float = 0.01
    ga_propselect: float = 0.2
    ga_propused: float = 0.7
    ga_crowdfactor_trade: int = 8
    ga_crowdfactor_consume: int = 4
    ga_crowdsubpop: float = 0.5
    ga_uratio: tuple = (0.0, 0.2)
    
    @property
    def n_agents(self):
        return self.n_types * self.n_agents_per_type
    
    @property 
    def n_goods(self):
        return self.n_types  # One good per type in the basic model

print("EconomyConfig class defined.")

EconomyConfig class defined.

2.3 The Classifier System¶

Each classifier is a trinary string encoding a (condition, action) pair. For the trade classifier with 3 goods:

Condition: (own good, partner’s good) — each encoded in 2 bits using $\{0, 1, \#\}$
Action: 1 = propose trade, 0 = refuse

The encoding for goods is:

Code	Good
1 0	Good 1
0 1	Good 2
0 0	Good 3
0 #	Not good 1
# 0	Not good 2
# #	Any good

With 6 possible conditions for own good × 6 for partner’s good × 2 actions = 72 possible classifiers for the trade decision. For the consumption decision, there are 6 conditions × 2 actions = 12 possible classifiers.

Strength Update Rules¶

Strengths evolve as cumulative averages of net rewards (a key innovation of this paper):

S_{c,\tau+1} = S_{c,\tau} - \frac{1}{\tau}\left[(1 + b_2(c))S_{c,\tau} - \sum_e I_e b_1(e) S_{e,\tau} - U(\gamma)\right]

(1)

S_{e,\tau+1} = S_{e,\tau} - \frac{1}{\tau}\left[(1 + b_1(e))S_{e,\tau} - \sum_c I_c b_2(c) S_{c,\tau}\right]

(2)

where $U(\gamma)$ is the external payoff from consumption.

class Classifier:
    """
    A single classifier (if-then rule) in the classifier system.
    
    Attributes
    ----------
    condition : np.ndarray
        Condition part (trinary: 0, 1, or -1 for #).
    action : int
        Action part (0 or 1).
    strength : float
        Running average of net rewards.
    n_used : int
        Number of times this classifier has won the auction.
    n_traded : int
        Number of times this classifier's trade action was executed (MATLAB class004.m).
    """
    def __init__(self, condition: np.ndarray, action: int, strength: float = 0.0):
        self.condition = condition.copy()
        self.action = action
        self.strength = strength
        self.n_used = 1  # Initialize counter at 1 (as in the paper)
        self.n_traded = 0  # MATLAB class004.m: n_traded tracks actual trades
    
    def matches(self, state: np.ndarray) -> bool:
        """Check if this classifier's condition matches the given state."""
        for c, s in zip(self.condition, state):
            if c != -1 and c != s:  # -1 is the wildcard (#)
                return False
        return True
    
    @property
    def specificity(self) -> float:
        """Fraction inversely related to number of wildcards."""
        n_wildcards = np.sum(self.condition == -1)
        return 1.0 / (1.0 + n_wildcards)
    
    def bid(self, b1: float, b2: float) -> float:
        """Compute bid = (b1 + b2 * specificity) * strength."""
        return (b1 + b2 * self.specificity) * self.strength
    
    def __repr__(self):
        cond_str = ''.join(['#' if c == -1 else str(int(c)) for c in self.condition])
        return f"Classifier({cond_str} -> {self.action}, S={self.strength:.2f}, n={self.n_used})"


def encode_good(good_idx: int, n_goods: int) -> np.ndarray:
    """
    Encode a good index into binary representation.
    
    For 3 goods with 2-bit encoding:
        Good 0 -> [1, 0]
        Good 1 -> [0, 1]
        Good 2 -> [0, 0]
    """
    if n_goods == 3:
        encodings = np.array([[1, 0], [0, 1], [0, 0]])
    elif n_goods == 4:
        # For fiat money economy: goods 0-2 as before, good 3 (fiat) = [1, 1]
        encodings = np.array([[1, 0], [0, 1], [0, 0], [1, 1]])
    elif n_goods == 5:
        encodings = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1], [1, 1, 0], [1, 0, 1]])
    else:
        # General binary encoding
        n_bits = int(np.ceil(np.log2(max(n_goods, 2))))
        encodings = np.array([list(map(int, format(i, f'0{n_bits}b'))) for i in range(n_goods)])
    return encodings[good_idx]


def decode_good(encoding: np.ndarray, n_goods: int) -> int:
    """Decode a binary encoding back to a good index."""
    for i in range(n_goods):
        if np.array_equal(encoding, encode_good(i, n_goods)):
            return i
    return -1


def generate_all_conditions(n_bits: int) -> List[np.ndarray]:
    """
    Generate all possible conditions in the trinary alphabet {0, 1, #}.
    
    For 2-bit encoding of goods: {(1,0), (0,1), (0,0), (0,#), (#,0), (#,#)}
    These represent the 6 possible conditions per good position.
    """
    if n_bits == 2:
        # The 6 meaningful conditions for 3-good encoding
        return [
            np.array([1, 0]),   # Good 0
            np.array([0, 1]),   # Good 1
            np.array([0, 0]),   # Good 2
            np.array([0, -1]),  # Not good 0 (good 1 or 2)
            np.array([-1, 0]), # Not good 1 (good 0 or 2)
            np.array([-1, -1]) # Any good (wildcard)
        ]
    elif n_bits == 3:
        # For 5-good encoding with 3 bits
        conditions = []
        for b0 in [-1, 0, 1]:
            for b1 in [-1, 0, 1]:
                for b2 in [-1, 0, 1]:
                    conditions.append(np.array([b0, b1, b2]))
        return conditions
    else:
        raise ValueError(f"n_bits={n_bits} not supported")


def create_complete_classifier_set(n_goods: int, classifier_type: str) -> List[Classifier]:
    """
    Create a complete enumeration of all possible classifiers.
    
    Parameters
    ----------
    n_goods : int
        Number of goods in the economy.
    classifier_type : str
        Either 'trade' or 'consume'.
    
    Returns
    -------
    List[Classifier]
        Complete set of classifiers for the given context.
    """
    if n_goods == 3:
        n_bits = 2
    elif n_goods == 4:
        n_bits = 2
    elif n_goods == 5:
        n_bits = 3
    else:
        n_bits = int(np.ceil(np.log2(max(n_goods, 2))))
    
    if classifier_type == 'trade':
        # For trade: condition encodes (own_good, partner_good) = 2 * n_bits bits
        own_conditions = generate_all_conditions(n_bits)
        partner_conditions = generate_all_conditions(n_bits)
        classifiers = []
        for own_cond in own_conditions:
            for part_cond in partner_conditions:
                condition = np.concatenate([own_cond, part_cond])
                for action in [0, 1]:
                    classifiers.append(Classifier(condition, action))
        return classifiers
    elif classifier_type == 'consume':
        conditions = generate_all_conditions(n_bits)
        classifiers = []
        for cond in conditions:
            for action in [0, 1]:
                classifiers.append(Classifier(cond, action))
        return classifiers


def create_random_classifier_set(n_classifiers: int, cond_length: int) -> List[Classifier]:
    """Create a random classifier set with trinary conditions."""
    classifiers = []
    for _ in range(n_classifiers):
        condition = np.random.randint(-1, 2, size=cond_length).astype(float)
        action = np.random.randint(0, 2)
        strength = np.random.rand() * 0.1
        classifiers.append(Classifier(condition, action, strength))
    return classifiers


def create_classifier_replacing_weakest(classifiers, state):
    """Replace the weakest classifier with one matching the current state."""
    weakest_idx = min(range(len(classifiers)), key=lambda i: classifiers[i].strength)
    action = np.random.randint(0, 2)
    classifiers[weakest_idx] = Classifier(state.copy(), action, 0.0)
    return weakest_idx


print("Classifier system building blocks defined.")
print(f"  Classifier: condition/action rule with strength tracking")
print(f"  encode_good/decode_good: binary encoding for goods")
print(f"  create_complete_classifier_set: exhaustive enumeration")
print(f"  create_random_classifier_set: random initialization")

Classifier system building blocks defined.
  Classifier: condition/action rule with strength tracking
  encode_good/decode_good: binary encoding for goods
  create_complete_classifier_set: exhaustive enumeration
  create_random_classifier_set: random initialization

2.4 The Classifier System Agent¶

Each agent type uses two interconnected classifier systems:

Trade classifier: Maps pre-trade state $(x_a, x_{\rho(a)})$ → trade decision $\lambda$
Consume classifier: Maps post-trade holdings $x_a^+$ → consume decision $\gamma$

The auction selects the highest-strength matching classifier, and the bucket brigade passes payments between winning classifiers.

class ClassifierAgent:
    """
    An agent type using classifier systems for trade and consumption decisions.
    
    All agents of the same type share classifier systems (as in the paper,
    to economize on computation).
    
    Parameters
    ----------
    type_id : int
        Agent type (0-indexed).
    config : EconomyConfig
        Economy configuration.
    """
    def __init__(self, type_id: int, config: EconomyConfig):
        self.type_id = type_id
        self.config = config
        self.n_goods = config.n_goods
        
        if self.n_goods <= 3:
            self.n_bits = 2
        elif self.n_goods <= 5:
            self.n_bits = 3
        else:
            self.n_bits = int(np.ceil(np.log2(self.n_goods)))
        
        # Initialize classifier systems
        if config.use_complete_enumeration:
            self.trade_classifiers = create_complete_classifier_set(self.n_goods, 'trade')
            self.consume_classifiers = create_complete_classifier_set(self.n_goods, 'consume')
        else:
            self.trade_classifiers = create_random_classifier_set(
                config.n_trade_classifiers, 2 * self.n_bits)
            self.consume_classifiers = create_random_classifier_set(
                config.n_consume_classifiers, self.n_bits)
        
        # Track last winning consume classifier (for bucket brigade)
        self.last_consume_winner = None
        self.last_consume_bid_frac = 0.0
    
    def get_trade_decision(self, own_good: int, partner_good: int) -> Tuple[int, int]:
        """
        Decide whether to propose a trade.
        
        Returns
        -------
        (action, winner_idx) : tuple
            action = 1 (trade) or 0 (don't trade), and index of winning classifier.
        """
        # Encode state
        own_enc = encode_good(own_good, self.n_goods)
        partner_enc = encode_good(partner_good, self.n_goods)
        state = np.concatenate([own_enc, partner_enc])
        
        # Find matching classifiers
        matches = [(i, c) for i, c in enumerate(self.trade_classifiers) if c.matches(state)]
        
        if not matches:
            # No matching classifier — creation operator (Section 6 / create.m)
            # Replace most redundant or weakest classifier, maintaining constant population
            new_idx = create_classifier_replacing_weakest(self.trade_classifiers, state)
            return self.trade_classifiers[new_idx].action, new_idx
        
        # Diversification: ensure both actions are represented among matches
        # MATLAB class003 (Economy B with GA) does NOT use diversification.
        # Only apply for complete enumeration economies (class001/class002 style).
        if self.config.use_complete_enumeration:
            apply_diversification(self.trade_classifiers, state, self.n_goods)
            # Re-find matches after diversification may have modified classifier list
            matches = [(i, c) for i, c in enumerate(self.trade_classifiers) if c.matches(state)]
        
        # Auction: highest strength wins
        best_idx, best_classifier = max(matches, key=lambda x: x[1].strength)
        return best_classifier.action, best_idx
    
    def get_consume_decision(self, holding: int) -> Tuple[int, int]:
        """
        Decide whether to consume the currently held good.
        
        Returns
        -------
        (action, winner_idx) : tuple
            action = 1 (consume) or 0 (don't consume), and index of winning classifier.
        """
        state = encode_good(holding, self.n_goods)
        matches = [(i, c) for i, c in enumerate(self.consume_classifiers) if c.matches(state)]
        
        if not matches:
            # Creation operator: replace redundant/weak classifier
            new_idx = create_classifier_replacing_weakest(self.consume_classifiers, state)
            return self.consume_classifiers[new_idx].action, new_idx
        
        # Diversification: only for complete enumeration (not for GA economies)
        if self.config.use_complete_enumeration:
            apply_diversification(self.consume_classifiers, state, self.n_goods)
            # Re-find matches after diversification may have modified classifier list
            matches = [(i, c) for i, c in enumerate(self.consume_classifiers) if c.matches(state)]
        
        best_idx, best_classifier = max(matches, key=lambda x: x[1].strength)
        return best_classifier.action, best_idx
    
    def update_strengths(self, trade_winner_idx: int, consume_winner_idx: int,
                         external_payoff: float, trade_happened: bool):
        """
        Update classifier strengths using the bucket brigade algorithm.
        
        Implements equations (10) and (11) from the paper:
        
        Eq (10) — Consumption classifier c:
            S_{c,τ} = S_{c,τ-1} - 1/(τ_c - 1) * [(1+b₂)S_{c,τ-1} - b₁·S_e - U(γ)]
        
        Eq (11) — Exchange classifier e:
            S_{e,τ+1} = S_{e,τ} - 1/τ_e * [(1+b₁)S_{e,τ} - b₂·S_c]
        
        The bucket brigade connects trade and consumption classifiers:
        - The winning consume classifier at t pays bid to winning exchange at t
        - The winning exchange classifier at t pays bid to winning consume at t-1
        - External payoff flows to the winning consume classifier at t
        
        Parameters
        ----------
        trade_winner_idx : int
            Index of winning trade classifier.
        consume_winner_idx : int
            Index of winning consume classifier.
        external_payoff : float
            Net utility from consumption decision.
        trade_happened : bool
            Whether the trade actually occurred (both parties agreed).
        """
        b11, b12 = self.config.bid_trade
        b21, b22 = self.config.bid_consume
        
        e_cls = self.trade_classifiers[trade_winner_idx]
        c_cls = self.consume_classifiers[consume_winner_idx]
        
        # Compute bid fractions
        b1_e = b11 + b12 * e_cls.specificity
        b2_c = b21 + b22 * c_cls.specificity
        
        # --- Exchange classifier update (eq. 11) ---
        # Only update if trade occurred or agent refused trade
        # (don't update if offer was not reciprocated and action was 'trade')
        if trade_happened or e_cls.action == 0:
            tau_e = e_cls.n_used  # Use counter BEFORE incrementing (paper's τ_e(t))
            e_cls.n_used += 1
            
            # Exchange classifier receives payment from consume classifier
            payment_from_consume = b2_c * c_cls.strength
            e_cls.strength = e_cls.strength - (1.0 / tau_e) * (
                (1 + b1_e) * e_cls.strength - payment_from_consume
            )
        
        # --- Consumption classifier update (eq. 10) ---
        tau_c_old = c_cls.n_used  # Counter before increment = τ_c - 1 in paper
        c_cls.n_used += 1
        
        payment_from_exchange = b1_e * e_cls.strength if (trade_happened or e_cls.action == 0) else 0.0
        c_cls.strength = c_cls.strength - (1.0 / tau_c_old) * (
            (1 + b2_c) * c_cls.strength - payment_from_exchange - external_payoff
        )
        
        # --- Pay previous consume classifier from current exchange classifier ---
        # Per the paper: "The bid of the winning exchange classifier at t is paid
        # to the winning consumption classifier at t-1"
        if self.last_consume_winner is not None and (trade_happened or e_cls.action == 0):
            prev_c = self.consume_classifiers[self.last_consume_winner]
            payment = b1_e * e_cls.strength
            prev_c.strength += payment / max(prev_c.n_used, 1)
        
        # Remember this consume classifier for next round's bucket brigade
        self.last_consume_winner = consume_winner_idx
        self.last_consume_bid_frac = b2_c


print("ClassifierAgent class defined.")

ClassifierAgent class defined.

2.5 The Genetic Algorithm¶

When using incomplete enumeration (random initial classifiers), the genetic algorithm periodically evolves the classifier population. Our implementation replicates ga3.m from the original MATLAB code:

Identify killable rules: Classifiers with strength < 0 or usage < 20% of maximum usage are candidates for replacement
Two-stage parent selection (matching MATLAB):
- Stage 1: Pre-select a subset of rules proportional to their usage counts ( $p_2 = 0.7$ fraction)
- Stage 2: Within that subset, select parents by fitness-proportional roulette wheel
Single-point crossover ( $p_3 = 0.6$ ) with ternary cyclic mutation ( $p_4 = 0.01$ )
Crowding-based replacement: Children replace the most similar rule within the killable set (De Jong crowding), preserving population diversity

Additional operators (Section 6 of the paper):

Creation: When no classifier matches a state, the most redundant (or weakest) classifier is replaced with a rule matching that state — maintaining constant population size
Diversification: Ensures both actions are represented among matching classifiers
Specialization: Converts wildcards (#) to specific values with decreasing frequency $f_s(t) = 1/(2\sqrt{t})$

GA parameters from MATLAB winitial.m: $p_1 = 0.2$ , $p_2 = 0.7$ , $p_3 = 0.6$ , $p_4 = 0.01$ , crowding factor = 8 (trade) / 4 (consume).

def apply_genetic_algorithm(classifiers: List[Classifier], 
                            n_pairs: int = None,
                            pcross: float = 0.6,
                            pmutation: float = 0.01,
                            propselect: float = 0.2,
                            propused: float = 0.7,
                            crowding_factor: int = 8,
                            crowding_subpop: float = 0.5,
                            uratio: tuple = (0.0, 0.2)) -> List[Classifier]:
    """
    Apply the genetic algorithm to evolve the classifier population.
    
    Faithfully replicates the ga3.m algorithm from the original MATLAB code:
    1. Identify killable rules (weak strength or low usage)
    2. Two-stage parent selection:
       a. Pre-select subset proportional to usage count (propused fraction)
       b. Fitness-proportional roulette wheel within that subset
    3. Single-point crossover with ternary mutation
    4. Crowding-based replacement within killable set
    
    Parameters
    ----------
    classifiers : list of Classifier
        Current population of classifiers.
    n_pairs : int or None
        Number of mating pairs. If None, computed as round(propselect * len * 0.5).
    pcross : float
        Crossover probability (default 0.6, matching MATLAB pcrosst).
    pmutation : float
        Per-bit mutation probability (default 0.01, matching MATLAB pmutationt).
    propselect : float
        Proportion of classifiers selected for reproduction (p₁=0.2 in paper).
    propused : float
        Proportion of most-used rules as candidates for parent selection (p₂=0.7 in paper).
    crowding_factor : int
        Number of candidates considered in crowding replacement (default 8 for trade, 4 for consume).
    crowding_subpop : float
        Fraction of population used for crowding subpopulation (default 0.5).
    uratio : tuple
        (strength_cutoff, usage_cutoff) for killable rules.
        Rules with strength < uratio[0] OR usage/max_usage < uratio[1] are candidates.
        
    Returns
    -------
    list of Classifier
        Updated classifier population.
    """
    n_classifiers = len(classifiers)
    if n_classifiers < 4:
        return classifiers
    
    # Compute n_pairs from propselect if not specified
    if n_pairs is None:
        n_pairs = max(1, round(propselect * n_classifiers * 0.5))
    
    cond_length = len(classifiers[0].condition)
    
    # Compute fitness (strengths shifted to be non-negative, as in MATLAB)
    strengths = np.array([c.strength for c in classifiers])
    usage = np.array([c.n_used for c in classifiers])
    min_s = strengths.min()
    
    # Identify killable rules: strength < uratio[0] OR usage/max < uratio[1]
    max_usage = max(usage.max(), 1)
    cankill = []
    for i in range(n_classifiers):
        if strengths[i] < uratio[0] or usage[i] / max_usage < uratio[1]:
            cankill.append(i)
    
    if not cankill:
        return classifiers  # Nothing to replace
    
    # Shift strengths to be non-negative for selection (as in MATLAB)
    fitness = strengths.copy()
    if min_s < 0:
        fitness = fitness - min_s
    fitness = fitness + 1e-6  # Ensure positive
    
    # Limit n_pairs to available killable slots
    n_pairs = min(n_pairs, (len(cankill) + 1) // 2)
    
    for _ in range(n_pairs):
        if not cankill:
            break
            
        # --- Stage 1: Pre-select subset by usage (propused fraction) ---
        ncalled = int(propused * n_classifiers)
        if ncalled < n_classifiers:
            # Select ncalled indices proportional to usage+1
            usage_weights = usage.astype(float) + 1.0
            # Zero out already-selected to avoid duplicates
            available = usage_weights.copy()
            selected_indices = []
            for _ in range(min(ncalled, n_classifiers)):
                total = available.sum()
                if total <= 0:
                    break
                r = np.random.rand() * total
                cumsum = 0.0
                for idx in range(n_classifiers):
                    cumsum += available[idx]
                    if cumsum >= r:
                        selected_indices.append(idx)
                        available[idx] = 0.0
                        break
            if len(selected_indices) < 2:
                selected_indices = list(range(n_classifiers))
        else:
            selected_indices = list(range(n_classifiers))
        
        # --- Stage 2: Fitness-proportional selection within pre-selected subset ---
        subset_fitness = np.array([fitness[i] for i in selected_indices])
        total_fit = subset_fitness.sum()
        if total_fit <= 0:
            continue
        
        # Select two parents by roulette wheel
        def roulette_select(pop_fitness, pop_size):
            r = np.random.rand() * pop_fitness.sum()
            cumsum = 0.0
            for j in range(pop_size):
                cumsum += pop_fitness[j]
                if cumsum >= r:
                    return j
            return pop_size - 1
        
        idx1 = roulette_select(subset_fitness, len(selected_indices))
        idx2 = roulette_select(subset_fitness, len(selected_indices))
        mate1 = selected_indices[idx1]
        mate2 = selected_indices[idx2]
        
        p1 = classifiers[mate1]
        p2 = classifiers[mate2]
        
        # --- Single-point crossover (matching MATLAB ga3.m) ---
        if np.random.rand() < pcross:
            jcross = 1 + int(np.floor((cond_length - 1) * np.random.rand()))
        else:
            jcross = cond_length  # No crossover
        
        # Create children with crossover
        child1_cond = np.concatenate([p1.condition[:jcross], p2.condition[jcross:]])
        child2_cond = np.concatenate([p2.condition[:jcross], p1.condition[jcross:]])
        child1_action = p1.action
        child2_action = p2.action
        
        # --- Ternary mutation (matching MATLAB: cyclic shift) ---
        for j in range(cond_length):
            if np.random.rand() < pmutation:
                # Cyclic shift: -1->0->1->-1 or -1->1->0->-1
                shift = np.random.randint(1, 3)  # 1 or 2
                child1_cond[j] = ((child1_cond[j] + 1 + shift) % 3) - 1
            if np.random.rand() < pmutation:
                shift = np.random.randint(1, 3)
                child2_cond[j] = ((child2_cond[j] + 1 + shift) % 3) - 1
        # Action mutation
        if np.random.rand() < pmutation:
            child1_action = 1 - child1_action
        if np.random.rand() < pmutation:
            child2_action = 1 - child2_action
        
        # Average parents' strengths for children
        avg_strength = (p1.strength + p2.strength) / 2.0
        
        # --- Crowding-based replacement (matching MATLAB crowdin3.m) ---
        def find_crowding_replacement(child_cond, child_action, kill_set):
            """Find most similar rule in kill_set to replace (De Jong crowding)."""
            if not kill_set:
                return None
            
            best_match = -1
            best_similarity = -1
            
            subpop_size = max(1, int(crowding_subpop * len(kill_set)))
            
            for _ in range(crowding_factor):
                # Select random subpopulation from kill_set
                if subpop_size >= len(kill_set):
                    candidates = list(kill_set)
                else:
                    candidates = list(np.random.choice(kill_set, size=subpop_size, replace=False))
                
                # Find weakest in subpopulation
                weakest_idx = min(candidates, key=lambda i: classifiers[i].strength)
                
                # Compute similarity: matching condition bits + different action
                cond_match = sum(1 for j in range(cond_length) 
                               if child_cond[j] == classifiers[weakest_idx].condition[j])
                action_diff = 1 if child_action != classifiers[weakest_idx].action else 0
                similarity = cond_match + action_diff
                
                if similarity > best_similarity:
                    best_similarity = similarity
                    best_match = weakest_idx
            
            return best_match
        
        # Replace first child via crowding
        mort1 = find_crowding_replacement(child1_cond, child1_action, cankill)
        if mort1 is not None:
            classifiers[mort1] = Classifier(child1_cond, child1_action, avg_strength)
            classifiers[mort1].n_used = p1.n_used  # Inherit usage count
            cankill = [i for i in cankill if i != mort1]
        
        # Replace second child via crowding
        if cankill:
            mort2 = find_crowding_replacement(child2_cond, child2_action, cankill)
            if mort2 is not None:
                classifiers[mort2] = Classifier(child2_cond, child2_action, avg_strength)
                classifiers[mort2].n_used = p2.n_used
                cankill = [i for i in cankill if i != mort2]
    
    # Rescale back if we shifted (matching MATLAB)
    # Not needed since we work with the Classifier objects directly
    
    return classifiers


def apply_diversification(classifiers: List[Classifier], state: np.ndarray,
                          n_goods: int):
    """
    Ensure both actions are represented among matching classifiers.
    
    If all matching classifiers for a given state have the same action,
    create a new classifier with the opposite action.
    
    Paper Section 6: "used each time the classifier system is called upon"
    """
    matches = [(i, c) for i, c in enumerate(classifiers) if c.matches(state)]
    if not matches:
        return
    
    actions = [c.action for _, c in matches]
    if len(set(actions)) == 1:
        # All same action — create opposite
        opposite_action = 1 - actions[0]
        # Find weakest match to replace
        weakest_idx, weakest_c = min(matches, key=lambda x: x[1].strength)
        avg_strength = np.mean([c.strength for _, c in matches])
        classifiers[weakest_idx] = Classifier(state.copy(), opposite_action, avg_strength)


def apply_specialization(classifiers: List[Classifier], t: int):
    """
    Specialization operator from Section 6 of the paper.
    
    Probabilistically converts # (wildcard) positions to specific bit values
    (0 or 1) with frequency f_s(t) = 1 / (2 * sqrt(t)).
    
    This is the inverse of the generalization operator: as the simulation
    progresses, the specialization probability decreases (1/2√t → 0), allowing 
    early exploration via general rules to gradually give way to specific rules
    that exploit learned patterns.
    
    Parameters
    ----------
    classifiers : list of Classifier
        Classifier population to specialize (modified in-place).
    t : int
        Current simulation period (1-indexed).
    """
    f_s = 1.0 / (2.0 * np.sqrt(t))
    
    for classifier in classifiers:
        for j in range(len(classifier.condition)):
            if classifier.condition[j] == -1:  # Wildcard (#)
                if np.random.rand() < f_s:
                    classifier.condition[j] = np.random.randint(0, 2)  # 0 or 1


def create_classifier_replacing_weakest(classifiers: List[Classifier], 
                                         state: np.ndarray) -> int:
    """
    Creation operator from the paper (Section 6) / MATLAB create.m.
    
    When no classifier matches the current state, create a new one by
    replacing the most redundant classifier (or weakest if none are redundant).
    The new classifier gets the unmatched state as condition, a random action,
    and the population's average strength.
    
    This maintains a constant classifier population size, matching the original
    MATLAB implementation.
    
    Parameters
    ----------
    classifiers : list of Classifier
        Current classifier population.
    state : np.ndarray
        The unmatched state encoding.
    
    Returns
    -------
    int
        Index of the replaced/new classifier.
    """
    n = len(classifiers)
    cond_length = len(state)
    
    # Find the most redundant condition pattern
    # Group classifiers by their condition string
    condition_groups = {}
    for i, c in enumerate(classifiers):
        key = tuple(c.condition)
        if key not in condition_groups:
            condition_groups[key] = []
        condition_groups[key].append(i)
    
    # Find the largest group (most duplicated condition)
    max_group_size = max(len(g) for g in condition_groups.values())
    
    if max_group_size > 1:
        # Find the group with most duplicates
        largest_groups = [g for g in condition_groups.values() if len(g) == max_group_size]
        # Pick one group (first found, as in MATLAB)
        redundant_group = largest_groups[0]
        # Within the group, find the weakest
        replace_idx = min(redundant_group, key=lambda i: classifiers[i].strength)
    else:
        # No redundant classifiers — replace the globally weakest
        replace_idx = min(range(n), key=lambda i: classifiers[i].strength)
    
    # Create new classifier with random action and average strength
    action = np.random.randint(0, 2)
    avg_strength = np.mean([c.strength for c in classifiers])
    classifiers[replace_idx] = Classifier(state.copy(), action, avg_strength)
    
    return replace_idx


print("GA functions defined: apply_genetic_algorithm (ga3-style), apply_diversification,")
print("  apply_specialization, create_classifier_replacing_weakest")

GA functions defined: apply_genetic_algorithm (ga3-style), apply_diversification,
  apply_specialization, create_classifier_replacing_weakest

2.6 The Simulation Engine¶

The simulation runs for $T$ periods. Each period:

All agents are randomly matched into pairs
For each pair, trade and consumption decisions are made using the classifier systems
Strengths are updated via the bucket brigade
(Optionally) the genetic algorithm is applied

We track the distribution of holdings $\pi_{it}^h(k)$ — the fraction of type $i$ agents holding good $k$ at time $t$ — which should converge to the equilibrium values.

class KiyotakiWrightSimulation:
    """
    Simulation of the Kiyotaki-Wright economy with classifier system agents.
    
    This class implements the full simulation as described in Section 7
    of Marimon, McGrattan, and Sargent (1990).
    """
    
    def __init__(self, config: EconomyConfig, seed: int = None):
        self.config = config
        if seed is not None:
            np.random.seed(seed)
        
        # Create agent types
        self.agents = [ClassifierAgent(i, config) for i in range(config.n_types)]
        
        # Initialize holdings: each agent starts with a random good
        self.holdings = np.random.randint(0, config.n_goods, size=config.n_agents)
        
        # Map agent index to type
        self.agent_types = np.repeat(np.arange(config.n_types), config.n_agents_per_type)
        
        # Statistics tracking
        self.history = {
            'holdings_dist': [],      # Holdings distribution over time
            'trade_rates': [],        # Trade rates per period
            'consumption_rates': [],  # Consumption rates per period
            'exchange_counts': [],    # Exchange frequency counts π_i^e(jk)
            'consumption_counts': [], # Consumption frequency counts π_i^c(j|j)
        }
    
    def run(self, n_periods: int, record_every: int = 1, verbose: bool = True):
        """
        Run the simulation for n_periods.
        
        Parameters
        ----------
        n_periods : int
            Number of periods to simulate.
        record_every : int
            Record statistics every this many periods.
        verbose : bool
            Print progress updates.
        """
        config = self.config
        
        # Pre-compute GA schedule matching MATLAB winitial.m:
        # GA fires only on EVEN iterations with probability 1/sqrt(k/2).
        # Sized to n_periods (not hardcoded) per Copilot review.
        if config.use_genetic_algorithm:
            pga = 1.0 / np.sqrt(np.arange(1, n_periods // 2 + 1))
            self._ga_schedule = np.zeros(n_periods, dtype=bool)
            even_indices = np.arange(1, n_periods, 2)  # 0-indexed even iterations (1,3,5,...) = periods 2,4,6,...
            self._ga_schedule[even_indices] = pga[:len(even_indices)] > np.random.rand(len(even_indices))
        else:
            self._ga_schedule = np.zeros(1, dtype=bool)
        
        for t in range(1, n_periods + 1):
            trades_this_period = 0
            consumptions_this_period = 0
            
            # Per-period exchange and consumption tracking
            # exchange_counts[i, j, k] = times type i exchanged good j for good k
            exchange_counts = np.zeros((config.n_types, config.n_goods, config.n_goods))
            # consumption_counts[i, j, 0] = times type i held good j after trade
            # consumption_counts[i, j, 1] = times type i consumed good j
            consumption_counts = np.zeros((config.n_types, config.n_goods, 2))
            
            # Random matching: create random permutation
            perm = np.random.permutation(config.n_agents)
            n_pairs = config.n_agents // 2
            
            for p in range(n_pairs):
                a1 = perm[2 * p]
                a2 = perm[2 * p + 1]
                
                type1 = self.agent_types[a1]
                type2 = self.agent_types[a2]
                good1 = self.holdings[a1]
                good2 = self.holdings[a2]
                
                agent1 = self.agents[type1]
                agent2 = self.agents[type2]
                
                # --- Trade Decision ---
                action1, trade_winner1 = agent1.get_trade_decision(good1, good2)
                action2, trade_winner2 = agent2.get_trade_decision(good2, good1)
                
                trade_happens = (action1 == 1) and (action2 == 1)
                
                if trade_happens:
                    # Swap goods
                    self.holdings[a1], self.holdings[a2] = good2, good1
                    trades_this_period += 1
                    # Track exchange: type i exchanged good j for good k
                    exchange_counts[type1, good1, good2] += 1
                    exchange_counts[type2, good2, good1] += 1
                
                # Post-trade holdings
                post_good1 = self.holdings[a1]
                post_good2 = self.holdings[a2]
                
                # --- Consumption Decision ---
                cons_action1, cons_winner1 = agent1.get_consume_decision(post_good1)
                cons_action2, cons_winner2 = agent2.get_consume_decision(post_good2)
                
                # Track consumption attempts
                consumption_counts[type1, post_good1, 0] += 1  # held good
                consumption_counts[type2, post_good2, 0] += 1
                
                # Process agent 1 consumption
                payoff1 = self._process_consumption(a1, type1, post_good1, cons_action1)
                if cons_action1 == 1:
                    consumption_counts[type1, post_good1, 1] += 1  # consumed
                    if post_good1 == type1:
                        consumptions_this_period += 1
                
                # Process agent 2 consumption
                payoff2 = self._process_consumption(a2, type2, post_good2, cons_action2)
                if cons_action2 == 1:
                    consumption_counts[type2, post_good2, 1] += 1
                    if post_good2 == type2:
                        consumptions_this_period += 1
                
                # --- Update Strengths (Bucket Brigade) ---
                agent1.update_strengths(trade_winner1, cons_winner1, payoff1, 
                                       trade_happens or action1 == 0)
                agent2.update_strengths(trade_winner2, cons_winner2, payoff2, 
                                       trade_happens or action2 == 0)
            
            # --- Genetic Algorithm + Specialization (if enabled) ---
            if config.use_genetic_algorithm:
                # GA schedule matching MATLAB: even iterations only, 1/sqrt(k/2) probability
                # with random type selection (not all types every time)
                ga_idx = t - 1
                run_ga = (ga_idx < len(self._ga_schedule) and self._ga_schedule[ga_idx])
                
                if run_ga:
                    # Random type selection matching MATLAB:
                    # Always send 1 type, with p=0.33 send a 2nd, with p=0.33 send a 3rd
                    types_to_send = [np.random.randint(config.n_types)]
                    remaining = [i for i in range(config.n_types) if i != types_to_send[0]]
                    if np.random.rand() < 0.33 and remaining:
                        second = remaining[np.random.randint(len(remaining))]
                        types_to_send.append(second)
                        remaining = [i for i in remaining if i != second]
                        if np.random.rand() < 0.33 and remaining:
                            third = remaining[np.random.randint(len(remaining))]
                            types_to_send.append(third)
                    
                    # Apply GA to selected types (separate random selection for trade and consume)
                    for type_idx in types_to_send:
                        agent = self.agents[type_idx]
                        agent.trade_classifiers = apply_genetic_algorithm(
                            agent.trade_classifiers,
                            pcross=config.ga_pcross,
                            pmutation=config.ga_pmutation,
                            propselect=config.ga_propselect,
                            propused=config.ga_propused,
                            crowding_factor=config.ga_crowdfactor_trade,
                            crowding_subpop=config.ga_crowdsubpop,
                            uratio=config.ga_uratio
                        )
                    
                    # Separate random type selection for consume classifiers
                    types_to_send_c = [np.random.randint(config.n_types)]
                    remaining_c = [i for i in range(config.n_types) if i != types_to_send_c[0]]
                    if np.random.rand() < 0.33 and remaining_c:
                        second_c = remaining_c[np.random.randint(len(remaining_c))]
                        types_to_send_c.append(second_c)
                        remaining_c = [i for i in remaining_c if i != second_c]
                        if np.random.rand() < 0.33 and remaining_c:
                            third_c = remaining_c[np.random.randint(len(remaining_c))]
                            types_to_send_c.append(third_c)
                    
                    for type_idx in types_to_send_c:
                        agent = self.agents[type_idx]
                        agent.consume_classifiers = apply_genetic_algorithm(
                            agent.consume_classifiers,
                            pcross=config.ga_pcross,
                            pmutation=config.ga_pmutation,
                            propselect=config.ga_propselect,
                            propused=config.ga_propused,
                            crowding_factor=config.ga_crowdfactor_consume,
                            crowding_subpop=config.ga_crowdsubpop,
                            uratio=config.ga_uratio
                        )
                
                # Specialization: convert wildcards to specific bits (Section 6)
                # f_s(t) = 1/(2√t) — decreasing over time
                for agent in self.agents:
                    apply_specialization(agent.trade_classifiers, t)
                    apply_specialization(agent.consume_classifiers, t)
                
            # --- Tax mechanism (matching MATLAB class001/class003) ---
            # MATLAB applies proportional tax to trade classifier strengths EVERY period,
            # for ALL economy types (class001 complete-enum AND class003 random).
            # Tax is computed at start of iteration and subtracted at end.
            # MATLAB: Tax(:,i) = tax(i) * abs(CSt(:,strength)); CSt -= Tax;
            tax_rate = 0.0001  # From MATLAB wtinit.m: tax=0.0001*ones(ntypes,1)
            for agent in self.agents:
                for cls in agent.trade_classifiers:
                    cls.strength -= tax_rate * abs(cls.strength)
            
            # --- Record Statistics ---
            if t % record_every == 0:
                self._record_statistics(t, trades_this_period, consumptions_this_period,
                                       exchange_counts, consumption_counts)
            
            # --- Progress ---
            if verbose and t % max(1, n_periods // 10) == 0:
                dist = self._compute_holdings_dist()
                print(f"  Period {t:5d}/{n_periods}: "
                      f"Trades={trades_this_period:3d}, "
                      f"Consumptions={consumptions_this_period:3d}")
    
    def _process_consumption(self, agent_idx: int, type_id: int, 
                             holding: int, action: int) -> float:
        """
        Process consumption decision and return the external payoff.
        
        External payoff U(γ) from eq. (12) of the paper:
            If consume: u_i(x) - s(f(a))   (utility minus production good storage)
            If don't consume: -s(x)         (negative storage cost)
        
        Note: The paper specifies u_i(k) = 0 if k ≠ i. Agents CAN consume the
        wrong good (getting 0 utility but still producing and paying storage).
        The classifier system must LEARN not to do this.
        """
        config = self.config
        
        if action == 1:
            # Consume: get utility (0 for wrong good), produce new good
            utility = config.utility[type_id] if holding == type_id else 0.0
            new_good = config.produces[type_id]
            storage_cost = config.storage_costs[new_good]
            self.holdings[agent_idx] = new_good
            return utility - storage_cost
        else:
            # Don't consume: pay storage cost
            return -config.storage_costs[holding]
    
    def _compute_holdings_dist(self) -> np.ndarray:
        """
        Compute π_i^h(k): fraction of type i agents holding good k.
        
        Returns
        -------
        np.ndarray
            Shape (n_types, n_goods) matrix of holding fractions.
        """
        config = self.config
        dist = np.zeros((config.n_types, config.n_goods))
        for i in range(config.n_types):
            mask = self.agent_types == i
            type_holdings = self.holdings[mask]
            for k in range(config.n_goods):
                dist[i, k] = np.mean(type_holdings == k)
        return dist
    
    def _record_statistics(self, t: int, trades: int, consumptions: int,
                           exchange_counts: np.ndarray = None,
                           consumption_counts: np.ndarray = None):
        """Record statistics for the current period."""
        dist = self._compute_holdings_dist()
        self.history['holdings_dist'].append(dist.copy())
        self.history['trade_rates'].append(trades)
        self.history['consumption_rates'].append(consumptions)
        if exchange_counts is not None:
            self.history['exchange_counts'].append(exchange_counts.copy())
        if consumption_counts is not None:
            self.history['consumption_counts'].append(consumption_counts.copy())


print("KiyotakiWrightSimulation class defined.")
print("  Implements the full simulation with classifier system agents,")
print("  genetic algorithm, specialization, and tax mechanism.")

KiyotakiWrightSimulation class defined.
  Implements the full simulation with classifier system agents,
  genetic algorithm, specialization, and tax mechanism.

2.7 Visualization Functions¶

def plot_holdings_distribution(sim: KiyotakiWrightSimulation, 
                                record_every: int = 1,
                                title: str = None):
    """
    Plot the distribution of holdings over time for each agent type.
    
    This replicates Figures 6-9 from the paper, showing π_i^h(k) vs. time.
    """
    config = sim.config
    history = np.array(sim.history['holdings_dist'])  # (T, n_types, n_goods)
    T = len(history)
    time = np.arange(T) * record_every
    
    fig, axes = plt.subplots(1, config.n_types, figsize=(5 * config.n_types, 4), 
                             sharey=True)
    if config.n_types == 1:
        axes = [axes]
    
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
    linestyles = ['-', '--', ':', '-.', '-']
    
    for i, ax in enumerate(axes):
        for k in range(config.n_goods):
            ax.plot(time, history[:, i, k] * 100, 
                    color=colors[k % len(colors)],
                    linestyle=linestyles[k % len(linestyles)],
                    linewidth=1.5,
                    label=f'Good {k+1}')
        ax.set_xlabel('Period')
        ax.set_ylabel('% Holding' if i == 0 else '')
        ax.set_title(f'Type {i+1} Agent')
        ax.legend(loc='best', fontsize=9)
        ax.set_ylim(-5, 105)
        ax.grid(True, alpha=0.3)
    
    if title is None:
        title = f"Distribution of Holdings — {config.name}"
    fig.suptitle(title, fontsize=13, fontweight='bold')
    plt.tight_layout()
    plt.show()
    return fig


def plot_trade_consumption_rates(sim: KiyotakiWrightSimulation,
                                  record_every: int = 1,
                                  window: int = 20):
    """Plot smoothed trade and consumption rates over time."""
    trades = np.array(sim.history['trade_rates'])
    consumptions = np.array(sim.history['consumption_rates'])
    T = len(trades)
    time = np.arange(T) * record_every
    
    # Smooth with moving average
    if T > window:
        kernel = np.ones(window) / window
        trades_smooth = np.convolve(trades, kernel, mode='valid')
        cons_smooth = np.convolve(consumptions, kernel, mode='valid')
        time_smooth = time[window-1:]
    else:
        trades_smooth, cons_smooth = trades, consumptions
        time_smooth = time
    
    fig, ax = plt.subplots(figsize=(8, 4))
    ax.plot(time_smooth, trades_smooth, label='Trades', linewidth=1.5)
    ax.plot(time_smooth, cons_smooth, label='Consumptions', linewidth=1.5)
    ax.set_xlabel('Period')
    ax.set_ylabel('Count per period')
    ax.set_title(f'Trade and Consumption Activity — {sim.config.name}')
    ax.legend()
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
    return fig


def print_holdings_table(sim: KiyotakiWrightSimulation, period_label: str = ""):
    """
    Print the holdings distribution table, replicating Tables 4, 6, 8. 
    """
    dist = sim._compute_holdings_dist()
    config = sim.config
    
    header = f"{'π_i^h(j)':>10s}"
    for k in range(config.n_goods):
        header += f"   j={k+1:d}  "
    
    print(f"\nHoldings Distribution {period_label}")
    print("=" * (10 + 9 * config.n_goods))
    print(header)
    print("-" * (10 + 9 * config.n_goods))
    
    for i in range(config.n_types):
        row = f"  i={i+1:d}     "
        for k in range(config.n_goods):
            row += f"  {dist[i, k]:.3f} "
        print(row)
    print()


print("Visualization functions defined.")

Visualization functions defined.

def print_exchange_frequency(sim: KiyotakiWrightSimulation, window: int = 10,
                             period_label: str = "", t_idx: int = None):
    """
    Print exchange frequency table π_i^e(jk), replicating Tables 5, 7, etc.
    
    π_i^e(jk) = probability that type i agent holds j, meets agent with k, and trades.
    Uses a moving average over `window` periods ending at `t_idx`.
    
    Parameters
    ----------
    t_idx : int, optional
        Time index to center the query on. If None, uses latest data.
        Supports negative indexing (e.g., -1 for last period).
    """
    config = sim.config
    if not sim.history['exchange_counts']:
        print("No exchange frequency data recorded.")
        return
    
    total = len(sim.history['exchange_counts'])
    
    if t_idx is None:
        # Use last `window` periods (original behavior)
        n = min(window, total)
        recent = sim.history['exchange_counts'][-n:]
    else:
        # Resolve negative index
        if t_idx < 0:
            t_idx = total + t_idx
        n = min(window, t_idx + 1)
        recent = sim.history['exchange_counts'][t_idx - n + 1:t_idx + 1]
    
    avg_counts = np.mean(recent, axis=0)  # (n_types, n_goods, n_goods)
    
    # Normalize: divide by number of agents per type to get per-agent frequency
    n_per_type = config.n_agents_per_type
    freq = avg_counts / n_per_type
    
    print(f"\nExchange Frequency π_i^e(jk) {period_label}")
    print("=" * (14 + 22 * config.n_goods))
    header = f"{'π_i^e(jk)':>12s}"
    for k in range(config.n_goods):
        header += f"       j={k+1:d}          "
    print(header)
    print("-" * (14 + 22 * config.n_goods))
    
    for i in range(config.n_types):
        row = f"  i={i+1:d}        "
        for j in range(config.n_goods):
            triple = "(" + ",".join([f"{freq[i,j,k]:.2f}" for k in range(config.n_goods)]) + ")"
            row += f"  {triple:<18s}"
        print(row)
    print()


def print_consumption_frequency(sim: KiyotakiWrightSimulation, window: int = 10,
                                 period_label: str = "", t_idx: int = None):
    """
    Print consumption frequency π_i^c(j|j), replicating Table 14 etc.
    
    π_i^c(j|j) = probability that type i consumes good j given holding j after trade.
    Uses a moving average over `window` periods ending at `t_idx`.
    """
    config = sim.config
    if not sim.history['consumption_counts']:
        print("No consumption frequency data recorded.")
        return
    
    total = len(sim.history['consumption_counts'])
    
    if t_idx is None:
        n = min(window, total)
        recent = sim.history['consumption_counts'][-n:]
    else:
        if t_idx < 0:
            t_idx = total + t_idx
        n = min(window, t_idx + 1)
        recent = sim.history['consumption_counts'][t_idx - n + 1:t_idx + 1]
    
    avg_counts = np.mean(recent, axis=0)  # (n_types, n_goods, 2)
    
    print(f"\nConsumption Frequency π_i^c(j|j) {period_label}")
    print("=" * (12 + 9 * config.n_goods))
    header = f"{'π_i^c(j|j)':>10s}"
    for k in range(config.n_goods):
        header += f"   j={k+1:d}  "
    print(header)
    print("-" * (12 + 9 * config.n_goods))
    
    for i in range(config.n_types):
        row = f"  i={i+1:d}     "
        for k in range(config.n_goods):
            held = avg_counts[i, k, 0]
            consumed = avg_counts[i, k, 1]
            if held > 0:
                rate = consumed / held
                row += f"  {rate:.3f} "
            else:
                row += f"    —   "
        print(row)
    print()


def print_winning_classifier_actions(sim: KiyotakiWrightSimulation, 
                                       period_label: str = ""):
    """
    Print the winning classifier actions table π̃_i^e(jk|j), 
    replicating Tables 6, 8, etc.
    
    For each type i, for each (own_good j, partner_good k):
    shows the action (1=trade, 0=refuse) of the highest-strength matching classifier.
    
    Note: This shows the CURRENT state of classifiers. For intermediate-time
    classifier actions, snapshots would need to be saved during simulation.
    """
    config = sim.config
    n_goods = config.n_goods
    
    print(f"\nWinning Classifier Actions π̃_i^e(jk|j) {period_label}")
    print("=" * (14 + 22 * n_goods))
    header = f"{'π̃_i^e(jk|j)':>12s}"
    for j in range(n_goods):
        header += f"       j={j+1:d}          "
    print(header)
    print("-" * (14 + 22 * n_goods))
    
    for i in range(config.n_types):
        agent = sim.agents[i]
        row = f"  i={i+1:d}        "
        for j in range(n_goods):
            actions = []
            for k in range(n_goods):
                own_enc = encode_good(j, n_goods)
                partner_enc = encode_good(k, n_goods)
                state = np.concatenate([own_enc, partner_enc])
                
                matches = [(idx, c) for idx, c in enumerate(agent.trade_classifiers) 
                           if c.matches(state)]
                if matches:
                    best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
                    actions.append(str(best_cls.action))
                else:
                    actions.append("—")
            triple = "(" + ",".join(actions) + ")"
            row += f"  {triple:<18s}"
        print(row)
    print()


def print_full_analysis(sim: KiyotakiWrightSimulation, period_label: str = "",
                         window: int = 10, t_idx: int = None):
    """
    Print the complete set of analysis tables matching the paper's format.
    
    Parameters
    ----------
    t_idx : int, optional
        Time index for exchange/consumption frequency queries. If None, 
        uses latest data. Supports negative indexing.
    """
    if t_idx is not None:
        # Use time-indexed holdings from history
        total = len(sim.history['holdings_dist'])
        idx = t_idx if t_idx >= 0 else total + t_idx
        dist = sim.history['holdings_dist'][idx]
        config = sim.config
        header = f"{'π_i^h(j)':>10s}"
        for k in range(config.n_goods):
            header += f"   j={k+1:d}  "
        print(f"\nHoldings Distribution {period_label}")
        print("=" * (10 + 9 * config.n_goods))
        print(header)
        print("-" * (10 + 9 * config.n_goods))
        for i in range(config.n_types):
            row = f"  i={i+1:d}     "
            for k in range(config.n_goods):
                row += f"  {dist[i, k]:.3f} "
            print(row)
        print()
    else:
        print_holdings_table(sim, period_label=period_label)
    
    print_exchange_frequency(sim, window=window, period_label=period_label, t_idx=t_idx)
    print_winning_classifier_actions(sim, period_label=period_label)
    print_consumption_frequency(sim, window=window, period_label=period_label, t_idx=t_idx)


def print_classifier_strengths(sim: KiyotakiWrightSimulation, type_id: int,
                                classifier_type: str = "trade", top_n: int = 10,
                                period_label: str = ""):
    """
    Print top classifiers by strength for a given agent type.
    
    Replicates Tables 10-15 (A1.1) and Tables 20-27 (A1.2) from the paper,
    which show individual classifier strings and strengths.
    
    Parameters
    ----------
    type_id : int
        Agent type (0-indexed).
    classifier_type : str
        "trade" or "consume".
    top_n : int
        Number of top classifiers to display.
    """
    config = sim.config
    agent = sim.agents[type_id]
    
    if classifier_type == "trade":
        classifiers = agent.trade_classifiers
        label = "Exchange"
        n_bits = len(classifiers[0].condition) if classifiers else 0
    else:
        classifiers = agent.consume_classifiers
        label = "Consumption"
        n_bits = len(classifiers[0].condition) if classifiers else 0
    
    # Sort by strength descending
    sorted_cls = sorted(enumerate(classifiers), key=lambda x: x[1].strength, reverse=True)
    
    print(f"\nTop {top_n} {label} Classifiers for Type {type_id + 1} {period_label}")
    print("=" * 70)
    print(f"{'Rank':>4s}  {'Condition':>12s}  {'Action':>6s}  {'Strength':>10s}  {'Used':>6s}  {'Specificity':>11s}")
    print("-" * 70)
    
    for rank, (idx, cls) in enumerate(sorted_cls[:top_n], 1):
        # Convert condition to trinary string: 0→'0', 1→'1', -1→'#'
        cond_str = ''.join(['#' if v == -1 else str(v) for v in cls.condition])
        action_str = "TRADE" if cls.action == 1 else "HOLD"
        print(f"{rank:4d}  {cond_str:>12s}  {action_str:>6s}  {cls.strength:10.2f}  {cls.n_used:6d}  {cls.specificity:11.3f}")
    
    total = len(classifiers)
    action_1 = sum(1 for c in classifiers if c.action == 1)
    action_0 = total - action_1
    avg_strength = np.mean([c.strength for c in classifiers]) if classifiers else 0
    print(f"\n  Total: {total} classifiers ({action_1} TRADE, {action_0} HOLD), avg strength: {avg_strength:.2f}")


print("Analysis functions defined (exchange freq, consumption freq, winning classifiers, classifier strengths).")

Analysis functions defined (exchange freq, consumption freq, winning classifiers, classifier strengths).

3. Economy A1: Fundamental Equilibrium with Complete Enumeration¶

Economy A1 uses the Model A production structure with parameters:

$A_i = 50$ agents per type
Storage costs: $s_1 = 0.1,\ s_2 = 1,\ s_3 = 20$
Utility: $u_i = 100$ for all $i$
Complete enumeration of classifiers ( $E_a = 72,\ C_a = 12$ )
Bid parameters: $b_{11} = b_{12} = 0.025,\ b_{21} = b_{22} = 0.25$

Expected Result¶

The paper shows that the distribution of holdings rapidly converges to the fundamental equilibrium:

$\pi_i^h(j)$	$j=1$	$j=2$	$j=3$
$i=1$	0	1	0
$i=2$	0.5	0	0.5
$i=3$	1	0	0

In this equilibrium, good 1 (lowest storage cost) serves as the general medium of exchange. Type II agents accept good 1 even though they don’t consume it, because they can easily trade it for their desired good 2.

# Economy A1.1: Complete enumeration, fundamental equilibrium
config_a1 = EconomyConfig(
    name="Economy A1.1 (Complete Enumeration)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([1, 2, 0]),       # Type 1->good 2, Type 2->good 3, Type 3->good 1
    storage_costs=np.array([0.1, 1.0, 20.0]),  # s1 < s2 < s3
    utility=np.array([100.0, 100.0, 100.0]),
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.025, 0.025),
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=True,
    use_genetic_algorithm=False
)

print(f"Economy: {config_a1.name}")
print(f"Total agents: {config_a1.n_agents}")
print(f"Storage costs: {config_a1.storage_costs}")
print(f"Utility: {config_a1.utility}")
print(f"Production: Type i produces good {config_a1.produces + 1}")

Economy: Economy A1.1 (Complete Enumeration)
Total agents: 150
Storage costs: [ 0.1  1.  20. ]
Utility: [100. 100. 100.]
Production: Type i produces good [2 3 1]

# Run Economy A1.1 simulation
print("Running Economy A1.1 (Complete Enumeration)...")
print("=" * 50)

sim_a1 = KiyotakiWrightSimulation(config_a1, seed=42)
sim_a1.run(n_periods=1000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy A1.1 (Complete Enumeration)...
==================================================
  Period   100/1000: Trades= 34, Consumptions= 33
  Period   200/1000: Trades= 33, Consumptions= 34
  Period   300/1000: Trades= 32, Consumptions= 34
  Period   400/1000: Trades= 29, Consumptions= 35
  Period   500/1000: Trades= 29, Consumptions= 35
  Period   600/1000: Trades= 26, Consumptions= 22
  Period   700/1000: Trades= 33, Consumptions= 32
  Period   800/1000: Trades= 35, Consumptions= 40
  Period   900/1000: Trades= 35, Consumptions= 30
  Period  1000/1000: Trades= 36, Consumptions= 38

Simulation complete.

# Display results for Economy A1.1
# The paper reports 10-period moving averages at t=500 and t=1000

# --- Results at t=500 (cf. Paper Tables 4-6) ---
print("=" * 70)
print("ECONOMY A1.1 RESULTS AT t=500 (cf. Paper Tables 4-6)")
print("=" * 70)
print_full_analysis(sim_a1, period_label="at t=500", window=10, t_idx=499)

# --- Results at t=1000 (cf. Paper Tables 4-6) ---
print("\n" + "=" * 70)
print("ECONOMY A1.1 RESULTS AT t=1000 (cf. Paper Tables 4-6)")
print("=" * 70)
print_full_analysis(sim_a1, period_label="at t=1000", window=10)

# --- Classifier Strength Tables (cf. Paper Tables 10-15) ---
print("\n" + "=" * 70)
print("ECONOMY A1.1 CLASSIFIER STRENGTHS AT t=1000 (cf. Paper Tables 10-15)")
print("=" * 70)
for type_id in range(3):
    print_classifier_strengths(sim_a1, type_id, "trade", top_n=10, period_label="at t=1000")
    print_classifier_strengths(sim_a1, type_id, "consume", top_n=5, period_label="at t=1000")

# Plot distribution of holdings over time (replicates Figure 6)
fig_a1 = plot_holdings_distribution(sim_a1, record_every=1,
    title="Economy A1.1: Distribution of Holdings (cf. Figure 6 in paper)")

======================================================================
ECONOMY A1.1 RESULTS AT t=500 (cf. Paper Tables 4-6)
======================================================================

Holdings Distribution at t=500
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.460   0.000   0.540 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=500
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.18,0.30,0.19)    (0.00,0.00,0.00)  
  i=2          (0.00,0.18,0.00)    (0.00,0.00,0.00)    (0.17,0.19,0.00)  
  i=3          (0.00,0.00,0.17)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=500
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,1)             (1,1,1)           
  i=2          (0,1,0)             (0,0,0)             (1,1,0)           
  i=3          (0,0,1)             (0,0,0)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=500
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000   1.000 
  i=2       0.000   1.000   0.000 
  i=3       1.000     —     1.000 


======================================================================
ECONOMY A1.1 RESULTS AT t=1000 (cf. Paper Tables 4-6)
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.540   0.000   0.460 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.17,0.36,0.14)    (0.00,0.00,0.00)  
  i=2          (0.00,0.17,0.00)    (0.00,0.00,0.00)    (0.19,0.14,0.00)  
  i=3          (0.00,0.00,0.19)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,1)             (1,1,1)           
  i=2          (0,1,0)             (0,0,0)             (1,1,0)           
  i=3          (0,0,1)             (0,0,0)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000   1.000 
  i=2       0.000   1.000   0.000 
  i=3       1.000     —     1.000 


======================================================================
ECONOMY A1.1 CLASSIFIER STRENGTHS AT t=1000 (cf. Paper Tables 10-15)
======================================================================

Top 10 Exchange Classifiers for Type 1 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1          0110   TRADE       30.55    8236        1.000
   2          1001    HOLD       29.92       4        1.000
   3          1010    HOLD       29.54       8        1.000
   4          1000    HOLD       22.97       7        1.000
   5          1010   TRADE        0.00       1        1.000
   6          1001   TRADE        0.00       1        1.000
   7          1000   TRADE        0.00       1        1.000
   8          100#    HOLD        0.00       1        0.500
   9          100#   TRADE        0.00       1        0.500
  10          10#0    HOLD        0.00       1        0.500

  Total: 72 classifiers (36 TRADE, 36 HOLD), avg strength: 1.46

Top 5 Consumption Classifiers for Type 1 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1            10   TRADE       67.24    8251        1.000
   2            10    HOLD       -0.10       2        1.000
   3            ##   TRADE       -0.57   41740        0.333
   4            01    HOLD       -0.76       3        1.000
   5            01   TRADE       -1.00       2        1.000

  Total: 12 classifiers (6 TRADE, 6 HOLD), avg strength: 1.69

Top 10 Exchange Classifiers for Type 2 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1          0110    HOLD       24.81       2        1.000
   2          1001   TRADE       24.78    8134        1.000
   3          0001   TRADE       24.75    8034        1.000
   4          0100    HOLD       23.03      15        1.000
   5          0101    HOLD       21.08       8        1.000
   6          0010   TRADE        0.09    8148        1.000
   7          10#0    HOLD        0.09   17045        0.500
   8          0110   TRADE        0.00       1        1.000
   9          0101   TRADE        0.00       1        1.000
  10          0100   TRADE        0.00       1        1.000

  Total: 72 classifiers (36 TRADE, 36 HOLD), avg strength: -1.15

Top 5 Consumption Classifiers for Type 2 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1            01   TRADE       54.50   16188        1.000
   2            10    HOLD        0.21   25679        1.000
   3            01    HOLD       -1.00       2        1.000
   4            #0    HOLD      -14.50    8127        0.500
   5            #0   TRADE      -19.38       2        0.500

  Total: 12 classifiers (6 TRADE, 6 HOLD), avg strength: -9.88

Top 10 Exchange Classifiers for Type 3 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1          1000   TRADE       30.85    8146        1.000
   2          0000    HOLD       29.90       4        1.000
   3          0010    HOLD       28.73      14        1.000
   4          0001    HOLD       27.18      11        1.000
   5          0100    HOLD        0.11       7        1.000
   6          0101    HOLD        0.05       5        1.000
   7          0110    HOLD        0.04       3        1.000
   8          1001    HOLD        0.03   16612        1.000
   9          1010    HOLD        0.03   24907        1.000
  10          0110   TRADE        0.00       1        1.000

  Total: 72 classifiers (36 TRADE, 36 HOLD), avg strength: 1.61

Top 5 Consumption Classifiers for Type 3 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1            00   TRADE       67.86    8171        1.000
   2            #0   TRADE        0.11   41711        0.500
   3            0#   TRADE       -0.05     112        0.500
   4            ##    HOLD       -0.09       2        0.333
   5            ##   TRADE       -0.10       2        0.333

  Total: 12 classifiers (6 TRADE, 6 HOLD), avg strength: 3.78

# Trade and consumption activity
plot_trade_consumption_rates(sim_a1, record_every=1, window=30);

Discussion: Economy A1.1¶

As in the paper, the classifier system converges to the fundamental equilibrium:

Type 1 agents hold exclusively good 2 (which they produce) — they trade it for good 1 (their consumption good)
Type 2 agents split between good 1 and good 3 — they accept good 1 as a medium of exchange
Type 3 agents hold exclusively good 1 — they accept it from type 2 and trade it for good 3

Good 1, the cheapest to store, naturally emerges as the medium of exchange.

4. Economy A1.2: Random Initial Classifiers with Genetic Algorithm¶

Economy A1.2 is identical to A1.1 except that:

Initial classifiers are randomly generated (not a complete enumeration)
The genetic algorithm periodically injects new rules

This tests whether the classifier system can discover the fundamental equilibrium starting from random rules, using the GA’s creation, diversification, specialization, and generalization operations.

The paper reports convergence is slower but the system still approaches the fundamental equilibrium.

# Economy A1.2: Random classifiers + GA
config_a12 = EconomyConfig(
    name="Economy A1.2 (Random + GA)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([1, 2, 0]),
    storage_costs=np.array([0.1, 1.0, 20.0]),
    utility=np.array([100.0, 100.0, 100.0]),
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.025, 0.025),
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=False,  # Random initial classifiers
    use_genetic_algorithm=True,       # Enable GA
    ga_frequency=0.5,
    ga_pcross=0.6,
    ga_pmutation=0.01
)

print("Running Economy A1.2 (Random + GA)...")
print("=" * 50)

sim_a12 = KiyotakiWrightSimulation(config_a12, seed=123)
sim_a12.run(n_periods=2000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy A1.2 (Random + GA)...
==================================================
  Period   200/2000: Trades= 62, Consumptions= 55
  Period   400/2000: Trades= 67, Consumptions= 54
  Period   600/2000: Trades= 68, Consumptions= 62
  Period   800/2000: Trades= 54, Consumptions= 43
  Period  1000/2000: Trades= 67, Consumptions= 45
  Period  1200/2000: Trades= 62, Consumptions= 51
  Period  1400/2000: Trades= 55, Consumptions= 52
  Period  1600/2000: Trades= 55, Consumptions= 49
  Period  1800/2000: Trades= 64, Consumptions= 37
  Period  2000/2000: Trades= 60, Consumptions= 46

Simulation complete.

# Display results for Economy A1.2
# Paper reports results at t=1000 and t=2000 for GA economies

print("=" * 70)
print("ECONOMY A1.2 RESULTS AT t=1000")
print("=" * 70)
print_full_analysis(sim_a12, period_label="at t=1000", window=10, t_idx=999)

print("\n" + "=" * 70)
print("ECONOMY A1.2 RESULTS AT t=2000 (cf. Paper Tables 7-9)")
print("=" * 70)
print_full_analysis(sim_a12, period_label="at t=2000", window=10)

# --- Classifier Strength Tables (cf. Paper Tables 20-27) ---
# The paper's Tables 20-27 show individual classifier rules and strengths
# at t=1000 and t=2000 for Economy A1.2
print("\n" + "=" * 70)
print("ECONOMY A1.2 CLASSIFIER STRENGTHS AT t=1000 (cf. Paper Tables 20-23)")
print("=" * 70)
for type_id in range(3):
    print_classifier_strengths(sim_a12, type_id, "trade", top_n=10, period_label="at t=1000")
    print_classifier_strengths(sim_a12, type_id, "consume", top_n=5, period_label="at t=1000")

print("\n" + "=" * 70)
print("ECONOMY A1.2 CLASSIFIER STRENGTHS AT t=2000 (cf. Paper Tables 24-27)")
print("=" * 70)
for type_id in range(3):
    print_classifier_strengths(sim_a12, type_id, "trade", top_n=10, period_label="at t=2000")
    print_classifier_strengths(sim_a12, type_id, "consume", top_n=5, period_label="at t=2000")

# Plot distribution of holdings (replicates Figure 7)
fig_a12 = plot_holdings_distribution(sim_a12, record_every=1,
    title="Economy A1.2: Distribution of Holdings (cf. Figure 7 in paper)")

======================================================================
ECONOMY A1.2 RESULTS AT t=1000
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.440   0.000   0.560 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.52,0.33,0.16)    (0.00,0.00,0.00)  
  i=2          (0.00,0.17,0.10)    (0.00,0.00,0.00)    (0.26,0.16,0.04)  
  i=3          (0.31,0.34,0.16)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (1,—,—)             (1,1,1)             (1,—,—)           
  i=2          (0,1,1)             (—,1,—)             (1,1,1)           
  i=3          (1,1,1)             (—,—,—)             (—,—,1)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000   1.000 
  i=2       0.000   1.000   1.000 
  i=3       1.000   1.000   1.000 


======================================================================
ECONOMY A1.2 RESULTS AT t=2000 (cf. Paper Tables 7-9)
======================================================================

Holdings Distribution at t=2000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.420   0.000   0.580 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=2000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.51,0.33,0.16)    (0.00,0.00,0.00)  
  i=2          (0.00,0.17,0.08)    (0.00,0.00,0.00)    (0.24,0.16,0.07)  
  i=3          (0.31,0.34,0.17)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=2000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (1,—,—)             (1,1,1)             (1,—,—)           
  i=2          (0,1,1)             (—,1,—)             (1,1,1)           
  i=3          (1,1,1)             (—,—,—)             (—,—,1)           


Consumption Frequency π_i^c(j|j) at t=2000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000   1.000 
  i=2       0.000   1.000   1.000 
  i=3       1.000   1.000   1.000 


======================================================================
ECONOMY A1.2 CLASSIFIER STRENGTHS AT t=1000 (cf. Paper Tables 20-23)
======================================================================

Top 10 Exchange Classifiers for Type 1 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  0.01.01.00.0   TRADE       28.95   50168        1.000
   2  0.01.01.00.0   TRADE       23.23   11636        1.000
   3  0.01.01.00.0   TRADE       23.23   30592        1.000
   4  0.01.01.01.0   TRADE       22.16   13816        1.000
   5  0.00.01.00.0   TRADE       22.16    4647        1.000
   6  0.00.01.01.0   TRADE       21.43    4647        1.000
   7  0.01.01.00.0   TRADE       21.43   30592        1.000
   8  0.01.01.00.0   TRADE       21.18   11636        1.000
   9  0.00.01.00.0   TRADE       21.18    1871        1.000
  10  0.00.01.00.0   TRADE       21.14    4647        1.000

  Total: 72 classifiers (71 TRADE, 1 HOLD), avg strength: 19.40

Top 5 Consumption Classifiers for Type 1 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        1.00.0   TRADE       67.63   50469        1.000
   2        1.00.0   TRADE       64.28   44134        1.000
   3        1.00.0   TRADE       64.28   18154        1.000
   4        1.00.0   TRADE       60.95   18154        1.000
   5        1.00.0   TRADE       58.22   18154        1.000

  Total: 12 classifiers (12 TRADE, 0 HOLD), avg strength: 49.96

Top 10 Exchange Classifiers for Type 2 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  0.00.00.01.0   TRADE       23.63   16914        1.000
   2  1.00.00.01.0   TRADE       23.62   16696        1.000
   3  0.00.00.01.0   TRADE       23.45   15716        1.000
   4  1.00.00.01.0   TRADE       23.44   15496        1.000
   5  1.00.00.01.0   TRADE       22.14    7890        1.000
   6  1.00.00.01.0   TRADE       21.97   14114        1.000
   7  0.00.00.01.0   TRADE       21.97    5527        1.000
   8  0.00.00.01.0   TRADE       21.97    5338        1.000
   9  0.00.00.01.0   TRADE       21.92    5338        1.000
  10  1.00.00.01.0   TRADE       21.82   13842        1.000

  Total: 72 classifiers (71 TRADE, 1 HOLD), avg strength: 19.68

Top 5 Consumption Classifiers for Type 2 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        0.01.0   TRADE       54.54   33623        1.000
   2        0.01.0   TRADE       50.20    6244        1.000
   3        0.01.0   TRADE       50.20    6244        1.000
   4        0.01.0   TRADE       50.20    6244        1.000
   5        1.01.0    HOLD       46.71    9024        1.000

  Total: 12 classifiers (8 TRADE, 4 HOLD), avg strength: 38.53

Top 10 Exchange Classifiers for Type 3 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  1.00.00.00.0   TRADE       29.48   16452        1.000
   2  1.00.00.00.0   TRADE       27.37    7096        1.000
   3  1.00.00.00.0   TRADE       27.37    5116        1.000
   4  1.00.00.00.0   TRADE       27.18    5116        1.000
   5  1.00.00.00.0   TRADE       27.18    6612        1.000
   6  1.00.00.00.0   TRADE       27.16    5116        1.000
   7  1.00.00.00.0   TRADE       27.14    6612        1.000
   8  1.00.00.00.0   TRADE       27.07    5619        1.000
   9  1.01.00.00.0   TRADE       26.99    5619        1.000
  10  1.00.00.00.0   TRADE       26.99    5116        1.000

  Total: 72 classifiers (72 TRADE, 0 HOLD), avg strength: 25.27

Top 5 Consumption Classifiers for Type 3 at t=1000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        0.00.0   TRADE       68.07   16337        1.000
   2        0.00.0   TRADE       68.04    7528        1.000
   3        0.00.0   TRADE       68.04   13488        1.000
   4        0.00.0   TRADE       68.00    7528        1.000
   5        0.00.0   TRADE       68.00    6631        1.000

  Total: 12 classifiers (12 TRADE, 0 HOLD), avg strength: 51.90

======================================================================
ECONOMY A1.2 CLASSIFIER STRENGTHS AT t=2000 (cf. Paper Tables 24-27)
======================================================================

Top 10 Exchange Classifiers for Type 1 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  0.01.01.00.0   TRADE       28.95   50168        1.000
   2  0.01.01.00.0   TRADE       23.23   11636        1.000
   3  0.01.01.00.0   TRADE       23.23   30592        1.000
   4  0.01.01.01.0   TRADE       22.16   13816        1.000
   5  0.00.01.00.0   TRADE       22.16    4647        1.000
   6  0.00.01.01.0   TRADE       21.43    4647        1.000
   7  0.01.01.00.0   TRADE       21.43   30592        1.000
   8  0.01.01.00.0   TRADE       21.18   11636        1.000
   9  0.00.01.00.0   TRADE       21.18    1871        1.000
  10  0.00.01.00.0   TRADE       21.14    4647        1.000

  Total: 72 classifiers (71 TRADE, 1 HOLD), avg strength: 19.40

Top 5 Consumption Classifiers for Type 1 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        1.00.0   TRADE       67.63   50469        1.000
   2        1.00.0   TRADE       64.28   44134        1.000
   3        1.00.0   TRADE       64.28   18154        1.000
   4        1.00.0   TRADE       60.95   18154        1.000
   5        1.00.0   TRADE       58.22   18154        1.000

  Total: 12 classifiers (12 TRADE, 0 HOLD), avg strength: 49.96

Top 10 Exchange Classifiers for Type 2 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  0.00.00.01.0   TRADE       23.63   16914        1.000
   2  1.00.00.01.0   TRADE       23.62   16696        1.000
   3  0.00.00.01.0   TRADE       23.45   15716        1.000
   4  1.00.00.01.0   TRADE       23.44   15496        1.000
   5  1.00.00.01.0   TRADE       22.14    7890        1.000
   6  1.00.00.01.0   TRADE       21.97   14114        1.000
   7  0.00.00.01.0   TRADE       21.97    5527        1.000
   8  0.00.00.01.0   TRADE       21.97    5338        1.000
   9  0.00.00.01.0   TRADE       21.92    5338        1.000
  10  1.00.00.01.0   TRADE       21.82   13842        1.000

  Total: 72 classifiers (71 TRADE, 1 HOLD), avg strength: 19.68

Top 5 Consumption Classifiers for Type 2 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        0.01.0   TRADE       54.54   33623        1.000
   2        0.01.0   TRADE       50.20    6244        1.000
   3        0.01.0   TRADE       50.20    6244        1.000
   4        0.01.0   TRADE       50.20    6244        1.000
   5        1.01.0    HOLD       46.71    9024        1.000

  Total: 12 classifiers (8 TRADE, 4 HOLD), avg strength: 38.53

Top 10 Exchange Classifiers for Type 3 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1  1.00.00.00.0   TRADE       29.48   16452        1.000
   2  1.00.00.00.0   TRADE       27.37    7096        1.000
   3  1.00.00.00.0   TRADE       27.37    5116        1.000
   4  1.00.00.00.0   TRADE       27.18    5116        1.000
   5  1.00.00.00.0   TRADE       27.18    6612        1.000
   6  1.00.00.00.0   TRADE       27.16    5116        1.000
   7  1.00.00.00.0   TRADE       27.14    6612        1.000
   8  1.00.00.00.0   TRADE       27.07    5619        1.000
   9  1.01.00.00.0   TRADE       26.99    5619        1.000
  10  1.00.00.00.0   TRADE       26.99    5116        1.000

  Total: 72 classifiers (72 TRADE, 0 HOLD), avg strength: 25.27

Top 5 Consumption Classifiers for Type 3 at t=2000
======================================================================
Rank     Condition  Action    Strength    Used  Specificity
----------------------------------------------------------------------
   1        0.00.0   TRADE       68.07   16337        1.000
   2        0.00.0   TRADE       68.04    7528        1.000
   3        0.00.0   TRADE       68.04   13488        1.000
   4        0.00.0   TRADE       68.00    7528        1.000
   5        0.00.0   TRADE       68.00    6631        1.000

  Total: 12 classifiers (12 TRADE, 0 HOLD), avg strength: 51.90

Discussion: Economy A1.2¶

With random initial classifiers and the genetic algorithm, convergence is slower than with complete enumeration, but the system still moves toward the fundamental equilibrium. The GA helps by:

Creating classifiers for unmatched states
Diversifying to ensure both actions are explored
Generalizing via crossover (disagreements → wildcards)
Specializing by converting wildcards to specific values

5. Economy A2: Fundamental vs. Speculative Equilibrium¶

Economy A2 differs from A1 only in that utility is raised to $u_i = 500$ . At this higher utility, for high enough discount factors the unique stationary rational-expectations equilibrium of Kiyotaki and Wright is the speculative equilibrium, in which type 1 agents accept good 3 (high storage cost) as a speculative investment.

The condition for the fundamental equilibrium to be unique (eq. 13 in the paper):

s_3 - s_2 > (\pi_1^h(3) - \pi_1^h(2)) \cdot \frac{1}{3} u_1

(3)

With $s_3 - s_2 = 19$ and $u_1 = 500$ , this becomes $19 > \frac{500}{3}(\pi_1^h(3) - \pi_1^h(2))$ , which can be violated.

The Paper’s Key Finding¶

Despite the speculative equilibrium being the unique rational-expectations equilibrium, the classifier system converges to the fundamental equilibrium. The authors explain this as a consequence of the agents’ early “myopia” — the strength-updating rule converges slowly, and early decisions are essentially myopic, favoring the fundamental equilibrium.

# Economy A2.1: High utility — speculative equilibrium should prevail for patient agents
config_a2 = EconomyConfig(
    name="Economy A2.1 (High Utility, Complete Enum.)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([1, 2, 0]),
    storage_costs=np.array([0.1, 1.0, 20.0]),
    utility=np.array([500.0, 500.0, 500.0]),   # High utility!
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.025, 0.025),
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=True,
    use_genetic_algorithm=False
)

print("Running Economy A2.1 (High Utility)...")
print("=" * 50)

sim_a2 = KiyotakiWrightSimulation(config_a2, seed=42)
sim_a2.run(n_periods=1000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy A2.1 (High Utility)...
==================================================
  Period   100/1000: Trades= 43, Consumptions= 42
  Period   200/1000: Trades= 37, Consumptions= 38
  Period   300/1000: Trades= 43, Consumptions= 45
  Period   400/1000: Trades= 49, Consumptions= 55
  Period   500/1000: Trades= 40, Consumptions= 46
  Period   600/1000: Trades= 41, Consumptions= 37
  Period   700/1000: Trades= 45, Consumptions= 44
  Period   800/1000: Trades= 42, Consumptions= 47
  Period   900/1000: Trades= 45, Consumptions= 40
  Period  1000/1000: Trades= 45, Consumptions= 47

Simulation complete.

# Display results for Economy A2.1
# Paper reports at t=500 and t=1000 for complete enumeration economies

print("=" * 70)
print("ECONOMY A2.1 RESULTS AT t=500")
print("=" * 70)
print_full_analysis(sim_a2, period_label="at t=500", window=10, t_idx=499)

print("\n" + "=" * 70)
print("ECONOMY A2.1 RESULTS AT t=1000 (cf. Paper Table 10)")
print("=" * 70)
print_full_analysis(sim_a2, period_label="at t=1000", window=10)

# Check Kiyotaki-Wright condition for fundamental uniqueness
dist = sim_a2._compute_holdings_dist()
config = sim_a2.config
s3_minus_s2 = config_a2.storage_costs[2] - config_a2.storage_costs[1]
rhs = (1/3) * config_a2.utility[0] * (dist[0, 2] - dist[0, 1])
print(f"\nKiyotaki-Wright condition check:")
print(f"  s₃ - s₂ = {s3_minus_s2:.1f}")
print(f"  (1/3)·u₁·(π₁ʰ(3) - π₁ʰ(2)) = {rhs:.2f}")
print(f"  Fundamental unique? s₃-s₂ > RHS: {s3_minus_s2 > rhs}")
print(f"  → The classifier system converges to the FUNDAMENTAL equilibrium")
print(f"    even when rational expectations would predict the speculative one.")

# Plot
fig_a2 = plot_holdings_distribution(sim_a2, record_every=1,
    title="Economy A2.1: Holdings Distribution — Fundamental Despite High u\n"
          "(Rational expectations predicts speculative equilibrium)")

======================================================================
ECONOMY A2.1 RESULTS AT t=500
======================================================================

Holdings Distribution at t=500
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.460   0.000   0.540 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=500
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.50,0.30,0.00)    (0.00,0.00,0.00)  
  i=2          (0.00,0.18,0.00)    (0.00,0.00,0.00)    (0.17,0.00,0.00)  
  i=3          (0.00,0.32,0.17)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=500
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,0)             (0,0,0)           
  i=2          (0,1,0)             (0,0,0)             (1,1,0)           
  i=3          (0,1,1)             (0,0,0)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=500
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000     —   
  i=2       0.000   1.000   0.000 
  i=3       1.000   1.000   1.000 


======================================================================
ECONOMY A2.1 RESULTS AT t=1000 (cf. Paper Table 10)
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.540   0.000   0.460 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.50,0.36,0.00)    (0.00,0.00,0.00)  
  i=2          (0.00,0.17,0.00)    (0.00,0.00,0.00)    (0.19,0.00,0.00)  
  i=3          (0.00,0.32,0.19)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,0)             (0,0,0)           
  i=2          (0,1,0)             (0,0,0)             (1,1,0)           
  i=3          (0,1,1)             (0,0,0)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   1.000     —   
  i=2       0.000   1.000   0.000 
  i=3       1.000   1.000   1.000 


Kiyotaki-Wright condition check:
  s₃ - s₂ = 19.0
  (1/3)·u₁·(π₁ʰ(3) - π₁ʰ(2)) = -166.67
  Fundamental unique? s₃-s₂ > RHS: True
  → The classifier system converges to the FUNDAMENTAL equilibrium
    even when rational expectations would predict the speculative one.

Discussion: Economy A2¶

As the paper finds, even with $u_i = 500$ (which should favor the speculative equilibrium for patient agents), the classifier system converges to the fundamental equilibrium. The authors write:

“Patience requires experience. The transfer system inside the classifier system is designed to converge to a set of long-run average strengths. In the limit, the artificially intelligent agents should behave as long-run average payoff maximizers... It takes time, however, for optimal rules to achieve the desired strengths. The behavior of our artificially intelligent agents can be very myopic at the beginning.”

This is a key result: the learning dynamics of the classifier system create an inherent bias toward the fundamental equilibrium.

6. Economy A2.2: High Utility with Random Classifiers + GA¶

Economy A2.2 is the random-classifier variant of Economy A2. The paper’s master table lists this economy with equilibrium type “S” (speculative), suggesting that with random initial classifiers and the GA, the system may find the speculative equilibrium — in contrast to Economy A2.1 where complete enumeration led to the fundamental equilibrium.

This is an important test: does the GA’s increased experimentation enable agents to escape the myopic lock-in that prevented the speculative equilibrium from emerging in A2.1?

Parameters are identical to A2.1 ( $u_i = 500$ , $s_1=0.1, s_2=1, s_3=20$ ) except that the classifier system uses random initial rules with the genetic algorithm.

# Economy A2.2: High utility with random classifiers + GA
config_a22 = EconomyConfig(
    name="Economy A2.2 (High Utility, Random + GA)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([1, 2, 0]),
    storage_costs=np.array([0.1, 1.0, 20.0]),
    utility=np.array([500.0, 500.0, 500.0]),   # High utility
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.025, 0.025),
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=False,   # Random initial classifiers
    use_genetic_algorithm=True,        # Enable GA
    ga_frequency=0.5,
    ga_pcross=0.6,
    ga_pmutation=0.01
)

print("Running Economy A2.2 (High Utility, Random + GA)...")
print("=" * 50)

sim_a22 = KiyotakiWrightSimulation(config_a22, seed=456)
sim_a22.run(n_periods=2000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy A2.2 (High Utility, Random + GA)...
==================================================
  Period   200/2000: Trades=  8, Consumptions=  0
  Period   400/2000: Trades= 33, Consumptions= 15
  Period   600/2000: Trades=  9, Consumptions=  0
  Period   800/2000: Trades= 14, Consumptions=  0
  Period  1000/2000: Trades= 32, Consumptions= 20
  Period  1200/2000: Trades= 34, Consumptions= 17
  Period  1400/2000: Trades= 33, Consumptions= 11
  Period  1600/2000: Trades=  9, Consumptions=  0
  Period  1800/2000: Trades= 33, Consumptions=  0
  Period  2000/2000: Trades= 31, Consumptions=  0

Simulation complete.

# Display results for Economy A2.2
# Paper reports at t=1000 and t=2000 for GA economies

print("=" * 70)
print("ECONOMY A2.2 RESULTS AT t=1000")
print("=" * 70)
print_full_analysis(sim_a22, period_label="at t=1000", window=10, t_idx=999)

print("\n" + "=" * 70)
print("ECONOMY A2.2 RESULTS AT t=2000")
print("=" * 70)
print_full_analysis(sim_a22, period_label="at t=2000", window=10)

# Check: is this fundamental or speculative?
dist_a22 = sim_a22._compute_holdings_dist()
print("Interpretation:")
print(f"  Type 1 holds: Good 1={dist_a22[0,0]:.3f}, Good 2={dist_a22[0,1]:.3f}, Good 3={dist_a22[0,2]:.3f}")
if dist_a22[0, 2] > 0.1:
    print("  → Type 1 agents accept Good 3 (speculative behavior)")
    print("  → Economy appears to be in or near SPECULATIVE equilibrium")
else:
    print("  → Type 1 agents do NOT accept Good 3")
    print("  → Economy appears to be in or near FUNDAMENTAL equilibrium")

# Plot distribution of holdings
fig_a22 = plot_holdings_distribution(sim_a22, record_every=1,
    title="Economy A2.2: Holdings Distribution — Random + GA with High u")

======================================================================
ECONOMY A2.2 RESULTS AT t=1000
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   1.000   0.000 
  i=2       0.000   0.000   1.000 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.00,0.32,0.34)    (0.00,0.00,0.00)  
  i=2          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.34,0.33)  
  i=3          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (—,—,—)             (0,—,—)             (1,—,1)           
  i=2          (1,1,1)             (—,0,0)             (1,1,1)           
  i=3          (0,0,0)             (—,—,—)             (0,—,0)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1         —     0.000   1.000 
  i=2         —     1.000   1.000 
  i=3       0.000     —       —   


======================================================================
ECONOMY A2.2 RESULTS AT t=2000
======================================================================

Holdings Distribution at t=2000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   0.000   1.000 
  i=2       0.000   0.000   1.000 
  i=3       1.000   0.000   0.000 


Exchange Frequency π_i^e(jk) at t=2000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.70)  
  i=2          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.65)  
  i=3          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=2000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (—,—,—)             (0,—,—)             (1,—,1)           
  i=2          (1,1,1)             (—,0,0)             (1,1,1)           
  i=3          (0,0,0)             (—,—,—)             (0,—,0)           


Consumption Frequency π_i^c(j|j) at t=2000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1         —       —     0.000 
  i=2         —       —     1.000 
  i=3       0.000     —       —   

Interpretation:
  Type 1 holds: Good 1=0.000, Good 2=0.000, Good 3=1.000
  → Type 1 agents accept Good 3 (speculative behavior)
  → Economy appears to be in or near SPECULATIVE equilibrium

Discussion: Economy A2.2¶

The paper lists Economy A2.2 with equilibrium type “S” (speculative) in its master table (Table 8). The contrast between A2.1 and A2.2 is revealing:

A2.1 (complete enumeration): Converges to the fundamental equilibrium despite high $u$ favoring speculative
A2.2 (random + GA): The GA’s increased experimentation may enable agents to escape the myopic lock-in

This comparison highlights that the learning algorithm itself affects equilibrium selection — not just the economic parameters. The GA introduces more exploration (“experimentation”), which can help agents discover that accepting good 3 speculatively is profitable when utility is high enough to offset storage costs.

7. Economy B: Alternative Production Structure¶

Economy B uses a different production pattern:

Type I produces good 3
Type II produces good 1
Type III produces good 2

With storage costs $s_1 = 1,\ s_2 = 4,\ s_3 = 9$ and $u_i = 100$ , both fundamental and speculative equilibria are possible.

The paper reports an interesting pattern: at $t=500$ , the economy appears to be in a speculative equilibrium, but by $t=1000$ it has transitioned to the fundamental equilibrium. This provides evidence that convergence to the fundamental equilibrium is not simply due to myopic behavior.

# Economy B.1: Different production structure
config_b = EconomyConfig(
    name="Economy B.1 (Alt. Production, Complete Enum.)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([2, 0, 1]),        # Type 1->good 3, Type 2->good 1, Type 3->good 2
    storage_costs=np.array([1.0, 4.0, 9.0]),
    utility=np.array([100.0, 100.0, 100.0]),
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.25, 0.25),
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=True,
    use_genetic_algorithm=False
)

print("Running Economy B.1...")
print("=" * 50)

sim_b = KiyotakiWrightSimulation(config_b, seed=42)
sim_b.run(n_periods=1000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy B.1...
==================================================
  Period   100/1000: Trades= 28, Consumptions= 36
  Period   200/1000: Trades= 27, Consumptions= 33
  Period   300/1000: Trades= 24, Consumptions= 29
  Period   400/1000: Trades= 29, Consumptions= 42
  Period   500/1000: Trades= 28, Consumptions= 43
  Period   600/1000: Trades= 33, Consumptions= 35
  Period   700/1000: Trades= 26, Consumptions= 30
  Period   800/1000: Trades= 25, Consumptions= 33
  Period   900/1000: Trades= 34, Consumptions= 35
  Period  1000/1000: Trades= 32, Consumptions= 41

Simulation complete.

# Results for Economy B.1
# KEY: Paper reports speculative equilibrium at t=500, 
# fundamental equilibrium at t=1000 — an important transition!

print("=" * 70)
print("ECONOMY B.1 RESULTS AT t=500 (Paper reports speculative eq. here)")
print("=" * 70)
print_full_analysis(sim_b, period_label="at t=500", window=10, t_idx=499)

# Check if speculative at t=500
# In economy B (Model B production): Type 1→good 3, Type 2→good 1, Type 3→good 2
# Speculative: Type 3 accepts good 3 (high cost) speculatively
dist_500 = sim_b.history['holdings_dist'][499]
print(f"\n  Type 3 holding good 3 (speculative indicator): {dist_500[2,2]:.3f}")
if dist_500[2, 2] > 0.05:
    print("  → Possible speculative behavior at t=500")
else:
    print("  → Already near fundamental at t=500")

print("\n" + "=" * 70)
print("ECONOMY B.1 RESULTS AT t=1000 (Paper reports fundamental eq. here)")
print("=" * 70)
print_full_analysis(sim_b, period_label="at t=1000", window=10)

fig_b = plot_holdings_distribution(sim_b, record_every=1,
    title="Economy B.1: Distribution of Holdings (cf. Paper Figure 8)\n"
          "Paper: speculative at t=500 → fundamental at t=1000")

======================================================================
ECONOMY B.1 RESULTS AT t=500 (Paper reports speculative eq. here)
======================================================================

Holdings Distribution at t=500
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   0.260   0.740 
  i=2       1.000   0.000   0.000 
  i=3       0.500   0.500   0.000 


Exchange Frequency π_i^e(jk) at t=500
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.11,0.02,0.00)    (0.15,0.10,0.14)  
  i=2          (0.00,0.24,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  
  i=3          (0.00,0.00,0.15)    (0.13,0.00,0.10)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=500
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,0)             (1,1,1)           
  i=2          (0,1,0)             (0,0,0)             (0,0,0)           
  i=3          (0,0,1)             (1,0,1)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=500
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   0.000   0.000 
  i=2       1.000   1.000     —   
  i=3       0.000   0.000   1.000 


  Type 3 holding good 3 (speculative indicator): 0.000
  → Already near fundamental at t=500

======================================================================
ECONOMY B.1 RESULTS AT t=1000 (Paper reports fundamental eq. here)
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.000   0.280   0.720 
  i=2       1.000   0.000   0.000 
  i=3       0.560   0.440   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.00,0.00,0.00)    (0.09,0.04,0.00)    (0.15,0.08,0.20)  
  i=2          (0.00,0.23,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  
  i=3          (0.00,0.00,0.15)    (0.14,0.00,0.08)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0,0,0)             (1,1,0)             (1,1,1)           
  i=2          (0,1,0)             (0,0,0)             (0,0,0)           
  i=3          (0,0,1)             (1,0,1)             (0,0,0)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       1.000   0.000   0.000 
  i=2       1.000   1.000     —   
  i=3       0.000   0.000   1.000

Discussion: Economy B.1¶

Economy B uses Model B production (Type I→Good 3, Type II→Good 1, Type III→Good 2) with storage costs $s_1=1, s_2=4, s_3=9$ . Both fundamental and speculative equilibria are possible. The paper reports an interesting dynamic: at $t=500$ the economy appears speculative, but by $t=1000$ it has moved to the fundamental equilibrium — providing evidence that fundamental selection is not merely a consequence of myopia.

8. Economy B.2: Alternative Production with Random Classifiers + GA¶

Economy B.2 uses the same Model B production structure as B.1 but starts with random classifiers and uses the genetic algorithm. The paper reports that B.2 “had not converged after 2000 periods” but was “moving towards the fundamental equilibrium.”

This is a challenging test for the GA — with two possible equilibria and the alternative production structure, convergence is slower.

# Economy B.2: Model B production with random classifiers + GA
config_b2 = EconomyConfig(
    name="Economy B.2 (Alt. Production, Random + GA)",
    n_types=3,
    n_agents_per_type=50,
    produces=np.array([2, 0, 1]),        # Model B: Type 1->good 3, Type 2->good 1, Type 3->good 2
    storage_costs=np.array([1.0, 4.0, 9.0]),
    utility=np.array([100.0, 100.0, 100.0]),
    n_trade_classifiers=72,
    n_consume_classifiers=12,
    bid_trade=(0.025, 0.025),    # Paper Table: b11=0.025, b12=0.025 for B.2
    bid_consume=(0.25, 0.25),
    use_complete_enumeration=False,   # Random initial classifiers
    use_genetic_algorithm=True,        # Enable GA
    ga_frequency=0.5,
    ga_pcross=0.6,
    ga_pmutation=0.01
)

print("Running Economy B.2 (Alt. Production, Random + GA)...")
print("=" * 50)

sim_b2 = KiyotakiWrightSimulation(config_b2, seed=789)
sim_b2.run(n_periods=2000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy B.2 (Alt. Production, Random + GA)...
==================================================
  Period   200/2000: Trades= 19, Consumptions= 11
  Period   400/2000: Trades= 46, Consumptions=  4
  Period   600/2000: Trades=  8, Consumptions=  0
  Period   800/2000: Trades= 12, Consumptions=  0
  Period  1000/2000: Trades= 22, Consumptions= 16
  Period  1200/2000: Trades= 15, Consumptions=  0
  Period  1400/2000: Trades=  9, Consumptions=  0
  Period  1600/2000: Trades=  8, Consumptions=  0
  Period  1800/2000: Trades= 26, Consumptions= 19
  Period  2000/2000: Trades= 15, Consumptions=  0

Simulation complete.

# Display results for Economy B.2
# Paper reports at t=1000 and t=2000; notes it "had not converged" by t=2000

print("=" * 70)
print("ECONOMY B.2 RESULTS AT t=1000")
print("=" * 70)
print_full_analysis(sim_b2, period_label="at t=1000", window=10, t_idx=999)

print("\n" + "=" * 70)
print("ECONOMY B.2 RESULTS AT t=2000 (Paper: 'not converged, moving toward fundamental')")
print("=" * 70)
print_full_analysis(sim_b2, period_label="at t=2000", window=10)

# Compare with B.1
print("\nComparison B.1 (complete enum., t=1000) vs B.2 (random+GA, t=2000):")
dist_b1 = sim_b._compute_holdings_dist()
dist_b2 = sim_b2._compute_holdings_dist()
for i in range(3):
    b1_vals = ' '.join([f'{dist_b1[i,k]:.3f}' for k in range(3)])
    b2_vals = ' '.join([f'{dist_b2[i,k]:.3f}' for k in range(3)])
    print(f"  Type {i+1}: B.1=[{b1_vals}]  B.2=[{b2_vals}]")

# Plot
fig_b2 = plot_holdings_distribution(sim_b2, record_every=1,
    title="Economy B.2: Holdings Distribution — Random + GA\n"
          "Paper: 'not converged after 2000 periods, moving toward fundamental'")

======================================================================
ECONOMY B.2 RESULTS AT t=1000
======================================================================

Holdings Distribution at t=1000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.980   0.020   0.000 
  i=2       1.000   0.000   0.000 
  i=3       0.000   1.000   0.000 


Exchange Frequency π_i^e(jk) at t=1000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.03,0.05,0.00)    (0.11,0.00,0.00)    (0.00,0.00,0.00)  
  i=2          (0.09,0.42,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  
  i=3          (0.00,0.00,0.00)    (0.36,0.32,0.00)    (0.00,0.00,0.00)  


Winning Classifier Actions π̃_i^e(jk|j) at t=1000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (1,1,0)             (—,—,—)             (0,1,0)           
  i=2          (1,1,1)             (—,—,—)             (—,1,—)           
  i=3          (1,1,1)             (1,1,1)             (0,1,1)           


Consumption Frequency π_i^c(j|j) at t=1000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       0.000   0.000     —   
  i=2       1.000   1.000     —   
  i=3       1.000   1.000     —   


======================================================================
ECONOMY B.2 RESULTS AT t=2000 (Paper: 'not converged, moving toward fundamental')
======================================================================

Holdings Distribution at t=2000
=====================================
  π_i^h(j)   j=1     j=2     j=3  
-------------------------------------
  i=1       0.140   0.000   0.860 
  i=2       1.000   0.000   0.000 
  i=3       0.000   0.000   1.000 


Exchange Frequency π_i^e(jk) at t=2000
================================================================================
   π_i^e(jk)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (0.06,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  
  i=2          (0.38,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.00)  
  i=3          (0.00,0.00,0.00)    (0.00,0.00,0.00)    (0.00,0.00,0.34)  


Winning Classifier Actions π̃_i^e(jk|j) at t=2000
================================================================================
π̃_i^e(jk|j)       j=1                 j=2                 j=3          
--------------------------------------------------------------------------------
  i=1          (1,1,0)             (—,—,—)             (0,1,0)           
  i=2          (1,1,1)             (—,—,—)             (—,1,—)           
  i=3          (1,1,1)             (1,1,1)             (0,1,1)           


Consumption Frequency π_i^c(j|j) at t=2000
=======================================
π_i^c(j|j)   j=1     j=2     j=3  
---------------------------------------
  i=1       0.000     —     1.000 
  i=2       1.000     —       —   
  i=3         —       —     0.000 


Comparison B.1 (complete enum., t=1000) vs B.2 (random+GA, t=2000):
  Type 1: B.1=[0.000 0.280 0.720]  B.2=[0.140 0.000 0.860]
  Type 2: B.1=[1.000 0.000 0.000]  B.2=[1.000 0.000 0.000]
  Type 3: B.1=[0.560 0.440 0.000]  B.2=[0.000 0.000 1.000]

Discussion: Economy B.2¶

As the paper reports, Economy B.2 with random initial classifiers converges more slowly than B.1. The paper noted it “had not converged after 2000 periods” but was “moving towards the fundamental equilibrium.”

Together, the B.1 and B.2 results provide evidence that the classifier systems select the fundamental equilibrium over the speculative equilibrium — and crucially, this selection is not merely due to early myopic behavior (since B.1 starts near the speculative equilibrium and moves away from it).

9. Economy C: Fiat Money¶

Economy C adds a fourth good — fiat money (good 0) — that:

Has zero storage cost ( $s_0 = 0$ )
Provides no utility to any agent
Cannot be consumed

Fiat money is introduced by randomly allocating units to some agents at $t=0$ . The key question is whether agents will learn to use this intrinsically worthless good as a medium of exchange, purely because it is costless to store.

The paper shows remarkably fast convergence: agents learn to use fiat money as a medium of exchange, accepting it in trade even though it provides no direct utility.

Storage costs: $s_0 = 0,\ s_1 = 9,\ s_2 = 14,\ s_3 = 29$

# Economy C: Fiat Money (4 goods)
# Good 0, 1, 2 = commodities; Good 3 = fiat money (zero storage, can't consume)
# Uses the same classifier framework with proper 2-bit encoding for 4 goods
#
# GA parameters: Economy C uses ga_pcross=0.6, ga_pmutation=0.01, matching the
# universal parameter file (winitial.m) from the original MATLAB code. The MATLAB
# class003.m calls winitial with no overrides to pcrosst/pmutationt.

class FiatMoneyAgent:
    """
    Classifier agent for 4-good economy.
    Fixes n_bits=2 for 4 goods (since encode_good(n_goods=4) uses 2 bits).
    Uses random classifiers with GA support.
    """
    def __init__(self, type_id: int, n_trade_cls: int = 150, 
                 n_consume_cls: int = 20,
                 bid_trade: tuple = (0.025, 0.025),
                 bid_consume: tuple = (0.025, 0.25)):
        self.type_id = type_id
        self.n_goods = 4
        self.n_bits = 2  # 2 bits sufficient: [1,0],[0,1],[0,0],[1,1]
        self.bid_trade = bid_trade
        self.bid_consume = bid_consume
        
        # Random classifiers (as in the paper for Economy C)
        self.trade_classifiers = create_random_classifier_set(
            n_trade_cls, 2 * self.n_bits)  # condition = 4 trits
        self.consume_classifiers = create_random_classifier_set(
            n_consume_cls, self.n_bits)     # condition = 2 trits
        
        self.last_consume_winner = None
    
    def get_trade_decision(self, own_good: int, partner_good: int):
        own_enc = encode_good(own_good, self.n_goods)
        partner_enc = encode_good(partner_good, self.n_goods)
        state = np.concatenate([own_enc, partner_enc])
        
        matches = [(i, c) for i, c in enumerate(self.trade_classifiers) 
                   if c.matches(state)]
        
        if not matches:
            # Creation operator: replace redundant/weak classifier
            new_idx = create_classifier_replacing_weakest(self.trade_classifiers, state)
            return self.trade_classifiers[new_idx].action, new_idx
        
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        return best_cls.action, best_idx
    
    def get_consume_decision(self, holding: int):
        state = encode_good(holding, self.n_goods)
        matches = [(i, c) for i, c in enumerate(self.consume_classifiers) 
                   if c.matches(state)]
        
        if not matches:
            # Creation operator: replace redundant/weak classifier
            new_idx = create_classifier_replacing_weakest(self.consume_classifiers, state)
            return self.consume_classifiers[new_idx].action, new_idx
        
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        return best_cls.action, best_idx
    
    def update_strengths(self, trade_winner_idx, consume_winner_idx,
                         external_payoff, trade_happened):
        b11, b12 = self.bid_trade
        b21, b22 = self.bid_consume
        
        e_cls = self.trade_classifiers[trade_winner_idx]
        c_cls = self.consume_classifiers[consume_winner_idx]
        
        b1_e = b11 + b12 * e_cls.specificity
        b2_c = b21 + b22 * c_cls.specificity
        
        # Exchange classifier update (eq. 11): use counter BEFORE incrementing
        if trade_happened or e_cls.action == 0:
            tau_e = e_cls.n_used  # Counter before increment
            e_cls.n_used += 1
            payment_from_consume = b2_c * c_cls.strength
            e_cls.strength = e_cls.strength - (1.0 / max(tau_e, 1)) * (
                (1 + b1_e) * e_cls.strength - payment_from_consume)
        
        # Consumption classifier update (eq. 10): denominator is τ_c - 1 = old counter
        tau_c_old = c_cls.n_used  # = τ_c - 1 in paper notation
        c_cls.n_used += 1
        payment_from_exchange = b1_e * e_cls.strength if (trade_happened or e_cls.action == 0) else 0.0
        c_cls.strength = c_cls.strength - (1.0 / max(tau_c_old, 1)) * (
            (1 + b2_c) * c_cls.strength - payment_from_exchange - external_payoff)
        
        # Pay previous consume classifier from current exchange classifier
        if self.last_consume_winner is not None and (trade_happened or e_cls.action == 0):
            prev_c = self.consume_classifiers[self.last_consume_winner]
            payment = b1_e * e_cls.strength
            prev_c.strength += payment / max(prev_c.n_used, 1)
        
        self.last_consume_winner = consume_winner_idx


class FiatMoneySimulation:
    """
    Kiyotaki-Wright economy with fiat money using proper classifier agents.
    
    Goods 0-2 are commodities, Good 3 is fiat money.
    """
    def __init__(self, n_agents_per_type=50, storage_costs=None, utility=100.0,
                 n_fiat=48, n_trade_cls=150, n_consume_cls=20,
                 bid_trade=(0.025, 0.025), bid_consume=(0.025, 0.25),
                 ga_frequency=0.5, ga_pcross=0.6, ga_pmutation=0.01,
                 ga_propselect=0.2, ga_propused=0.7,
                 ga_crowdfactor_trade=8, ga_crowdfactor_consume=4,
                 ga_crowdsubpop=0.5, ga_uratio=(0.0, 0.2),
                 seed=None):
        if seed is not None:
            np.random.seed(seed)
        
        self.n_types = 3
        self.n_goods = 4
        self.n_agents_per_type = n_agents_per_type
        self.n_agents = 3 * n_agents_per_type
        self.produces = np.array([1, 2, 0])  # Same Wicksell triangle
        
        if storage_costs is None:
            self.storage_costs = np.array([9.0, 14.0, 29.0, 0.0])
        else:
            self.storage_costs = storage_costs
        
        self.utility = np.array([utility] * 3)
        self.ga_frequency = ga_frequency
        self.ga_pcross = ga_pcross
        self.ga_pmutation = ga_pmutation
        self.ga_propselect = ga_propselect
        self.ga_propused = ga_propused
        self.ga_crowdfactor_trade = ga_crowdfactor_trade
        self.ga_crowdfactor_consume = ga_crowdfactor_consume
        self.ga_crowdsubpop = ga_crowdsubpop
        self.ga_uratio = ga_uratio
        
        # Agent types
        self.agent_types = np.repeat(np.arange(3), n_agents_per_type)
        
        # Classifier agents (one per type, shared by all agents of that type)
        self.agents = [
            FiatMoneyAgent(i, n_trade_cls, n_consume_cls, bid_trade, bid_consume) 
            for i in range(3)
        ]
        
        # Initialize: each agent holds their production good
        self.holdings = np.zeros(self.n_agents, dtype=int)
        for i in range(3):
            self.holdings[self.agent_types == i] = self.produces[i]
        
        # Inject fiat money (good 3) to random agents
        fiat_agents = np.random.choice(self.n_agents, size=n_fiat, replace=False)
        self.holdings[fiat_agents] = 3
        
        self.history = {'holdings_dist': [], 'trade_rates': [], 'consumption_rates': [],
                        'exchange_counts': [], 'consumption_counts': []}
    
    def run(self, n_periods=2000, record_every=1, verbose=True):
        # Pre-compute GA schedule matching MATLAB winitial.m:
        # GA fires only on EVEN iterations with probability 1/sqrt(k/2).
        pga = 1.0 / np.sqrt(np.arange(1, n_periods // 2 + 1))
        self._ga_schedule = np.zeros(n_periods, dtype=bool)
        even_indices = np.arange(1, n_periods, 2)  # 0-indexed even iterations
        self._ga_schedule[even_indices] = pga[:len(even_indices)] > np.random.rand(len(even_indices))
        
        for t in range(1, n_periods + 1):
            trades = 0
            consumptions = 0
            exchange_count = np.zeros((self.n_types, self.n_goods, self.n_goods), dtype=int)
            consumption_count = np.zeros((self.n_types, self.n_goods), dtype=int)
            
            perm = np.random.permutation(self.n_agents)
            n_pairs = self.n_agents // 2
            
            for p in range(n_pairs):
                a1, a2 = perm[2*p], perm[2*p+1]
                t1, t2 = self.agent_types[a1], self.agent_types[a2]
                g1, g2 = self.holdings[a1], self.holdings[a2]
                agent1, agent2 = self.agents[t1], self.agents[t2]
                
                # Trade decisions
                act1, tw1 = agent1.get_trade_decision(g1, g2)
                act2, tw2 = agent2.get_trade_decision(g2, g1)
                
                trade_happens = (act1 == 1) and (act2 == 1) and (g1 != g2)
                if trade_happens:
                    self.holdings[a1], self.holdings[a2] = g2, g1
                    trades += 1
                    exchange_count[t1, g1, g2] += 1
                    exchange_count[t2, g2, g1] += 1
                
                pg1, pg2 = self.holdings[a1], self.holdings[a2]
                
                # Consumption decisions
                ca1, cw1 = agent1.get_consume_decision(pg1)
                ca2, cw2 = agent2.get_consume_decision(pg2)
                
                payoff1 = self._process_consumption(a1, t1, pg1, ca1)
                if ca1 == 1:
                    consumption_count[t1, pg1] += 1
                    if pg1 == t1:
                        consumptions += 1
                payoff2 = self._process_consumption(a2, t2, pg2, ca2)
                if ca2 == 1:
                    consumption_count[t2, pg2] += 1
                    if pg2 == t2:
                        consumptions += 1
                
                # Bucket brigade updates
                agent1.update_strengths(tw1, cw1, payoff1, trade_happens or act1 == 0)
                agent2.update_strengths(tw2, cw2, payoff2, trade_happens or act2 == 0)
            
            # GA schedule matching MATLAB winitial.m:
            # Even iterations only, probability 1/sqrt(k/2)
            ga_idx = t - 1
            run_ga = (ga_idx < len(self._ga_schedule) and self._ga_schedule[ga_idx])
            
            if run_ga:
                # Random type selection matching MATLAB:
                # Always send 1 type, with p=0.33 send a 2nd, with p=0.33 send a 3rd
                types_to_send = [np.random.randint(self.n_types)]
                remaining = [i for i in range(self.n_types) if i != types_to_send[0]]
                if np.random.rand() < 0.33 and remaining:
                    second = remaining[np.random.randint(len(remaining))]
                    types_to_send.append(second)
                    remaining = [i for i in remaining if i != second]
                    if np.random.rand() < 0.33 and remaining:
                        types_to_send.append(remaining[np.random.randint(len(remaining))])
                
                for type_idx in types_to_send:
                    agent = self.agents[type_idx]
                    agent.trade_classifiers = apply_genetic_algorithm(
                        agent.trade_classifiers,
                        pcross=self.ga_pcross, pmutation=self.ga_pmutation,
                        propselect=self.ga_propselect, propused=self.ga_propused,
                        crowding_factor=self.ga_crowdfactor_trade,
                        crowding_subpop=self.ga_crowdsubpop,
                        uratio=self.ga_uratio)
                
                # Separate random type selection for consume classifiers
                types_to_send_c = [np.random.randint(self.n_types)]
                remaining_c = [i for i in range(self.n_types) if i != types_to_send_c[0]]
                if np.random.rand() < 0.33 and remaining_c:
                    second_c = remaining_c[np.random.randint(len(remaining_c))]
                    types_to_send_c.append(second_c)
                    remaining_c = [i for i in remaining_c if i != second_c]
                    if np.random.rand() < 0.33 and remaining_c:
                        types_to_send_c.append(remaining_c[np.random.randint(len(remaining_c))])
                
                for type_idx in types_to_send_c:
                    agent = self.agents[type_idx]
                    agent.consume_classifiers = apply_genetic_algorithm(
                        agent.consume_classifiers,
                        pcross=self.ga_pcross, pmutation=self.ga_pmutation,
                        propselect=self.ga_propselect, propused=self.ga_propused,
                        crowding_factor=self.ga_crowdfactor_consume,
                        crowding_subpop=self.ga_crowdsubpop,
                        uratio=self.ga_uratio)
            
            # Specialization: convert wildcards to specific bits (Section 6)
            for agent in self.agents:
                apply_specialization(agent.trade_classifiers, t)
                apply_specialization(agent.consume_classifiers, t)
            
            # --- Tax mechanism (matching MATLAB class003) ---
            # Tax applied to trade classifier strengths each period
            tax_rate = 0.0001
            for agent in self.agents:
                for cls in agent.trade_classifiers:
                    cls.strength -= tax_rate * abs(cls.strength)
            
            if t % record_every == 0:
                self.history['holdings_dist'].append(self._compute_dist())
                self.history['trade_rates'].append(trades)
                self.history['consumption_rates'].append(consumptions)
                self.history['exchange_counts'].append(exchange_count)
                self.history['consumption_counts'].append(consumption_count)
            
            if verbose and t % max(1, n_periods // 10) == 0:
                print(f"  Period {t:5d}/{n_periods}: Trades={trades:3d}, Cons={consumptions:3d}")
    
    def _process_consumption(self, agent_idx, type_id, holding, action):
        """
        Process consumption: type i consumes good i, can't consume fiat (good 3).
        
        Per the paper: agents are not allowed to consume good 0 (fiat money).
        Agents CAN attempt to consume the wrong commodity good (getting 0 utility).
        """
        if action == 1 and holding != 3:  # Can consume any commodity but not fiat
            # Consume: utility is u_i if holding own good, else 0
            utility = self.utility[type_id] if holding == type_id else 0.0
            new_good = self.produces[type_id]
            self.holdings[agent_idx] = new_good
            return utility - self.storage_costs[new_good]
        else:
            return -self.storage_costs[holding]
    
    def _compute_dist(self):
        dist = np.zeros((3, 4))
        for i in range(3):
            mask = self.agent_types == i
            for k in range(4):
                dist[i, k] = np.mean(self.holdings[mask] == k)
        return dist


print("FiatMoneySimulation class defined (with proper classifier system).")

FiatMoneySimulation class defined (with proper classifier system).

# Run Economy C  
# GA parameters: pcross=0.6, pmutation=0.01 — matching MATLAB winitial.m
# (Previously had pcross=0.01, pmutation=0.2 which was a bug — confirmed by
#  reading original/matlab/class003.m which calls winitial with no overrides)
print("Running Economy C (Fiat Money)...")
print("=" * 50)

sim_c = FiatMoneySimulation(
    n_agents_per_type=50,
    storage_costs=np.array([9.0, 14.0, 29.0, 0.0]),  # goods 0-2 = commodities, good 3 = fiat ($0)
    utility=100.0,
    n_fiat=48,
    n_trade_cls=150,
    n_consume_cls=20,
    ga_frequency=0.5,
    ga_pcross=0.6,       # Fixed: was 0.01, now matches MATLAB winitial.m (pcrosst=0.6)
    ga_pmutation=0.01,   # Fixed: was 0.2, now matches MATLAB winitial.m (pmutationt=0.01)
    seed=42
)
sim_c.run(n_periods=2000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy C (Fiat Money)...
==================================================
  Period   200/2000: Trades= 34, Cons= 33
  Period   400/2000: Trades= 20, Cons= 22
  Period   600/2000: Trades= 30, Cons= 26
  Period   800/2000: Trades= 37, Cons= 27
  Period  1000/2000: Trades= 31, Cons= 31
  Period  1200/2000: Trades= 46, Cons= 39
  Period  1400/2000: Trades= 35, Cons= 31
  Period  1600/2000: Trades= 27, Cons= 19
  Period  1800/2000: Trades= 40, Cons= 29
  Period  2000/2000: Trades= 31, Cons= 25

Simulation complete.

# Results for Economy C (Fiat Money)
# Paper reports at t=750 and t=1250 (Tables 13-16)

good_labels_c = ['Good 1', 'Good 2', 'Good 3', 'Fiat $']

def print_fiat_holdings(sim, t_idx, label):
    dist = sim.history['holdings_dist'][t_idx]
    print(f"\nHoldings Distribution {label}")
    print(f"{'π_i^h(j)':>10s}   j=1      j=2      j=3      j=fiat")
    print("-" * 55)
    for i in range(3):
        print(f"  i={i+1:d}       {dist[i,0]:.3f}    {dist[i,1]:.3f}    {dist[i,2]:.3f}    {dist[i,3]:.3f}")

def print_fiat_exchange_freq(sim, t_idx, window, label):
    n = min(window, t_idx + 1)
    recent = sim.history['exchange_counts'][t_idx - n + 1:t_idx + 1]
    avg_counts = np.mean(recent, axis=0)
    freq = avg_counts / sim.n_agents_per_type
    print(f"\nExchange Frequency π_i^e(jk) {label}")
    print("-" * 80)
    for i in range(3):
        row = f"  i={i+1:d}  "
        for j in range(4):
            quad = "(" + ",".join([f"{freq[i,j,k]:.2f}" for k in range(4)]) + ")"
            row += f"  {quad:<22s}"
        print(row)

def print_fiat_winning_actions(sim, label):
    """Print winning classifier actions for 4-good economy."""
    print(f"\nWinning Classifier Actions π̃_i^e(jk|j) {label}")
    print("-" * 80)
    for i in range(3):
        agent = sim.agents[i]
        row = f"  i={i+1:d}  "
        for j in range(4):
            actions = []
            for k in range(4):
                own_enc = encode_good(j, 4)
                partner_enc = encode_good(k, 4)
                state = np.concatenate([own_enc, partner_enc])
                matches = [(idx, c) for idx, c in enumerate(agent.trade_classifiers) 
                           if c.matches(state)]
                if matches:
                    best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
                    actions.append(str(best_cls.action))
                else:
                    actions.append("—")
            quad = "(" + ",".join(actions) + ")"
            row += f"  {quad:<22s}"
        print(row)

# --- Results at t=750 (cf. Paper Tables 13-16) ---
print("=" * 80)
print("ECONOMY C RESULTS AT t=750 (cf. Paper Tables 13-16)")
print("=" * 80)
print_fiat_holdings(sim_c, 749, "at t=750")
print_fiat_exchange_freq(sim_c, 749, 10, "at t=750")
print_fiat_winning_actions(sim_c, "at t=750")

# --- Results at t=1250 ---
print("\n" + "=" * 80)
print("ECONOMY C RESULTS AT t=1250 (cf. Paper Tables 13-16)")
print("=" * 80)
print_fiat_holdings(sim_c, 1249, "at t=1250")
print_fiat_exchange_freq(sim_c, 1249, 10, "at t=1250")
print_fiat_winning_actions(sim_c, "at t=1250")

# Plot (replicates Figure 9)
history_c = np.array(sim_c.history['holdings_dist'])  # (T, 3, 4)
T = len(history_c)
time = np.arange(T)

fig, axes = plt.subplots(1, 3, figsize=(15, 4), sharey=True)
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', 'gold']
linestyles = ['--', ':', '-.', '-']

for i, ax in enumerate(axes):
    for k in range(4):
        ax.plot(time, history_c[:, i, k] * 100, 
                color=colors[k], linestyle=linestyles[k],
                linewidth=1.5, label=good_labels_c[k])
    ax.set_xlabel('Period')
    ax.set_ylabel('% Holding' if i == 0 else '')
    ax.set_title(f'Type {i+1} Agent')
    ax.legend(loc='best', fontsize=8)
    ax.set_ylim(-5, 105)
    ax.grid(True, alpha=0.3)

fig.suptitle('Economy C: Emergence of Fiat Money (cf. Figure 9 in paper)', 
             fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()

================================================================================
ECONOMY C RESULTS AT t=750 (cf. Paper Tables 13-16)
================================================================================

Holdings Distribution at t=750
  π_i^h(j)   j=1      j=2      j=3      j=fiat
-------------------------------------------------------
  i=1       0.000    0.680    0.000    0.320
  i=2       0.160    0.000    0.540    0.300
  i=3       0.260    0.400    0.000    0.340

Exchange Frequency π_i^e(jk) at t=750
--------------------------------------------------------------------------------
  i=1    (0.00,0.00,0.00,0.00)   (0.07,0.00,0.13,0.12)   (0.00,0.00,0.00,0.00)   (0.05,0.00,0.08,0.00) 
  i=2    (0.00,0.01,0.01,0.00)   (0.00,0.00,0.00,0.00)   (0.04,0.22,0.00,0.14)   (0.00,0.12,0.04,0.00) 
  i=3    (0.00,0.06,0.03,0.05)   (0.00,0.00,0.08,0.07)   (0.00,0.00,0.00,0.00)   (0.00,0.07,0.02,0.00) 

Winning Classifier Actions π̃_i^e(jk|j) at t=750
--------------------------------------------------------------------------------
  i=1    (1,—,—,—)               (1,0,1,1)               (1,—,—,—)               (1,0,1,1)             
  i=2    (—,1,1,1)               (—,1,—,—)               (1,1,1,1)               (0,1,1,1)             
  i=3    (1,1,1,1)               (0,1,1,1)               (1,0,1,—)               (0,0,1,0)             

================================================================================
ECONOMY C RESULTS AT t=1250 (cf. Paper Tables 13-16)
================================================================================

Holdings Distribution at t=1250
  π_i^h(j)   j=1      j=2      j=3      j=fiat
-------------------------------------------------------
  i=1       0.000    0.740    0.000    0.260
  i=2       0.380    0.000    0.280    0.340
  i=3       0.640    0.000    0.000    0.360

Exchange Frequency π_i^e(jk) at t=1250
--------------------------------------------------------------------------------
  i=1    (0.00,0.00,0.00,0.00)   (0.22,0.00,0.06,0.12)   (0.00,0.00,0.00,0.00)   (0.13,0.00,0.00,0.00) 
  i=2    (0.00,0.08,0.04,0.09)   (0.00,0.00,0.00,0.00)   (0.13,0.06,0.00,0.08)   (0.07,0.06,0.02,0.00) 
  i=3    (0.00,0.15,0.09,0.11)   (0.00,0.00,0.00,0.00)   (0.00,0.00,0.00,0.00)   (0.00,0.06,0.06,0.00) 

Winning Classifier Actions π̃_i^e(jk|j) at t=1250
--------------------------------------------------------------------------------
  i=1    (1,—,—,—)               (1,0,1,1)               (1,—,—,—)               (1,0,1,1)             
  i=2    (—,1,1,1)               (—,1,—,—)               (1,1,1,1)               (0,1,1,1)             
  i=3    (1,1,1,1)               (0,1,1,1)               (1,0,1,—)               (0,0,1,0)

Discussion: Economy C (Fiat Money)¶

The paper reports that “the economy converges remarkably fast to the fundamental equilibrium” with fiat money. Agents learn to accept the intrinsically worthless fiat money in trade because:

Fiat money has zero storage cost, making it preferable to hold
Other agents learn to accept it, creating a self-reinforcing pattern
The classifier system naturally credits rules that lead to lower storage costs

This demonstrates the ability of artificially intelligent agents to discover and sustain complex social arrangements like fiat money.

10. Economy D: Five Goods, Five Types — The “Triumph” of the Paper¶

Economy D is the most complex and arguably most important economy in the paper. It involves 5 agent types and 5 goods, with the production structure:

Type $i$	Produces	Consumes
I	Good 3	Good 1
II	Good 4	Good 2
III	Good 5	Good 3
IV	Good 1	Good 4
V	Good 2	Good 5

Storage costs: $s_1 = 1,\ s_2 = 4,\ s_3 = 9,\ s_4 = 16,\ s_5 = 30$

Utility: $u_i = 200$ for all $i$

Why This Economy Matters¶

The authors write:

“For this economy we do not start with any characterization of a stationary equilibrium. With enough work, one could obtain an analytic solution for the fundamental equilibrium for such an economy, but here our purpose is to let our artificially intelligent agents suggest to us what an equilibrium might be.”

The classifier systems discovered fundamental-like trading patterns — a Nash equilibrium that the authors had not analytically derived beforehand. They could only verify it ex post. This demonstrated that classifier systems could serve as equilibrium-discovery tools, not just equilibrium-verification tools.

The discovered pattern: agents trade only for goods with lower storage cost than what they currently hold, except they always accept their own consumption good.

class FiveGoodAgent:
    """
    Classifier agent for the 5-good economy (Economy D).
    
    Uses 3-bit encoding for 5 goods:
        Good 0 (paper's Good 1) -> [1, 0, 0]
        Good 1 (paper's Good 2) -> [0, 1, 0]
        Good 2 (paper's Good 3) -> [0, 0, 1]
        Good 3 (paper's Good 4) -> [1, 1, 0]
        Good 4 (paper's Good 5) -> [1, 0, 1]
    
    Parameters match the paper's Table for Economy D:
        E_a = 180 exchange classifiers, C_a = 20 consumption classifiers
    
    Implements class004.m-style diversification and specialization operators
    from the original MATLAB code.
    """
    def __init__(self, type_id: int, n_trade_cls: int = 180,
                 n_consume_cls: int = 20,
                 bid_trade: tuple = (0.025, 0.025),
                 bid_consume: tuple = (0.25, 0.25),
                 ufitness: float = 0.5):
        self.type_id = type_id
        self.n_goods = 5
        self.n_bits = 3  # 3 bits for 5 goods
        self.bid_trade = bid_trade
        self.bid_consume = bid_consume
        self.ufitness = ufitness  # Threshold for identifying low-usage "loser" rules
        
        # Random classifiers (paper uses R = random for Economy D)
        self.trade_classifiers = create_random_classifier_set(
            n_trade_cls, 2 * self.n_bits)  # condition = 6 trits (own good + partner good)
        self.consume_classifiers = create_random_classifier_set(
            n_consume_cls, self.n_bits)     # condition = 3 trits (own good)
        
        self.last_consume_winner = None
    
    def _class004_diversification(self, classifiers, matches, winner_idx, winner_cls, iteration=0):
        """
        Class004-style diversification (matching MATLAB class004.m step f/m).
        
        If ALL matching classifiers have the same action as the winner,
        replace the weakest LOW-USAGE classifier (a "loser") with a COPY of
        the winner but with the OPPOSITE action. This preserves the winner's
        general condition pattern (which may contain wildcards).
        
        This is fundamentally different from the old approach which replaced
        a matching classifier with an exact-state rule.
        """
        if not matches or len(matches) < 1:
            return
        
        # Check if all matching classifiers have the same action
        actions = set(c.action for _, c in matches)
        if len(actions) > 1:
            return  # Both actions already represented — no diversification needed
        
        # Find low-usage "loser" classifiers (usage < ufitness * (1 + max_usage))
        max_usage = max(c.n_used for c in classifiers) + 1
        losers = [(i, c) for i, c in enumerate(classifiers)
                  if c.n_used / max_usage < self.ufitness]
        
        if not losers:
            return
        
        # Replace weakest loser with copy of winner but opposite action
        weakest_loser_idx, _ = min(losers, key=lambda x: x[1].strength)
        new_cls = Classifier(
            winner_cls.condition.copy(),
            1 - winner_cls.action,
            winner_cls.strength
        )
        new_cls.n_used = winner_cls.n_used
        classifiers[weakest_loser_idx] = new_cls
    
    def _class004_specialization(self, classifiers, winner_idx, winner_cls, 
                                  pmutation=0.01):
        """
        Class004-style specialization (matching MATLAB class004.m step h/o).
        
        For winning rules with wildcards and high usage, create a specialized 
        copy by converting some wildcards to specific values. The specialized
        copy replaces the weakest low-usage classifier.
        """
        condition = winner_cls.condition
        cond_with_action = np.append(condition, winner_cls.action)
        
        # Count wildcards (checking only condition bits that don't match any exact good)
        n_wildcards = np.sum(condition == -1)
        if n_wildcards == 0:
            return
        
        # Check if winner has high enough usage
        max_usage_among_matches = max(c.n_used for c in classifiers) + 1
        if winner_cls.n_used / max_usage_among_matches <= self.ufitness:
            return
        
        # Randomly select wildcard positions to specialize (with pmutation probability)
        wildcard_mask = (cond_with_action < 0)
        specialize_mask = wildcard_mask & (np.random.rand(len(cond_with_action)) < pmutation)
        
        if not np.any(specialize_mask):
            return
        
        # Find low-usage losers
        max_usage = max(c.n_used for c in classifiers) + 1
        losers = [(i, c) for i, c in enumerate(classifiers)
                  if c.n_used / max_usage < self.ufitness]
        
        if not losers:
            return
        
        weakest_idx, _ = min(losers, key=lambda x: x[1].strength)
        
        # Create specialized copy: keep non-wildcard bits, randomize selected wildcards
        new_cond = condition.copy()
        for j in range(len(condition)):
            if specialize_mask[j]:
                new_cond[j] = float(np.random.randint(0, 2))
        
        new_action = winner_cls.action
        if specialize_mask[-1]:  # Action bit was also selected
            new_action = np.random.randint(0, 2)
        
        classifiers[weakest_idx] = Classifier(new_cond, new_action, winner_cls.strength)
        classifiers[weakest_idx].n_used = winner_cls.n_used
    
    def get_trade_decision(self, own_good: int, partner_good: int, 
                           do_specialize: bool = False, pmutation: float = 0.01):
        own_enc = encode_good(own_good, self.n_goods)
        partner_enc = encode_good(partner_good, self.n_goods)
        state = np.concatenate([own_enc, partner_enc])
        
        matches = [(i, c) for i, c in enumerate(self.trade_classifiers) 
                   if c.matches(state)]
        
        if not matches:
            # Creation operator: replace redundant/weak classifier
            new_idx = create_classifier_replacing_weakest(self.trade_classifiers, state)
            return self.trade_classifiers[new_idx].action, new_idx
        
        # Find winner by highest strength (class004 step f)
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        
        # Class004 diversification: replace low-usage loser with winner+opposite action
        self._class004_diversification(
            self.trade_classifiers, matches, best_idx, best_cls)
        
        # Class004 specialization: create specialized copies of winning rules
        if do_specialize:
            self._class004_specialization(
                self.trade_classifiers, best_idx, best_cls, pmutation)
        
        # Re-find matches and winner (diversification may have added new matching rules)
        matches = [(i, c) for i, c in enumerate(self.trade_classifiers) 
                   if c.matches(state)]
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        
        return best_cls.action, best_idx
    
    def get_consume_decision(self, holding: int, 
                             do_specialize: bool = False, pmutation: float = 0.01):
        state = encode_good(holding, self.n_goods)
        matches = [(i, c) for i, c in enumerate(self.consume_classifiers) 
                   if c.matches(state)]
        
        if not matches:
            # Creation operator: replace redundant/weak classifier
            new_idx = create_classifier_replacing_weakest(self.consume_classifiers, state)
            return self.consume_classifiers[new_idx].action, new_idx
        
        # Find winner by highest strength
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        
        # Class004 diversification
        self._class004_diversification(
            self.consume_classifiers, matches, best_idx, best_cls)
        
        # Class004 specialization
        if do_specialize:
            self._class004_specialization(
                self.consume_classifiers, best_idx, best_cls, pmutation)
        
        # Re-find matches and winner
        matches = [(i, c) for i, c in enumerate(self.consume_classifiers) 
                   if c.matches(state)]
        best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
        
        return best_cls.action, best_idx
    
    def update_strengths(self, trade_winner_idx, consume_winner_idx,
                         external_payoff, trade_happened):
        """
        Bucket brigade strength update per paper equations (10)-(11).
        
        Key: denominators use counters BEFORE incrementing.
        - Exchange: tau_e = n_used before increment (paper eq 11)
        - Consumption: tau_c - 1 = n_used before increment (paper eq 10)
        """
        b11, b12 = self.bid_trade
        b21, b22 = self.bid_consume
        
        e_cls = self.trade_classifiers[trade_winner_idx]
        c_cls = self.consume_classifiers[consume_winner_idx]
        
        b1_e = b11 + b12 * e_cls.specificity
        b2_c = b21 + b22 * c_cls.specificity
        
        # Exchange classifier update (paper eq 11)
        if trade_happened or e_cls.action == 0:
            tau_e = e_cls.n_used  # Use BEFORE incrementing (paper's tau_e)
            e_cls.n_used += 1
            # Track actual trades: n_traded increments only when action=1 (trade)
            # MATLAB class004.m: CSt(win1,l2+3:l2+4) += [trade, 1]
            e_cls.n_traded += (1 if e_cls.action == 1 else 0)
            payment_from_consume = b2_c * c_cls.strength
            e_cls.strength = e_cls.strength - (1.0 / max(tau_e, 1)) * (
                (1 + b1_e) * e_cls.strength - payment_from_consume)
        
        # Consumption classifier update (paper eq 10, denominator = tau_c - 1)
        tau_c_old = c_cls.n_used  # This is tau_c - 1 in paper notation
        c_cls.n_used += 1
        payment_from_exchange = b1_e * e_cls.strength if (trade_happened or e_cls.action == 0) else 0.0
        c_cls.strength = c_cls.strength - (1.0 / max(tau_c_old, 1)) * (
            (1 + b2_c) * c_cls.strength - payment_from_exchange - external_payoff)
        
        if self.last_consume_winner is not None:
            prev_c = self.consume_classifiers[self.last_consume_winner]
            tau_prev = prev_c.n_used
            payment = b1_e * e_cls.strength
            prev_c.strength = prev_c.strength + payment / max(tau_prev, 1)
        
        self.last_consume_winner = consume_winner_idx


def apply_ga4_crossover(classifiers: List[Classifier],
                         n_pairs: int = None,
                         pcross: float = 0.6,
                         pmutation: float = 0.01,
                         propselect: float = 0.2,
                         propused: float = 0.7,
                         crowding_factor: int = 8,
                         crowding_subpop: float = 0.5,
                         uratio: tuple = (0.0, 0.2)) -> List[Classifier]:
    """
    GA4-style genetic algorithm matching MATLAB ga4.m for Economy D.
    
    Key difference from ga3: two-point GENERALIZATION crossover.
    When parents disagree on a specific bit, the child gets a wildcard (-1).
    This promotes exploration via generalization.
    
    Note: ga4.m does NOT perform mutation (nmutation is initialized to 0
    and never incremented in the MATLAB source).
    """
    n_classifiers = len(classifiers)
    if n_classifiers < 4:
        return classifiers
    
    if n_pairs is None:
        n_pairs = max(1, round(propselect * n_classifiers * 0.5))
    
    cond_length = len(classifiers[0].condition)
    
    strengths = np.array([c.strength for c in classifiers])
    usage = np.array([c.n_used for c in classifiers])
    min_s = strengths.min()
    
    # Identify killable rules: strength < uratio[0] OR usage/(1+max) < uratio[1]
    # Note: ga4 uses 1+max(usage) in denominator (vs max(usage) in ga3)
    max_usage = max(usage.max(), 1) + 1
    cankill = []
    for i in range(n_classifiers):
        if strengths[i] < uratio[0] or usage[i] / max_usage < uratio[1]:
            cankill.append(i)
    
    if not cankill:
        return classifiers
    
    fitness = strengths.copy()
    if min_s < 0:
        fitness = fitness - min_s
    fitness = fitness + 1e-6
    
    n_pairs = min(n_pairs, (len(cankill) + 1) // 2)
    
    for _ in range(n_pairs):
        if not cankill:
            break
        
        # Stage 1: Pre-select by usage
        ncalled = int(propused * n_classifiers)
        if ncalled < n_classifiers:
            usage_weights = usage.astype(float) + 1.0
            available = usage_weights.copy()
            selected_indices = []
            for _ in range(min(ncalled, n_classifiers)):
                total = available.sum()
                if total <= 0:
                    break
                r = np.random.rand() * total
                cumsum = 0.0
                for idx in range(n_classifiers):
                    cumsum += available[idx]
                    if cumsum >= r:
                        selected_indices.append(idx)
                        available[idx] = 0.0
                        break
            if len(selected_indices) < 2:
                selected_indices = list(range(n_classifiers))
        else:
            selected_indices = list(range(n_classifiers))
        
        # Stage 2: Fitness-proportional selection
        subset_fitness = np.array([fitness[i] for i in selected_indices])
        total_fit = subset_fitness.sum()
        if total_fit <= 0:
            continue
        
        def roulette_select(pop_fitness, pop_size):
            r = np.random.rand() * pop_fitness.sum()
            cumsum = 0.0
            for j in range(pop_size):
                cumsum += pop_fitness[j]
                if cumsum >= r:
                    return j
            return pop_size - 1
        
        idx1 = roulette_select(subset_fitness, len(selected_indices))
        idx2 = roulette_select(subset_fitness, len(selected_indices))
        mate1 = selected_indices[idx1]
        mate2 = selected_indices[idx2]
        
        p1 = classifiers[mate1]
        p2 = classifiers[mate2]
        avg_strength = (p1.strength + p2.strength) / 2.0
        
        child1_cond = p1.condition.copy()
        child2_cond = p2.condition.copy()
        child1_action = p1.action
        child2_action = p2.action
        
        # --- Two-point generalization crossover (ga4 style) ---
        # MATLAB ga4.m: jcross=sort(1+floor((lcond+1)*rand(2,1))) — 1-based
        # Convert to 0-based for Python by removing the +1
        jcross = np.sort(np.floor((cond_length + 1) * np.random.rand(2)).astype(int))
        inside = np.random.rand() > 0.5
        
        if inside:
            cross_region = list(range(jcross[0], min(jcross[1], cond_length)))
        else:
            cross_region = list(range(0, min(jcross[0], cond_length))) + \
                          list(range(min(jcross[1], cond_length), cond_length))
        
        if cross_region:
            for j in cross_region:
                v1 = child1_cond[j]
                v2 = child2_cond[j]
                # Generalization crossover logic from MATLAB ga4.m:
                # tem3 = abs((v1<0)*v2 + (v1>=0)*v1 - (v2<0)*v1 - (v2>=0)*v2)
                # When both specific and agree → tem3=0 → keep value
                # When both specific but disagree → tem3≠0 → child = -tem3 (wildcard)
                # When mixed (one wildcard, one specific) → tem3=0 → keep parent values
                # When both wildcards → tem3=0 → keep wildcards
                if v1 >= 0 and v2 >= 0:
                    if v1 != v2:
                        # Parents disagree → generalize to wildcard
                        child1_cond[j] = -1
                        child2_cond[j] = -1
                    # else: parents agree → keep value (no change needed)
                # Mixed wildcard/specific or both wildcards → keep parent values
                # (MATLAB ga4.m: tem3=0 so child=parent*1-0=parent)
        
        # Note: ga4.m does NOT perform mutation (nmutation=0, never incremented)
        
        # Crowding replacement
        def find_crowding_replacement(child_cond, child_action, kill_set):
            if not kill_set:
                return None
            best_match = -1
            best_similarity = -1
            subpop_size = max(1, int(crowding_subpop * len(kill_set)))
            for _ in range(crowding_factor):
                if subpop_size >= len(kill_set):
                    candidates = list(kill_set)
                else:
                    candidates = list(np.random.choice(kill_set, size=subpop_size, replace=False))
                weakest_idx = min(candidates, key=lambda i: classifiers[i].strength)
                cond_match = sum(1 for j in range(cond_length) 
                               if child_cond[j] == classifiers[weakest_idx].condition[j])
                action_diff = 1 if child_action != classifiers[weakest_idx].action else 0
                similarity = cond_match + action_diff
                if similarity > best_similarity:
                    best_similarity = similarity
                    best_match = weakest_idx
            return best_match
        
        mort1 = find_crowding_replacement(child1_cond, child1_action, cankill)
        if mort1 is not None:
            classifiers[mort1] = Classifier(child1_cond, child1_action, avg_strength)
            classifiers[mort1].n_used = p1.n_used
            classifiers[mort1].n_traded = p1.n_traded
            cankill = [i for i in cankill if i != mort1]
        
        if cankill:
            mort2 = find_crowding_replacement(child2_cond, child2_action, cankill)
            if mort2 is not None:
                classifiers[mort2] = Classifier(child2_cond, child2_action, avg_strength)
                classifiers[mort2].n_used = p2.n_used
                classifiers[mort2].n_traded = p2.n_traded
                cankill = [i for i in cankill if i != mort2]
    
    return classifiers


class FiveGoodSimulation:
    """
    Kiyotaki-Wright economy with 5 agent types and 5 goods (Economy D).
    
    This is the most complex economy in the paper. The production structure is:
        Type 0 (paper's I)   produces good 2 (paper's good 3), consumes good 0 (paper's good 1)
        Type 1 (paper's II)  produces good 3 (paper's good 4), consumes good 1 (paper's good 2)
        Type 2 (paper's III) produces good 4 (paper's good 5), consumes good 2 (paper's good 3)
        Type 3 (paper's IV)  produces good 0 (paper's good 1), consumes good 3 (paper's good 4)
        Type 4 (paper's V)   produces good 1 (paper's good 2), consumes good 4 (paper's good 5)
    
    Implements class004.m-style operators: ga4 crossover, diversification,
    specialization, and per-period strength tax.
    """
    def __init__(self, n_agents_per_type=50, storage_costs=None, utility=200.0,
                 n_trade_cls=180, n_consume_cls=20,
                 bid_trade=(0.025, 0.025), bid_consume=(0.25, 0.25),
                 ga_frequency=0.5, ga_pcross=0.6, ga_pmutation=0.01,
                 ga_propselect=0.2, ga_propused=0.7,
                 ga_crowdfactor_trade=8, ga_crowdfactor_consume=4,
                 ga_crowdsubpop=0.5, ga_uratio=(0.0, 0.2),
                 seed=None):
        if seed is not None:
            np.random.seed(seed)
        
        self.n_types = 5
        self.n_goods = 5
        self.n_agents_per_type = n_agents_per_type
        self.n_agents = 5 * n_agents_per_type
        
        # Production: Type i produces good (i+2) mod 5
        # Paper: I→3, II→4, III→5, IV→1, V→2
        # 0-indexed: 0→2, 1→3, 2→4, 3→0, 4→1
        self.produces = np.array([2, 3, 4, 0, 1])
        
        if storage_costs is None:
            self.storage_costs = np.array([1.0, 4.0, 9.0, 16.0, 30.0])
        else:
            self.storage_costs = storage_costs
        
        self.utility = np.array([utility] * 5)
        self.ga_frequency = ga_frequency
        self.ga_pcross = ga_pcross
        self.ga_pmutation = ga_pmutation
        self.ga_propselect = ga_propselect
        self.ga_propused = ga_propused
        self.ga_crowdfactor_trade = ga_crowdfactor_trade
        self.ga_crowdfactor_consume = ga_crowdfactor_consume
        self.ga_crowdsubpop = ga_crowdsubpop
        self.ga_uratio = ga_uratio
        
        # Agent types
        self.agent_types = np.repeat(np.arange(5), n_agents_per_type)
        
        # Classifier agents (one shared per type)
        self.agents = [
            FiveGoodAgent(i, n_trade_cls, n_consume_cls, bid_trade, bid_consume)
            for i in range(5)
        ]
        
        # Initialize: each agent holds their production good
        self.holdings = np.zeros(self.n_agents, dtype=int)
        for i in range(5):
            self.holdings[self.agent_types == i] = self.produces[i]
        
        self.history = {'holdings_dist': [], 'trade_rates': [], 'consumption_rates': [],
                        'exchange_counts': [], 'consumption_counts': []}
    
    def run(self, n_periods=3000, record_every=1, verbose=True):
        # Pre-compute GA schedule matching MATLAB winitial.m:
        # GA fires only on EVEN iterations with probability 1/sqrt(k/2).
        # Sized to n_periods (not hardcoded) per Copilot review.
        pga = 1.0 / np.sqrt(np.arange(1, n_periods // 2 + 1))
        self._ga_schedule = np.zeros(n_periods, dtype=bool)
        even_idx = np.arange(1, n_periods, 2)
        self._ga_schedule[even_idx] = pga[:len(even_idx)] > np.random.rand(len(even_idx))
        
        # Separate schedule for specialization (runit1 in MATLAB)
        self._spec_schedule = np.zeros(n_periods, dtype=bool)
        self._spec_schedule[even_idx] = pga[:len(even_idx)] > np.random.rand(len(even_idx))
        
        for t in range(1, n_periods + 1):
            trades = 0
            consumptions = 0
            exchange_count = np.zeros((self.n_types, self.n_goods, self.n_goods), dtype=int)
            consumption_count = np.zeros((self.n_types, self.n_goods), dtype=int)
            
            # Determine if specialization should run this period
            ga_idx = t - 1
            do_specialize = (ga_idx < len(self._spec_schedule) and 
                           self._spec_schedule[ga_idx])
            
            perm = np.random.permutation(self.n_agents)
            n_pairs = self.n_agents // 2
            
            for p in range(n_pairs):
                a1, a2 = perm[2*p], perm[2*p+1]
                t1, t2 = self.agent_types[a1], self.agent_types[a2]
                g1, g2 = self.holdings[a1], self.holdings[a2]
                agent1, agent2 = self.agents[t1], self.agents[t2]
                
                # Trade decisions (with class004-style diversification + specialization)
                act1, tw1 = agent1.get_trade_decision(
                    g1, g2, do_specialize=do_specialize, pmutation=self.ga_pmutation)
                act2, tw2 = agent2.get_trade_decision(
                    g2, g1, do_specialize=do_specialize, pmutation=self.ga_pmutation)
                
                trade_happens = (act1 == 1) and (act2 == 1) and (g1 != g2)
                if trade_happens:
                    self.holdings[a1], self.holdings[a2] = g2, g1
                    trades += 1
                    exchange_count[t1, g1, g2] += 1
                    exchange_count[t2, g2, g1] += 1
                
                pg1, pg2 = self.holdings[a1], self.holdings[a2]
                
                # Consumption decisions (with class004-style diversification)
                ca1, cw1 = agent1.get_consume_decision(
                    pg1, do_specialize=do_specialize, pmutation=self.ga_pmutation)
                ca2, cw2 = agent2.get_consume_decision(
                    pg2, do_specialize=do_specialize, pmutation=self.ga_pmutation)
                
                payoff1 = self._process_consumption(a1, t1, pg1, ca1)
                if ca1 == 1:
                    consumption_count[t1, pg1] += 1
                    if pg1 == t1:
                        consumptions += 1
                payoff2 = self._process_consumption(a2, t2, pg2, ca2)
                if ca2 == 1:
                    consumption_count[t2, pg2] += 1
                    if pg2 == t2:
                        consumptions += 1
                
                # Bucket brigade updates
                agent1.update_strengths(tw1, cw1, payoff1, trade_happens or act1 == 0)
                agent2.update_strengths(tw2, cw2, payoff2, trade_happens or act2 == 0)
            
            # --- Genetic algorithm (ga4-style with generalization crossover) ---
            run_ga = (ga_idx < len(self._ga_schedule) and self._ga_schedule[ga_idx])
            if run_ga:
                # Random type selection (MATLAB: 1 always + p=0.33 for 2nd, 3rd, etc.)
                types_to_send = [np.random.randint(self.n_types)]
                remaining = [i for i in range(self.n_types) if i != types_to_send[0]]
                if np.random.rand() < 0.33 and remaining:
                    second = remaining[np.random.randint(len(remaining))]
                    types_to_send.append(second)
                    remaining = [i for i in remaining if i != second]
                    if np.random.rand() < 0.33 and remaining:
                        types_to_send.append(remaining[np.random.randint(len(remaining))])
                
                for type_idx in types_to_send:
                    agent = self.agents[type_idx]
                    agent.trade_classifiers = apply_ga4_crossover(
                        agent.trade_classifiers,
                        pcross=self.ga_pcross, pmutation=self.ga_pmutation,
                        propselect=self.ga_propselect, propused=self.ga_propused,
                        crowding_factor=self.ga_crowdfactor_trade,
                        crowding_subpop=self.ga_crowdsubpop,
                        uratio=self.ga_uratio)
                
                # Separate random type selection for consume classifiers
                types_c = [np.random.randint(self.n_types)]
                remaining_c = [i for i in range(self.n_types) if i != types_c[0]]
                if np.random.rand() < 0.33 and remaining_c:
                    sc = remaining_c[np.random.randint(len(remaining_c))]
                    types_c.append(sc)
                    remaining_c = [i for i in remaining_c if i != sc]
                    if np.random.rand() < 0.33 and remaining_c:
                        types_c.append(remaining_c[np.random.randint(len(remaining_c))])
                
                for type_idx in types_c:
                    agent = self.agents[type_idx]
                    agent.consume_classifiers = apply_ga4_crossover(
                        agent.consume_classifiers,
                        pcross=self.ga_pcross, pmutation=self.ga_pmutation,
                        propselect=self.ga_propselect, propused=self.ga_propused,
                        crowding_factor=self.ga_crowdfactor_consume,
                        crowding_subpop=self.ga_crowdsubpop,
                        uratio=self.ga_uratio)
            
            # --- Class004-style tax (prevents strength lock-in) ---
            # MATLAB class004.m: Pt = action*n_traded + (1-action)*n_called + 1
            #                    Pc = n_called + 1
            # CSt -= (1/Pt) * (CSt + 1);  CSc -= (1/Pc) * (CSc + 1)
            for agent in self.agents:
                for cls in agent.trade_classifiers:
                    Pt = cls.action * cls.n_traded + (1 - cls.action) * cls.n_used + 1
                    cls.strength -= (1.0 / Pt) * (cls.strength + 1.0)
                for cls in agent.consume_classifiers:
                    Pc = cls.n_used + 1
                    cls.strength -= (1.0 / Pc) * (cls.strength + 1.0)
            
            if t % record_every == 0:
                self.history['holdings_dist'].append(self._compute_dist())
                self.history['trade_rates'].append(trades)
                self.history['consumption_rates'].append(consumptions)
                self.history['exchange_counts'].append(exchange_count)
                self.history['consumption_counts'].append(consumption_count)
            
            if verbose and t % max(1, n_periods // 10) == 0:
                print(f"  Period {t:5d}/{n_periods}: Trades={trades:3d}, Cons={consumptions:3d}")
    
    def _process_consumption(self, agent_idx, type_id, holding, action):
        """
        Process consumption decision.
        
        Per paper eq (12): u_i(k) = utility if k == i, else 0.
        Agents CAN consume wrong goods (getting 0 utility) — they still
        produce their production good and pay its storage cost.
        """
        if action == 1:
            # Agent chose to consume
            utility = self.utility[type_id] if holding == type_id else 0.0
            new_good = self.produces[type_id]
            self.holdings[agent_idx] = new_good
            return utility - self.storage_costs[new_good]
        else:
            return -self.storage_costs[holding]
    
    def _compute_dist(self):
        dist = np.zeros((5, 5))
        for i in range(5):
            mask = self.agent_types == i
            for k in range(5):
                dist[i, k] = np.mean(self.holdings[mask] == k)
        return dist


print("FiveGoodSimulation class defined (Economy D).")
print("  Uses class004-style operators: ga4 crossover, diversification,")
print("  specialization, and per-period strength tax.")

FiveGoodSimulation class defined (Economy D).
  Uses class004-style operators: ga4 crossover, diversification,
  specialization, and per-period strength tax.

# Run Economy D  
print("Running Economy D (Five Goods, Five Types)...")
print("=" * 50)
print("This is the most complex economy in the paper.")
print("250 agents, 5 types, 5 goods, random classifiers + GA")
print()

sim_d = FiveGoodSimulation(
    n_agents_per_type=50,
    storage_costs=np.array([1.0, 4.0, 9.0, 16.0, 30.0]),
    utility=200.0,
    n_trade_cls=180,
    n_consume_cls=20,
    bid_trade=(0.025, 0.025),
    bid_consume=(0.25, 0.25),
    ga_frequency=0.5,
    ga_pcross=0.6,
    ga_pmutation=0.01,
    seed=42
)
sim_d.run(n_periods=3000, record_every=1, verbose=True)

print("\nSimulation complete.")

Running Economy D (Five Goods, Five Types)...
==================================================
This is the most complex economy in the paper.
250 agents, 5 types, 5 goods, random classifiers + GA

  Period   300/3000: Trades= 24, Cons= 14
  Period   600/3000: Trades= 21, Cons= 12
  Period   900/3000: Trades= 37, Cons= 23
  Period  1200/3000: Trades= 28, Cons= 17
  Period  1500/3000: Trades= 31, Cons= 14
  Period  1800/3000: Trades= 13, Cons=  5
  Period  2100/3000: Trades= 20, Cons= 12
  Period  2400/3000: Trades=  6, Cons=  1
  Period  2700/3000: Trades= 19, Cons= 15
  Period  3000/3000: Trades= 15, Cons= 11

Simulation complete.

# Results for Economy D (Five Goods, Five Types)
# Paper reports at t=500 and t=1750 (Tables 17-22)

def print_5good_holdings(sim, t_idx, label):
    # Handle negative indices
    if t_idx < 0:
        t_idx = len(sim.history['holdings_dist']) + t_idx
    dist = sim.history['holdings_dist'][t_idx]
    print(f"\nHoldings Distribution {label}")
    header = f"{'π_i^h(j)':>10s}"
    for k in range(5):
        header += f"   j={k+1:d}  "
    print(header)
    print("-" * 60)
    for i in range(5):
        row = f"  i={i+1:d}     "
        for k in range(5):
            row += f"  {dist[i, k]:.3f} "
        produces = sim.produces[i] + 1
        row += f"  (produces {produces}, consumes {i+1})"
        print(row)

def print_5good_exchange_freq(sim, t_idx, window, label):
    # Handle negative indices
    if t_idx < 0:
        t_idx = len(sim.history['exchange_counts']) + t_idx
    n = min(window, t_idx + 1)
    recent = sim.history['exchange_counts'][t_idx - n + 1:t_idx + 1]
    avg_counts = np.mean(recent, axis=0)
    freq = avg_counts / sim.n_agents_per_type
    print(f"\nExchange Frequency π_i^e(jk) {label}")
    print("-" * 100)
    for i in range(5):
        row = f"  i={i+1:d}  "
        for j in range(5):
            quint = "(" + ",".join([f"{freq[i,j,k]:.2f}" for k in range(5)]) + ")"
            row += f"  {quint:<28s}"
        print(row)

def print_5good_winning_actions(sim, label):
    """Print winning classifier actions for 5-good economy."""
    print(f"\nWinning Classifier Actions π̃_i^e(jk|j) {label}")
    print("-" * 100)
    for i in range(5):
        agent = sim.agents[i]
        row = f"  i={i+1:d}  "
        for j in range(5):
            actions = []
            for k in range(5):
                own_enc = encode_good(j, 5)
                partner_enc = encode_good(k, 5)
                state = np.concatenate([own_enc, partner_enc])
                matches = [(idx, c) for idx, c in enumerate(agent.trade_classifiers)
                           if c.matches(state)]
                if matches:
                    best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
                    actions.append(str(best_cls.action))
                else:
                    actions.append("—")
            quint = "(" + ",".join(actions) + ")"
            row += f"  {quint:<28s}"
        print(row)

# --- Results at t=500 (cf. Paper Tables 17-19) ---
print("=" * 100)
print("ECONOMY D RESULTS AT t=500 (cf. Paper Tables 17-19)")
print("=" * 100)
print_5good_holdings(sim_d, 499, "at t=500")
print_5good_exchange_freq(sim_d, 499, 10, "at t=500")
print_5good_winning_actions(sim_d, "at t=500")

# --- Results at t=1750 (cf. Paper Tables 20-22) ---
print("\n" + "=" * 100)
print("ECONOMY D RESULTS AT t=1750 (cf. Paper Tables 20-22)")
print("=" * 100)
print_5good_holdings(sim_d, 1749, "at t=1750")
print_5good_exchange_freq(sim_d, 1749, 10, "at t=1750")
print_5good_winning_actions(sim_d, "at t=1750")

# --- Final results at t=3000 ---
print("\n" + "=" * 100)
print("ECONOMY D RESULTS AT t=3000 (final)")
print("=" * 100)
print_5good_holdings(sim_d, -1, "at t=3000")
print_5good_exchange_freq(sim_d, -1, 10, "at t=3000")
print_5good_winning_actions(sim_d, "at t=3000")

====================================================================================================
ECONOMY D RESULTS AT t=500 (cf. Paper Tables 17-19)
====================================================================================================

Holdings Distribution at t=500
  π_i^h(j)   j=1     j=2     j=3     j=4     j=5  
------------------------------------------------------------
  i=1       0.000   0.660   0.300   0.040   0.000   (produces 3, consumes 1)
  i=2       0.000   0.000   0.140   0.840   0.020   (produces 4, consumes 2)
  i=3       0.000   0.740   0.000   0.060   0.200   (produces 5, consumes 3)
  i=4       1.000   0.000   0.000   0.000   0.000   (produces 1, consumes 4)
  i=5       0.000   0.920   0.020   0.060   0.000   (produces 2, consumes 5)

Exchange Frequency π_i^e(jk) at t=500
----------------------------------------------------------------------------------------------------
  i=1    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.09,0.00)    (0.00,0.09,0.00,0.06,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=2    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.03,0.00,0.03,0.00)    (0.00,0.16,0.07,0.00,0.02)    (0.00,0.00,0.00,0.00,0.00)  
  i=3    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.05,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.01,0.01,0.00,0.00)    (0.00,0.04,0.00,0.03,0.00)  
  i=4    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=5    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.07,0.10,0.04)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.01,0.01,0.00,0.01)    (0.00,0.00,0.00,0.00,0.00)  

Winning Classifier Actions π̃_i^e(jk|j) at t=500
----------------------------------------------------------------------------------------------------
  i=1    (—,—,—,—,—)                   (0,1,0,1,0)                   (1,0,1,1,0)                   (0,0,0,1,0)                   (1,0,1,1,1)                 
  i=2    (—,1,1,—,—)                   (—,—,—,—,—)                   (1,1,1,0,1)                   (1,1,1,1,1)                   (1,1,1,0,1)                 
  i=3    (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                 
  i=4    (0,0,0,0,0)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (1,—,—,—,—)                 
  i=5    (1,—,1,—,1)                   (1,1,1,1,1)                   (0,0,—,—,0)                   (1,1,1,1,1)                   (—,—,—,—,—)                 

====================================================================================================
ECONOMY D RESULTS AT t=1750 (cf. Paper Tables 20-22)
====================================================================================================

Holdings Distribution at t=1750
  π_i^h(j)   j=1     j=2     j=3     j=4     j=5  
------------------------------------------------------------
  i=1       0.000   0.680   0.220   0.100   0.000   (produces 3, consumes 1)
  i=2       0.000   0.000   0.020   0.960   0.020   (produces 4, consumes 2)
  i=3       0.000   0.080   0.000   0.500   0.420   (produces 5, consumes 3)
  i=4       1.000   0.000   0.000   0.000   0.000   (produces 1, consumes 4)
  i=5       0.000   1.000   0.000   0.000   0.000   (produces 2, consumes 5)

Exchange Frequency π_i^e(jk) at t=1750
----------------------------------------------------------------------------------------------------
  i=1    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.04,0.00)    (0.00,0.03,0.00,0.03,0.02)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=2    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.01,0.00,0.00,0.00)    (0.00,0.03,0.02,0.00,0.08)    (0.00,0.01,0.00,0.01,0.00)  
  i=3    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.01)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.03,0.02,0.00,0.04)    (0.00,0.10,0.01,0.11,0.00)  
  i=4    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=5    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.04,0.02,0.10)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  

Winning Classifier Actions π̃_i^e(jk|j) at t=1750
----------------------------------------------------------------------------------------------------
  i=1    (—,—,—,—,—)                   (0,1,0,1,0)                   (1,0,1,1,0)                   (0,0,0,1,0)                   (1,0,1,1,1)                 
  i=2    (—,1,1,—,—)                   (—,—,—,—,—)                   (1,1,1,0,1)                   (1,1,1,1,1)                   (1,1,1,0,1)                 
  i=3    (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                 
  i=4    (0,0,0,0,0)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (1,—,—,—,—)                 
  i=5    (1,—,1,—,1)                   (1,1,1,1,1)                   (0,0,—,—,0)                   (1,1,1,1,1)                   (—,—,—,—,—)                 

====================================================================================================
ECONOMY D RESULTS AT t=3000 (final)
====================================================================================================

Holdings Distribution at t=3000
  π_i^h(j)   j=1     j=2     j=3     j=4     j=5  
------------------------------------------------------------
  i=1       0.000   0.840   0.140   0.020   0.000   (produces 3, consumes 1)
  i=2       0.000   0.000   0.160   0.820   0.020   (produces 4, consumes 2)
  i=3       0.000   0.080   0.000   0.000   0.920   (produces 5, consumes 3)
  i=4       1.000   0.000   0.000   0.000   0.000   (produces 1, consumes 4)
  i=5       0.000   0.980   0.000   0.020   0.000   (produces 2, consumes 5)

Exchange Frequency π_i^e(jk) at t=3000
----------------------------------------------------------------------------------------------------
  i=1    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.03,0.00)    (0.00,0.03,0.00,0.03,0.02)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.01,0.00,0.00,0.00)  
  i=2    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.01,0.00,0.01,0.01)    (0.00,0.05,0.04,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=3    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.18,0.02,0.00,0.00)  
  i=4    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  
  i=5    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.04,0.02,0.19)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)    (0.00,0.00,0.00,0.00,0.00)  

Winning Classifier Actions π̃_i^e(jk|j) at t=3000
----------------------------------------------------------------------------------------------------
  i=1    (—,—,—,—,—)                   (0,1,0,1,0)                   (1,0,1,1,0)                   (0,0,0,1,0)                   (1,0,1,1,1)                 
  i=2    (—,1,1,—,—)                   (—,—,—,—,—)                   (1,1,1,0,1)                   (1,1,1,1,1)                   (1,1,1,0,1)                 
  i=3    (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                   (0,1,1,0,0)                 
  i=4    (0,0,0,0,0)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (—,—,—,—,—)                   (1,—,—,—,—)                 
  i=5    (1,—,1,—,1)                   (1,1,1,1,1)                   (0,0,—,—,0)                   (1,1,1,1,1)                   (—,—,—,—,—)

# Plot Distribution of Holdings for Economy D
# This replicates the spirit of the paper's Figures 10-11
history_d = np.array(sim_d.history['holdings_dist'])  # (T, 5, 5)
T = len(history_d)
time = np.arange(T)

fig, axes = plt.subplots(1, 5, figsize=(25, 4.5), sharey=True)
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
good_labels = [f'Good {k+1}' for k in range(5)]

for i, ax in enumerate(axes):
    for k in range(5):
        ax.plot(time, history_d[:, i, k] * 100,
                color=colors[k], linewidth=1.2,
                label=good_labels[k], alpha=0.8)
    ax.set_xlabel('Period')
    ax.set_ylabel('% Holding' if i == 0 else '')
    consumes = i + 1
    produces = sim_d.produces[i] + 1
    ax.set_title(f'Type {i+1}\n(cons {consumes}, prod {produces})', fontsize=10)
    ax.legend(loc='best', fontsize=7)
    ax.set_ylim(-5, 105)
    ax.grid(True, alpha=0.3)

fig.suptitle('Economy D: Distribution of Holdings — Five Types, Five Goods\n'
             '(cf. Paper Figure 11)', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()

# Analyze the discovered trading patterns in Economy D
# The paper's key finding: agents learn to trade only for goods with LOWER storage cost
# (except always accepting their own consumption good)

print("=" * 70)
print("ECONOMY D: ANALYSIS OF DISCOVERED TRADING PATTERNS")
print("=" * 70)
print()
print("The paper's key finding is that agents discover 'fundamental-like'")
print("trading patterns: they only trade for goods with LOWER storage cost")
print("than what they currently hold (except they always accept their own")
print("consumption good).")
print()
print(f"Storage costs: s₁={sim_d.storage_costs[0]:.0f}, s₂={sim_d.storage_costs[1]:.0f}, "
      f"s₃={sim_d.storage_costs[2]:.0f}, s₄={sim_d.storage_costs[3]:.0f}, s₅={sim_d.storage_costs[4]:.0f}")
print()

# For each agent type, inspect the top trade classifiers to determine
# which goods they accept in trade
for i in range(5):
    agent = sim_d.agents[i]
    consumes = i  # 0-indexed
    produces = sim_d.produces[i]
    
    print(f"Type {i+1} (consumes good {consumes+1}, produces good {produces+1}):")
    
    # Check trade decisions for each (own_good, partner_good) combination
    accept_patterns = {}
    for own in range(5):
        for partner in range(5):
            if own == partner:
                continue
            own_enc = encode_good(own, 5)
            partner_enc = encode_good(partner, 5)
            state = np.concatenate([own_enc, partner_enc])
            
            matches = [(idx, c) for idx, c in enumerate(agent.trade_classifiers)
                       if c.matches(state)]
            if matches:
                best_idx, best_cls = max(matches, key=lambda x: x[1].strength)
                decision = "ACCEPT" if best_cls.action == 1 else "REFUSE"
            else:
                decision = "NO RULE"
            
            key = (own + 1, partner + 1)  # Paper indexing
            accept_patterns[key] = decision
    
    # Print compact trading matrix
    print(f"  Trading decisions (holding own → offered partner's good):")
    for own in range(5):
        decisions = []
        for partner in range(5):
            if own == partner:
                decisions.append("  —   ")
            else:
                d = accept_patterns.get((own+1, partner+1), "?")
                if d == "ACCEPT":
                    # Check if this follows fundamental logic
                    is_own_good = (partner == consumes)
                    is_lower_cost = sim_d.storage_costs[partner] < sim_d.storage_costs[own]
                    marker = "✓" if (is_own_good or is_lower_cost) else "!"  
                    decisions.append(f"ACC {marker}")
                else:
                    decisions.append(f"REF  ")
        print(f"    Hold {own+1}: [{', '.join(decisions)}]  (for goods 1-5)")
    print()

======================================================================
ECONOMY D: ANALYSIS OF DISCOVERED TRADING PATTERNS
======================================================================

The paper's key finding is that agents discover 'fundamental-like'
trading patterns: they only trade for goods with LOWER storage cost
than what they currently hold (except they always accept their own
consumption good).

Storage costs: s₁=1, s₂=4, s₃=9, s₄=16, s₅=30

Type 1 (consumes good 1, produces good 3):
  Trading decisions (holding own → offered partner's good):
    Hold 1: [  —   , REF  , REF  , REF  , REF  ]  (for goods 1-5)
    Hold 2: [REF  ,   —   , REF  , ACC !, REF  ]  (for goods 1-5)
    Hold 3: [ACC ✓, REF  ,   —   , ACC !, REF  ]  (for goods 1-5)
    Hold 4: [REF  , REF  , REF  ,   —   , REF  ]  (for goods 1-5)
    Hold 5: [ACC ✓, REF  , ACC ✓, ACC ✓,   —   ]  (for goods 1-5)

Type 2 (consumes good 2, produces good 4):
  Trading decisions (holding own → offered partner's good):
    Hold 1: [  —   , ACC ✓, ACC !, REF  , REF  ]  (for goods 1-5)
    Hold 2: [REF  ,   —   , REF  , REF  , REF  ]  (for goods 1-5)
    Hold 3: [ACC ✓, ACC ✓,   —   , REF  , ACC !]  (for goods 1-5)
    Hold 4: [ACC ✓, ACC ✓, ACC ✓,   —   , ACC !]  (for goods 1-5)
    Hold 5: [ACC ✓, ACC ✓, ACC ✓, REF  ,   —   ]  (for goods 1-5)

Type 3 (consumes good 3, produces good 5):
  Trading decisions (holding own → offered partner's good):
    Hold 1: [  —   , ACC !, ACC ✓, REF  , REF  ]  (for goods 1-5)
    Hold 2: [REF  ,   —   , ACC ✓, REF  , REF  ]  (for goods 1-5)
    Hold 3: [REF  , ACC ✓,   —   , REF  , REF  ]  (for goods 1-5)
    Hold 4: [REF  , ACC ✓, ACC ✓,   —   , REF  ]  (for goods 1-5)
    Hold 5: [REF  , ACC ✓, ACC ✓, REF  ,   —   ]  (for goods 1-5)

Type 4 (consumes good 4, produces good 1):
  Trading decisions (holding own → offered partner's good):
    Hold 1: [  —   , REF  , REF  , REF  , REF  ]  (for goods 1-5)
    Hold 2: [REF  ,   —   , REF  , REF  , REF  ]  (for goods 1-5)
    Hold 3: [REF  , REF  ,   —   , REF  , REF  ]  (for goods 1-5)
    Hold 4: [REF  , REF  , REF  ,   —   , REF  ]  (for goods 1-5)
    Hold 5: [ACC ✓, REF  , REF  , REF  ,   —   ]  (for goods 1-5)

Type 5 (consumes good 5, produces good 2):
  Trading decisions (holding own → offered partner's good):
    Hold 1: [  —   , REF  , ACC !, REF  , ACC ✓]  (for goods 1-5)
    Hold 2: [ACC ✓,   —   , ACC !, ACC !, ACC ✓]  (for goods 1-5)
    Hold 3: [REF  , REF  ,   —   , REF  , REF  ]  (for goods 1-5)
    Hold 4: [ACC ✓, ACC ✓, ACC ✓,   —   , ACC ✓]  (for goods 1-5)
    Hold 5: [REF  , REF  , REF  , REF  ,   —   ]  (for goods 1-5)

# Visualize the production structure and discovered exchange patterns for Economy D
# Replicates Figures 10 and 11 from the paper

# Compute final holdings distribution for Economy D
# dist_d[i, k] = fraction of type-i agents holding good k at the final period
dist_d = np.zeros((5, 5))
for i in range(5):
    mask = sim_d.agent_types == i
    for k in range(5):
        dist_d[i, k] = np.mean(sim_d.holdings[mask] == k)

fig, axes = plt.subplots(1, 2, figsize=(16, 7))

# --- Figure 10: Production Pattern ---
ax = axes[0]
# Place 5 types in a pentagon
angles = [np.pi/2 + 2*np.pi*k/5 for k in range(5)]
radius = 0.35
cx, cy = 0.5, 0.5
positions = [(cx + radius * np.cos(a), cy + radius * np.sin(a)) for a in angles]

# Good nodes (inner pentagon, smaller radius)
good_radius = 0.18
good_positions = [(cx + good_radius * np.cos(a + np.pi/5), 
                   cy + good_radius * np.sin(a + np.pi/5)) for a in angles]

# Draw agent nodes
for i, (x, y) in enumerate(positions):
    circle = plt.Circle((x, y), 0.05, color='lightblue', ec='navy', lw=2, 
                         transform=ax.transAxes, clip_on=False)
    ax.add_patch(circle)
    ax.text(x, y, f'Type\n{i+1}', ha='center', va='center', fontsize=7, 
            fontweight='bold', transform=ax.transAxes)

# Draw good nodes
for k, (x, y) in enumerate(good_positions):
    circle = plt.Circle((x, y), 0.035, color='lightyellow', ec='darkgoldenrod', lw=1.5,
                         transform=ax.transAxes, clip_on=False)
    ax.add_patch(circle)
    ax.text(x, y, f'G{k+1}', ha='center', va='center', fontsize=7, 
            fontweight='bold', transform=ax.transAxes)

# Draw production arrows: Type i → Good produces[i]
prod_map = {0: 2, 1: 3, 2: 4, 3: 0, 4: 1}  # 0-indexed
for i in range(5):
    sx, sy = positions[i]
    gx, gy = good_positions[prod_map[i]]
    ax.annotate('', xy=(gx, gy), xytext=(sx, sy),
                arrowprops=dict(arrowstyle='->', color='darkred', lw=1.5),
                xycoords='axes fraction', textcoords='axes fraction')

ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.set_aspect('equal')
ax.axis('off')
ax.set_title('Production Structure\n(cf. Paper Figure 10)', fontsize=11, fontweight='bold')

# --- Discovered Exchange Patterns ---
ax = axes[1]

# Same pentagon layout for types
for i, (x, y) in enumerate(positions):
    circle = plt.Circle((x, y), 0.05, color='lightblue', ec='navy', lw=2,
                         transform=ax.transAxes, clip_on=False)
    ax.add_patch(circle)
    consumes = i + 1
    produces = sim_d.produces[i] + 1
    ax.text(x, y, f'Type\n{i+1}', ha='center', va='center', fontsize=7, 
            fontweight='bold', transform=ax.transAxes)
    ax.text(x, y - 0.07, f'c:{consumes} p:{produces}', ha='center', fontsize=6, 
            color='gray', transform=ax.transAxes)

# Draw likely exchange arrows based on observed holdings
# If type i frequently holds good k (where k != produces[i]), 
# it means type i is accepting good k in trade
colors_5 = ['tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple']
for i in range(5):
    for k in range(5):
        if k == sim_d.produces[i]:
            continue  # Skip production good
        if dist_d[i, k] > 0.05:  # Significant holding
            # Find which type produces good k
            producer = np.where(sim_d.produces == k)[0]
            if len(producer) > 0:
                j = producer[0]
                sx, sy = positions[j]
                tx, ty = positions[i]
                offset = 0.02 * (k - 2)  # slight offset to avoid overlap
                ax.annotate('', xy=(tx + offset, ty), xytext=(sx + offset, sy),
                            arrowprops=dict(arrowstyle='->', color=colors_5[k], lw=1.2,
                                          connectionstyle=f'arc3,rad={0.15 * (k-2)}',
                                          alpha=0.7),
                            xycoords='axes fraction', textcoords='axes fraction')

# Add storage cost annotation
ax.text(0.5, -0.02, 'Storage costs: s₁=1 < s₂=4 < s₃=9 < s₄=16 < s₅=30\n'
        'Fundamental pattern: trade only for lower-cost goods (or own consumption good)',
        ha='center', fontsize=8, transform=ax.transAxes,
        bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))

ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.set_aspect('equal')
ax.axis('off')
ax.set_title('Discovered Exchange Patterns\n(cf. Paper Figure 11)', fontsize=11, fontweight='bold')

plt.tight_layout()
plt.show()

Discussion: Economy D — The Paper’s “Triumph”¶

Economy D is where the Holland classifier system demonstrates its power as an equilibrium-discovery tool. With 5 agent types, 5 goods, and a complex production structure, the authors did not have an analytic characterization of the stationary equilibrium before running the simulation.

The classifier systems discovered trading patterns that turned out to be consistent with a fundamental equilibrium — agents trade only for goods with lower storage costs than what they currently hold (except they always accept their own consumption good). The authors write:

“With enough work, one could obtain an analytic solution for the fundamental equilibrium for such an economy, but here our purpose is to let our artificially intelligent agents suggest to us what an equilibrium might be.”

This result demonstrates that:

Classifier systems scale to more complex economies beyond the 3-agent setup
The systems can discover equilibria that humans have not yet characterized analytically
The discovered equilibrium can be verified ex post as a Nash equilibrium
The fundamental principle — trade for lower storage cost goods — generalizes naturally to larger economies

This is perhaps the most compelling argument in the paper for the value of artificial intelligence in economic modeling: the AI agents found something the economists had not.

11. Comparative Analysis Across Economies¶

Let us compare the convergence behavior across all economies we have simulated.

# Summary comparison table — ALL 8 economies from the paper
print("="*85)
print("SUMMARY OF SIMULATION RESULTS")
print("Replicating ALL Economies from Marimon, McGrattan, Sargent (1990)")
print("="*85)
print()
print(f"{'Economy':<12s} {'Storage Costs':>22s} {'Utility':>8s} {'Init CS':>10s} {'Paper Eq.':>10s} {'Our Result':>12s}")
print("-"*85)
print(f"{'A1.1':<12s} {'s=(0.1, 1, 20)':>22s} {'u=100':>8s} {'Complete':>10s} {'F':>10s} {'Fundamental':>12s}")
print(f"{'A1.2':<12s} {'s=(0.1, 1, 20)':>22s} {'u=100':>8s} {'Random+GA':>10s} {'F':>10s} {'Fundamental':>12s}")
print(f"{'A2.1':<12s} {'s=(0.1, 1, 20)':>22s} {'u=500':>8s} {'Complete':>10s} {'S':>10s} {'Fundamental':>12s}")
print(f"{'A2.2':<12s} {'s=(0.1, 1, 20)':>22s} {'u=500':>8s} {'Random+GA':>10s} {'S':>10s} {'(see below)':>12s}")
print(f"{'B.1':<12s} {'s=(1, 4, 9)':>22s} {'u=100':>8s} {'Complete':>10s} {'F/S':>10s} {'Fundamental':>12s}")
print(f"{'B.2':<12s} {'s=(1, 4, 9)':>22s} {'u=100':>8s} {'Random+GA':>10s} {'F/S':>10s} {'(see below)':>12s}")
print(f"{'C':<12s} {'s=(9,14,29,0)':>22s} {'u=100':>8s} {'Random+GA':>10s} {'F':>10s} {'Fiat Money':>12s}")
print(f"{'D':<12s} {'s=(1,4,9,16,30)':>22s} {'u=200':>8s} {'Random+GA':>10s} {'—':>10s} {'Fundamental':>12s}")
print("-"*85)
print("Paper Eq. column: F=Fundamental, S=Speculative, F/S=Both possible, —=No analytic benchmark")
print()

# Holdings distributions at final period for ALL economies
print("\nFinal Holdings Distributions π_i^h(j):")
print("="*65)

for name, sim_obj in [('A1.1', sim_a1), ('A1.2', sim_a12), 
                       ('A2.1', sim_a2), ('A2.2', sim_a22),
                       ('B.1', sim_b), ('B.2', sim_b2)]:
    dist = sim_obj._compute_holdings_dist()
    print(f"\n{name}:")
    for i in range(3):
        vals = ' '.join([f'{dist[i,k]:.3f}' for k in range(3)])
        print(f"  Type {i+1}: [{vals}]  (goods 1, 2, 3)")

# Economy C (4 goods)
dist_c = sim_c._compute_dist()
print(f"\nC (Fiat Money):")
for i in range(3):
    vals = ' '.join([f'{dist_c[i,k]:.3f}' for k in range(4)])
    print(f"  Type {i+1}: [{vals}]  (goods 1, 2, 3, fiat)")

# Economy D (5 goods)
dist_d = sim_d._compute_dist()
print(f"\nD (Five Goods, Five Types):")
for i in range(5):
    vals = ' '.join([f'{dist_d[i,k]:.3f}' for k in range(5)])
    print(f"  Type {i+1}: [{vals}]  (goods 1, 2, 3, 4, 5)")

=====================================================================================
SUMMARY OF SIMULATION RESULTS
Replicating ALL Economies from Marimon, McGrattan, Sargent (1990)
=====================================================================================

Economy               Storage Costs  Utility    Init CS  Paper Eq.   Our Result
-------------------------------------------------------------------------------------
A1.1                 s=(0.1, 1, 20)    u=100   Complete          F  Fundamental
A1.2                 s=(0.1, 1, 20)    u=100  Random+GA          F  Fundamental
A2.1                 s=(0.1, 1, 20)    u=500   Complete          S  Fundamental
A2.2                 s=(0.1, 1, 20)    u=500  Random+GA          S  (see below)
B.1                     s=(1, 4, 9)    u=100   Complete        F/S  Fundamental
B.2                     s=(1, 4, 9)    u=100  Random+GA        F/S  (see below)
C                     s=(9,14,29,0)    u=100  Random+GA          F   Fiat Money
D                   s=(1,4,9,16,30)    u=200  Random+GA          —  Fundamental
-------------------------------------------------------------------------------------
Paper Eq. column: F=Fundamental, S=Speculative, F/S=Both possible, —=No analytic benchmark


Final Holdings Distributions π_i^h(j):
=================================================================

A1.1:
  Type 1: [0.000 1.000 0.000]  (goods 1, 2, 3)
  Type 2: [0.540 0.000 0.460]  (goods 1, 2, 3)
  Type 3: [1.000 0.000 0.000]  (goods 1, 2, 3)

A1.2:
  Type 1: [0.000 1.000 0.000]  (goods 1, 2, 3)
  Type 2: [0.420 0.000 0.580]  (goods 1, 2, 3)
  Type 3: [1.000 0.000 0.000]  (goods 1, 2, 3)

A2.1:
  Type 1: [0.000 1.000 0.000]  (goods 1, 2, 3)
  Type 2: [0.540 0.000 0.460]  (goods 1, 2, 3)
  Type 3: [1.000 0.000 0.000]  (goods 1, 2, 3)

A2.2:
  Type 1: [0.000 0.000 1.000]  (goods 1, 2, 3)
  Type 2: [0.000 0.000 1.000]  (goods 1, 2, 3)
  Type 3: [1.000 0.000 0.000]  (goods 1, 2, 3)

B.1:
  Type 1: [0.000 0.280 0.720]  (goods 1, 2, 3)
  Type 2: [1.000 0.000 0.000]  (goods 1, 2, 3)
  Type 3: [0.560 0.440 0.000]  (goods 1, 2, 3)

B.2:
  Type 1: [0.140 0.000 0.860]  (goods 1, 2, 3)
  Type 2: [1.000 0.000 0.000]  (goods 1, 2, 3)
  Type 3: [0.000 0.000 1.000]  (goods 1, 2, 3)

C (Fiat Money):
  Type 1: [0.000 0.720 0.000 0.280]  (goods 1, 2, 3, fiat)
  Type 2: [0.000 0.000 0.700 0.300]  (goods 1, 2, 3, fiat)
  Type 3: [0.420 0.200 0.000 0.380]  (goods 1, 2, 3, fiat)

D (Five Goods, Five Types):
  Type 1: [0.000 0.840 0.140 0.020 0.000]  (goods 1, 2, 3, 4, 5)
  Type 2: [0.000 0.000 0.160 0.820 0.020]  (goods 1, 2, 3, 4, 5)
  Type 3: [0.000 0.080 0.000 0.000 0.920]  (goods 1, 2, 3, 4, 5)
  Type 4: [1.000 0.000 0.000 0.000 0.000]  (goods 1, 2, 3, 4, 5)
  Type 5: [0.000 0.980 0.000 0.020 0.000]  (goods 1, 2, 3, 4, 5)

12. Examining the Classifier Systems¶

Let us inspect the classifier systems that emerged in Economy A1.1 to understand which rules the agents learned.

def display_top_classifiers(agent: ClassifierAgent, classifier_type: str = 'trade', 
                            n_top: int = 10, n_goods: int = 3):
    """
    Display the top classifiers by strength for an agent type.
    """
    if classifier_type == 'trade':
        clfs = agent.trade_classifiers
        n_bits = agent.n_bits
    else:
        clfs = agent.consume_classifiers
        n_bits = agent.n_bits
    
    # Sort by strength (descending)
    sorted_clfs = sorted(enumerate(clfs), key=lambda x: x[1].strength, reverse=True)
    
    print(f"\n{'='*60}")
    print(f"Top {n_top} {classifier_type} classifiers for Type {agent.type_id + 1}")
    print(f"{'='*60}")
    
    def decode_condition(cond, n_bits):
        """Convert condition to human-readable form."""
        parts = []
        for start in range(0, len(cond), n_bits):
            bits = cond[start:start+n_bits]
            if n_bits == 2:
                if np.array_equal(bits, [1, 0]): parts.append('G1')
                elif np.array_equal(bits, [0, 1]): parts.append('G2')
                elif np.array_equal(bits, [0, 0]): parts.append('G3')
                elif np.array_equal(bits, [0, -1]): parts.append('¬G1')
                elif np.array_equal(bits, [-1, 0]): parts.append('¬G2')
                elif np.array_equal(bits, [-1, -1]): parts.append('Any')
                else: parts.append(str(bits))
        return parts
    
    for rank, (idx, clf) in enumerate(sorted_clfs[:n_top]):
        cond_parts = decode_condition(clf.condition, n_bits)
        if classifier_type == 'trade':
            action_str = 'TRADE' if clf.action == 1 else 'REFUSE'
            own, partner = cond_parts[0], cond_parts[1]
            print(f"  {rank+1:2d}. If hold={own:>4s}, partner={partner:>4s} → {action_str:>6s}  "
                  f"(S={clf.strength:8.2f}, n={clf.n_used:5d})")
        else:
            action_str = 'CONSUME' if clf.action == 1 else 'KEEP'
            holding = cond_parts[0]
            print(f"  {rank+1:2d}. If hold={holding:>4s} → {action_str:>7s}  "
                  f"(S={clf.strength:8.2f}, n={clf.n_used:5d})")

# Display top classifiers for each type in Economy A1.1
for i in range(3):
    display_top_classifiers(sim_a1.agents[i], 'trade', n_top=6, n_goods=3)
    display_top_classifiers(sim_a1.agents[i], 'consume', n_top=4, n_goods=3)


============================================================
Top 6 trade classifiers for Type 1
============================================================
   1. If hold=  G2, partner=  G1 →  TRADE  (S=   30.55, n= 8236)
   2. If hold=  G1, partner=  G2 → REFUSE  (S=   29.92, n=    4)
   3. If hold=  G1, partner=  G1 → REFUSE  (S=   29.54, n=    8)
   4. If hold=  G1, partner=  G3 → REFUSE  (S=   22.97, n=    7)
   5. If hold=  G1, partner=  G1 →  TRADE  (S=    0.00, n=    1)
   6. If hold=  G1, partner=  G2 →  TRADE  (S=    0.00, n=    1)

============================================================
Top 4 consume classifiers for Type 1
============================================================
   1. If hold=  G1 → CONSUME  (S=   67.24, n= 8251)
   2. If hold=  G1 →    KEEP  (S=   -0.10, n=    2)
   3. If hold= Any → CONSUME  (S=   -0.57, n=41740)
   4. If hold=  G2 →    KEEP  (S=   -0.76, n=    3)

============================================================
Top 6 trade classifiers for Type 2
============================================================
   1. If hold=  G2, partner=  G1 → REFUSE  (S=   24.81, n=    2)
   2. If hold=  G1, partner=  G2 →  TRADE  (S=   24.78, n= 8134)
   3. If hold=  G3, partner=  G2 →  TRADE  (S=   24.75, n= 8034)
   4. If hold=  G2, partner=  G3 → REFUSE  (S=   23.03, n=   15)
   5. If hold=  G2, partner=  G2 → REFUSE  (S=   21.08, n=    8)
   6. If hold=  G3, partner=  G1 →  TRADE  (S=    0.09, n= 8148)

============================================================
Top 4 consume classifiers for Type 2
============================================================
   1. If hold=  G2 → CONSUME  (S=   54.50, n=16188)
   2. If hold=  G1 →    KEEP  (S=    0.21, n=25679)
   3. If hold=  G2 →    KEEP  (S=   -1.00, n=    2)
   4. If hold= ¬G2 →    KEEP  (S=  -14.50, n= 8127)

============================================================
Top 6 trade classifiers for Type 3
============================================================
   1. If hold=  G1, partner=  G3 →  TRADE  (S=   30.85, n= 8146)
   2. If hold=  G3, partner=  G3 → REFUSE  (S=   29.90, n=    4)
   3. If hold=  G3, partner=  G1 → REFUSE  (S=   28.73, n=   14)
   4. If hold=  G3, partner=  G2 → REFUSE  (S=   27.18, n=   11)
   5. If hold=  G2, partner=  G3 → REFUSE  (S=    0.11, n=    7)
   6. If hold=  G2, partner=  G2 → REFUSE  (S=    0.05, n=    5)

============================================================
Top 4 consume classifiers for Type 3
============================================================
   1. If hold=  G3 → CONSUME  (S=   67.86, n= 8171)
   2. If hold= ¬G2 → CONSUME  (S=    0.11, n=41711)
   3. If hold= ¬G1 → CONSUME  (S=   -0.05, n=  112)
   4. If hold= Any →    KEEP  (S=   -0.09, n=    2)

13. The Fundamental Equilibrium Diagram¶

We can visualize the trade patterns that emerge, replicating Figure 2 from the paper.

def plot_wicksell_triangle(sim, title: str = None, window: int = 100):
    """
    Data-driven Wicksell triangle / exchange pattern diagram for 3-good economies.
    
    Reads actual exchange counts from the simulation history to draw arrows
    showing discovered trade patterns, with line thickness proportional to
    exchange frequency. Works for economies A1.1, A1.2, A2.1, A2.2, B.1, B.2.
    
    Parameters
    ----------
    sim : KiyotakiWrightSimulation or similar
        Simulation with history['exchange_counts'].
    title : str, optional
        Plot title. If None, auto-generated.
    window : int
        Number of recent periods to average exchange counts over.
    """
    # Gather exchange counts from recent history
    exchange_history = sim.history.get('exchange_counts', [])
    if not exchange_history:
        print("No exchange count data available.")
        return None
    
    # Average over last `window` periods
    recent = exchange_history[-window:]
    avg_counts = np.mean(recent, axis=0)  # shape: (n_types, n_goods, n_goods)
    
    # Determine if 3, 4, or 5 goods
    n_goods = avg_counts.shape[1]
    n_types = avg_counts.shape[0]
    
    if n_types == 3 and n_goods <= 4:
        return _plot_triangle(avg_counts, n_goods, n_types, title, sim)
    elif n_types == 5:
        return _plot_pentagon(avg_counts, n_goods, n_types, title, sim)
    else:
        print(f"Unsupported layout: {n_types} types, {n_goods} goods")
        return None


def _plot_triangle(avg_counts, n_goods, n_types, title, sim):
    """Draw Wicksell triangle for 3-type economies (3 or 4 goods)."""
    fig, ax = plt.subplots(1, 1, figsize=(7, 6))
    
    # Agent positions in a triangle
    positions = [(0.5, 0.9), (0.1, 0.15), (0.9, 0.15)]
    
    # Draw agent nodes
    for i, (x, y) in enumerate(positions):
        circle = plt.Circle((x, y), 0.08, color='lightblue', ec='navy', lw=2)
        ax.add_patch(circle)
        ax.text(x, y, f'Type {i+1}', ha='center', va='center', fontsize=10, fontweight='bold')
    
    # Type labels: consumption and production
    try:
        produces = sim.config.produces if hasattr(sim, 'config') else sim.produces
    except:
        produces = np.array([1, 2, 0])
    
    labels = [
        f'Consumes Good {i+1}\nProduces Good {produces[i]+1}'
        for i in range(n_types)
    ]
    label_positions = [(0.5, 1.02), (0.1, 0.02), (0.9, 0.02)]
    for i, ((lx, ly), label) in enumerate(zip(label_positions, labels)):
        ax.text(lx, ly, label, ha='center', fontsize=8, color='gray')
    
    # Compute total exchange flow: type i trades good j for good k
    # Aggregate to: type i gives good j to someone (across all goods k they receive)
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
    
    # For each pair of types, compute dominant good flow
    max_flow = avg_counts.max()
    if max_flow == 0:
        max_flow = 1
    
    # Draw arrows for significant exchanges
    # type i exchanges good j for good k with some agent
    # We show flows between type pairs
    for i in range(n_types):
        for j in range(n_goods):
            for k in range(n_goods):
                if j == k:
                    continue
                flow = avg_counts[i, j, k]
                if flow / max_flow < 0.05:
                    continue  # Skip insignificant flows
                
                # Type i gives good j, receives good k
                # Arrow: from type i position toward center, labeled with goods
                sx, sy = positions[i]
                
                # Find who they're trading with (the type that tends to hold good j after receiving it)
                # Approximate: just draw flow arrows from type i outward
                # The arrow represents "Type i sends good j, receives good k"
                thickness = 1 + 3 * (flow / max_flow)
                
                # Determine receiving type (heuristic: the type that produces good k)
                recv_type = None
                for t in range(n_types):
                    if t != i and produces[t] == k:
                        recv_type = t
                        break
                if recv_type is None:
                    # Fallback: any other type
                    for t in range(n_types):
                        if t != i:
                            recv_type = t
                            break
                
                ex, ey = positions[recv_type]
                
                # Offset so forward and reverse arrows don't overlap
                rad = 0.15 if j < k else -0.15
                ax.annotate('', xy=(ex, ey), xytext=(sx, sy),
                            arrowprops=dict(arrowstyle='->', color=colors[j % len(colors)],
                                          lw=thickness, alpha=0.7,
                                          connectionstyle=f'arc3,rad={rad}'))
    
    # Legend for goods
    for g in range(n_goods):
        ax.plot([], [], color=colors[g % len(colors)], lw=2, 
                label=f'Good {g+1}')
    ax.legend(loc='lower center', fontsize=9, ncol=n_goods, 
              bbox_to_anchor=(0.5, -0.12))
    
    # Identify medium of exchange from holdings
    try:
        config = sim.config if hasattr(sim, 'config') else None
        if config:
            costs_str = ', '.join([f's{k+1}={config.storage_costs[k]:.1f}' for k in range(n_goods)])
        else:
            costs_str = ''
    except:
        costs_str = ''
    
    if costs_str:
        ax.text(0.5, -0.18, f'Storage costs: {costs_str}',
                ha='center', fontsize=8, transform=ax.transAxes,
                bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))
    
    ax.set_xlim(-0.1, 1.1)
    ax.set_ylim(-0.15, 1.1)
    ax.set_aspect('equal')
    ax.axis('off')
    
    if title is None:
        name = sim.config.name if hasattr(sim, 'config') else "Economy"
        title = f"Exchange Pattern — {name}"
    ax.set_title(title, fontsize=13, fontweight='bold', pad=15)
    plt.tight_layout()
    plt.show()
    return fig


def _plot_pentagon(avg_counts, n_goods, n_types, title, sim):
    """Draw exchange pattern diagram for 5-type economies."""
    fig, ax = plt.subplots(1, 1, figsize=(8, 8))
    
    # Pentagon positions
    angles = [np.pi/2 + 2*np.pi*i/5 for i in range(5)]
    positions = [(0.5 + 0.35*np.cos(a), 0.5 + 0.35*np.sin(a)) for a in angles]
    
    try:
        produces = sim.produces
    except:
        produces = np.array([2, 3, 4, 0, 1])
    
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
    
    # Draw agent nodes
    for i, (x, y) in enumerate(positions):
        circle = plt.Circle((x, y), 0.05, color='lightblue', ec='navy', lw=2)
        ax.add_patch(circle)
        ax.text(x, y, f'Type\n{i+1}', ha='center', va='center', fontsize=8, fontweight='bold')
    
    # Draw significant exchange arrows
    max_flow = avg_counts.max()
    if max_flow == 0:
        max_flow = 1
    
    for i in range(n_types):
        for j in range(n_goods):
            for k in range(n_goods):
                if j == k:
                    continue
                flow = avg_counts[i, j, k]
                if flow / max_flow < 0.05:
                    continue
                thickness = 1 + 3 * (flow / max_flow)
                
                # Find likely trading partner
                recv_type = None
                for t in range(n_types):
                    if t != i and produces[t] == k:
                        recv_type = t
                        break
                if recv_type is None:
                    continue
                
                sx, sy = positions[i]
                ex, ey = positions[recv_type]
                rad = 0.1
                ax.annotate('', xy=(ex, ey), xytext=(sx, sy),
                            arrowprops=dict(arrowstyle='->', color=colors[j % len(colors)],
                                          lw=thickness, alpha=0.7,
                                          connectionstyle=f'arc3,rad={rad}'))
    
    for g in range(n_goods):
        ax.plot([], [], color=colors[g % len(colors)], lw=2, label=f'Good {g+1}')
    ax.legend(loc='lower center', fontsize=9, ncol=min(n_goods, 5),
              bbox_to_anchor=(0.5, -0.05))
    
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)
    ax.set_aspect('equal')
    ax.axis('off')
    if title is None:
        title = "Exchange Pattern — Economy D (5 Goods)"
    ax.set_title(title, fontsize=13, fontweight='bold')
    plt.tight_layout()
    plt.show()
    return fig


# --- Data-driven Wicksell triangles for all 3-good economies ---
print("Wicksell Triangle Exchange Patterns (Data-Driven)")
print("=" * 50)

for name, sim_obj in [('A1.1', sim_a1), ('A1.2', sim_a12), 
                       ('A2.1', sim_a2), ('A2.2', sim_a22),
                       ('B.1', sim_b), ('B.2', sim_b2)]:
    plot_wicksell_triangle(sim_obj, title=f"Economy {name} — Exchange Pattern (cf. Paper Figures 2-4)")

# Economy C (4 goods)
plot_wicksell_triangle(sim_c, title="Economy C — Exchange Pattern with Fiat Money")

# Economy D (5 goods)
plot_wicksell_triangle(sim_d, title="Economy D — Exchange Pattern (cf. Paper Figure 11)")

Wicksell Triangle Exchange Patterns (Data-Driven)
==================================================

14. Strength Dynamics and Convergence¶

The paper uses cumulative average net rewards for classifier strengths (rather than total rewards as in Holland’s original specification). This is crucial for convergence — it makes the strength update a stochastic approximation that converges to the expected value.

Let us examine the evolution of average strengths over time.

def plot_strength_evolution(sim: KiyotakiWrightSimulation):
    """
    Plot the average strength of trade and consume classifiers over time.
    We re-run a shorter simulation to track strengths at each step.
    """
    config = sim.config
    
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    for i, agent in enumerate(sim.agents):
        # Trade classifier strengths
        trade_strengths = [c.strength for c in agent.trade_classifiers]
        consume_strengths = [c.strength for c in agent.consume_classifiers]
        
        # Histogram of final strengths
        axes[0].hist(trade_strengths, bins=30, alpha=0.5, label=f'Type {i+1}', density=True)
        axes[1].hist(consume_strengths, bins=15, alpha=0.5, label=f'Type {i+1}', density=True)
    
    axes[0].set_xlabel('Strength')
    axes[0].set_ylabel('Density')
    axes[0].set_title('Trade Classifier Strength Distribution')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    axes[1].set_xlabel('Strength')
    axes[1].set_ylabel('Density')
    axes[1].set_title('Consume Classifier Strength Distribution')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    fig.suptitle(f'Final Classifier Strength Distributions — {config.name}', 
                 fontsize=13, fontweight='bold')
    plt.tight_layout()
    plt.show()
    return fig

plot_strength_evolution(sim_a1);

15. Conclusions¶

This notebook replicates all eight economies from Marimon, McGrattan, and Sargent (1990): A1.1, A1.2, A2.1, A2.2, B.1, B.2, C, and D.

Key Results¶

Convergence to Nash Equilibrium: For most economies, the classifier system agents learn to play strategies consistent with a Nash-Markov equilibrium of the Kiyotaki-Wright model, even starting from random initial rules.
Fundamental Equilibrium Selection: When multiple equilibria exist (fundamental vs. speculative), the classifier systems consistently converge to the fundamental equilibrium — the one in which goods with low storage costs serve as media of exchange. This holds even when rational expectations theory predicts the speculative equilibrium should prevail (Economy A2).
The Role of the Learning Algorithm: Economy A2 reveals that the learning algorithm itself affects equilibrium selection. With complete enumeration (A2.1), early myopia locks the system into the fundamental equilibrium. With the GA (A2.2), increased experimentation may allow different outcomes — highlighting that equilibrium selection depends on the interplay between economic parameters and the learning dynamics.
Emergence of Fiat Money: In Economy C, artificially intelligent agents learn to use an intrinsically worthless good as a medium of exchange when it has zero storage cost, demonstrating the classical argument for why money has value.
Economy D — Equilibrium Discovery: The most striking result is Economy D (five types, five goods), where the classifier systems discovered a Nash equilibrium that the authors had not analytically derived. The agents learned fundamental-like trading patterns — trading only for goods with lower storage costs — which could be verified ex post as consistent with Nash equilibrium conditions. This demonstrates that classifier systems can serve as equilibrium-discovery tools, not just verification tools.

Broader Significance¶

This paper was a pioneering application of machine learning to economic theory. By replacing rational expectations with Holland’s classifier systems, Marimon, McGrattan, and Sargent showed that:

Equilibrium concepts can be understood as emergent outcomes of learning processes
The selection among multiple equilibria depends on the learning dynamics, not just parameter values
Bounded rationality (via classifiers) can still produce sophisticated collective outcomes like the emergence of money
Artificially intelligent agents can discover equilibria in complex environments that resist analytic solution — the “triumph” demonstrated by Economy D

References¶

Marimon, R., McGrattan, E., & Sargent, T. J. (1990). Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents. Journal of Economic Dynamics and Control, 14, 329-373.
Kiyotaki, N., & Wright, R. (1989). On Money as a Medium of Exchange. Journal of Political Economy, 97(4), 927-954.
Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press.
Holland, J. H. (1986). Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. In Machine Learning: An Artificial Intelligence Approach, Vol. 2.
Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley.

References¶

Kiyotaki, N., & Wright, R. (1989). On Money as a Medium of Exchange. Journal of Political Economy, 97(4), 927–954. 10.1086/261634