Having established the DEQN framework on the one-dimensional Brock--Mirman model in Chapter Chapter 2, we now scale it to the multi-country international real business cycle (IRBC) model of Backus et al. (1992). This model features countries with heterogeneous productivity, complete markets, irreversible investment, and convex capital adjustment costs. It is the standard testbed for high-dimensional solution methods in macroeconomics, and applying DEQNs to it illustrates how the framework handles high-dimensional state spaces, multiple equilibrium conditions, and complementarity constraints.
3.1Why IRBC for Macro-Finance Research?¶
Beyond its computational-testbed role, the IRBC model is the workhorse framework for open-economy asset pricing and international risk sharing. Several first-order questions in macro-finance can be posed sharply within it:
International risk sharing. Under complete markets and homogeneous preferences, the planner’s allocation implies perfectly correlated consumption growth across countries, whereas the data show consumption correlations that are lower than output correlations (the consumption-correlation puzzle of Backus et al. (1992)): the benchmark BKK model predicts a cross-country consumption correlation close to 1, while the empirical correlation between the US and a typical industrial country is in the range 0.3--0.5 and is systematically below the corresponding output correlation. IRBC extensions with incomplete markets, frictions, or heterogeneous preferences are designed precisely to close this gap.
Asset-market structure and welfare. Heathcote & Perri (2002) use an IRBC-style setup to quantify the welfare cost of moving from complete markets to one-bond economies (“financial autarky”), obtaining a welfare cost of the same order of magnitude as the business cycle itself. More generally, variants of this model are the standard laboratory for comparing complete vs. incomplete market structures.
Capital flows and current-account dynamics. With intertemporal savings and heterogeneous productivity, the IRBC delivers persistent current-account imbalances as an equilibrium outcome rather than as a reduced-form residual. This is the starting point for the modern open-economy DSGE literature.
Home bias. The frictionless benchmark of near-unit consumption correlation is the anchor against which observed portfolio home bias must be explained; Heathcote & Perri (2013) show that accounting for nontraded goods and labor-income hedging substantially narrows the gap between theory and observed portfolios.
Macro-financial transmission. Adjustment costs, borrowing constraints, and Pareto weights become levers for studying how financial frictions propagate across borders. This is the class of extensions that motivates the DEQN treatment: once frictions are added, the policy functions acquire kinks and nonlinearities that are hard to handle with traditional grid-based methods.
The IRBC model is therefore an interesting substantive object, not merely a scaling test. Its combination of a clean complete-markets benchmark and rich, realistic frictions makes it a natural next step after the one-country Brock--Mirman benchmark of Chapter Chapter 2.
3.1.0.1A calibration caveat for the puzzles above.¶
The shock decomposition of Section 3.2 below, , hard-wires a cross-country innovation correlation of exactly for any number of countries . The consumption-correlation and Backus--Smith puzzles cited in the bullets above should therefore be read as statements about this specific calibration: richer correlation structures, country-specific factor loadings, or fewer aggregate factors would change the quantitative bite of the puzzles in this model. This is calibration, not theory.
3.2Model Setup¶
Table 3.1:Symbol cheat-sheet for the IRBC model. Note the IES-vs-CRRA convention: here is the intertemporal elasticity, and the implied risk aversion is ; later chapters on continuous-time HA models and climate use for CRRA and for IES.
| Symbol | Role | Range / sign | Calibration |
|---|---|---|---|
| _j | IES of country (not CRRA) | linearly spaced | |
| ^j | Pareto weight on country | ||
| _t | Aggregate resource-constraint multiplier | ||
| _t^j | Irreversibility KKT multiplier on | 0 in slack regime | |
| A_ | TFP normalization constant | ||
| Capital share in Cobb--Douglas | 0.36 | ||
| ^j | Quadratic adjustment-cost level | ||
| _z | TFP persistence | 0.95 | |
| _e | Innovation s.d. per component | 0.01 | |
| ^j | Idiosyncratic innovation | i.i.d. across | |
| ^ | Aggregate innovation | common factor | |
| Adjustment-cost intensity | 0.50 |
The international real business cycle (IRBC) model, introduced by Backus et al. (1992), extends the single-country growth model to heterogeneous countries, each endowed with country-specific capital and total factor productivity . The model features complete markets, irreversible investment, and convex capital adjustment costs, and serves as the workhorse test case for high-dimensional solution methods Brumm & Scheidegger, 2017. Here, we apply the DEQN methodology of Azinovic et al. (2022) to this setting.
3.2.0.1Preferences.¶
Each country has CRRA utility
where the intertemporal elasticity of substitution (IES) is heterogeneous across countries; risk aversion under this CRRA specification equals . Notation warning: this chapter uses for the IES, while later chapters on continuous-time HA models and climate use for CRRA risk aversion and for the IES. The convention is stated explicitly at the start of each chapter. A social planner maximizes
with Pareto weights .
3.2.0.2Production.¶
Country produces , where the total factor productivity constant is calibrated to normalize the steady-state capital stock to unity. In steady state (where , , and ), the Euler equation implies:
This normalization ensures that the deterministic steady state lies at for all countries, which simplifies the network’s learning task and provides a natural center for the training distribution.
3.2.0.3TFP process.¶
Log productivity follows an AR(1) with common and idiosyncratic shocks:
The persistence restriction guarantees stationarity of the TFP process, which in turn underlies the existence of an ergodic distribution on which DEQN training samples (Section Section 2.3). Here is the per-component standard deviation, so the marginal innovation variance for country is and the cross-country innovation covariance is . These two facts imply a fixed cross-country innovation correlation of regardless of , a direct consequence of the equal-weighted aggregate-shock decomposition . Asset-pricing implications (in particular the international consumption-correlation puzzle and the cyclicality of trade balances) inherit this hard-wired common-factor structure: results below should be interpreted with that calibration choice in mind. If a desired total innovation scale is targeted instead, set .
3.2.0.4Adjustment costs and irreversibility.¶
Changing the capital stock incurs a quadratic adjustment cost:
with marginal derivatives that appear in the Euler equations:
Note that is negative whenever , i.e. in expanding states. Consequently the term that appears in the marginal product of capital below (3.14) raises MPK in expansion phases; a reader who plugs in here will introduce a sign error. Investment is irreversible: .
3.2.0.5Pareto-weight calibration.¶
With heterogeneous IES , a symmetric deterministic steady state is most easily obtained by choosing the Pareto weights as
The derivation is a two-step inversion of the planner’s first-order condition. The consumption-sharing condition (3.12) (derived in the next section from the FOC for ) reads , so . In the deterministic steady state with the normalizations and we want every country to consume the same amount implied by the resource constraint . Setting and solving for gives Eq. (3.7). The symmetric steady state thus serves as a natural anchor for training: the network’s initial predictions need only match this point to avoid infeasible economies during the early simulated trajectories.
3.2.0.6Reference calibration.¶
Throughout the companion notebooks lecture_04_01_IRBC_DEQN_smooth.ipynb and lecture_04_02_IRBC_DEQN_irreversible.ipynb, we use the quarterly calibration summarized in Table Table 3.2. The implied total factor productivity and deterministic steady-state quantities can then be computed analytically.
Table 3.2:Reference IRBC calibration used in the companion notebook. Countries’ IES values are linearly spaced in . Pareto weights are computed from (3.7).
| Symbol | Name | Value | Description |
|---|---|---|---|
| Discount factor | 0.99 | Quarterly | |
| Capital share | 0.36 | Cobb--Douglas | |
| Depreciation | 0.01 | Low quarterly rate | |
| TFP persistence | 0.95 | Highly persistent | |
| Shock std. dev. | 0.01 | Small innovations | |
| Adjustment-cost intensity | 0.50 | Moderate frictions | |
| Min IES | 0.25 | Risk aversion | |
| Max IES | 1.00 | Log utility | |
| Steady-state capital | 1.00 | Normalization |
3.2.0.7Worked steady state.¶
Equation (3.3) is most compactly written as ; multiplying numerator and denominator by gives the algebraically equivalent form used below. Substituting the reference values:
The aggregate resource constraint (3.17) is then satisfied country by country, , as a trivial check. These numbers provide a baseline against which the trained network’s predictions on an out-of-sample simulation can be compared.
3.3The Planner’s Problem and Equilibrium Conditions¶
3.3.0.1The planner’s problem.¶
The social planner maximizes the weighted sum of utilities across all countries, subject to the aggregate resource constraint (3.17), the irreversibility constraints, the production technology, and the TFP process (3.4):
with Pareto weights and discount factor .
3.3.0.2The Lagrangian.¶
Following the same approach as in Section Section 2.4 for the Brock--Mirman model, we form the Lagrangian by attaching discounted multipliers to each constraint. Let be the multiplier on the aggregate resource constraint at date , and the multiplier on the irreversibility constraint for country at date . The Lagrangian is:
The planner chooses and for each country and each date . The complementary slackness conditions require , , and . Two notation reminders before we differentiate. First, the irreversibility multiplier is , not the resource-constraint multiplier ; the two play different roles ( shadow-prices the aggregate goods market; shadow-prices country ’s individual investment floor) and they enter the FOCs through entirely different channels. Second, is the standard KKT sign: the multiplier on a -constraint is non-negative at the optimum, and the Fischer--Burmeister residual constructed below packages this sign restriction together with the slackness condition into a single smooth squared term that is compatible with SGD.
3.3.0.3FOC w.r.t. :¶
Differentiating the Lagrangian with respect to :
This is the consumption-sharing condition: the planner equates the Pareto-weighted marginal utility of consumption across all countries to a common shadow price . Solving (3.11) for :
This shows that all consumption levels are determined by the single variable : a higher shadow price (resources are scarcer) lowers consumption in every country. Countries with a higher IES respond more elastically to changes in .
3.3.0.4FOC w.r.t. :¶
The variable appears in three places in the Lagrangian: (i) the date- resource constraint with coefficient , (ii) the date- irreversibility constraint with coefficient , and (iii) the date- terms via output , depreciated capital , adjustment costs , and the irreversibility constraint. Differentiating and collecting terms:
Now define the marginal product of capital (inclusive of depreciation and adjustment cost effects):
and note from (3.6) that . Dividing (3.13) by and substituting the MPK definition:
This is the Euler equation for country . The left-hand side is the cost of investing one more unit in country ’s capital: the shadow price of the resources used (scaled by the marginal adjustment cost) minus the value of relaxing the irreversibility constraint. The right-hand side is the expected discounted benefit: next period’s shadow price times the marginal product of capital, minus the option-value loss from tightening next period’s irreversibility constraint.
3.3.0.5Relative error form.¶
For numerical purposes, regroup (3.15) so that the cost-of-investment term stands alone on the left, , and divide through by it. This gives a scale-free formulation:
This ensures that all Euler equations are dimensionless and residuals can be interpreted directly as percentage deviations from optimality.
3.3.0.6Aggregate resource constraint.¶
All output is allocated to consumption, investment, and adjustment costs:
3.3.0.7Summary of equilibrium conditions.¶
The complete system consists of three blocks:
Consumption sharing (3.12): determines all consumption levels from .
Euler equations (3.16): intertemporal optimality conditions, one per country.
Aggregate resource constraint (3.17): closes the model by equating world supply and demand.
In addition, the irreversibility constraints are enforced via complementary slackness (, , ).
3.3.0.8Fischer--Burmeister complementarity.¶
The irreversibility constraint is enforced via a smoothed Fischer--Burmeister residual:
The exact Fischer--Burmeister map is the limiting case . Its zero set coincides with the positive axes in the -plane, ensuring , , and (Figure Figure 3.1). The smoothed version with rounds the corner at the origin and is differentiable there, improving numerical conditioning at the cost of a slight relaxation of exact complementarity. The companion notebooks use as the default; tighter values (10-6--10-5) are sometimes preferred when complementarity must hold to higher accuracy, at the cost of stiffer gradients near the origin.
Figure 3.1:The Fischer--Burmeister complementarity function, drawn in the investment--multiplier plane: investment on the horizontal axis, the irreversibility multiplier on the vertical axis. The exact map packs the three Karush--Kuhn--Tucker conditions , , into a single smooth equation: holds exactly on the two heavy blue half-axes and nowhere else. The horizontal half-axis (, ) is the investing regime, where the country invests a strictly positive amount, the irreversibility constraint is slack, and its shadow price is therefore zero. The vertical half-axis (, ) is the constrained regime, where the constraint binds, investment is pinned at zero, and measures how much the planner would pay to relax it; the origin is the knife-edge where both hold with equality. The open interior of the first quadrant ( and together) is infeasible because it violates complementarity, and there strictly (since whenever both are positive). This is exactly what makes the function useful as a loss term: when the network’s predicted lands in that forbidden region, the squared residual is positive and its negative gradient (green arrow) pushes the iterate back toward the nearest feasible half-axis, so the network learns which regime applies at each state without any explicit regime switch. The exact map has a single kink, at the origin; the smoothed version actually used in the code rounds that corner, restoring differentiability everywhere at the price of an relaxation of exact complementarity.
The complementarity conditions , , have a natural economic interpretation: when investment is strictly positive (), the irreversibility constraint is slack and the multiplier is zero (); conversely, when the constraint binds (), the multiplier is positive, reflecting the shadow value of the binding constraint. The FB function smoothly encodes both regimes, allowing the neural network to learn which regime applies for each state without explicit regime switching.
3.4DEQN Formulation¶
3.4.0.1From Brock--Mirman to IRBC.¶
It is useful to see the IRBC as the natural extension of the one-country benchmark of Chapter Chapter 2. Table Table 3.3 summarizes what changes.
Table 3.3:The DEQN template is the same in both cases; only the input/output dimensions, the number of loss terms, and the presence of complementarity constraints change.
| Brock--Mirman (Ch. Chapter 2) | IRBC (this chapter) | |
|---|---|---|
| Countries | 1 | |
| States | ||
| Policies | ||
| Loss terms | 1 Euler | Euler 1 ARC Fischer--Burmeister |
| Constraints | none | irreversibility, convex adjustment costs |
| Shocks per period | 1 | (one idiosyncratic per country + one aggregate) |
| Output activation | softplus or sigmoid | softplus |
| Analytical solution | yes (log utility, ) | no |
The full system of equations comprises Euler equations, Fischer--Burmeister conditions, and 1 aggregate resource constraint, totaling equations. Table Table 3.4 summarizes how the problem dimensions scale with .
Table 3.4:Scaling of the IRBC state, policy, equation, and quadrature dimensions with the number of countries . The state, policy, and equation counts grow linearly. Tensor-product Gauss--Hermite quadrature grows as , while the Stroud-3 monomial rule uses only nodes; this is why the notebook uses Gauss--Hermite only for the two-country classroom case and switches to monomial or QMC rules in larger IRBC applications.
| States | Policies | Equations | Shock dim. | GH nodes () | Stroud-3 nodes | |
|---|---|---|---|---|---|---|
| 2 | 4 | 5 | 5 | 3 | 27 | 6 |
| 5 | 10 | 11 | 11 | 6 | 729 | 12 |
| 10 | 20 | 21 | 21 | 11 | 22 | |
| 50 | 100 | 101 | 101 | 51 | 102 | |
| 100 | 200 | 201 | 201 | 101 | 202 |
Figure 3.2:Quadrature-cost crossover for the IRBC model as a function of the number of countries . Tensor-product Gauss--Hermite (red) grows exponentially in and becomes infeasible by ; the Stroud-3 monomial rule (blue) grows linearly and stays well under 103 nodes even at . This is the operational reason every IRBC application beyond the classroom case uses monomial or QMC integration.
The neural network maps the full state vector to all policy variables simultaneously through the small Swish--softplus network in Figure Figure 3.3.
Figure 3.3:Reference network architecture used for the -country IRBC model. The diagram shows the irreversible companion notebook (lecture_04_02_IRBC_DEQN_irreversible.ipynb): two hidden layers of 64 Swish units mapping the -dimensional state to a -dimensional output ( capital choices, the resource-constraint multiplier , and the irreversibility multipliers ); softplus on the and heads enforces non-negativity, and capital choices use the bounded growth head described below. The smooth-benchmark companion (lecture_04_01_IRBC_DEQN_smooth.ipynb) drops the block, leaving an -dimensional output head and no Fischer--Burmeister residual; in both notebooks the capital head is parameterized as the bounded log-growth (smooth) or the additive form (irreversible), both of which keep by construction.
The hidden layers use the Swish activation , while the output layer employs the softplus function to keep the multipliers and capital choice positive. Two approximation caveats deserve emphasis. First, for all , so the multipliers are strictly positive rather than exactly zero when the constraint is slack; complementarity is enforced only approximately. Second, irreversibility requires ; a softplus on alone does not enforce this, since the network can output a positive that nonetheless implies negative investment. A cleaner alternative is to output investment directly via and set , which hard-enforces the constraint by construction.
The total DEQN loss aggregates the equilibrium conditions. In the smooth benchmark (companion notebook lecture_04_01_IRBC_DEQN_smooth.ipynb) only the Euler and aggregate-resource-constraint residuals appear:
The irreversibility extension (companion notebook lecture_04_02_IRBC_DEQN_irreversible.ipynb) augments (3.19) with the Fischer--Burmeister complementarity block:
where is the number of training states. When the individual loss components differ in magnitude across countries (which is typical when countries differ in size or calibration), an adaptive loss-balancing scheme from Chapter Chapter 4 (e.g., ReLoBRaLo, SoftAdapt, GradNorm) can be applied to reweight the components during training.
3.4.0.2Representative implementation.¶
The architecture is a 2-hidden-layer Swish network with a softplus output head. In the smooth benchmark the head has dimension (the capital choices and the resource-constraint multiplier ); in the irreversible extension the head expands to , adding the irreversibility multipliers (softplus enforces non-negativity by construction). Only the irreversible loss carries a non-textbook line, the Fischer--Burmeister smoothing of the complementarity :
def fischer_burmeister(mu, I, eps=1e-4):
return mu + I - tf.sqrt(mu**2 + I**2 + eps**2)Program 1:Fischer--Burmeister smoothing of (irreversible companion notebook only).
This residual is then squared elementwise and averaged across the mini-batch and across the countries, in line with the squared-residual treatment of the Euler and ARC blocks; that elementwise square is what makes the gradient field push iterates toward the complementarity axes (see Figure Figure 3.1). Inside the per-batch cost function of the irreversible notebook, this residual is squared and averaged alongside the Euler-equation residual (whose conditional expectation is handled by the Stroud-3 monomial rule of Section 2.6.3 -- nodes for the idiosyncratic and one aggregate shock) and the aggregate-resource-constraint residual. The smooth companion implements the same compute_cost pipeline with the outputs and the FB block removed.
3.5Persistent-Simulation Training¶
The companion notebooks train the IRBC DEQN with a single training pipeline: a continuing ensemble of stochastic trajectories that evolves alongside the policy network. There is no Phase 1 / Phase 2 switch and no reset to the steady state between training segments.
What makes the single-pipeline approach feasible is that both companion notebooks parameterize the policy so that capital cannot leave the feasible set, even at random initialization. In the smooth notebook the network outputs a bounded log-growth term, , which keeps strictly positive and per-period capital growth bounded by . In the irreversible notebook the policy network outputs an investment fraction shaped by a sigmoid head and the law of motion is hard-coded with . Either choice removes the reason historical implementations needed a uniform-sampling burn-in: the simulation cannot diverge.
A SAMPLING_MODE switch (simulation vs exogenous) is exposed for ablation studies and debugging, exogenous sampling on a wide box can be useful to confirm that a finding is not an artefact of the ergodic set, but the default simulation mode runs for the entire training horizon without a phase change.
A typical schedule on the two-country benchmark uses trajectories of length per segment, a batch size of 256, and one or a small number of optimizer passes per segment, with Adam at learning rate and a cosine decay; convergence is read off the diagnostics of the next section rather than off a phase-transition criterion. As a budgeting reference, the companion notebooks typically run on the order of 200--500 training segments before mean Euler errors drop below 10-3 on a held-out trajectory.
3.6Results and Scalability¶
The DEQN approach has been successfully applied to IRBC models with up to countries (200 state variables, 201 policy outputs), producing equilibrium errors below 10-3 in all Euler equations, a level comparable to the best existing solution methods at a fraction of the computational cost, while substantially mitigating curse-of-dimensionality effects in practice.
3.6.0.1Convergence diagnostics.¶
The quality of the DEQN solution is assessed using several complementary diagnostics:
Euler equation errors: For each country , compute . Errors below 10-3 indicate that the optimality condition is violated by less than 0.1% of consumption, an acceptable tolerance for most applications.
Resource constraint residual: Verify that on the test set.
Complementarity check (irreversible companion only): Confirm that and that the multiplier is positive only when investment is at its lower bound.
Economic diagnostics: Verify that the ergodic distribution of capital, output, and consumption has sensible properties (e.g., positive trade balances for productive countries, capital flowing to high-productivity states).
Policy-drift / time-invariance check: Evaluate the policy on a fixed anchor cloud
X_anchorafter each monitoring interval and reportpolicy_drift_rmsandpolicy_drift_max. The architecture has no calendar-time input, so any fixed weight vector is a stationary recursive policy by construction; the empirical question is whether SGD has stopped moving the policy function. The run is treated as time-invariant once both drift statistics fall below the prescribed tolerancesTIME_INVARIANCE_TOL_RMSandTIME_INVARIANCE_TOL_MAX.Zero-shock stochastic steady state (SSS): Iterate the learned policy from
ZERO_SHOCK_N_STARTSdispersed feasible starts with all shocks set to zero. A well-trained policy converges to a common point with and (in the irreversible case) ; the SSS is a fixed point of the learned stochastic policy that is not imposed during training.
3.6.1Validation Protocol¶
To keep the manuscript self-contained, we summarize here the validation diagnostics used for the IRBC model:
Held-out residual table. Evaluate mean and max absolute residuals on an out-of-sample test set for each equation block (Euler and ARC always; FB only in the irreversible companion). In the two-country benchmark, typical values are mean and max for Euler/ARC, with smaller FB residuals.
Euler-side comparison. Compare left and right sides of the Euler equation directly on the test set (scatter around the 45-degree line). Target thresholds are mean relative error below 10-3 and max relative error below 10-2.
Constraint diagnostics (irreversible companion only). Verify everywhere and that lies close to the complementarity axes ( when ).
Economic sanity checks. Confirm market-wide accounting identities (e.g., trade balances summing to zero), sensible consumption-sharing behavior, and stable ergodic state distributions around economically plausible regions.
Policy-drift / time-invariance check. Track
policy_drift_rmsandpolicy_drift_maxon a fixed anchor cloud across training segments; flag the run as time-invariant once both drop below the prescribed tolerances. This check distinguishes “the policy has stabilized” from “the residuals are small”; both are needed for a trustworthy recursive solution.Zero-shock stochastic steady state. Simulate the learned policy with all shocks set to zero from several dispersed feasible initial states. Convergence to a single point with (and in the irreversible case) is a coordinate-free sanity check that complements the held-out residual table.
This protocol makes solution quality auditable and comparable across model sizes and network configurations.
3.6.1.1Policy function properties.¶
The learned policy functions exhibit the expected economic properties. Consumption sharing follows the Pareto-weight and IES structure in (3.12): holding the common shadow price fixed, a higher Pareto weight raises country ’s consumption, and with heterogeneous IES the consumption ratio varies with . Productivity affects consumption only indirectly through the equilibrium shadow price and the resource constraint, not through a mechanical bilateral ratio . This is the textbook complete-markets prediction: the cross-country consumption ratio depends on the Pareto weights and the IES gap, not on the productivity differential. The empirical failure of this prediction is the consumption-correlation puzzle introduced in Section 3.1; a closely related but distinct failure is the Backus--Smith puzzle, which concerns the correlation between relative consumption growth and the real exchange rate, predicted to be near one under complete markets but empirically near zero or even negative. Any model that aims to reproduce either puzzle has to break some of the assumptions used here (e.g. by restricting the asset menu, Heathcote & Perri (2002), or adding non-traded goods, Heathcote & Perri (2013)). Investment responds procyclically to productivity shocks: a high realization of raises the marginal product of capital in country , triggering increased investment. When the irreversibility constraint binds (), capital cannot be disinvested and the multiplier becomes positive; the network learns this regime-switching behavior smoothly through the Fischer--Burmeister loss. Trade balances adjust to channel resources toward productive countries: positive trade balances (net exports of goods) correspond to countries whose current productivity exceeds the average, and the implied capital flows are consistent with standard international macroeconomic theory.
The key advantage of the DEQN approach is its scaling behavior: while traditional Cartesian grid-based methods Judd, 1998 exhibit exponential growth in computation time as increases, and even adaptive sparse grid methods Brumm & Scheidegger, 2017, which significantly mitigate the curse of dimensionality, become computationally demanding for , DEQN runtimes in our implementations are reported close to linear in over a broad range of model sizes (see Azinovic et al. (2022), Table 2 and surrounding discussion, for timings across ). This favorable empirical scaling arises because the network’s parameter count grows roughly linearly (more input/output neurons), while each SGD step avoids state-space grids. The companion notebooks (lecture_04_01_IRBC_DEQN_smooth.ipynb and lecture_04_02_IRBC_DEQN_irreversible.ipynb) only run the case, so the linear-scaling claim cannot be reproduced from the in-class material; readers who wish to verify it directly should consult the published timings or replicate the larger- runs from the Azinovic et al. codebase.
3.6.1.2Comparison with adaptive sparse grids.¶
The approach of Brumm & Scheidegger (2017) handles kinks in the policy function (e.g., those induced by the irreversibility constraint) by refining the grid locally around the kink using hierarchical surplus indicators. This keeps the method accurate but the grid remains anchored to a hypercube, so computation still scales poorly once the number of active kinks or the dimensionality grows. DEQNs do not represent kinks by grid refinement; instead, they fit a smooth approximator (Swish/softplus network) to the Fischer--Burmeister-regularized problem, which produces a globally smooth policy that tracks the true piecewise structure without needing localized grid points. The two methods are therefore complementary: adaptive sparse grids give deterministic error bounds on a hypercube; DEQNs give simulation-based error bounds on the ergodic set with no grid at all. From a theoretical perspective, Montanelli & Du (2019) establish error bounds showing that deep ReLU networks can approximate functions on sparse grids without the exponential growth in parameters that afflicts classical polynomial methods, providing formal underpinning for why deep learning can mitigate (though not eliminate) the high-dimensional approximation cost. Exact runtimes depend on architectural choices, quadrature design, and hardware; the robust finding is that the DEQN formulation avoids explicit tensor-product state grids and remains computationally viable in dimensions where standard methods become prohibitively expensive.
Beyond the IRBC setting, closely related neural-equilibrium methods have been applied to other policy-relevant problems. Nuño et al. (2024) use DEQNs to compute optimal monetary policy rules under persistent supply shocks, replacing the linearization step around steady state with a globally trained policy network. Bretscher et al. (2022) apply DEQN to multi-country international real business cycles with comparative advantage. Most recently, Azinovic-Yang & Žemlička (2025) replace the endogenous cross-sectional state with a truncated history of exogenous aggregate shocks (the sequence-space representation), so that the network’s input dimension scales with the truncation horizon rather than with the number of agents, which is the heterogeneous-agent extension developed in Chapter Chapter 6.
3.7Further Reading¶
Brumm & Scheidegger (2017), adaptive sparse grids for IRBC, the classical-method benchmark this chapter contrasts with.
Pichler (2011), an IRBC-specific application of the monomial rule of Section 2.6.3, useful as a sanity check for the multi-country setting.
Niederreiter (1992), the standard reference for quasi-Monte Carlo and low-discrepancy sequences for the high-dimensional integrals encountered at large .
Nuño et al. (2024), a recent DEQN application to optimal monetary policy.
3.8Exercises¶
Worked solutions and guidance for these exercises appear in Appendix Appendix F.
- Backus, D. K., Kehoe, P. J., & Kydland, F. E. (1992). International real business cycles. Journal of Political Economy, 745–775.
- Heathcote, J., & Perri, F. (2002). Financial Autarky and International Business Cycles. Journal of Monetary Economics, 49(3), 601–627.
- Heathcote, J., & Perri, F. (2013). The International Diversification Puzzle Is Not As Bad As You Think. Journal of Political Economy, 121(6), 1108–1159.
- Brumm, J., & Scheidegger, S. (2017). Using Adaptive Sparse Grids to Solve High-Dimensional Dynamic Models. Econometrica, 85(5), 1575–1612. 10.3982/ECTA12216
- Azinovic, M., Gaegauf, L., & Scheidegger, S. (2022). DEEP EQUILIBRIUM NETS. International Economic Review, 63(4), 1471–1525. 10.1111/iere.12575
- Judd, K. L. (1998). Numerical methods in economics. The MIT press.
- Montanelli, H., & Du, Q. (2019). New Error Bounds for Deep ReLU Networks Using Sparse Grids. SIAM Journal on Mathematics of Data Science, 1(1), 78–92.
- Nuño, G., Renner, P., & Scheidegger, S. (2024). Monetary policy with persistent supply shocks [Techreport]. CESifo Working Paper Series.
- Bretscher, L., Fernández-Villaverde, J., & Scheidegger, S. (2022). Ricardian Business Cycles [SSRN Scholarly Paper]. 10.2139/ssrn.4278274
- Azinovic-Yang, M., & Žemlička, J. (2025). Deep Learning in the Sequence Space. 10.48550/arXiv.2509.13623
- Pichler, P. (2011). Solving the multi-country real business cycle model using a monomial rule Galerkin method. Journal of Economic Dynamics and Control, 35(2), 240–251.
- Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods (Vol. 63). SIAM.