Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

11 Climate Economics and Deep Uncertainty Quantification

University of Lausanne

This chapter brings together the methods developed throughout this script and applies them to one of the most consequential computational challenges in economics: climate change policy. Integrated assessment models (IAMs) couple economic growth with the carbon cycle, temperature dynamics, and climate damages, creating high-dimensional nonstationary dynamic programming problems that are ideal candidates for the DEQN and surrogate methods we have developed. We present the CDICE model of Folini et al. (2025), solve it with DEQNs, and then use GP surrogates and Bayesian active learning, first for deep uncertainty quantification Friedl et al., 2023, and then, applying the same surrogate-then-optimize machinery to a different OLG model and a different surrogate, to search over policy parameters and derive constrained Pareto-improving carbon tax rules in an OLG--IAM with deep uncertainty Kübler et al., 2026. This last step illustrates a general use of surrogates that goes beyond estimation and UQ: once the structural model has been mapped into a fast, differentiable surrogate, the costly outer loop of an optimal-policy search in a dynamic stochastic heterogeneous-agent economy collapses into a small optimization on the surrogate. For broader overviews of climate economics and IAMs, see Hassler et al. (2016) on environmental macroeconomics, Dietz (2024) on IAMs, Fernández-Villaverde et al. (2025) on the intersection of climate economics and deep learning, and Ploeg & Rezai (2026) on the macroeconomics of climate policy.

11.1The Macroeconomics of Climate Change

Climate change is a global externality: the emissions of each agent affect the welfare of all agents, including future generations that have no say in current decisions. Unlike standard market failures, the climate externality operates across centuries, involves deep scientific uncertainty, and couples the macroeconomy with the earth system in both directions. Recent overviews include Hassler et al. (2016) on environmental macroeconomics, Dietz (2024) on IAMs, Fernández-Villaverde et al. (2025) on climate economics and deep learning, and Ploeg & Rezai (2026) on the macroeconomics of climate policy.

11.1.0.1Integrated assessment models.

Integrated assessment models (IAMs) formalize this coupling. The economy produces output and emissions; emissions accumulate in the atmosphere and raise global temperature; temperature increases cause damages that reduce output. The feedback loop is closed (Figure Figure 11.1):

The integrated-assessment feedback loop. The economy produces output and CO_2 emissions; emissions accumulate in the atmosphere and raise global mean temperature (\Delta T); higher temperatures generate damages that reduce output and consumption, which in turn shape the path of future emissions. An IAM closes this loop and uses it to quantify the welfare cost of additional emissions, summarized by the social cost of carbon .

Figure 11.1:The integrated-assessment feedback loop. The economy produces output and CO2_2 emissions; emissions accumulate in the atmosphere and raise global mean temperature (ΔT\Delta T); higher temperatures generate damages that reduce output and consumption, which in turn shape the path of future emissions. An IAM closes this loop and uses it to quantify the welfare cost of additional emissions, summarized by the social cost of carbon (11.1).

The central output of an IAM is the social cost of carbon (SCC): the marginal welfare cost of one additional unit of CO2_2 emissions, measured in consumption-equivalent units. When emissions are measured in GtC (gigatons of carbon), the SCC has units of consumption per GtC. Conversion to USD per tCO2_2 requires first applying the consumption-to-USD numeraire and then converting the carbon mass unit: one tCO2_2 contains 12/4412/44 tons of carbon, so a price expressed per ton of carbon is divided by 44/1244/12 to obtain the corresponding price per ton of CO2_2 (and a GtC price is also divided by 109). Formally,

SCCt=Vt/EtVt/Ct,\mathrm{SCC}_t = -\frac{\partial V_t / \partial E_t}{\partial V_t / \partial C_t},

where VtV_t is the value function, EtE_t is contemporaneous emissions, and CtC_t is consumption. The flow form is linked to the stock-based form SCCtM=(Vt/MtAT)/(Vt/Ct)\mathrm{SCC}^M_t = -(\partial V_t/\partial M^{\mathrm{AT}}_t)/(\partial V_t/\partial C_t), derived in Section Section 11.6, by the chain rule

VtEt  =  Vt+1Mt+1ATMt+1ATEt\frac{\partial V_t}{\partial E_t} \;=\; \frac{\partial V_{t+1}}{\partial M^{\mathrm{AT}}_{t+1}}\,\frac{\partial M^{\mathrm{AT}}_{t+1}}{\partial E_t}

together with the carbon-to-CO2_2 unit conversion noted above. In a first-best allocation, the optimal carbon tax equals the SCC Golosov et al., 2014. The SCC is high when climate damages are steep, the climate response is strong, discounting is low, and tipping risks are material Cai & Lontzek, 2019Dietz, 2024.

11.1.0.2From surrogates to climate IAMs.

Chapter Chapter 9 introduced surrogates and Bayesian active learning as fast approximators for repeated model evaluations. Climate IAMs are the natural application: each parameter configuration (climate sensitivity, damage curvature, discount rate) is expensive to solve, yet policy questions require evaluating thousands of configurations to map out tail risks and Pareto-improving rules. The DEQN approach of Chapters Chapter 2--Chapter 3, combined with the GP and active-learning toolkit of Chapter Chapter 9, is therefore the natural workhorse for climate-policy uncertainty quantification.

11.1.0.3Why computation matters.

Solving IAMs globally, as opposed to linearization or certainty equivalence, is computationally demanding for several reasons:

The deep learning toolkit developed in this course (DEQNs, deep surrogates, and GP-based uncertainty quantification) is therefore particularly well suited to climate economics.

11.1.0.4The three movements of this chapter.

The remainder of this chapter has three movements. Movement 1 (Section 11.3--Section 11.5) makes precise what changes when we ask the Deep Equilibrium Network of Chapter Chapter 2 to solve a non-stationary IAM, and presents the modified training algorithm in one labeled box. Movement 2 (Paragraph--Section 11.8) puts that algorithm to work on a concrete stochastic DICE economy. Movement 3 (Section 11.9--Section 11.12) sketches the four extensions that matter for serious climate-finance research: Bayesian learning on the climate sensitivity, recursive Epstein--Zin preferences, global uncertainty quantification of the social cost of carbon, and constrained Pareto-improving carbon-tax design in a heterogeneous-agent IAM.

11.2The DICE Model

11.2.1The IAM Landscape

DICE is the workhorse of this chapter, but it is one of several integrated assessment models in active use. The list below summarizes the active landscape; each model trades global parsimony for regional or sectoral granularity, and computational tractability for fidelity of the climate physics.

DICE.
Global aggregate; 3-box carbon cycle and 2-layer energy-balance model; one-sector Ramsey planner. The standard benchmark for SCC and integrated policy analysis Nordhaus, 2017.
RICE.
Twelve-region extension of DICE with trade Nordhaus & Yang, 1996. Used for regional SCC and equity questions.
CDICE.
A global DICE-2016 recalibration tailored to deep-learning solution methods, with Epstein--Zin preferences and OLG variants. The model used in Section 11.6 below Folini et al., 2025.
ACE.
Analytic Climate Economy: log-linear approximations to the carbon cycle, temperature dynamics, and damages yield a closed-form optimal carbon tax Traeger, 2023. Acts as an analytic benchmark for the numerical SCC computed below.
FaIR / MAGICC.
Reduced-complexity climate emulators that take emissions as input and produce temperature responses; widely used to translate IPCC scenarios to economic models.
WITCH / REMIND.
Multi-region IAMs with full energy-system modules; standard for mitigation-pathway and technology-portfolio studies. Outside the scope of this script.

CDICE is the model we solve in this chapter. ACE provides a useful analytic shadow for it, in particular a closed-form SCC that decomposes transparently into structural parameters; we do not derive that closed form here, but Exercise 11.3 asks the reader to compute it from Traeger (2023) and compare against the DEQN-trained CDICE solution as an external sanity check.

The Dynamic Integrated model of Climate and the Economy (DICE), developed by Nordhaus (1994), is the most influential IAM; in this chapter we follow the variant of Nordhaus (2017) as recalibrated by Folini et al. (2025). It couples a neoclassical growth model with a reduced-form climate module in a single global framework. The remainder of this section builds the model up block by block, in increasing complexity: first the macro-economic backbone (Section 11.2.2), then the emissions and abatement technology (Section 11.2.3), then the climate physics (Section 11.2.5--Section 11.2.6), and finally the damage feedback (Section 11.2.7) that closes the loop. A consolidated calibration is given in Table Table 11.2.

11.2.1.1Time-step convention.

Following Folini et al. (2025) we calibrate CDICE on an annual time step, Δt=1\Delta_t = 1 year, so that all rates in Table Table 11.2 (capital depreciation δ\delta, pure rate of time preference ρ\rho, the decay rates g0σ,δσ,gback,δLandg^{\sigma}_0, \delta^{\sigma}, g^{\mathrm{back}}, \delta^{\mathrm{Land}}, the carbon-cycle transfer rates b12,b23b_{12}, b_{23}, and the temperature-block coefficients c1,c3,c4c_1, c_3, c_4) are read directly as annual values; the original DICE-2016 calibration of Nordhaus (2017), by contrast, hard-wires a 5-year time step into its coefficients. Growth rates of TFP and population are written as annual log changes, gtA:=ln(At+1/At)g^A_t := \ln(A_{t+1}/A_t) and gtL:=ln(Lt+1/Lt)g^L_t := \ln(L_{t+1}/L_t), and the dynamics (11.14)(11.16)--(11.17) and the FOC residuals of Section 11.6 therefore carry no Δt\Delta_t multipliers; emissions EtE_t entering (11.14) are the annual total. Switching to a non-annual Δt\Delta_t amounts to reinserting the multiplications Δt{g0σ,δσ,gback,δLand,b12,b23,c1,c3,c4}\Delta_t \cdot \{g^{\sigma}_0, \delta^{\sigma}, g^{\mathrm{back}}, \delta^{\mathrm{Land}}, b_{12}, b_{23}, c_1, c_3, c_4\} in the obvious places, the time-step-generic form discussed in Online Appendix D of Folini et al. (2025).

11.2.2Production and the Ramsey--Cass--Koopmans backbone

Strip away the climate block and DICE is just a neoclassical growth model with population and TFP growth. A single representative firm produces gross output with Cobb--Douglas technology in capital and effective labor,

Ytgross  =  Ktα(AtLt)1α,Y^{\mathrm{gross}}_t \;=\; K_t^{\alpha}\,(A_t L_t)^{1-\alpha},

where α(0,1)\alpha\in(0,1) is the capital share, AtA_t is total factor productivity, and LtL_t is population. Both AtA_t and LtL_t follow deterministic but time-varying paths: AtA_t trends because of exogenous productivity growth, and LtL_t follows the calibrated demographic projection of Nordhaus (2017). The capital stock evolves under the standard accumulation law

Kt+1  =  (1δ)Kt+It,K_{t+1} \;=\; (1-\delta) K_t + I_t,

with depreciation rate δ\delta and gross investment ItI_t. The economy’s resource constraint, written in terms of net (after-damages, after-abatement) output that we develop in Section 11.2.3--Section 11.2.7, is Ct+ItYtnetC_t + I_t \le Y^{\mathrm{net}}_t, where CtC_t is aggregate consumption.

A benevolent planner picks (Ct,It,μt)t0(C_t,\, I_t,\, \mu_t)_{t\ge 0} to maximize a discounted CRRA-IES felicity sum,

V0  =  t=0βtLt(Ct/Lt)11/ψ111/ψ,βt  =  exp(ρΔtt),V_0 \;=\; \sum_{t=0}^{\infty} \beta_t\, L_t\,\frac{(C_t/L_t)^{1-1/\psi}-1}{1-1/\psi}, \qquad \beta_t \;=\; \exp(-\rho\,\Delta_t \cdot t),

with intertemporal-elasticity-of-substitution parameter ψ>0\psi>0 and pure rate of time preference ρ\rho. This is the time-additive aggregator of the standard Ramsey--Cass--Koopmans growth model; we replace it with the recursive Epstein--Zin form once stochastic risk enters the picture (Section 11.10). The planner controls μt\mu_t, the emissions abatement rate, in addition to the savings--consumption split; we develop the cost of abatement next.

11.2.3Industrial emissions, abatement, and the backstop technology

Industrial production is a CO2_2-emitting activity. Let σt\sigma_t denote the carbon intensity of gross output, expressed in CDICE’s working units of 103 GtC of emissions per unit of gross output (a 103 GtC normalization on the carbon stocks improves the conditioning of the climate side; see Table Table 11.2). Industrial emissions are then σtYtgross\sigma_t Y^{\mathrm{gross}}_t before any mitigation effort; with abatement rate μt[0,1]\mu_t \in [0,1] the planner can scale these emissions down,

Eind,t  =  (1μt)σtYtgross  =  (1μt)σtKtα(AtLt)1α.E_{\mathrm{ind},t} \;=\; (1-\mu_t)\,\sigma_t\, Y^{\mathrm{gross}}_t \;=\; (1-\mu_t)\,\sigma_t\, K_t^{\alpha}(A_t L_t)^{1-\alpha}.

Carbon intensity is itself an exogenous decreasing time path. DICE-2016 calibrates a closed-form decay,

σt  =  σ0exp ⁣[g0σlog(1+δσ)((1+δσ)t1)],\sigma_t \;=\; \sigma_0\,\exp\!\left[\frac{g^{\sigma}_0}{\log(1+\delta^{\sigma})}\bigl((1+\delta^{\sigma})^{t}-1\bigr)\right],

with initial intensity σ0\sigma_0, initial growth rate g0σ<0g^{\sigma}_0<0 (so emissions per dollar of output fall over time), and second-derivative parameter δσ>0\delta^{\sigma}>0 that bends the path further down at long horizons. Equation (11.7) captures the steady decarbonization that even unabated world output undergoes through ongoing technological change; the planner’s μt\mu_t is the additional mitigation effort on top of that baseline.

Abatement is not free. In the spirit of an aggregate marginal-abatement-cost curve, DICE assumes the abatement-cost share of gross output is a power function of μt\mu_t,

Θ(μt)  =  θ1,tμtθ2,\Theta(\mu_t) \;=\; \theta_{1,t}\,\mu_t^{\theta_2},

with curvature parameter θ2>1\theta_2>1 (a typical calibration is θ2=2.6\theta_2=2.6). The level coefficient θ1,t\theta_{1,t} is not a free parameter: it is pinned down by the cost of the backstop technology, the cleanest large-scale abatement technology available at any given time (e.g. direct air capture). Let ptbackp^{\mathrm{back}}_t denote the cost per unit of CO2_2 avoided when the backstop is fully deployed, and assume an exogenous declining path,

ptback  =  p0backexp(gbackt),p^{\mathrm{back}}_t \;=\; p^{\mathrm{back}}_0\,\exp(-g^{\mathrm{back}}\,t),

reflecting steady cost reductions in clean technologies. Setting the marginal abatement cost at μt=1\mu_t=1 equal to the backstop price (multiplied by carbon-to-CO2_2 conversion c2co2\mathrm{c2co2} to keep mass units consistent, and by 103 to convert σt\sigma_t from 103 GtC working units back to GtC) yields the calibration identity

θ1,t  =  ptback103c2co2σtθ2.\theta_{1,t} \;=\; \frac{p^{\mathrm{back}}_t \cdot 10^3 \cdot \mathrm{c2co2} \cdot \sigma_t}{\theta_2}.

Equation (11.10) is what makes Θ(μ)\Theta(\mu) economically meaningful rather than a fitted polynomial: the abatement-cost function inherits its level from the backstop price and its curvature from the assumption θ2=2.6\theta_2=2.6. The 103 factor matches Equation (11) of Online Appendix D of Folini et al. (2025) and the corresponding factor of 1000 in the companion implementation. As the backstop becomes cheaper (ptbackp^{\mathrm{back}}_t \downarrow), full mitigation becomes cheaper too, which is one of the channels that makes the deterministic optimal μt\mu_t rise toward 1 over the 21st century.

The bound μt[0,1]\mu_t \in [0,1] deserves a comment. μt=0\mu_t = 0 means business-as-usual emissions; μt=1\mu_t = 1 means full deployment of the backstop, eliminating all industrial emissions. Values μt>1\mu_t > 1 would correspond to net-negative industrial emissions (e.g. aggressive direct air capture beyond the firm’s own footprint), which DICE forbids; we will impose the upper bound as a Kuhn--Tucker constraint, smoothed by a Fischer--Burmeister term, in Paragraph.

11.2.4Land-use emissions and net output

The atmosphere does not distinguish between an industrial flow and a non-industrial flow of carbon. In DICE, total emissions therefore comprise an industrial component (11.6) and an exogenous land-use-change component,

ELand,t  =  ELand,0exp(δLandt),E_{\mathrm{Land},t} \;=\; E_{\mathrm{Land},0}\,\exp(-\delta^{\mathrm{Land}}\,t),

which decays smoothly toward zero as deforestation slows. Total emissions feeding the atmosphere are

Et  =  Eind,t+ELand,t.E_t \;=\; E_{\mathrm{ind},t} + E_{\mathrm{Land},t}.

Closing the production block requires accounting for two additional drains on gross output: climate damages, governed by atmospheric temperature TtATT^{\mathrm{AT}}_t via a damage fraction Ω(TtAT)\Omega(T^{\mathrm{AT}}_t) developed in Section 11.2.7, and abatement spending (11.8). Net output is therefore

Ytnet  =  (1Ω(TtAT)Θ(μt))Ytgross,Y^{\mathrm{net}}_t \;=\; \bigl(1 - \Omega(T^{\mathrm{AT}}_t) - \Theta(\mu_t)\bigr)\,Y^{\mathrm{gross}}_t,

which is what is available for consumption and investment. The additive form is the convention adopted in CDICE and used by the production-grade DEQN library port; an alternative multiplicative form (1Ωret)(1Θ)(1-\Omega^{\mathrm{ret}})(1-\Theta) with retained-output factor Ωret\Omega^{\mathrm{ret}} appears in Nordhaus (2008).

The planner’s controls and exogenous trends are now all named. The endogenous economic state is the capital stock KtK_t. The exogenous trends are TFP AtA_t, population LtL_t, carbon intensity σt\sigma_t, land-use emissions ELand,tE_{\mathrm{Land},t}, and (added below) the non-CO2_2 component of radiative forcing FtEXF^{\mathrm{EX}}_t. The planner controls the consumption--investment split (equivalently, the savings rate sts_t) and the abatement rate μt[0,1]\mu_t \in [0,1]. All that remains is the climate side: the carbon cycle that turns total emissions EtE_t into atmospheric concentration, the energy balance that turns concentration into temperature, and the damage function that turns temperature back into output loss.

11.2.5Carbon cycle

DICE represents the global carbon cycle as a three-reservoir linear system: an atmospheric box, an upper (mixed-layer) ocean box, and a lower (deep) ocean box. Carbon flows between reservoirs at calibrated rates, and total emissions EtE_t from (11.12) enter directly into the atmospheric reservoir. Stacking concentrations as Mt=(MtAT,MtUO,MtLO)M_t = (M^{\mathrm{AT}}_t,\, M^{\mathrm{UO}}_t,\, M^{\mathrm{LO}}_t)^\top, the transition is

Mt+1  =  (I+B)Mt  +  e1Et,M_{t+1} \;=\; (I + B)\, M_t \;+\; \bm{e}_1\,E_t,

where e1=(1,0,0)\bm{e}_1 = (1,0,0)^\top injects emissions into the atmosphere alone, EtE_t is the per-period emissions total, and the transfer matrix

B  =  (b12b12MeqAT/MeqUO0b12b12MeqAT/MeqUOb23b23MeqUO/MeqLO0b23b23MeqUO/MeqLO)B \;=\; \begin{pmatrix} -b_{12} & b_{12}\,M^{\mathrm{AT}}_{\mathrm{eq}}/M^{\mathrm{UO}}_{\mathrm{eq}} & 0 \\ b_{12} & -b_{12}\,M^{\mathrm{AT}}_{\mathrm{eq}}/M^{\mathrm{UO}}_{\mathrm{eq}} - b_{23} & b_{23}\,M^{\mathrm{UO}}_{\mathrm{eq}}/M^{\mathrm{LO}}_{\mathrm{eq}} \\ 0 & b_{23} & -b_{23}\,M^{\mathrm{UO}}_{\mathrm{eq}}/M^{\mathrm{LO}}_{\mathrm{eq}} \end{pmatrix}

encodes the two atmosphere--upper-ocean exchange rates (b12b_{12} in either direction) and the two upper-ocean--lower-ocean exchange rates (b23b_{23} in either direction). The off-diagonal scaling by the equilibrium-mass ratios MeqAT/MeqUOM^{\mathrm{AT}}_{\mathrm{eq}}/M^{\mathrm{UO}}_{\mathrm{eq}} and MeqUO/MeqLOM^{\mathrm{UO}}_{\mathrm{eq}}/M^{\mathrm{LO}}_{\mathrm{eq}} guarantees that, under zero net emissions, the system relaxes to the calibrated pre-industrial equilibrium Meq=(MeqAT,MeqUO,MeqLO)M_{\mathrm{eq}} = (M^{\mathrm{AT}}_{\mathrm{eq}},\, M^{\mathrm{UO}}_{\mathrm{eq}},\, M^{\mathrm{LO}}_{\mathrm{eq}})^\top. Calibrated values for b12,b23b_{12}, b_{23}, and MeqM_{\mathrm{eq}} in CDICE are listed in Table Table 11.2. The lecture slides for this chapter sometimes write the same transition with four directional rates ϕ12,ϕ21,ϕ23,ϕ32\phi_{12},\phi_{21},\phi_{23},\phi_{32} in place of b12b_{12} and b23b_{23}; the two parameterizations are identical under ϕ12=b12\phi_{12} = b_{12}, ϕ21=b12MeqAT/MeqUO\phi_{21} = b_{12}\,M^{\mathrm{AT}}_{\mathrm{eq}}/M^{\mathrm{UO}}_{\mathrm{eq}}, ϕ23=b23\phi_{23} = b_{23}, ϕ32=b23MeqUO/MeqLO\phi_{32} = b_{23}\,M^{\mathrm{UO}}_{\mathrm{eq}}/M^{\mathrm{LO}}_{\mathrm{eq}}, i.e. the slide form makes the equilibrium-mass scaling absorbed into BB explicit at the cost of two extra symbols.

Equation (11.14) is a pulse-and-decay system: a unit pulse of emissions raises atmospheric carbon by one unit instantaneously, and that anomaly then bleeds into the upper ocean over decades and into the deep ocean over centuries. Figure Figure 11.2 shows the implied BAU emissions trajectory under nine alternative climate-module calibrations; the spread is mostly driven by the equilibrium climate sensitivity (developed in Section 11.2.6), not by the carbon cycle, which is tightly disciplined by the pulse and step tests of Section 11.2.8.

Business-as-usual industrial emissions in CDICE (in GtCO_2/yr) under the nine combinations of three carbon-cycle calibrations (MMM, MESMO, LOVECLIM) and three temperature calibrations (MMM, HadGEM2-ES, GISS-E2-R); the thin CDICE curves overlap visually, confirming that the BAU emissions path is essentially insensitive to the climate-module calibration because \sigma_t and A_t are exogenous. The thick red and orange curves are the RCP 8.5 and RCP 6.0 scenarios, included as climate-policy reference paths. Reproduced from , Figure 11(a).

Figure 11.2:Business-as-usual industrial emissions in CDICE (in GtCO2_2/yr) under the nine combinations of three carbon-cycle calibrations (MMM, MESMO, LOVECLIM) and three temperature calibrations (MMM, HadGEM2-ES, GISS-E2-R); the thin CDICE curves overlap visually, confirming that the BAU emissions path is essentially insensitive to the climate-module calibration because σt\sigma_t and AtA_t are exogenous. The thick red and orange curves are the RCP 8.5 and RCP 6.0 scenarios, included as climate-policy reference paths. Reproduced from Folini et al. (2025), Figure 11(a).

11.2.6Two-layer energy balance and radiative forcing

A two-layer energy balance model links carbon concentrations to temperature:

Tt+1AT=TtAT+c1(FtλTtATc3(TtATTtOC))T^{\mathrm{AT}}_{t+1} = T^{\mathrm{AT}}_t + c_1 \bigl(F_t - \lambda\, T^{\mathrm{AT}}_t - c_3(T^{\mathrm{AT}}_t - T^{\mathrm{OC}}_t)\bigr)
Tt+1OC=TtOC+c4(TtATTtOC)T^{\mathrm{OC}}_{t+1} = T^{\mathrm{OC}}_t + c_4 \bigl(T^{\mathrm{AT}}_t - T^{\mathrm{OC}}_t\bigr)

where radiative forcing is

Ft=F2×CO2log(MtAT/MPIAT)log2+FtEX.F_t = F_{\mathrm{2\times CO_2}} \frac{\log(M^{\mathrm{AT}}_t / M^{\mathrm{AT}}_{\mathrm{PI}})}{\log 2} + F^{\mathrm{EX}}_t.

Figure Figure 11.3 summarizes the full topology of the climate side: industrial emissions enter the atmospheric carbon stock, leak into the upper and lower ocean reservoirs at calibrated rates, raise radiative forcing through the logarithmic CO2_2 term, and warm the atmospheric and ocean temperature layers through the two-layer energy balance.

Topology of the CDICE climate side. Total emissions E_t enter the atmospheric carbon box M^{\mathrm{AT}}_t, leak into the upper- and lower-ocean carbon boxes at exchange rates b_{12} and b_{23}, and drive radiative forcing F_t through the logarithmic CO_2 relation. The two-layer energy balance maps F_t into the atmospheric temperature T^{\mathrm{AT}}_t via c_1, with c_3, c_4 governing the heat exchange between atmosphere and ocean. The dashed arrow closes the loop through the damage function back into output (developed in ). Five climate states (M^{\mathrm{AT}}, M^{\mathrm{UO}}, M^{\mathrm{LO}}, T^{\mathrm{AT}}, T^{\mathrm{OC}}) form the climate-side block of the DEQN state vector .

Figure 11.3:Topology of the CDICE climate side. Total emissions EtE_t enter the atmospheric carbon box MtATM^{\mathrm{AT}}_t, leak into the upper- and lower-ocean carbon boxes at exchange rates b12b_{12} and b23b_{23}, and drive radiative forcing FtF_t through the logarithmic CO2_2 relation. The two-layer energy balance maps FtF_t into the atmospheric temperature TtATT^{\mathrm{AT}}_t via c1c_1, with c3,c4c_3, c_4 governing the heat exchange between atmosphere and ocean. The dashed arrow closes the loop through the damage function back into output (developed in Section 11.2.7). Five climate states (MAT,MUO,MLO,TAT,TOC)(M^{\mathrm{AT}}, M^{\mathrm{UO}}, M^{\mathrm{LO}}, T^{\mathrm{AT}}, T^{\mathrm{OC}}) form the climate-side block of the DEQN state vector (11.25).

The parameter λ=F2×CO2/ΔTAT,×2\lambda = F_{\mathrm{2\times CO_2}} / \Delta T_{\mathrm{AT},\times 2} is determined by the equilibrium climate sensitivity (ECS), defined as the long-run atmospheric warming from a doubling of CO2_2 concentration. We treat λ\lambda as a deterministic constant in the baseline model; Section 11.9 promotes it to a learnable Gaussian parameter, with the additive feedback term φ1Cf~t+1TtAT\varphi_{1C}\tilde f_{t+1} T^{\mathrm{AT}}_t entering the right-hand side of (11.16) and the coefficient φ1C\varphi_{1C} defined in that subsection. ECS is one of the most consequential and uncertain parameters in climate science Roe & Baker, 2007Knutti et al., 2017. Observational and model-based estimates place ECS in a likely (66 %) range of 2.5°C--4°C and a very likely (90 %) range of 2°C--5°C, with a best estimate of approximately 3°C Calvin et al., 2023; ECS uncertainty is one of the largest single sources of variance in the SCC.

11.2.7Damage function: closing the climate--economy loop

The damage function is what turns a temperature anomaly back into an output loss, and so it is what closes the economy--climate--damages feedback loop drawn schematically in Figure Figure 11.1. Following the convention in Folini et al. (2025), Online Appendix D, we treat Ω(TAT)\Omega(T_{\mathrm{AT}}) as the damage fraction of gross output (the fraction lost to climate damages, increasing in TATT_{\mathrm{AT}}), and the abatement-cost fraction Θ(μ)\Theta(\mu) from (11.8) as a separate output drain. The two enter additively in net output (11.13); an alternative multiplicative form (1Ωret)(1Θ)(1-\Omega^{\mathrm{ret}})(1-\Theta) with retained-output factor Ωret\Omega^{\mathrm{ret}} is used by Nordhaus (2008).

The workhorse specification is Nordhaus (2008)’s quadratic,

ΩN(TAT)  =  π1TAT+π2TAT2,\Omega^N(T_{\mathrm{AT}}) \;=\; \pi_1\, T_{\mathrm{AT}} + \pi_2\, T_{\mathrm{AT}}^2,

which is relatively benign for moderate warming and is what we use in the deterministic CDICE solve below. Calibrated values (π1,π2)(\pi_1, \pi_2) are listed in Table Table 11.2. The damage function (11.19) is the most contested object in the IAM literature: at TAT=3CT_{\mathrm{AT}}=3\,{}^\circ\mathrm{C} above pre-industrial, Nordhaus--quadratic damages amount to roughly 2%2\% of gross output, which several recent empirical literatures argue is far below realistic central estimates. We therefore treat the damage curvature π2\pi_2 as one of the two key uncertain parameters in the deep-UQ analysis of Section 11.11 (the other being the equilibrium climate sensitivity).

For the tipping-point branch of the literature, Weitzman (2012) argued that catastrophic thresholds require a steeper damage function,

ΩW(TAT)  =  1    11+(1ψ1TAT)2+(12TPTAT)6.754,\Omega^W(T_{\mathrm{AT}}) \;=\; 1 \;-\; \frac{1}{1 + \bigl(\tfrac{1}{\psi_1} T_{\mathrm{AT}}\bigr)^2 + \bigl(\tfrac{1}{2\, TP} T_{\mathrm{AT}}\bigr)^{6.754}},

where TPTP is a stochastic tipping-point threshold. We do not solve a Weitzman damage variant in the baseline CDICE-DEQN, but the OLG-IAM of Section 11.12 introduces a stylized tipping risk in the same spirit; the degree of convexity of the damage function is one of the most important determinants of the optimal carbon tax.

11.2.8CDICE: recalibration of the climate module

A key contribution of Folini et al. (2025) is a systematic recalibration of the DICE climate module against benchmarks from climate science model archives (CMIP). Their CDICE framework retains the same functional forms as DICE but fits parameters to the four-test protocol summarized in Table Table 11.1.

Table 11.1:CDICE climate-module calibration protocol. The first two tests discipline the carbon-cycle and temperature-response blocks directly; the last two check whether the calibrated reduced-form module remains accurate on out-of-sample and historically realistic forcing paths.

TestTargetUse
1. Carbon pulse (100 GtC)Atmospheric retention pathCalibrate carbon cycle
2. 4×4\timesCO2_2 stepTemperature impulse responseCalibrate temperature block
3. 1% CO2_2/yearTransient climate responseOut-of-sample validation
4. Historical + RCPRealistic forcing pathsEnd-to-end validation

This calibration ensures that the reduced-form climate module is consistent with state-of-the-art earth system models. CDICE also introduces a transparent time-step formulation, Xt+Δt=Xt+Δtf(Xt,ut;θ)X_{t+\Delta t} = X_t + \Delta t \cdot f(X_t, u_t; \theta), that allows coherent implementation at annual, 5-year, or 10-year resolution within a single generic framework. Figure Figure 11.4 illustrates how much the climate-cycle calibration matters even before the planner makes any decision: under business-as-usual, DICE-2016 and CDICE produce visibly different atmospheric carbon trajectories, and the gap propagates into temperature, damages, and ultimately the SCC.

Atmospheric carbon M^{\mathrm{AT}}_t along the BAU path (in GtC, over 200 years from 2015) under the three CDICE carbon-cycle calibrations (CDICE = MMM, CDICE-MESMO, CDICE-LOVECLIM) and the legacy DICE-2016 carbon cycle. Only the carbon-cycle block is varied here; the temperature block is held at the CDICE MMM calibration, since the BAU carbon-stock path does not depend on the temperature calibration to first order. The DICE-2016 path lies systematically above the CMIP-disciplined paths, reflecting that the original DICE carbon cycle overstates atmospheric retention; CDICE-MESMO and CDICE-LOVECLIM bracket the CDICE baseline on the slow-removal and fast-removal sides, respectively. Reproduced from , Figure 15(a).

Figure 11.4:Atmospheric carbon MtATM^{\mathrm{AT}}_t along the BAU path (in GtC, over 200 years from 2015) under the three CDICE carbon-cycle calibrations (CDICE = MMM, CDICE-MESMO, CDICE-LOVECLIM) and the legacy DICE-2016 carbon cycle. Only the carbon-cycle block is varied here; the temperature block is held at the CDICE MMM calibration, since the BAU carbon-stock path does not depend on the temperature calibration to first order. The DICE-2016 path lies systematically above the CMIP-disciplined paths, reflecting that the original DICE carbon cycle overstates atmospheric retention; CDICE-MESMO and CDICE-LOVECLIM bracket the CDICE baseline on the slow-removal and fast-removal sides, respectively. Reproduced from Folini et al. (2025), Figure 15(a).

11.2.9Calibration and initial conditions, in one place

The block-by-block model description above introduces a fairly large set of parameters. Table Table 11.2 consolidates the calibration we use throughout the rest of the chapter, lifted from the Online Appendix of Folini et al. (2025). Two CMIP5 alternatives (HadGEM2-ES and GISS-E2-R) are shown alongside the multi-model mean (MMM) so that the deep-UQ analysis of Section 11.11 has a concrete distribution to draw from. We follow the CDICE convention of expressing all carbon quantities in 103 GtC working units: equilibrium and initial carbon stocks MeqM_{\mathrm{eq}} and M0M_0, the initial carbon intensity σ0\sigma_0, and the initial land-use emissions ELand,0E_{\mathrm{Land},0} are all on the same scale, which keeps the numerical conditioning of the carbon-cycle and emissions states under control. The factor 103 appears explicitly in the abatement-cost calibration (11.10) to convert σt\sigma_t back to GtC when it is multiplied by the backstop price; a reader comparing values against raw DICE-2016 numbers (e.g. 2.6\sim 2.6 GtC/yr land-use emissions, 851\sim 851 GtC atmospheric carbon in 2015) should multiply the table entries by 103 first.

Table 11.2:CDICE baseline calibration used in the deterministic CDICE-DEQN solve. Parameter values follow the Online Appendix of Folini et al. (2025) and are stated on an annual time step (Δt=1\Delta_t = 1 yr). All carbon quantities (Meq,M0,σ0,ELand,0M_{\mathrm{eq}}, M_0, \sigma_0, E_{\mathrm{Land},0}) are in CDICE’s 103 GtC working units; multiply by 103 to recover GtC. Two alternative climate calibrations (CDICE-HadGEM2-ES, CDICE-GISS-E2-R) are listed in the temperature block, with their full free-parameter sets {c1,c3,c4,F2×CO2,λ}\{c_1, c_3, c_4, F_{\mathrm{2\times CO_2}}, \lambda\} and corresponding ECS, since simply varying ECS while holding the rest of the temperature block fixed is not equivalent to using the full CMIP5 calibration Folini et al., 2025. Initial state is for year 2015.

BlockParameterValueMeaning
Economyα\alpha0.30Capital share in Cobb--Douglas output
δ\delta0.10/yrCapital depreciation rate
ρ\rho0.015/yrPure rate of time preference
ψ\psi0.69Intertemporal elasticity of substitution
Emissions &σ0\sigma_09.556 ⁣× ⁣1059.556\!\times\!10^{-5} (103 GtC)/USDInitial carbon intensity
abatementg0σg^{\sigma}_0-0.0152/yrInitial decay rate of σt\sigma_t
δσ\delta^{\sigma}0.001/yrCurvature of σt\sigma_t decay
p0backp^{\mathrm{back}}_00.55 thUSD/tCO2_2Initial backstop price
gbackg^{\mathrm{back}}0.005/yrDecay rate of backstop price
θ2\theta_22.6Curvature of Θ(μ)\Theta(\mu)
c2co2\mathrm{c2co2}3.666Carbon-to-CO2_2 mass conversion
Land useELand,0E_{\mathrm{Land},0}7.09 ⁣× ⁣1047.09\!\times\!10^{-4} (103 GtC)/yrInitial land-use emissions
δLand\delta^{\mathrm{Land}}0.023/yrDecay rate of ELand,tE_{\mathrm{Land},t}
Carbon cycleb12b_{12}0.054/yrAtm.--upper-ocean transfer rate
b23b_{23}0.0082/yrUpper-ocean--lower-ocean transfer rate
MeqM_{\mathrm{eq}}(0.607,0.489,1.281)(0.607, 0.489, 1.281) (103 GtC)Pre-industrial equilibrium masses
Temperaturec1c_1 (MMM)0.137/yrAtmospheric heat-capacity inverse
c3c_3 (MMM)0.73/yrAtm.--ocean coupling
c4c_4 (MMM)0.00689/yrOcean heat-capacity inverse
F2×CO2F_{\mathrm{2\times CO_2}} (MMM)3.45 W/m2^2Forcing from CO2_2 doubling
λ\lambda (MMM)1.06 W/m2^2/KClimate feedback parameter
ECS (MMM)3.25\approx 3.25\,{}^\circCEquilibrium climate sensitivity
HadGEM2-ES(c1,c3,c4)=(0.154,0.55,0.00671)(c_1,c_3,c_4)=(0.154,0.55,0.00671)/yrHigh-end CMIP5 calibration
F2×CO2=2.95F_{\mathrm{2\times CO_2}}=2.95, λ=0.65\lambda=0.65, ECS4.55\approx 4.55°C
GISS-E2-R(c1,c3,c4)=(0.213,1.16,0.00921)(c_1,c_3,c_4)=(0.213,1.16,0.00921)/yrLow-end CMIP5 calibration
F2×CO2=3.65F_{\mathrm{2\times CO_2}}=3.65, λ=1.70\lambda=1.70, ECS2.15\approx 2.15°C
Damagesπ1\pi_10.0Linear damage coefficient
π2\pi_20.00236Quadratic damage coefficient
Initial stateK0K_0223 T USDCapital, year 2015
M0M_0(0.851,0.628,1.323)(0.851, 0.628, 1.323) (103 GtC)Atm./upper/lower carbon, 2015
T0T_0(1.10,0.27)(1.10, 0.27)\,{}^\circCAtm./ocean temp. above pre-industrial, 2015

11.2.10The full IAM, summarized

Pulling the previous subsections together, CDICE is a deterministic dynamical system on a finite-dimensional state vector that the planner steers with two controls. The endogenous state at date tt is the sextuple

Xtend  =  (Kt,  MtAT,  MtUO,  MtLO,  TtAT,  TtOC),\bm{X}^{\mathrm{end}}_t \;=\; \bigl(K_t,\; M^{\mathrm{AT}}_t,\; M^{\mathrm{UO}}_t,\; M^{\mathrm{LO}}_t,\; T^{\mathrm{AT}}_t,\; T^{\mathrm{OC}}_t\bigr),

the exogenous-trend vector is

Xtexo  =  (At,  Lt,  σt,  ELand,t,  FtEX),\bm{X}^{\mathrm{exo}}_t \;=\; \bigl(A_t,\; L_t,\; \sigma_t,\; E_{\mathrm{Land},t},\; F^{\mathrm{EX}}_t\bigr),

and the planner’s controls are (Ct,μt)(C_t,\, \mu_t) (equivalently (Kt+1,μt)(K_{t+1},\, \mu_t), since investment is determined by the resource constraint Ct+It=YtnetC_t + I_t = Y^{\mathrm{net}}_t together with (11.4)). The transitions are: capital from (11.4) with It=YtnetCtI_t = Y^{\mathrm{net}}_t - C_t; total emissions from (11.12), fed into the carbon cycle (11.14); temperature from (11.16)--(11.17) with forcing (11.18); and net output, hence the resource constraint, from (11.13). The objective is the discounted CRRA-IES felicity sum (11.5) subject to μt[0,1]\mu_t \in [0,1].

That is the entire deterministic IAM. Every primitive named above has a closed-form expression and a calibrated parameter (Table Table 11.2); the only thing left is to find the optimal policy (Ct,μt)t0(C_t, \mu_t)_{t\ge 0}. The model is intrinsically non-stationary. Section 11.3 makes that observation precise; the stationary DEQN of Chapter Chapter 2 needs to be amended before we can solve this system.

11.3Why DICE Breaks the Stationary DEQN

This is the technical pivot of the chapter. The stationary DEQN of Chapter Chapter 2 was designed for models whose policy function is a fixed point of a Bellman operator on an ergodic state space. IAMs satisfy neither premise. Three structural features break the stationarity assumption simultaneously, and each must be addressed before the DEQN can be trained at all.

11.3.0.1Time-varying state distributions with no ergodic limit.

The endogenous state of a stationary DSGE is the projection of a recurrent Markov chain onto a finite-dimensional vector; the policy function lives on its stationary distribution. In an IAM the analogue object does not exist within the planning horizon. Atmospheric carbon MtATM^{\mathrm{AT}}_t rises from a pre-industrial baseline of 600\sim 600 GtC to a peak of 1500\sim 1500 GtC over a century, then decays over millennia; atmospheric temperature TtATT^{\mathrm{AT}}_t follows with a multi-decade lag and a multi-century relaxation. Within the 300 years the planner cares about, neither variable ever returns to a state it has been in before. The state visited at t=100t = 100 is therefore not exchangeable with the state visited at t=200t = 200, and a time-invariant policy function p(Xt)\bm p(\bm X_t) that depends only on the endogenous state misses the whole point of the exercise: the optimal mitigation effort at a given (MAT,TAT)(M^{\mathrm{AT}}, T^{\mathrm{AT}}) depends on whether that state was reached on the way up or on the way down. Cf. the curse-of-dimensionality discussion in Section 2.1: it is not the size of the state space that breaks the DEQN here, it is the lack of recurrence.

Even setting the carbon and temperature stocks aside, the IAM is drifting deterministically. Total factor productivity AtA_t trends up at a calibrated, time-varying rate; population LtL_t follows the demographic projection of Nordhaus (2017); carbon intensity σt\sigma_t falls along the closed-form decay (11.7); land-use emissions ELand,tE_{\mathrm{Land},t} decay smoothly (11.11); the backstop price ptbackp^{\mathrm{back}}_t falls (11.9); the exogenous non-CO2_2 forcing FtEXF^{\mathrm{EX}}_t follows a fitted RCP trajectory; and the abatement-cost level θ1,t\theta_{1,t} inherits the time dependence of σt\sigma_t and ptbackp^{\mathrm{back}}_t through (11.10). Seven exogenous trends drive the model even before a shock is introduced. A time-invariant policy can never see them, and replacing them with their long-run averages is exactly the certainty-equivalence move that defeats the purpose of solving the model globally.

11.3.0.3Finite calendar-time horizon.

A stationary DEQN trains under a transversality condition: as tt \to \infty, the discounted shadow price of capital goes to zero, and the iterative-projection loss inherits that fixed-point structure for free. An IAM is not solved on [0,)[0, \infty). The planning horizon is a finite calendar date (the notebooks of Section 11.7 run roughly three centuries from a 2015 start), so transversality is not available and the policy is solved over a finite forward sweep instead.

11.3.0.4Putting it together.

These features compound and explain why a time-invariant DEQN of Chapter Chapter 2 cannot be used here without modification. The next two sections operationalize the response: Section 11.4 reorganizes the network inputs to include calendar time, and Section 11.5 states the resulting training algorithm as a labeled diff against the stationary DEQN box of Section 2.3.

11.4What Changes in the DEQN Setup

We now translate this into one concrete design choice for the network inputs. The autodiff machinery, the squared-residual structure, and the rest of the training loop of Chapter Chapter 2 carry over unchanged; this is a refactor of what the network sees, not a new solver.

11.4.1Time and trends as states

Calendar time itself enters as a state. Because neural networks prefer bounded inputs, we use the monotone time rescaling τt=1exp(ϑt)[0,1)\tau_t = 1 - \exp(-\vartheta\, t) \in [0, 1) of Eq. (11.24). Every exogenous trend (At,Lt,σt,ELand,t,FtEX,ptback,θ1,tA_t, L_t, \sigma_t, E_{\mathrm{Land},t}, F^{\mathrm{EX}}_t, p^{\mathrm{back}}_t, \theta_{1,t}) is then a deterministic function of τt\tau_t, so passing τt\tau_t to the network is informationally equivalent to passing the entire trend bundle. Training trajectories begin from the calibrated 2015 state and run forward over the planner’s horizon.

11.5The Non-Stationary DEQN Algorithm

The design choice of Section 11.4 translates into a single training algorithm. The body below is a literal diff against the stationary DEQN of Section 2.3: unchanged lines are grayed, new or modified lines are bolded.

One delta against the stationary DEQN box. The simulation step starts from a calibrated initial state x0\bm x_0 and integrates KK trajectories forward through calendar time, so the pool D\mathcal D contains time-stamped states (τt,xt)(\tau_t, \bm x_t) along finite-horizon trajectories rather than draws from an ergodic distribution. With τt\tau_t in the input the network learns a time-dependent policy; every other line of the box is the stationary DEQN of Section 2.3 unchanged.

11.5.0.1What replaces transversality.

Because the pool D\mathcal D is built from KK forward simulations of length TmaxT_{\max} that all start at the same x0\bm x_0, every trajectory visits the full calendar window [0,Tmax][0, T_{\max}] and a uniform mini-batch draw from D\mathcal D is therefore stratified across calendar time by construction. The missing transversality condition of Section 11.3 is absorbed numerically by choosing the horizon long enough that the discounted contribution of the terminal state falls below the training-noise floor: at the CDICE calibration ρ=0.015\rho = 0.015/yr and the notebooks’ default Tmax=300T_{\max} = 300 years, β^tTmaxexp(ρTmax)0.011\hat\beta_t^{\,T_{\max}} \approx \exp(-\rho\,T_{\max}) \approx 0.011, which is one to two orders of magnitude below the achievable residual root-mean-square at convergence. When the horizon must be short (e.g., the 1D toy of Exercise 11.10), one instead adds an explicit terminal residual λTxTmaxxTmaxref2\lambda_T\,\|\bm x_{T_{\max}} - \bm x^{\mathrm{ref}}_{T_{\max}}\|^2 to the loss; both options are standard in the finite-horizon DEQN literature.

11.6The Planner’s Lagrangian and FOCs

Movement 2 puts the non-stationary DEQN of Section 11.5 to work on the deterministic CDICE economy of Section 11.2. Solving this system with the algorithm of Section 11.5 amounts to writing the planner’s Lagrangian, deriving the first-order and envelope conditions, normalizing them, treating each FOC as a residual, and minimizing the sum of squared residuals on the time-stamped state pool generated by the forward simulation of Section 11.5. This section follows Friedl et al. (2023) and Online Appendix D of Folini et al. (2025).

11.6.0.1Detrending and state vector, in compact form.

The model-rendering choices already named in Section 11.4.1 carry over verbatim. Variables that grow with the productivity--population product AtLtA_t L_t are rescaled to per-effective-capita units:

ct  :=  CtAtLt,kt  :=  KtAtLt.c_t \;:=\; \frac{C_t}{A_t\,L_t}, \qquad k_t \;:=\; \frac{K_t}{A_t\,L_t}.

Calendar time enters through the bounded rescaling (compatible with the dynamic-programming convention of Traeger (2014)),

τ  =  1exp(ϑt)    [0,1),with inverset  =  ln(1τ)ϑ,\tau \;=\; 1 - \exp(-\vartheta\, t) \;\in\; [0,1), \qquad\text{with inverse}\quad t \;=\; -\frac{\ln(1-\tau)}{\vartheta},

with compression parameter ϑ>0\vartheta > 0. The full DEQN state vector then collects the detrended endogenous CDICE states, the bounded time index, the Bayesian-belief states (μf,t,Sf,t)(\mu_{f,t}, S_{f,t}) used in Section 11.9, and a slot for pseudo-state parameters θ\theta used in the UQ analysis of Section 11.11:

Xt  =  [kt,MtAT,MtUO,MtLO,TtAT,TtOC,μf,t,Sf,t,τt9 endogenous, exogenous, and time states;  θN pseudo-state parameters].\bm{X}_t \;=\; \bigl[\underbrace{k_t,\, M^{\mathrm{AT}}_t,\, M^{\mathrm{UO}}_t,\, M^{\mathrm{LO}}_t,\, T^{\mathrm{AT}}_t,\, T^{\mathrm{OC}}_t,\, \mu_{f,t},\, S_{f,t},\, \tau_t}_{9\text{ endogenous, exogenous, and time states}};\; \underbrace{\theta}_{N\text{ pseudo-state parameters}}\bigr].

In the deterministic core developed in this and the next section, only the six endogenous-state entries plus τt\tau_t are active, i.e. a seven-dimensional input vector; (μf,t,Sf,t)(\mu_{f,t}, S_{f,t}) and θ\theta are appended only in the extensions of Movement 3.

11.6.0.2The Lagrangian.

We now derive the equilibrium conditions that the DEQN will be trained against. The derivation follows the standard Lagrangian approach in CRRA-IES form, working directly with the deterministic CDICE primitives of Section 11.2 (the recursive Epstein--Zin refinement is layered on in Section 11.10). Write the Lagrangian with multiplier λt\lambda_t for the budget constraint Ct+It=YtnetC_t + I_t = Y^{\mathrm{net}}_t, multipliers νtAT,νtUO,νtLO\nu^{\mathrm{AT}}_t,\nu^{\mathrm{UO}}_t,\nu^{\mathrm{LO}}_t for the three carbon-reservoir transitions (11.14), multipliers ηtAT,ηtOC\eta^{\mathrm{AT}}_t,\eta^{\mathrm{OC}}_t for the temperature dynamics (11.16)--(11.17), and KKT multiplier λtμ0\lambda^\mu_t \ge 0 for the abatement bound μt1\mu_t \le 1. The derivation produces ten equilibrium conditions: a consumption FOC, an abatement FOC, the capital Euler equation, the budget/resource constraint, three carbon-stock envelope conditions, two temperature-envelope conditions, and the abatement upper-bound complementarity. Two of these are enforced algebraically (the static consumption and abatement FOCs); the remaining eight become the DEQN residuals of Section 11.7.

11.6.0.3Qualitative overview.

Taking derivatives of the Lagrangian with respect to the controls yields:

11.6.0.4Envelope theorem.

Since the FOC for Kt+1K_{t+1} involves V/Kt+1\partial V/\partial K_{t+1}, which cannot be computed analytically, we apply the envelope theorem. It provides derivatives of the value function with respect to current states, in particular V/kt\partial V/\partial k_t, V/MAT,t\partial V/\partial M_{\mathrm{AT},t}, V/TtAT\partial V/\partial T^{\mathrm{AT}}_t, which are then shifted forward one period and substituted back into the FOCs.

11.6.0.5Capital Euler equation.

Combining the FOCs and envelope conditions yields:

1=eρEt ⁣[(Vt+1(Et[Vt+11γ])1/(1γ))1/ψγ(Ct+1/Lt+1)1/ψ(Ct/Lt)1/ψRt+1K],1 = e^{-\rho}\,\mathbb{E}_t\!\left[\left(\frac{V_{t+1}}{\bigl(\mathbb{E}_t[V_{t+1}^{1-\gamma}]\bigr)^{1/(1-\gamma)}}\right)^{1/\psi - \gamma} \cdot \frac{(C_{t+1}/L_{t+1})^{-1/\psi}}{(C_t/L_t)^{-1/\psi}} \cdot R^K_{t+1}\right],

where Rt+1KR^K_{t+1} is the return on capital inclusive of climate damages. The SCC also appears through the shadow price of atmospheric carbon:

SCCtM=Vt/MAT,tVt/Ct.\mathrm{SCC}^{M}_t = -\frac{\partial V_t / \partial M_{\mathrm{AT},t}}{\partial V_t / \partial C_t}.

This is a shadow value per unit of atmospheric carbon stock. The emissions-based SCC in (11.1) additionally includes the marginal loading of a unit of emissions into MAT,tM_{\mathrm{AT},t} and the carbon-to-CO2_2 unit conversion. At the optimum, the marginal abatement cost equals the carbon tax equals the emissions SCC after these conversions.

11.6.0.6Normalization of multipliers.

Over a 300-year horizon, AtA_t and LtL_t can move the natural scale of marginal utilities and multipliers substantially, with the direction and magnitude depending on the IES through At11/ψLtA_t^{1-1/\psi}L_t. Such scale drift makes network outputs and gradients harder to optimize stably. Following the detrending logic of (11.23), all multipliers, the budget multiplier, the abatement-bound multiplier, and the five climate envelope multipliers alike, are divided by At11/ψLtA_t^{1-1/\psi}\,L_t. The argument for the climate multipliers tracks the budget-multiplier case via the envelope conditions of Paragraph and is spelled out in Online Appendix D of Folini et al. (2025); we adopt the result here:

λ^t:=λtAt11/ψLt,λ^tμ:=λtμAt11/ψLt,ν^tAT:=νtATAt11/ψLt,ν^tUO:=νtUOAt11/ψLt,\hat{\lambda}_t := \frac{\lambda_t}{A_t^{1-1/\psi}\,L_t},\quad \hat{\lambda}^\mu_t := \frac{\lambda^\mu_t}{A_t^{1-1/\psi}\,L_t},\quad \hat{\nu}^{\mathrm{AT}}_t := \frac{\nu^{\mathrm{AT}}_t}{A_t^{1-1/\psi}\,L_t},\quad \hat{\nu}^{\mathrm{UO}}_t := \frac{\nu^{\mathrm{UO}}_t}{A_t^{1-1/\psi}\,L_t},\quad \ldots

and analogously for the remaining multipliers ν^tLO\hat{\nu}^{\mathrm{LO}}_t, η^tAT\hat{\eta}^{\mathrm{AT}}_t, and η^tOC\hat{\eta}^{\mathrm{OC}}_t. The normalization induces an effective discount factor that absorbs the trend growth in the per-effective-capita Euler equation,

β^t  :=  exp ⁣(ρ+(11ψ)gtA+gtL),\hat{\beta}_t \;:=\; \exp\!\left(-\rho + \left(1-\frac{1}{\psi}\right) g^A_t + g^L_t\right),

where gtA:=ln(At+1/At)g^A_t := \ln(A_{t+1}/A_t) and gtL:=ln(Lt+1/Lt)g^L_t := \ln(L_{t+1}/L_t) are annual log changes. Equation (11.29) mirrors Equation (38) of Online Appendix D of Folini et al. (2025): the population term enters linearly because LtL_t enters the felicity weight Lt(Ct/Lt)11/ψL_t (C_t/L_t)^{1-1/\psi} linearly, while the productivity term inherits the 11/ψ1-1/\psi exponent from the per-effective-capita rescaling of consumption. All intertemporal equations below use β^t\hat{\beta}_t in place of eρe^{-\rho}. For a non-annual time step, replace ρ\rho by ρΔt\rho\Delta_t and gtA,gtLg^A_t, g^L_t by their per-period analogues.

11.6.0.7Sign convention for the climate multipliers.

We adopt the value-derivative convention throughout the script: each climate multiplier ν^tAT,ν^tUO,ν^tLO,η^tAT,η^tOC\hat{\nu}^{\mathrm{AT}}_t,\, \hat{\nu}^{\mathrm{UO}}_t,\, \hat{\nu}^{\mathrm{LO}}_t,\, \hat{\eta}^{\mathrm{AT}}_t,\, \hat{\eta}^{\mathrm{OC}}_t is the (normalized) partial derivative of the value function with respect to the corresponding climate state. Because extra atmospheric carbon lowers welfare, ν^tAT\hat{\nu}^{\mathrm{AT}}_t is non-positive at the optimum, which is why the stock SCC carries a minus sign, SCCtM=ν^tAT/λ^t\mathrm{SCC}^M_t = -\hat{\nu}^{\mathrm{AT}}_t/\hat{\lambda}_t. The companion implementation in dice_2p_surrogate_lib.py stores the positive marginal damage ν^tAT-\hat{\nu}^{\mathrm{AT}}_t as a network output for numerical conditioning and flips the sign explicitly inside each residual; the algebra below uses the script convention, so the reader who compares the equations to the code will see one extra sign flip per carbon-multiplier term.

11.6.0.8Symbol cheat-sheet for the multipliers.

Before writing the FOCs and the loss, Table Table 11.3 collects the multipliers that the DEQN learns and their role; subsequent equations use the hat-normalized form throughout.

Table 11.3:Normalized Lagrange multipliers in the CDICE--DEQN. All values are divided by At11/ψLtA_t^{1-1/\psi}\,L_t relative to the raw multipliers, so the hatted versions inherit the per-effective-capita scale that the network outputs see. The atmospheric carbon multiplier carries the SCC up to the marginal-utility denominator: SCCtM=ν^tAT/λ^t\mathrm{SCC}^M_t = -\hat\nu^{\mathrm{AT}}_t / \hat\lambda_t.

SymbolMultiplier onSign at optimumNetwork output?
_tBudget constraint Ct+It=YtnetC_t + I_t = Y^{\mathrm{net}}_t>0> 0yes (softplus)
^_tAbatement upper bound μt1\mu_t \le 10\ge 0no (implied, Eq. (11.39))
^_tAtmospheric carbon transition Mt+1AT=M^{\mathrm{AT}}_{t+1}=\ldots0\le 0yes (stored as ν^tAT>0-\hat\nu^{\mathrm{AT}}_t > 0 via softplus)
^_tUpper-ocean carbon transition0\le 0yes (linear)
^_tLower-ocean carbon transition0\le 0yes (linear)
^_tAtmospheric temperature transition0\le 0yes (linear)
^_tOcean temperature transition0\le 0yes (linear)

11.6.0.9Key FOCs in normalized form.

After normalization, the most important first-order conditions become (see Online Appendix D of Folini et al. (2025) for the complete set of 14 equations):

Lct=0    ct1/ψAt11/ψLtλ^t=0\frac{\partial \mathcal{L}}{\partial c_t} = 0 \; \Leftrightarrow\; c_t^{-1/\psi}\,A_t^{1-1/\psi}\,L_t - \hat{\lambda}_t = 0
Lkt+1=0    exp ⁣(gtA+gtL)λ^tβ^t{λ^t+1[(1Ω(TAT,t+1)Θ(μt+1))αkt+1α1+(1δ)]\frac{\partial \mathcal{L}}{\partial k_{t+1}} = 0 \; \Leftrightarrow\; \exp\!\bigl(g^A_t + g^L_t\bigr)\,\hat{\lambda}_t - \hat{\beta}_t\Bigl\{\hat{\lambda}_{t+1}\bigl[\bigl(1-\Omega(T_{\mathrm{AT},t+1}) - \Theta(\mu_{t+1})\bigr)\alpha k_{t+1}^{\alpha-1} + (1-\delta)\bigr] \nonumber
+ν^t+1ATσt+1(1μt+1)At+1Lt+1αkt+1α1}=0\quad + \hat{\nu}^{\mathrm{AT}}_{t+1}\,\sigma_{t+1}(1-\mu_{t+1})A_{t+1}L_{t+1}\alpha k_{t+1}^{\alpha-1}\Bigr\} = 0
Lμt=0    λ^tΘ(μt)ktα+λ^tμ+ν^tATσtAtLtktα=0.\frac{\partial \mathcal{L}}{\partial \mu_t} = 0 \; \Leftrightarrow\; \hat{\lambda}_t\,\Theta'(\mu_t)\,k_t^\alpha + \hat{\lambda}^\mu_t + \hat{\nu}^{\mathrm{AT}}_t\,\sigma_t\,A_t\,L_t\,k_t^\alpha = 0.

Equation (11.32) is the capital Euler equation: it equates the marginal cost of saving one additional unit today (left) to the discounted marginal benefit tomorrow (right), which now includes a term from the atmospheric carbon envelope (ν^t+1AT\hat{\nu}^{\mathrm{AT}}_{t+1}) because higher capital increases output and hence emissions.

11.6.0.10Envelope conditions.

Convention reminder. As stated in Section 11.2.1, CDICE is calibrated on an annual time step and the coefficients b12,b23,c1,c3,c4b_{12}, b_{23}, c_1, c_3, c_4 in Table Table 11.2 are annual rates; consequently no Δt\Delta_t multipliers appear in either the dynamics (11.14)(11.16)--(11.17) or in the FOC residuals below.

Differentiating the Lagrangian with respect to state variables and shifting forward one period yields the shadow prices of the climate stocks. For example, the atmospheric carbon envelope is:

LMAT,t+1=0    ν^tATβ^t ⁣[ν^t+1AT(1b12)+ν^t+1UOb12+η^t+1ATc1F2×CO21ln2MAT,t+1]=0.\frac{\partial \mathcal{L}}{\partial M_{\mathrm{AT},t+1}} = 0 \;\Leftrightarrow\; \hat{\nu}^{\mathrm{AT}}_t - \hat{\beta}_t\!\left[\hat{\nu}^{\mathrm{AT}}_{t+1}(1-b_{12}) + \hat{\nu}^{\mathrm{UO}}_{t+1}\,b_{12} + \hat{\eta}^{\mathrm{AT}}_{t+1}\,c_1\,F_{\mathrm{2\times CO_2}}\,\frac{1}{\ln 2\,M_{\mathrm{AT},t+1}}\right] = 0.

This equation says that the current shadow price of atmospheric carbon (ν^tAT\hat{\nu}^{\mathrm{AT}}_t) must equal the discounted future effects through three channels: persistence in the atmosphere (b12b_{12} term), diffusion into the upper ocean (ν^t+1UO\hat{\nu}^{\mathrm{UO}}_{t+1} term), and radiative forcing on temperature (η^t+1AT\hat{\eta}^{\mathrm{AT}}_{t+1} term). It is the existence of these climate multipliers that distinguishes the IAM from the purely economic models of Chapters Chapter 2--Chapter 4.

11.6.0.11Fischer--Burmeister complementarity for abatement.

The abatement rate is bounded above by 1 (full abatement), giving the KKT condition:

1μt    0λ^tμ    0,1 - \mu_t \;\geq\; 0 \quad\perp\quad \hat{\lambda}^\mu_t \;\geq\; 0,

which is non-smooth at μt=1\mu_t = 1. As in the borrowing-constraint treatment of Chapter Chapter 5 (Section Section 5.4), we replace it with the Fischer--Burmeister smooth approximation:

ΨFB ⁣(λ^tμ,  1μt)  =  λ^tμ+(1μt)(λ^tμ)2+(1μt)2+εFB  =  0,\Psi^{\mathrm{FB}}\!\bigl(\hat{\lambda}^\mu_t,\; 1-\mu_t\bigr) \;=\; \hat{\lambda}^\mu_t + (1-\mu_t) - \sqrt{(\hat{\lambda}^\mu_t)^2 + (1-\mu_t)^2 + \varepsilon_{\mathrm{FB}}} \;=\; 0,

with the same regularization parameter εFB0\varepsilon_{\mathrm{FB}} \geq 0 used in Chapters Chapter 3--Chapter 5. In CDICE-DEQN we take εFB=106\varepsilon_{\mathrm{FB}} = 10^{-6}, equivalent to the IRBC chapter’s ε=103\varepsilon = 10^{-3} under its ε2\varepsilon^2 convention; the trained policy is insensitive to the choice in the range 10-10 to 10-4. At εFB=0\varepsilon_{\mathrm{FB}} = 0 the zero set of ΨFB\Psi^{\mathrm{FB}} coincides with the positive axes in the (λ^tμ,1μt)(\hat{\lambda}^\mu_t,\, 1-\mu_t)-plane, enforcing the original complementarity exactly but the function is non-differentiable at the origin; with εFB>0\varepsilon_{\mathrm{FB}} > 0 the function is differentiable everywhere at the cost of a slight relaxation of exact complementarity.

11.7From FOCs to a Single Loss

The Lagrangian of Paragraph produces ten equilibrium conditions: the consumption FOC (11.30), the capital Euler (11.32), the abatement FOC (11.33), the budget/resource constraint Ct+It=YtnetC_t + I_t = Y^{\mathrm{net}}_t, the three carbon-stock envelopes (one of which is the atmospheric-carbon envelope (11.34)), the two temperature-layer envelopes, and the Fischer--Burmeister abatement complementarity (11.36). In the DEQN solver two of these ten are enforced exactly by algebraic recovery rather than as squared residuals: the consumption FOC is inverted to yield ctc_t from λ^t\hat{\lambda}_t, and the abatement FOC is solved for λ^tμ\hat{\lambda}^\mu_t and the resulting implied multiplier is fed straight into the Fischer--Burmeister condition. What remains is an eight-residual sum-of-squares loss with eight network outputs, structurally identical to the stationary DEQN of Chapters Chapter 2--Chapter 3. The only substantive difference is that the network must learn the shadow prices of all five climate state variables (three carbon stocks and two temperature layers) in addition to the economic choices, so that the planner has a gradient signal for how today’s decisions propagate through the carbon cycle and the energy balance into future damages.

11.7.0.1Policy network specification.

The policy function approximated by the neural network outputs an eight-dimensional vector,

Nρ(xt)    R8  :=  (kt+1,  μt,  λ^t,  ν^tAT,  ν^tUO,  ν^tLO,  η^tAT,  η^tOC),\mathcal{N}_\rho(\bm{x}_t) \;\in\; \mathbb{R}^{8} \;:=\; \bigl(k_{t+1},\; \mu_t,\; \hat{\lambda}_t,\; \hat{\nu}^{\mathrm{AT}}_t,\; \hat{\nu}^{\mathrm{UO}}_t,\; \hat{\nu}^{\mathrm{LO}}_t,\; \hat{\eta}^{\mathrm{AT}}_t,\; \hat{\eta}^{\mathrm{OC}}_t\bigr),

comprising two choice variables (kt+1k_{t+1}, μt\mu_t), the consumption shadow price λ^t\hat{\lambda}_t, and the five normalized climate multipliers. Note that the abatement KKT multiplier λ^tμ\hat{\lambda}^\mu_t is not a network output: it is recovered algebraically inside the loss (see below). A key difference from the stationary DEQN of Chapters Chapter 2--Chapter 3 is that the network must learn the shadow prices of all climate constraints, not just the economic choices. Without the climate multipliers, the planner would have no gradient signal about how today’s decisions propagate through the carbon cycle and temperature dynamics into future damages.

11.7.0.2Bounds and positivity.

The output activations of Nρ\mathcal{N}_\rho are chosen so that the bound and positivity constraints of the model hold for every input, eliminating the need for additional residuals. The capital level kt+1k_{t+1}, the consumption shadow λ^t\hat{\lambda}_t, and the abatement rate μt\mu_t are each passed through a softplus, which guarantees kt+1>0k_{t+1} > 0, λ^t>0\hat{\lambda}_t > 0 (so consumption recovered via (11.38) is positive), and μt0\mu_t \ge 0 exactly. The upper bound μt1\mu_t \le 1 is enforced jointly by the Fischer--Burmeister condition l8l_8 at the implied multiplier (11.39) and by a small quadratic upper-bound penalty E[max(μt1,0)2]\propto \mathbb{E}[\max(\mu_t - 1, 0)^2] added to the training loss. The atmospheric-carbon shadow ν^tAT\hat{\nu}^{\mathrm{AT}}_t is stored in the implementation as a positive marginal damage (see the sign-convention note in Paragraph) and is output through a softplus; the remaining climate multipliers ν^tUO,ν^tLO,η^tAT,η^tOC\hat{\nu}^{\mathrm{UO}}_t, \hat{\nu}^{\mathrm{LO}}_t, \hat{\eta}^{\mathrm{AT}}_t, \hat{\eta}^{\mathrm{OC}}_t are unconstrained and use linear output activations.

11.7.0.3How is consumption ctc_t determined?

The consumption FOC (11.30) is enforced exactly by inversion rather than as a residual: given the network’s prediction of λ^t\hat{\lambda}_t, consumption is recovered algebraically as

ct  =  (λ^tAt1/ψ1Lt1)ψ,c_t \;=\; \bigl(\hat{\lambda}_t \cdot A_t^{1/\psi - 1}\,L_t^{-1}\bigr)^{-\psi},

so ctc_t is not itself a network output. Positivity of ctc_t is guaranteed because the implementation passes λ^t\hat{\lambda}_t through a softplus activation, so λ^t>0\hat{\lambda}_t > 0 for every input.

11.7.0.4How is the abatement multiplier λ^tμ\hat{\lambda}^\mu_t determined?

The same trick handles the abatement FOC (11.33): rather than have the network output λ^tμ\hat{\lambda}^\mu_t and impose the FOC as a separate residual, we solve the FOC for λ^tμ\hat{\lambda}^\mu_t and treat the resulting implied multiplier as a deterministic function of the other network outputs. Setting L/μt=0\partial\mathcal{L}/\partial\mu_t = 0 in (11.33) yields

λ^tμ,impl  =  λ^tΘ(μt)ktα    ν^tATσtAtLtktα.\hat{\lambda}^{\mu,\mathrm{impl}}_t \;=\; -\hat{\lambda}_t\,\Theta'(\mu_t)\,k_t^{\alpha} \;-\; \hat{\nu}^{\mathrm{AT}}_t\,\sigma_t\, A_t\, L_t\,k_t^{\alpha}.

Plugged into the Fischer--Burmeister condition (11.36), this is the residual l8l_8 below. Two facts come for free. First, whenever l8=0l_8 = 0 holds and the smoothing parameter εFB\varepsilon_{\mathrm{FB}} is small, the abatement FOC also holds automatically, because l8l_8 couples the implied multiplier to the slack 1μt1-\mu_t. Second, the network output dimension drops from nine to eight, which improves training stability: the network no longer has to discover that λ^tμ\hat{\lambda}^\mu_t is exactly the right algebraic combination of λ^t,μt,ν^tAT\hat{\lambda}_t,\, \mu_t,\, \hat{\nu}^{\mathrm{AT}}_t.

The network architecture uses two hidden layers with 1024 units each, SELU activation, and the Adam optimizer with learning rate 10-5. Training alternates between broad sampling (Phase 1) and endogenous simulation (Phase 2), as described in Chapter Chapter 3.

11.7.0.5The 8 loss components.

Each remaining equilibrium condition from Paragraph becomes a residual lm=0l_m = 0, and the network is asked to drive every lml_m to zero simultaneously along simulated paths. The mapping is one-for-one: l1l_1 is the capital-Euler FOC (11.32); l2l_2 is the budget constraint that closes (11.4); l3l_3, l4l_4, l5l_5 are the three carbon-reservoir envelope conditions, of which l3l_3 is (11.34); l6l_6 and l7l_7 are the two temperature-layer envelopes; and l8l_8 is the Fischer--Burmeister smoothing (11.36) of the KKT slack on μt1\mu_t \le 1, evaluated at the implied multiplier (11.39). The consumption FOC (11.30) and the abatement FOC (11.33) are enforced exactly via the inversions in (11.38) and (11.39), which is why the loss list contains eight entries instead of nine. Written out, the eight components are:

l1:=exp ⁣(gtA+gtL)λ^tβ^t{λ^t+1[(1Ω(TAT,t+1)Θ(μt+1))αkt+1α1+(1δ)]l_1 := \exp\!\bigl(g^A_t + g^L_t\bigr)\,\hat{\lambda}_t - \hat{\beta}_t\Bigl\{\hat{\lambda}_{t+1}\bigl[\bigl(1-\Omega(T_{\mathrm{AT},t+1}) - \Theta(\mu_{t+1})\bigr)\alpha k_{t+1}^{\alpha-1} + (1-\delta)\bigr] \nonumber
+ν^t+1ATσt+1(1μt+1)At+1Lt+1αkt+1α1}(capital Euler)\quad + \hat{\nu}^{\mathrm{AT}}_{t+1}\,\sigma_{t+1}(1-\mu_{t+1})A_{t+1}L_{t+1}\alpha k_{t+1}^{\alpha-1}\Bigr\} \tag*{\text{(capital Euler)}}
l2:=(1Ω(TAT,t)Θ(μt))ktα+(1δ)ktctexp ⁣(gtA+gtL)kt+1(budget)l_2 := \bigl(1-\Omega(T_{\mathrm{AT},t}) - \Theta(\mu_t)\bigr)\,k_t^\alpha + (1-\delta)\,k_t - c_t - \exp\!\bigl(g^A_t + g^L_t\bigr)\,k_{t+1} \tag*{\text{(budget)}}
l3:=ν^tATβ^t ⁣[ν^t+1AT(1b12)+ν^t+1UOb12+η^t+1ATc1F2×CO21ln2MAT,t+1](atm. carbon)l_3 := \hat{\nu}^{\mathrm{AT}}_t - \hat{\beta}_t\!\left[\hat{\nu}^{\mathrm{AT}}_{t+1}(1-b_{12}) + \hat{\nu}^{\mathrm{UO}}_{t+1}\,b_{12} + \hat{\eta}^{\mathrm{AT}}_{t+1}\,c_1\,F_{\mathrm{2\times CO_2}}\,\tfrac{1}{\ln 2\,M_{\mathrm{AT},t+1}}\right] \tag*{\text{(atm.\ carbon)}}
l4:=ν^tUOβ^t ⁣[ν^t+1ATb12MEQATMEQUO+ν^t+1UO ⁣(1b12MEQATMEQUOb23)+ν^t+1LOb23](upper ocean C)l_4 := \hat{\nu}^{\mathrm{UO}}_t - \hat{\beta}_t\!\Bigl[\hat{\nu}^{\mathrm{AT}}_{t+1}\,b_{12}\,\tfrac{M^{\mathrm{AT}}_{\mathrm{EQ}}}{M^{\mathrm{UO}}_{\mathrm{EQ}}} + \hat{\nu}^{\mathrm{UO}}_{t+1}\!\Bigl(1-b_{12}\tfrac{M^{\mathrm{AT}}_{\mathrm{EQ}}}{M^{\mathrm{UO}}_{\mathrm{EQ}}}-b_{23}\Bigr) + \hat{\nu}^{\mathrm{LO}}_{t+1}\,b_{23}\Bigr] \tag*{\text{(upper ocean C)}}
l5:=ν^tLOβ^t ⁣[ν^t+1UOb23MEQUOMEQLO+ν^t+1LO ⁣(1b23MEQUOMEQLO)](lower ocean C)l_5 := \hat{\nu}^{\mathrm{LO}}_t - \hat{\beta}_t\!\Bigl[\hat{\nu}^{\mathrm{UO}}_{t+1}\,b_{23}\,\tfrac{M^{\mathrm{UO}}_{\mathrm{EQ}}}{M^{\mathrm{LO}}_{\mathrm{EQ}}} + \hat{\nu}^{\mathrm{LO}}_{t+1}\!\Bigl(1-b_{23}\,\tfrac{M^{\mathrm{UO}}_{\mathrm{EQ}}}{M^{\mathrm{LO}}_{\mathrm{EQ}}}\Bigr)\Bigr] \tag*{\text{(lower ocean C)}}
l6:=η^tATβ^t ⁣[λ^t+1Ω(TAT,t+1)kt+1α+η^t+1AT ⁣(1c1F2×CO2ΔTAT,×2c1c3)+η^t+1OCc4](atm. temp.)l_6 := \hat{\eta}^{\mathrm{AT}}_t - \hat{\beta}_t\!\Bigl[-\hat{\lambda}_{t+1}\,\Omega'(T_{\mathrm{AT},t+1})\,k_{t+1}^\alpha + \hat{\eta}^{\mathrm{AT}}_{t+1}\!\Bigl(1-c_1\,\tfrac{F_{\mathrm{2\times CO_2}}}{\Delta T_{\mathrm{AT},\times 2}}-c_1 c_3\Bigr) + \hat{\eta}^{\mathrm{OC}}_{t+1}\,c_4\Bigr] \tag*{\text{(atm.\ temp.)}}
l7:=η^tOCβ^t ⁣[η^t+1ATc1c3+η^t+1OC(1c4)](ocean temp.)l_7 := \hat{\eta}^{\mathrm{OC}}_t - \hat{\beta}_t\!\left[\hat{\eta}^{\mathrm{AT}}_{t+1}\,c_1 c_3 + \hat{\eta}^{\mathrm{OC}}_{t+1}(1-c_4)\right] \tag*{\text{(ocean temp.)}}
l8:=λ^tμ,impl+(1μt)(λ^tμ,impl)2+(1μt)2+εFB(Fischer–Burmeister, implied multiplier)l_8 := \hat{\lambda}^{\mu,\mathrm{impl}}_t + (1-\mu_t) - \sqrt{(\hat{\lambda}^{\mu,\mathrm{impl}}_t)^2 + (1-\mu_t)^2 + \varepsilon_{\mathrm{FB}}} \tag*{\text{(Fischer--Burmeister, implied multiplier)}}

Loss components l1l_1--l2l_2 enforce intertemporal optimality and feasibility, l3l_3--l7l_7 are the envelope conditions that price the five climate state variables, and l8l_8 jointly enforces the abatement FOC (via the implied multiplier) and the upper-bound complementarity μt1\mu_t \le 1.

11.7.0.6Total loss.

The DEQN loss aggregates all residuals along a simulated path:

ρ  :=  1Npathxton sim. path  m=18  (lm(xt,Nρ(xt)))2.\ell_\rho \;:=\; \frac{1}{N_{\text{path}}} \sum_{\bm{x}_t\,\text{on sim.\ path}} \;\sum_{m=1}^{8}\; \bigl(l_m(\bm{x}_t,\, \mathcal{N}_\rho(\bm{x}_t))\bigr)^2.

This is the same sum-of-squared-residuals structure as the NN-country IRBC model of Chapter Chapter 3, but with 8 equations per time step instead of the IRBC’s 2N+12N+1 (NN Euler equations, NN Fischer--Burmeister conditions, and one aggregate resource constraint).

11.7.0.7State evolution.

To evaluate the loss along a simulated path, the full state vector (11.25) must be propagated forward. In CDICE the next-period state is:

xt+1=(kt+1,  Mt+1AT,  Mt+1UO,  Mt+1LO,  Tt+1AT,  Tt+1OC,  μf,t+1,  Sf,t+1,  τt+1;  θ)T,\bm{x}_{t+1} = \bigl(k_{t+1},\; M^{\mathrm{AT}}_{t+1},\; M^{\mathrm{UO}}_{t+1},\; M^{\mathrm{LO}}_{t+1},\; T^{\mathrm{AT}}_{t+1},\; T^{\mathrm{OC}}_{t+1},\; \mu_{f,t+1},\; S_{f,t+1},\; \tau_{t+1};\; \theta\bigr)^T,

where:

All deterministic transitions are differentiable; stochastic shock draws are handled via reparameterization / common random numbers, so the simulate-then-backpropagate loop can be executed end-to-end with automatic differentiation.

This is the deterministic CDICE-DEQN solver in its entirety. Companion notebook 02_DICE_DEQN_Library_Port.ipynb trains it against the CDICE library reference solution of Folini et al. (2025); the verification gate inside that notebook is the natural stopping point for a reader who wants only the deterministic core.

11.8From CDICE to Stochastic IAMs

The deterministic CDICE-DEQN of Section 11.7, together with the AR(1) productivity extension developed in the remarkbox below, is the right pedagogical anchor because it contains every mechanical component of an integrated assessment model: capital accumulation, emissions, carbon diffusion, temperature dynamics, damages, abatement costs, and the SCC as a shadow price. It is not yet the object one wants for quantitative climate-policy research. Three features are still missing.

First, climate policy is an intrinsically stochastic problem. Productivity, carbon intensity, damages, climate feedbacks, and tipping thresholds are not known constants. Once they are stochastic, a carbon tax is not a path but a state-contingent policy. Second, long-run climate risk makes time-additive CRRA preferences too restrictive: the intertemporal elasticity of substitution and risk aversion should be separate parameters. Third, climate policy is distributional. The representative-agent SCC answers a marginal pricing question, but an implementable policy also asks which cohorts pay the tax and which cohorts receive the transfers. This is the point at which the chapter moves from representative-agent DICE to stochastic overlapping-generations IAMs.

The transition is smooth if one keeps the computational object fixed. In every case the neural network approximates a policy map

ut=Nρ(x~t),x~t=(economic states, climate states, beliefs, parameters, policy-rule coefficients),\begin{aligned} u_t &= \mathcal N_\rho(\tilde{\bm x}_t), \\ \tilde{\bm x}_t &= (\text{economic states},\ \text{climate states},\ \text{beliefs},\ \text{parameters},\ \text{policy-rule coefficients}), \end{aligned}

and the loss is still a sum of normalized equilibrium residuals. The only changes are the variables appended to x~t\tilde{\bm x}_t and the conditional expectations appearing in the residuals. Table Table 11.4 summarizes the sequence.

Table 11.4:The layers of the climate-economy pipeline used in the remainder of the chapter. Each layer is a small extension of the previous one; no new numerical paradigm is introduced after the deterministic CDICE-DEQN.

LayerEconomic questionComputational change
Deterministic CDICE (Section 11.7)What is the globally optimal abatement path and SCC at the baseline calibration?Time-stamped DEQN; eight residuals; horizon TmaxT_{\max} chosen so discounting absorbs transversality.
Stochastic DICE (AR(1) + GH quadrature, see the productivity-shock remarkbox below)How do shocks alter the SCC distribution?Add shock states; replace future terms by Gauss--Hermite expectations.
Bayesian learning on ECS (Section 11.9)How does learning about climate sensitivity alter the SCC distribution?Add belief mean and belief variance as states; one signal equation; conjugate Gaussian update.
Epstein--Zin DICE (Section 11.10)How do risk aversion and IES separately price climate tails?Add the value level as a network output; add one recursion residual and an EZ continuation-value weight.
Deep UQ (Section 11.11)Which uncertain parameters drive SCC variation?Treat parameters as pseudo-states; fit a GP surrogate for the QoI; compute Sobol, Shapley, and univariate effects.
Stochastic OLG-IAM (Section 11.12)Can carbon taxes be welfare improving and Pareto improving across cohorts?Treat tax coefficients and transfer shares as pseudo-states; fit GP surrogates for cohort welfare; solve constrained policy design on the surrogate.

11.9Bayesian Learning About Climate Sensitivity

11.9.0.1Why ECS is the natural learning state.

The equilibrium climate sensitivity (ECS), defined as the long-run atmospheric warming from a doubling of CO2_2, is the single most consequential and most uncertain parameter in the climate side of an IAM. Observational, paleoclimate, and model-based estimates place ECS in a likely (66%) range of roughly 2.5--4°C and a very-likely (90%) range of 2--5°C Sherwood et al., 2020Knutti et al., 2017Roe & Baker, 2007, and ECS uncertainty is the largest single contributor to SCC dispersion across model variants. Crucially, ECS is partially identified from temperature realizations conditional on emissions and forcing: a Bayesian planner who observes temperature paths can therefore update her posterior period by period, and the policy that maximizes ex-ante welfare conditions on the current posterior rather than on a fixed point estimate.

11.9.0.2How learning enters the state.

Promote the climate-feedback parameter λ\lambda in (11.16) to a stochastic object by adding the feedback term φ1Cf~t+1TtAT\varphi_{1C}\,\tilde f_{t+1}\,T^{\mathrm{AT}}_t to the right-hand side, where φ1C\varphi_{1C} is a calibrated coupling coefficient (taken from Friedl et al. (2023)) and f~t+1N(μf,t,Sf,t)\tilde f_{t+1} \sim \mathcal N(\mu_{f,t}, S_{f,t}) is a per-period draw under the planner’s posterior over the unknown climate-feedback deviation. The unknown itself is time-invariant; the subscript t+1t{+}1 indexes the period in which the subjective draw enters the temperature equation, and the planner’s posterior moments (μf,t,Sf,t)(\mu_{f,t}, S_{f,t}) shift over time as new temperature observations arrive. The planner observes the temperature-residual signal

yt+1  :=  φ1CTtATf~t+1  +  ϵ~T,t+1,ϵ~T,t+1N(0,SϵT),y_{t+1} \;:=\; \varphi_{1C}\,T^{\mathrm{AT}}_t\,\tilde f_{t+1} \;+\; \tilde\epsilon_{T,t+1},\qquad \tilde\epsilon_{T,t+1} \sim \mathcal N(0, S_{\epsilon_T}),

and conjugate Gaussian--Gaussian updating delivers the posterior

μf,t+1=SϵTμf,t+φ1CTtATSf,tyt+1SϵT+(φ1CTtAT)2Sf,t\mu_{f,t+1} = \frac{S_{\epsilon_T}\,\mu_{f,t} + \varphi_{1C}\,T^{\mathrm{AT}}_t\,S_{f,t}\,y_{t+1}}{S_{\epsilon_T} + (\varphi_{1C}\,T^{\mathrm{AT}}_t)^2\,S_{f,t}}
Sf,t+1=SϵTSf,tSϵT+(φ1CTtAT)2Sf,tS_{f,t+1} = \frac{S_{\epsilon_T} \cdot S_{f,t}}{S_{\epsilon_T} + (\varphi_{1C}\,T^{\mathrm{AT}}_t)^2\,S_{f,t}}

which the planner takes as two additional laws of motion for the belief states (μf,t,Sf,t)(\mu_{f,t}, S_{f,t}). These two states occupy the slots already reserved in the augmented state vector (11.25). Equations (11.58)--(11.59) are the Kalman update for a scalar linear-Gaussian state-space model with observation gain φ1CTtAT\varphi_{1C}\,T^{\mathrm{AT}}_t and noise variance SϵTS_{\epsilon_T}; cf. Bishop (2006) [§ 13.3] for the generic derivation. The DEQN algorithm of Section 11.5 is unchanged: the network simply receives two more inputs and learns a richer policy.

11.9.0.3Where this sits in the literature.

Bayesian learning about climate parameters in an integrated assessment frame has a long pedigree. Kelly & Kolstad (1999) and Kelly & Tan (2015) establish the basic Kelly--Kolstad result that learning takes decades to centuries in calibrated DICE-like settings, and that the tradeoff between mitigation (which lowers temperature variance) and information (which requires informative temperature paths) is sharp. Leach (2007) and Webster et al. (2008) sharpen the slow-learning result and quantify the policy errors induced by treating uncertainty as resolved too quickly. On the dynamic-programming side, Cai & Lontzek (2019) solve a stochastic-DICE variant with tipping-point hazards and recursive preferences. The robust-control program of Anderson et al. (2014), Barnett (2023), Barnett et al. (2023), and Barnett et al. (2020) addresses a complementary question (planner ambiguity over the data-generating process), and modern deep-learning solutions are the natural computational companion because tensor-product grids over belief states are infeasible at realistic state-vector sizes.

11.9.0.4Headline result from the UQ literature.

Friedl et al. (2023) solve the joint stochastic-DICE--Bayesian-learning DEQN with the methodology of this chapter and find two qualitative features that survive across the calibration cloud.[1] First, ECS uncertainty is largely resolved within roughly ten years of optimal policy: the posterior variance Sf,tS_{f,t} shrinks by an order of magnitude over the first decade of the planner’s horizon, even though the absolute posterior mean takes longer to settle. Second, the SCC under learning is roughly half the no-learning SCC for moderate true ECS values, and roughly the same as the no-learning SCC at the upper tail of the ECS distribution; learning is a strong substitute for precautionary mitigation in the moderate-ECS regime, and a weak substitute in the tail-ECS regime. The asymmetry is policy-relevant: the value of waiting to learn falls sharply once the planner suspects she is in the tail. The broader teaching point is that uncertainty is not automatically a reason to abate more: its policy effect depends on whether the uncertainty is static, learnable, or associated with irreversible tail risk. Figure Figure 11.5 illustrates the two qualitative features.

Schematic of the two qualitative features reported by . Left: posterior variance S_{f,t} relative to its prior value, on a logarithmic scale. The variance falls by roughly an order of magnitude over the first decade, mirroring the Kelly--Kolstad slow-learning result but accelerated by the deeper signal--noise ratio of the modern climate calibration. Right: \mathrm{SCC}_0 as a function of the true ECS, with and without Bayesian learning. Learning approximately halves the SCC at moderate ECS values where uncertainty is the dominant driver of precautionary abatement, but converges to the no-learning curve at the upper tail where the underlying physical damage dominates. Curves are illustrative; the magnitudes are those quoted in the body text.

Figure 11.5:Schematic of the two qualitative features reported by Friedl et al. (2023). Left: posterior variance Sf,tS_{f,t} relative to its prior value, on a logarithmic scale. The variance falls by roughly an order of magnitude over the first decade, mirroring the Kelly--Kolstad slow-learning result but accelerated by the deeper signal--noise ratio of the modern climate calibration. Right: SCC0\mathrm{SCC}_0 as a function of the true ECS, with and without Bayesian learning. Learning approximately halves the SCC at moderate ECS values where uncertainty is the dominant driver of precautionary abatement, but converges to the no-learning curve at the upper tail where the underlying physical damage dominates. Curves are illustrative; the magnitudes are those quoted in the body text.

11.10Epstein--Zin Preferences

11.10.0.1Why recursive preferences for climate.

The time-additive CRRA-IES aggregator (11.5) ties risk aversion and intertemporal substitution together. Climate policy is exactly the environment in which this restriction is least attractive. A planner may want a high IES ψ\psi to govern intertemporal substitution across long horizons, and a separate high coefficient of relative risk aversion γu\gamma_u to price low-probability climate disasters. Recursive Kreps--Porteus preferences, following Epstein & Zin (1989) and Weil (1989), implement this separation.[2]

Working with the normalized per-capita value vtv_t and per-capita consumption ct=Ct/Ltc_t = C_t/L_t, and writing βtEZ:=exp(ρΔt)\beta^{\mathrm{EZ}}_t := \exp(-\rho\,\Delta_t) for the one-period Epstein--Zin discount factor, the recursion is

vt=[(1βtEZ)ct11/ψ+βtEZ(Et ⁣[vt+11γu])11/ψ1γu]111/ψ,v_t = \left[ (1-\beta^{\mathrm{EZ}}_t)\, c_t^{1-1/\psi} + \beta^{\mathrm{EZ}}_t \left(\mathbb E_t\!\left[v_{t+1}^{1-\gamma_u}\right]\right)^{\frac{1-1/\psi}{1-\gamma_u}} \right]^{\frac{1}{1-1/\psi}},

with the usual logarithmic limits when ψ=1\psi = 1 or γu=1\gamma_u = 1, subject to the same budget constraint, capital-accumulation law, and climate dynamics as before.

11.10.0.2What changes in the DEQN loss.

The value level vtv_t becomes an additional network output, paired with a ninth residual that enforces the recursion (11.60):

RtEZ=vt[(1βtEZ)ct11/ψ+βtEZ(Et ⁣[vt+11γu])11/ψ1γu]111/ψ.\mathcal R^{\mathrm{EZ}}_t = v_t - \left[ (1-\beta^{\mathrm{EZ}}_t)\, c_t^{1-1/\psi} + \beta^{\mathrm{EZ}}_t \left(\mathbb E_t\!\left[v_{t+1}^{1-\gamma_u}\right]\right)^{\frac{1-1/\psi}{1-\gamma_u}} \right]^{\frac{1}{1-1/\psi}}.

In the deterministic CRRA-IES core of Section 11.7, vtv_t never appears explicitly, which is why eight residuals suffice there. The Euler and costate residuals of Paragraph keep their deterministic form but receive a Bansal--Yaron certainty-equivalent weighting inside each conditional expectation. It is convenient to write the one-step recursive-pricing kernel as

Mt,t+1EZ=β^t(vt+1(Et[vt+11γu])1/(1γu))1/ψγu(ct+1ct)1/ψ,\mathcal M^{\mathrm{EZ}}_{t,t+1} = \hat\beta_t \left( \frac{v_{t+1}}{\left(\mathbb E_t[v_{t+1}^{1-\gamma_u}]\right)^{1/(1-\gamma_u)}} \right)^{1/\psi - \gamma_u} \left(\frac{c_{t+1}}{c_t}\right)^{-1/\psi},

where β^t\hat\beta_t inherits the deterministic growth normalization of (11.29). In the code, (11.62) is just a multiplicative weight on next-period marginal-value terms; the certainty-equivalent operator inside the Kreps--Porteus aggregator becomes a second nested expectation. The DEQN loss inherits one extra Gauss--Hermite quadrature step and one extra network output, but no new algorithmic ingredient.

11.10.0.3Interpretation for the SCC.

Crost & Traeger (2013) and Crost & Traeger (2014) establish the analytic baseline: in a deterministic IAM, decoupling risk aversion from the IES changes the optimal carbon tax only when stochastic risk is present, but the change can be quantitatively large once it is. Jensen & Traeger (2014) and Traeger (2023) Traeger (2021) extend the result to closed-form ACE-class settings and show that for reasonable risk aversion above 1/ψ1/\psi, the SCC roughly doubles relative to CRRA; Cai & Lontzek (2019) reach the same conclusion in a fully stochastic DICE variant. Intuitively, recursive preferences change the SCC because carbon emissions affect the distribution of long-run consumption, not only its mean: if damages create low-consumption tail states, a high γu\gamma_u raises the SCC through the disaster-insurance channel. The sign of the IES effect depends on which shock dominates: in TFP-driven economies higher ψ\psi dampens the SCC because consumption smoothing absorbs the productivity risk, whereas in temperature-driven economies higher ψ\psi amplifies the SCC because the planner cares more about late-horizon consumption losses. Bansal et al. (2016) make the asset-pricing case for the same channel: long-run temperature shifts price into expected returns through the EZ aggregator, and ignoring them understates the welfare cost of carbon emissions. This is why stochastic DICE with Epstein--Zin preferences is a better teaching object than deterministic DICE for climate-finance questions: it connects welfare, tail risk, and asset-pricing logic in a single equilibrium loss.

11.11Deep Uncertainty Quantification via Surrogates

Deep UQ answers a different question from solving one stochastic IAM. The object is now a scalar quantity of interest,

q(θ)=SCC2100(θ),θΘRdθ,q(\theta) = \mathrm{SCC}_{2100}(\theta), \qquad \theta\in\Theta\subset\mathbb R^{d_\theta},

where θ\theta collects uncertain structural parameters: the ECS or its prior mean μf,0\mu_{f,0}, the prior variance Sf,0S_{f,0}, the pure rate of time preference ρ\rho, the IES ψ\psi, risk aversion γu\gamma_u, the damage curvature π2\pi_2, and any tipping parameters included in the experiment. Direct global sensitivity analysis would require solving the IAM thousands of times. Deep UQ replaces this infeasible outer loop by two amortizations.

11.11.0.1Amortization 1: parameters as pseudo-states.

The pseudo-state trick of Friedl et al. (2023) collapses the outer loop into a single DEQN training pass. Uncertain parameters θ\theta are appended to the network’s input,

x~t=(xteconomic + climate states,  θuncertain parameters),ut=Nρ(x~t),\tilde{\bm{x}}_t = \bigl(\underbrace{\bm x_t}_{\text{economic + climate states}},\; \underbrace{\theta}_{\text{uncertain parameters}}\bigr),\qquad u_t = \mathcal N_\rho(\tilde{\bm x}_t),

held fixed within each simulation episode and re-sampled across episodes from a design distribution Dθ\mathcal D_\theta. One trained network therefore approximates the policy function for every θ\theta in Dθ\mathcal D_\theta; evaluating any new θ\theta requires only a forward pass. For very large pseudo-state dimensions the active-subspace methods of Section 9.5 compress θ\theta before the next step. This is the same idea as the parameterized policy networks in Chapter Chapter 10; here the target is not an SMM criterion but an SCC distribution.

11.11.0.2Amortization 2: a GP for the quantity of interest.

After training, the DEQN is evaluated at a design set {θi}i=1n\{\theta_i\}_{i=1}^n and the corresponding QoI values qi=q(θi)q_i = q(\theta_i) are computed by forward simulation. Fit a Gaussian-process surrogate

q(θ)=m(θ)+ε(θ),m(θ){(θi,qi)}i=1nGP(μn(θ),kn(θ,θ)).q(\theta) = m(\theta) + \varepsilon(\theta), \qquad m(\theta)\mid \{(\theta_i,q_i)\}_{i=1}^n \sim \mathcal{GP}\bigl(\mu_n(\theta),\, k_n(\theta,\theta')\bigr).

The GP is cheap enough to evaluate millions of times, so the expensive IAM is no longer called inside Sobol, Shapley, or univariate-effect estimators. Bayesian active learning improves the design by adding points where the GP posterior uncertainty is largest or where integrated posterior variance is most reduced, following the toolkit of Chapter Chapter 9 (see Figure Figure 10.1 and Table Table 10.1).

11.11.0.3Sobol, Shapley, univariate effects.

Three complementary global sensitivity indices answer different questions about how θ\theta drives the SCC. The first-order Sobol index SiS_i of Sobol (2001) measures the share of output variance explained by θi\theta_i alone,

Si=Var(E[q(θ)θi])Var(q(θ)),S_i = \frac{\mathrm{Var}\bigl(\mathbb E[q(\theta)\mid\theta_i]\bigr)}{\mathrm{Var}(q(\theta))},

and the total-effect index captures both direct and interaction effects,

Sitot=1Var(E[q(θ)θi])Var(q(θ)).S_i^{\mathrm{tot}} = 1 - \frac{\mathrm{Var}\bigl(\mathbb E[q(\theta)\mid\theta_{-i}]\bigr)}{\mathrm{Var}(q(\theta))}.

For independent inputs the {Si}\{S_i\} sum to at most one, while the {Sitot}\{S_i^{\mathrm{tot}}\} can exceed one in the presence of interactions; equality iSitot=1\sum_i S_i^{\mathrm{tot}} = 1 characterizes additive models. Shapley effects, introduced into sensitivity analysis by Owen (2014) and developed further by Song et al. (2016) and Iooss & Prieur (2019), allocate Var(q)\mathrm{Var}(q) across parameters via cooperative-game averaging over all subsets of other parameters Shapley, 1953, sum exactly to Var(q)\mathrm{Var}(q) (raw) or one (normalized), and handle correlated inputs cleanly. Univariate-effect plots show the conditional mean E[q(θ)θi]\mathbb E[q(\theta)\mid\theta_i] as θi\theta_i varies and capture the directional response that Sobol indices average over. Saltelli & D'Hombres (2010) and Saltelli et al. (2008) give the standard estimators and best-practice warnings.

The reason this pipeline is the only feasible route is computational: direct Monte Carlo on Sobol or Shapley indices requires O(104)O(10^4) to O(106)O(10^6) evaluations of the structural model at fresh θ\theta draws. Even at one DEQN solve per parameter vector, that price tag is several core-decades. The DEQN-with-pseudo-states amortizes one loop, and the GP surrogate amortizes the other; the sensitivity indices are then computed on the GP rather than on the IAM.

11.11.0.4Empirical headline.

Friedl et al. (2023) apply the pipeline to a stochastic DICE variant with Epstein--Zin preferences and Bayesian learning, and find that two ingredients dominate the SCC variance across 2020--2100: the mean of the ECS belief (roughly 50--70% of the total-effect Sobol share) and the curvature parameter of the damage function (roughly 15--25%).[3] Together these account for 70--90% of the SCC variance. Risk aversion contributes a few percentage points; the pure rate of time preference and the IES contribute negligibly once damage curvature is conditioned on. The policy lesson is that under deep uncertainty the SCC should be reported as a distribution, not a point estimate, and that climate-policy design should target tail insurance against the upper ECS--damage corner rather than precision over the central calibration. Figure Figure 11.6 sketches the resulting variance decomposition.

Schematic of the total-effect Sobol shares of \mathrm{SCC}_{2100} variance reported by . Midpoints reflect the ranges quoted in the text (ECS mean 50--70%, damage curvature 15--25%), with horizontal error bars on the two leading parameters indicating the spread across horizon dates and damage-function specifications. The shape, two parameters carrying almost the entire variance, is what motivates the tail-insurance framing in the closing paragraph.

Figure 11.6:Schematic of the total-effect Sobol shares of SCC2100\mathrm{SCC}_{2100} variance reported by Friedl et al. (2023). Midpoints reflect the ranges quoted in the text (ECS mean 50--70%, damage curvature 15--25%), with horizontal error bars on the two leading parameters indicating the spread across horizon dates and damage-function specifications. The shape, two parameters carrying almost the entire variance, is what motivates the tail-insurance framing in the closing paragraph.

11.12Constrained Pareto-Improving Carbon Tax in OLG-IAMs

The SCC analysis of Section 11.11 is still the marginal welfare cost of one extra ton of carbon to a representative agent. Climate policy, however, redistributes welfare across cohorts: today’s workers pay abatement costs while tomorrow’s households inherit a cooler planet. A Pareto-improving carbon tax must transfer enough revenue back to current cohorts that no generation is worse off than under business-as-usual. This section closes Movement 3 by walking through the constrained-Pareto OLG-IAM of Kübler et al. (2026), reusing the DEQN-with-pseudo-states machinery of Section 11.11 and the GP surrogate of Chapter Chapter 9. The Pareto-improvement criterion is closely related to the social-security reform literature Krueger & Kubler, 2006, to recent work on intergenerational climate policy Karp et al., 2024Kotlikoff et al., 2021, and to the constrained-optimal-tax frontier of Douenne et al. (2024).

11.12.0.1Notation reset for this section.

The OLG-IAM uses different conventions than the representative-agent CDICE block of Section 11.2, following Kübler et al. (2026), and we summarize the differences here so the reader is not surprised. Ωt(Tt)\Omega_t(T_t) now denotes the retained-output factor, so net output is ΩtΦKαL1α\Omega_t \Phi K^\alpha L^{1-\alpha} rather than (1ΩΘ)Ygross(1-\Omega-\Theta)Y^{\mathrm{gross}}. pttaxp^{\mathrm{tax}}_t is the carbon tax (a per-tCO2_2 price); to avoid clashing with the transformed-time variable τt\tau_t of Section 11.3, this section uses pttaxp^{\mathrm{tax}}_t for the tax throughout, in line with the price-level interpretation. ete_t denotes the per-period emissions flow (in GtC), and Et=stesE_t = \sum_{s\le t} e_s is cumulative emissions through date tt; this is the convention of the climate-emulator literature Dietz & Venmans, 2019 and of the companion paper, and it is the source of the section’s frequent “cumulative-emissions tax” phrasing. Finally, the policy vector that the planner ultimately optimizes over is ϑ=(ϑtax,ω)\vartheta = (\vartheta_{\mathrm{tax}}, \omega), the joint vector of tax-rule coefficients and cohort transfer shares defined in Step 1 below.

11.12.0.2From CDICE to a TCRE emulator.

The OLG-IAM uses a much simpler climate side than the 5-state CDICE module of Section 11.2 (three carbon stocks plus two temperature layers). Once the planner’s horizon is converted to cumulative-emissions form Et=stesE_t = \sum_{s\le t} e_s, the linear Transient Climate Response to cumulative carbon Emissions (TCRE) approximation collapses the carbon-cycle and energy-balance machinery to a single algebraic relation TtATσCCREtT^{\mathrm{AT}}_t \approx \sigma_{\mathrm{CCR}}\,E_t Dietz & Venmans, 2019, which removes five climate states from the planner’s optimization. The simplification is essential: it is what makes the OLG state space (12 cohort assets + 5 climate / shock states + ϑ\vartheta pseudo-states) tractable end-to-end on a GPU. The reader who finds the change abrupt should treat the TCRE relation as a reduced-form summary of the same physics that drove Section 11.2.5--Section 11.2.6, fitted directly to long-run paths rather than block-by-block. Figure Figure 11.7 contrasts the two climate sides.

Climate side of CDICE versus TCRE. The 5-state CDICE module on the left, in which atmospheric carbon, two ocean carbon reservoirs, atmospheric temperature, and ocean temperature all enter the planner’s state, is collapsed in the OLG-IAM to a single algebraic relation between cumulative emissions and atmospheric temperature, T^{\mathrm{AT}}_t \approx \sigma_{\mathrm{CCR}}\,E_t. The simplification trades fidelity to short-run climate dynamics for tractability of the 12-cohort heterogeneous-agent state space and is what makes the bilevel policy search of  end-to-end feasible.

Figure 11.7:Climate side of CDICE versus TCRE. The 5-state CDICE module on the left, in which atmospheric carbon, two ocean carbon reservoirs, atmospheric temperature, and ocean temperature all enter the planner’s state, is collapsed in the OLG-IAM to a single algebraic relation between cumulative emissions and atmospheric temperature, TtATσCCREtT^{\mathrm{AT}}_t \approx \sigma_{\mathrm{CCR}}\,E_t. The simplification trades fidelity to short-run climate dynamics for tractability of the 12-cohort heterogeneous-agent state space and is what makes the bilevel policy search of Section 11.12 end-to-end feasible.

11.12.1The OLG-IAM Model

The model features A=12A=12 overlapping generations of selfish agents (ages 20--80 in 5-year periods), a competitive firm, and a simplified, cumulative-emissions climate module in the spirit of Dietz & Venmans (2019):

The household Euler equation takes the standard form Ct,jσu=βEt[(1+rt+1)Ct+1,j+1σu]C_{t,j}^{-\sigma_u} = \beta\,\mathbb{E}_t[(1+r_{t+1})\,C_{t+1,j+1}^{-\sigma_u}] for j=1,,A1j = 1,\ldots,A-1, and market clearing requires that aggregate savings equal the capital stock: jat,j=Kt\sum_j a_{t,j} = K_t. Figure Figure 11.8 simulates this model without policy intervention; it fixes the business-as-usual (BAU) baseline against which every Pareto-improving policy below is benchmarked, and supplies the cohort-by-cohort participation constraints for the constrained policy search.

Business-as-usual baseline for the 12-cohort stochastic OLG-IAM of . Without policy intervention the median warming reaches roughly 3\,{}^\circC over the 150-year horizon, and the upper tail of damages is substantially larger than the mean. Every Pareto-improving policy below is benchmarked against this baseline, which also supplies the participation constraints for the constrained policy search. Figure extracted from .

Figure 11.8:Business-as-usual baseline for the 12-cohort stochastic OLG-IAM of Kübler et al. (2026). Without policy intervention the median warming reaches roughly 33\,{}^\circC over the 150-year horizon, and the upper tail of damages is substantially larger than the mean. Every Pareto-improving policy below is benchmarked against this baseline, which also supplies the participation constraints for the constrained policy search. Figure extracted from Kübler et al. (2026).

11.12.2The 3-Step ML Pipeline

Finding an optimal carbon tax rule in this OLG economy is a bilevel optimization problem: the outer level searches over tax parameters, and the inner level solves the full stochastic general equilibrium for each candidate tax. Kübler et al. (2026) decompose this into three steps, summarized in Figure Figure 11.9:

Three-step machine-learning pipeline for constrained carbon-tax design. The DEQN amortizes equilibrium solution across tax parameters, the GP surrogate maps policy parameters to welfare and cohort utilities, and the final optimization imposes the Pareto constraints on the surrogate.

Figure 11.9:Three-step machine-learning pipeline for constrained carbon-tax design. The DEQN amortizes equilibrium solution across tax parameters, the GP surrogate maps policy parameters to welfare and cohort utilities, and the final optimization imposes the Pareto constraints on the surrogate.

11.12.2.1Step 1: DEQN with pseudo-states.

The tax-rule coefficients ϑtax\vartheta_{\mathrm{tax}} and the A=12A=12 transfer shares ω=(ω1,,ω12)\omega = (\omega_1,\ldots,\omega_{12}) are appended to the state of the neural network as pseudo-states. The transfer shares are non-negative weights satisfying j=1Aωj=1\sum_{j=1}^{A} \omega_j = 1, with cohort jj’s lump-sum transfer given by Tt,j=ωjpttaxet\mathbb{T}_{t,j} = \omega_j\,p^{\mathrm{tax}}_t\,e_t from the government’s resource constraint jTt,j=pttaxet\sum_j \mathbb{T}_{t,j} = p^{\mathrm{tax}}_t\,e_t. The simplex constraint ωΔA1\omega \in \Delta^{A-1} is enforced by sampling unconstrained logits and applying a softmax before feeding ω\omega into the network, so the DEQN never sees an infeasible transfer profile. All cohorts alive at tt, including the newborn cohort, receive a transfer. The number of tax parameters depends on the rule: a simple linear rule on cumulative emissions has ϑtax=(ϑ0,ϑE)R2\vartheta_{\mathrm{tax}} = (\vartheta_0,\vartheta_E) \in \mathbb{R}^2 (so a 14-dimensional pseudo-state vector together with the 12 transfer shares), and a richer rule that adds dependence on carbon intensity and tipping has ϑtaxR4\vartheta_{\mathrm{tax}} \in \mathbb{R}^4 (a 16-dimensional pseudo-state vector with the 12 shares). The DEQN learns the equilibrium for all candidate tax-and-transfer configurations at once, so that simulating any (ϑtax,ω)(\vartheta_{\mathrm{tax}}, \omega) requires only a forward pass. The network architecture, optimizer schedule, and training-pool design follow Kübler et al. (2026) verbatim; the exact configuration is documented in the companion repository linked at the end of this section.

11.12.2.2Step 2: GP surrogate.

At each design point ϑ=(ϑtax,ω)\vartheta = (\vartheta_{\mathrm{tax}}, \omega), the trained DEQN is simulated to obtain Monte-Carlo estimates of expected lifetime utility for the 40 tracked cohorts (12 alive at t=0t=0 plus 28 future cohorts born during the planner’s 150-year horizon). Independent GPs are then fitted to map ϑ\vartheta to expected aggregate welfare W(ϑ)\mathcal{W}(\vartheta) and to each of the 40 cohort welfares U~t(ϑ)\tilde{U}_t(\vartheta). The design itself uses Latin-hypercube sampling augmented with Bayesian active learning: the size scales with the dimension of ϑ\vartheta, with roughly 500 points sufficient for the 14-dimensional “linear-in-EE + transfers” specification (Section 5.3 of Kübler et al. (2026)) and roughly 800 points for the 16-dimensional “richer rule + transfers” specification (Section 5.4). Figure Figure 11.10 shows the resulting welfare surface for the two-parameter linear-in-cumulative-emissions rule, with transfer shares held at the Pareto-optimal solution: the contour exposes the low-dimensional ridge along which intercept and slope trade off cleanly, and on which the Step-3 optimizer searches.

Gaussian-process welfare surrogate over the two-dimensional tax-parameter slice (\vartheta_0, \vartheta_E) of the linear-in-cumulative-emissions rule, with transfer shares \omega held at the Pareto-optimal solution. The contour exposes the low-dimensional welfare surface on which the constrained optimizer of Eq.  searches once the DEQN has amortized the equilibrium solve. Figure extracted from .

Figure 11.10:Gaussian-process welfare surrogate over the two-dimensional tax-parameter slice (ϑ0,ϑE)(\vartheta_0, \vartheta_E) of the linear-in-cumulative-emissions rule, with transfer shares ω\omega held at the Pareto-optimal solution. The contour exposes the low-dimensional welfare surface on which the constrained optimizer of Eq. (11.68) searches once the DEQN has amortized the equilibrium solve. Figure extracted from Kübler et al. (2026).

11.12.2.3Step 3: Constrained optimization.

The planner solves

ϑ=arg maxϑ=(ϑtax,ω)  W(ϑ)s.t.U~t(ϑ)Ut    t,    ωΔA1,\vartheta^* = \argmax_{\vartheta = (\vartheta_{\mathrm{tax}}, \omega)}\;\mathcal{W}(\vartheta) \qquad \text{s.t.}\quad \tilde{U}_t(\vartheta) \geq U_t \;\;\forall\, t,\;\; \omega \in \Delta^{A-1},

where UtU_t is the business-as-usual (BAU) welfare of cohort tt and ΔA1\Delta^{A-1} is the standard simplex on A=12A=12 shares. The Pareto constraint ensures that no generation is worse off; whenever the welfare-maximizing ϑ\vartheta^\ast lies strictly inside the feasible polytope (which is the case in every scenario reported below) it is also strictly Pareto-improving for at least one cohort, so the weak constraint U~tUt\tilde U_t \ge U_t and the textbook strict-improvement requirement coincide at the optimum. Because each evaluation of W\mathcal{W} and U~t\tilde U_t is a forward pass through the trained GP rather than a fresh DEQN simulation, the constrained search reduces to a sequence of small SLSQP problems (the paper uses 500 random restarts of scipy.optimize.minimize) that complete in seconds. By contrast, replacing the surrogate with brute-force re-solves of the full SOLG IAM at every candidate ϑ\vartheta would require on the order of tens of thousands of core-hours per candidate, which is the comparison the paper draws against traditional methods.

11.12.3Results: Why Transfers Matter

The unconstrained welfare-maximizing cumulative-emissions tax is the natural benchmark. With a linear rule pttax=ϑ0+ϑEEtp^{\mathrm{tax}}_t = \vartheta_0 + \vartheta_E\,E_t and a fixed declining transfer scheme ω=ωˉ\omega = \bar\omega, the policy cuts emissions aggressively, stabilizes mean warming around 2.7C2.7\,{}^{\circ}\mathrm C, and raises aggregate social welfare by about 1.6%1.6\% in consumption-equivalent terms. But it imposes losses of up to roughly 5%5\% on initial generations: it is therefore welfare-improving in the social-welfare-function sense, but not Pareto improving. Figure Figure 11.11 shows the failure: the welfare-gains panel records the losses for transition generations that the social-welfare-function aggregate hides.

Welfare-improving but not Pareto-improving cumulative-emissions tax with a fixed exogenous transfer scheme. The policy strongly reduces climate risk and raises aggregate welfare by about 1.6\% in consumption-equivalent terms, but the welfare-gains panel shows losses for transition generations. Figure extracted from .

Figure 11.11:Welfare-improving but not Pareto-improving cumulative-emissions tax with a fixed exogenous transfer scheme. The policy strongly reduces climate risk and raises aggregate welfare by about 1.6%1.6\% in consumption-equivalent terms, but the welfare-gains panel shows losses for transition generations. Figure extracted from Kübler et al. (2026).

Endogenizing the transfer shares changes the conclusion. With the same simple tax base and an optimized transfer simplex, Kübler et al. (2026) report the optimized coefficients[4]

pttax=ϑ0+ϑEEt,(ϑ0,ϑE)=(0.186,0.225),p^{\mathrm{tax}}_t = \vartheta_0 + \vartheta_E\,E_t, \qquad (\vartheta_0,\, \vartheta_E) = (-0.186,\, 0.225),

together with transfer shares

ω=(0.128,0.051,0.058,0.089,0.149,0.090,0.066,0.143,0.076,0.048,0.039,0.061),\omega = (0.128,\, 0.051,\, 0.058,\, 0.089,\, 0.149,\, 0.090,\, 0.066,\, 0.143,\, 0.076,\, 0.048,\, 0.039,\, 0.061),

which sum to one up to rounding. Figure Figure 11.13 plots this transfer profile against cohort index; the non-monotone shape is what allows a less aggressive cumulative-emissions tax to satisfy the Pareto constraint at every age, and it is the single most informative graphical summary of the constrained-optimal-policy step. The negative intercept ϑ0=0.186\vartheta_0 = -0.186 is not a subsidy in practice: the planner’s horizon starts well into the industrial era at a strictly positive cumulative-emissions stock E0>0E_0 > 0, so the effective tax ϑ0+ϑEEt\vartheta_0 + \vartheta_E\,E_t is positive for every relevant EtE_t along the optimum. The negative intercept simply registers that the linear-in-EE rule undershoots a constant carbon price near E=0E = 0 and ramps up roughly proportionally to cumulative emissions thereafter. The combined policy makes every tracked cohort weakly better off than under BAU. The aggregate welfare gain is more modest than under the unconstrained optimum, at about 0.42%0.42\% in consumption-equivalent terms, but the right tail of damages is truncated: the 99th percentile of damages falls to roughly 7%7\% of output rather than about 9%9\% under BAU. Figure Figure 11.12 reports the full result. Comparing its welfare-gains panel with that of Figure Figure 11.11 is the section’s headline: a lower, simpler tax combined with an optimized transfer system shifts every cohort weakly into the gains region.

Pareto-improving cumulative-emissions tax with optimized intergenerational transfers, at the coefficients of --. The tax is less aggressive than the unconstrained rule, but the optimized transfer system shields current cohorts while preserving climate-risk reduction for future cohorts. Aggregate welfare rises by about 0.42\%. Figure extracted from .

Figure 11.12:Pareto-improving cumulative-emissions tax with optimized intergenerational transfers, at the coefficients of (11.69)--(11.70). The tax is less aggressive than the unconstrained rule, but the optimized transfer system shields current cohorts while preserving climate-risk reduction for future cohorts. Aggregate welfare rises by about 0.42%0.42\%. Figure extracted from Kübler et al. (2026).

Optimized transfer-share profile \omega_j across the 12 cohorts alive at t = 0, drawn directly from . The profile is decidedly non-monotone: the largest shares go to cohorts 1 (oldest), 5, and 8, which are precisely the cohorts the participation constraint \tilde U_t \ge U_t binds most tightly for under the un-transferred tax of Figure . The non-monotone shape is what allows a less aggressive cumulative-emissions tax to satisfy Pareto improvement at every age.

Figure 11.13:Optimized transfer-share profile ωj\omega_j across the 12 cohorts alive at t=0t = 0, drawn directly from (11.70). The profile is decidedly non-monotone: the largest shares go to cohorts 1 (oldest), 5, and 8, which are precisely the cohorts the participation constraint U~tUt\tilde U_t \ge U_t binds most tightly for under the un-transferred tax of Figure Figure 11.11. The non-monotone shape is what allows a less aggressive cumulative-emissions tax to satisfy Pareto improvement at every age.

The richer rule of Section 11.12 adds carbon intensity and a tipping-state statistic,

pttax=ϑ0+ϑEEt+ϑκκt+ϑTP(1Dt),p^{\mathrm{tax}}_t = \vartheta_0 + \vartheta_E\,E_t + \vartheta_\kappa\,\kappa_t + \vartheta_{TP}(1-D_t),

where DtD_t is the climate-tipping state of the model (built from the proximity of TtATT^{\mathrm{AT}}_t to the stochastic threshold TPtTP_t and the absorbed-tipping flag). Its optimized coefficients are

(ϑ0,ϑE,ϑκ,ϑTP)=(0.237,0.203,0.037,0.012),(\vartheta_0,\, \vartheta_E,\, \vartheta_\kappa,\, \vartheta_{TP}) = (-0.237,\, 0.203,\, 0.037,\, 0.012),

with the associated aggregate welfare gain rising only from about 0.42%0.42\% to about 0.45%0.45\%. The cohort-by-cohort welfare profile (not plotted; see Kübler et al. (2026) for the figure) again keeps every cohort weakly above its BAU baseline, and the marginal welfare improvement from the extra two policy-state coefficients is small. This is the substantive headline of Kübler et al. (2026): once intergenerational transfers are optimized, the simple cumulative-emissions tax captures most of the feasible Pareto-improving welfare gain. More policy-state variables improve the fit to climate risk, but the participation constraints bind tightly enough that the marginal welfare benefit of policy complexity is small. DtD_t is a deterministic function of variables already in the SOLG state, so it can be evaluated inside each forward pass; the exact functional form is in the paper.

11.12.3.1Runtime in numbers.

On a standard laptop (Apple M1), the OLG DEQN trains in roughly four wall-clock hours; on a high-end accelerator such as an NVIDIA GH200, training drops to the order of minutes Kübler et al., 2026. Adding the GP fits over 500 (resp. 800) design points and the constrained Step-3 optimization keeps the entire pipeline within the same order of magnitude, while the comparable brute-force re-solve of the SOLG model at every candidate ϑ\vartheta would dominate by orders of magnitude (the paper reports tens of thousands of core-hours for one fixed-parameter calibration, which would have to be repeated for every Step-3 candidate vector).

11.12.3.2Companion code.

The full production OLG-IAM solver, including the DEQN training loop with (ϑtax,ω)(\vartheta_{\mathrm{tax}}, \omega) pseudo-states and the bilevel policy search, is hosted in the companion repository sischei/JPE_Macro_Using_ML_to_compute_constrained_optimal_carbon_tax_rules, which accompanies Kübler et al. (2026). The classroom notebook in Lecture 17 of this course exposes a reduced surrogate-only version that loads pre-trained GP surrogates and reproduces the constrained-optimization step (Step 3) interactively, but does not retrain the OLG DEQN end-to-end; readers who want the full pipeline should clone the companion repository.

11.13Discussion and Outlook

The combination of DEQNs, pseudo-states, and GP surrogates provides a scalable and transparent framework for climate economics that overcomes key limitations of traditional methods.

11.13.0.1Comparison with traditional IAM solutions.

Standard IAMs (such as the GAMS implementation of DICE) rely on shooting methods or nonlinear programming solvers that find deterministic optimal paths. These approaches struggle with stochastic extensions: Monte Carlo integration over shocks is expensive, and certainty equivalence (replacing random variables with their means) misses the welfare cost of tail risks. The DEQN approach approximates the stochastic recursive solution over the chosen training distribution and state/pseudo-state domain (with Bayesian learning and recursive Epstein--Zin utility) in a single training run.

11.13.0.2Limitations.

Several limitations should be noted. First, the CDICE climate module, while calibrated to CMIP benchmarks, remains a reduced-form emulator and cannot capture spatial heterogeneity or regional climate impacts. Second, the OLG-IAM treats each generation as identical within a cohort; within-cohort heterogeneity (e.g., geographic exposure to climate damages) would require further extensions along the lines of Chapter Chapter 6. Third, the linear tax rules are interpretable and implementable but may leave welfare gains on the table relative to fully nonlinear rules.

11.13.0.3Extensions.

Active research frontiers include: multi-region IAMs with trade and carbon leakage Nordhaus & Yang, 1996; richer damage specifications including tipping cascades; endogenous technical change in abatement technology; and embedding climate modules in continuous-time heterogeneous-agent models (Chapter Chapter 8) to study the joint dynamics of climate risk and wealth inequality. The methodological toolkit developed in this course (DEQNs for equilibrium computation, PINNs for continuous-time PDEs, deep surrogates for uncertainty quantification, and Young’s method for distribution tracking) provides the computational infrastructure for these extensions.

11.13.0.4The three movements, in one synthesis.

Movement 1 established that solving an IAM by DEQN requires three modifications relative to the stationary toolkit of Chapter Chapter 2: time enters as a state, the training pool is built by simulating KK forward trajectories from a calibrated initial state rather than by sampling an ergodic distribution, and the missing transversality is absorbed numerically by choosing the horizon TmaxT_{\max} long enough that discounting suffices (or, on short horizons, by adding an explicit terminal residual). Movement 2 put that algorithm to work on a worked stochastic DICE economy, producing the eight-residual loss whose minimization delivers the deterministic policy and, with one extra Gauss--Hermite layer, the AR(1) SCC fan chart. Movement 3 layered four extensions onto the same spine: Bayesian learning over the climate sensitivity, recursive Epstein--Zin preferences, global UQ of the SCC via pseudo-states and GP surrogates, and constrained Pareto-improving carbon-tax design in a heterogeneous-agent OLG-IAM. Chapter Chapter 12 threads these into the broader synthesis with the rest of the course.

11.14Further Reading

11.15Exercises

Worked solutions and guidance for these exercises appear in Appendix Appendix F.

Footnotes
  1. The numerical claims in this paragraph quote the headline results of Friedl et al. (2023); consult that paper for the precise figures and the underlying calibration grid.

  2. Notation note. We use ψ\psi for the IES and γu\gamma_u for risk aversion throughout this section, following Friedl et al. (2023). The IRBC chapter (Chapter Chapter 3) used γ\gamma for the IES under the bundled CRRA-IES convention; here in the Epstein--Zin block we deliberately decouple the two parameters, so the symbol switch is intentional. The CRRA limit is recovered at γu=1/ψ\gamma_u = 1/\psi.

  3. Variance-share ranges quoted from Friedl et al. (2023); the spread reflects different points along the planner’s horizon and different damage-function specifications.

  4. All numerical coefficients, welfare gains, and damage-quantile values in this subsection are quoted from Kübler et al. (2026); consult the paper for the source tables and figures.

References
  1. Folini, D., Friedl, A., Kübler, F., & Scheidegger, S. (2025). The Climate in Climate Economics. The Review of Economic Studies, 92(1), 299–338. 10.1093/restud/rdae011
  2. Friedl, A., Kübler, F., Scheidegger, S., & Usui, T. (2023). Deep Uncertainty Quantification: With an Application to Integrated Assessment Models.
  3. Kübler, F., Scheidegger, S., & Surbek, O. (2026). Using Machine Learning to Compute Constrained Optimal Carbon Tax Rules. Journal of Political Economy: Macroeconomics.
  4. Hassler, J., Krusell, P., & Smith Jr, A. A. (2016). Environmental macroeconomics. In Handbook of macroeconomics (Vol. 2, pp. 1893–2008). Elsevier.
  5. Dietz, S. (2024). Chapter 1 - Introduction to integrated assessment modeling of climate change (L. Barrage & S. Hsiang, Eds.; Vol. 1, pp. 1–51). North-Holland. https://doi.org/10.1016/bs.hesecc.2024.10.002
  6. Fernández-Villaverde, J., Gillingham, K. T., & Scheidegger, S. (2025). Climate Change Through the Lens of Macroeconomic Modeling. Annual Review of Economics, 17, 125–150. https://doi.org/10.1146/annurev-economics-091124-045357
  7. van der Ploeg, F., & Rezai, A. (2026). Climate Change, Climate Policy, and the Macroeconomy (CEPR Discussion Paper No. No. 21153). CEPR Press. https://cepr.org/publications/dp21153
  8. Golosov, M., Hassler, J., Krusell, P., & Tsyvinski, A. (2014). Optimal taxes on fossil fuel in general equilibrium. Econometrica, 82(1), 41–88.
  9. Cai, Y., & Lontzek, T. S. (2019). The Social Cost of Carbon with Economic and Climate Risks. Journal of Political Economy, 127(6), 2684–2734. 10.1086/701890
  10. Nordhaus, W. D. (2017). Revisiting the Social Cost of Carbon. Proceedings of the National Academy of Sciences, 114(7), 1518 LP – 1523. 10.1073/pnas.1609244114
  11. Nordhaus, W. D., & Yang, Z. (1996). A regional dynamic general-equilibrium model of alternative climate-change strategies. The American Economic Review, 741–765.
  12. Traeger, C. P. (2023). ACE — Analytic Climate Economy. American Economic Journal: Economic Policy, 15(3), 372–406. 10.1257/pol.20210297
  13. Nordhaus, W. D. (1994). Managing the global commons: the economics of climate change. MIT press Cambridge, MA.
  14. Nordhaus, W. D. (2008). A Question of Balance: Weighing the Options on Global Warming Policies. Yale University Press, New Haven, CT.
  15. Roe, G. H., & Baker, M. B. (2007). Why is climate sensitivity so unpredictable? Science, 318(5850), 629–632.