ResearchForge / Calculators
← back to simulations
arXiv:2510.12272 · macroeconomic agent-based modelling

Heterogeneous RBCs via Deep Multi-Agent Reinforcement Learning

Federico Gabriele, Aldo Glielmo, Marco Taboga

method: agenttier: T2vs. heterogeneous-agentsbusiness 0.30local-repro 0.60arXiv abstract →

Abstract

Current macroeconomic models with agent heterogeneity can be broadly divided into two main groups. Heterogeneous-agent general equilibrium (GE) models, such as those based on Heterogeneous Agent New Keynesian (HANK) or Krusell-Smith (KS) approaches, rely on GE and 'rational expectations', somewhat unrealistic assumptions that make the models very computationally cumbersome, which in turn limits the amount of heterogeneity that can be modelled. In contrast, agent-based models (ABMs) can flexibly encompass a large number of arbitrarily heterogeneous agents, but typically require the specification of explicit behavioural rules, which can lead to a lengthy trial-and-error model-development process. To address these limitations, we introduce MARL-BC, a framework that integrates deep multi-agent reinforcement learning (MARL) with real business cycle (RBC) models. We demonstrate that MARL-BC can: (1) recover textbook RBC results when using a single agent; (2) recover the results of the mean-field KS model using a large number of identical agents; and (3) effectively simulate rich heterogeneity among agents, a hard task for traditional GE approaches. Our framework can be thought of as an ABM if used with a variety of heterogeneous interacting agents, and can reproduce GE results in limit cases. As such, it is a step towards a synthesis of these often opposed modelling paradigms.

Extracted equations

  • Y_t = A_t * K_t^alpha * L_t^(1-alpha)
  • K_t = sum_i k_i_t
  • L_t = sum_i ell_i_t
  • a_i_t = (1 - delta) * k_i_{t-1} + w_i_t * ell_i_t + r_i_t * k_i_{t-1}
  • r_i_t = alpha * A_t * (K_t / L_t)^(alpha - 1) * kappa_i
  • w_i_t = (1 - alpha) * A_t * (K_t / L_t)^(-alpha) * lambda_i
  • c_i_t = chat_i_t * a_i_t
  • k_i_{t+1} = a_i_t - c_i_t
  • R_i_t = log(c_i_t) - chi * ell_i_t
  • A_t = rho * A_{t-1} + sigma_A * epsilon_t

Simulation outputs

plot /research-assets/2510.12272/agent_timeseries_baseline.pngplot /research-assets/2510.12272/agent_timeseries_heterogeneous-agents.png

Baseline vs. variant

Variant arm: heterogeneous-agents

MetricBaselineVariantΔ (variant − baseline)
final_mean_energy100.0000100.00000.0000
final_alive_count20.000020.00000.0000
final_dispersion0.00000.00000.0000
steps_run1.000e+41.000e+40.0000

Paper claims vs. our run

  • MARL-BC recovers textbook RBC results with n=1 agent
    not-testable
    no fidelity score recorded
  • MARL-BC converges to mean-field Krusell-Smith equilibrium as n increases
    not-testable
    no fidelity score recorded
  • MARL-BC can simulate rich heterogeneity in agent productivities
    not-testable
    no fidelity score recorded
  • SAC algorithm achieves stable learning across population sizes n=10 to n=500
    not-testable
    no fidelity score recorded
  • Aggregate behaviour converges to well-defined equilibrium for large n
    not-testable
    no fidelity score recorded
  • Training is computationally feasible on modest hardware (single CPU)
    not-testable
    no fidelity score recorded

Parameters

alpha0.33
delta0.025
beta0.99
chi1
rho0.95
sigma_A0.01
n_agents20
kappa_i_range[0.98,1.02]
lambda_i_range[0.98,1.02]

Run notes

model_type=generic; ran baseline + heterogeneous-agents