Outputs and Post-Processing¶

This page describes the artefacts a simulation produces — in-memory traces during the run, plot files written by SteelPlotter, and the post-processed CSVs assembled at the end. The model also writes geospatial statistics (LCOE / LCOH / overbuild factors) per-year and aggregated.

For where individual costs and emissions originate, see Cost Calculation Functions. For how the trade LP feeds these traces, see Trade Model Overview and TM-PAM Connector.

DataCollector traces¶

DataCollector (src/steelo/domain/datacollector.py) is invoked at the end of every simulation year and aggregates per-FG state into in-memory traces consumed by SteelPlotter and the post-processor.

Attribute	Shape	Source
`trace_capacity`	`{year: {tech: capacity_t}}`	All active FGs
`trace_price`	`{year: {product: price_$/t}}` (incl. `steel`, `iron`, optional `scrap`, `iron_weighted_avg`)	`Environment.cost_curve`
`trace_production`	`{year: total_t}`	All active FGs
`trace_production_by_product`	`{year: {iron	steel: tonnes}}`
`trace_utilisation_rate`	`{year: {fg_id: rate}}`	Active FGs
`trace_capex`	`{year: {tech: {iso3: capex_usd}}}`	New FGs created that year
`trace_emissions`	`{boundary: {year: {tech: {scope: tCO2e}}}}`	Active FGs, all available boundaries, scopes `direct_ghg`, `direct_with_biomass_ghg`, `indirect_ghg`
`trace_iron_ore`	`{year: {quality: tonnes}}`	Iron-ore allocations
`trace_metallic_charges`	`{year: {charge_type: tonnes}}`	Iron-bearing inputs to steelmaking
`trace_international_iron_trade`	`{year: {iron_product: tonnes}}`	Cross-ISO3 flows of `IRON_PRODUCTS` from `Allocations.allocations`

Emissions reshape and overcounting fix¶

trace_emissions was previously a flat {year: {tech: total}} that summed direct_ghg + direct_with_biomass_ghg + indirect_ghg for the configured carbon-cost boundary only. That sum double-counted direct emissions — the two direct views are alternatives, not separate scopes — and inflated 2025 totals by ~64%. The current shape stores every available boundary and keeps each scope separate, so:

charts can be produced per boundary without re-walking the plant graph;
the chart layer chooses which scopes to combine (e.g. direct_ghg + indirect_ghg) without forcing a global decision at collection time;
the same iteration accumulates trace_production_by_product for free, used as the denominator for intensity charts.

International iron trade¶

collect_international_iron_trade(year, trade_allocations) walks the LP allocations, filters to commodities in IRON_PRODUCTS, drops intra-country flows (from_iso3 == to_iso3), and accumulates per-product cross-border tonnes. Logged at info level with per-product totals; consumed by SteelPlotter.plot_international_iron_trade().

SteelPlotter¶

SteelPlotter (src/steelo/utilities/steeliq_plotter.py) is the unified plotting class that has progressively replaced the standalone functions in src/steelo/utilities/plotting.py. It centralises styling (footers, legends, color schemes), output-path resolution via PlotPaths, and per-chart CSV export.

Class-level conventions:

Each plot method takes a trace_* dict and an optional iso3_filter / region grouping argument.
Methods return the saved Path, or None when there is no data.
export_csv=True (the default on most plots) writes a sibling .csv alongside the .png via _save_chart_data_to_csv(). The CSV has the same data the plot was built from — every furnace, every year — so capacity inventory and breakdowns survive even when the chart truncates or aggregates.
Plot output subdirectory is selected per call (subdir="plots_dir", "pam_plots_dir", etc.) via _save_figure().

Plot catalogue:

Method	Trace consumed	Output subdir
`plot_capex_by_technology`	`trace_capex`	`pam_plots_dir`
`plot_emissions_by_technology`	`trace_emissions` + `trace_production_by_product`	`EMISSIONS_SUBDIR` (`plots/emissions`)
`plot_iron_ore_by_quality`	`trace_iron_ore`	`pam_plots_dir`
`plot_metallic_charges`	`trace_metallic_charges`	`pam_plots_dir`
`plot_international_iron_trade`	`trace_international_iron_trade`	`pam_plots_dir`
`plot_steel_cost_curve` / `plot_cost_curve_per_region` / `plot_cost_curve_for_commodity` / `plot_cost_curve_with_breakdown` / `plot_cost_curve_step`	`Environment.cost_curve`	`COST_CURVES_SUBDIR` (`plots/cost_curves`)
`plot_capacity_development_by_technology` / `plot_area_chart_by_region_or_technology`	`trace_capacity`	`pam_plots_dir`

Plot folder layout¶

Emissions and cost-curve plots write to top-level sibling folders rather than under plots/PAM/…:

output/
  plots/
    PAM/        # plant-agent plots (capacity, capex, charges, prices)
    GEO/        # geospatial / new-plant plots
    TM/         # trade-model plots
    emissions/  # SteelPlotter.plot_emissions_by_technology
    cost_curves/  # SteelPlotter cost-curve methods

Cost-curve filenames follow cost_curve_{product}_by_{aggregation}_{year}.png (e.g. cost_curve_iron_by_region_2025.png); the legacy ordering {product}_cost_curve_by_{aggregation}_{year}.png is no longer produced by the new methods. plot_cost_curve_with_breakdown retains its previous filename convention.

Post-processed CSV columns¶

extract_and_process_stored_dataCollection() in src/steelo/adapters/dataprocessing/postprocessing/post_process_datacollection.py assembles a per-FG-per-year DataFrame from the stored pickle data. The column set is no longer hardcoded — keys come from runtime arguments computed once at simulation start.

Header columns (deterministic order)¶

Column	Source
`iso3`	`Plant.location.iso3`
`country`	`iso3_to_country_map[iso3]`
`region`	`iso3_to_region_map[iso3]`
`year`, `commands`, `materials`, `energy`, `cost_breakdown`	Per-FG state

Headers are placed first via explicit reordering so downstream consumers can rely on a stable schema.

Dynamic feedstock / carrier columns¶

Wide-form columns are emitted for each canonical feedstock or carrier key. Three families share this pattern:

Family	Key source	Example column
`cost_breakdown - <key>`	`Environment.cost_breakdown_keys` (built from dynamic feedstocks via `normalize_energy_key`)	`cost_breakdown - hydrogen`
`carbon_breakdown - <feedstock>`	`Environment.carbon_breakdown_columns`	`carbon_breakdown - coal`
`unit_subsidy_<carrier>`	`plant.columns[startswith("unit_subsidy_")]` plus a fixed-order header set, then `unit_subsidy_total`	`unit_subsidy_hydrogen`

The previous hardcoded STANDARD_COST_BREAKDOWN_COLUMNS list and the fluxes / lime → burnt lime rename map have been removed; new energy carriers and feedstocks now appear in the post-processed CSV automatically without code edits. Missing keys are zero-padded so the schema is the same across runs.

Optional columns¶

Column	Present when
`unit_secondary_output_costs`	FG records secondary-output cost adjustments
`unit_carbon_cost`	Carbon-cost calculation produced a value
`unit_carbon_cost_contribution - co2_slip`	FG has a non-zero `co2_slip × carbon_price` contribution (calculated on `FurnaceGroup`, collected each year)
`emissions_<boundary>_<scope>`	Wide-form columns expanded from per-FG `emissions[boundary][scope]` for every available boundary/scope pair

Per-carrier subsidy tracking¶

The data collector records each FG’s per-carrier subsidy (unit_subsidy_<carrier>) and a roll-up unit_subsidy_total. The post-processor picks these up by prefix, places the fixed-order set first, then appends any additional carriers found at runtime, then unit_subsidy_total. This keeps existing dashboards stable while letting new carriers flow through.

Tabular price outputs¶

SimulationRunner writes a per-year price CSV at the end of the run, keyed off data_collector.trace_price:

output/data/steel_iron_prices.csv

Columns: year, steel_price_usd_per_t, iron_price_usd_per_t, optional scrap_price_usd_per_t, optional iron_weighted_avg_cost_usd_per_t.

A matching matplotlib chart is also produced.

Geospatial statistics: LCOE, LCOH, overbuild factors¶

src/steelo/adapters/geospatial/geospatial_statistics.py exports per-country statistics every modelled geospatial year (typically every 5 years where the geospatial pipeline runs):

output/data/LCOE/lcoe_stats_{year}.csv — average / min / max / p10 / p20 / p25 / p50 LCOE in USD/MWh plus n_grid_points. Source LCOE is the power_price variable in energy_prices (USD/kWh, multiplied ×1000 on export).
output/data/LCOH/lcoh_stats_{year}.csv — same statistics in USD/kg for the capped_lcoh variable, plus a hydrogen_ceiling_pct column reflecting the configured GeoConfig percentile cap.
output/data/{factor}_factors/{factor}_stats_{year}.csv — overbuild-factor statistics (e.g. solar_factor, wind_factor, battery_factor) at LCOE percentile points (avg, min, max, p10, p20, p25, p50). Each value is the mean factor across grid points within ±5 % of the LCOE percentile in that country (with a closest-point fallback when the band is empty).

At end-of-run, aggregate_lcoe_lcoh_statistics(output_dir, start_year, end_year) concatenates the per-year files into:

output/data/LCOE/lcoe_stats_{start_year}_{end_year}.csv
output/data/LCOH/lcoh_stats_{start_year}_{end_year}.csv

sorted by (year, country), formatted to 4 decimals. Per-year files are kept; aggregated files are recognised and excluded from the next aggregation pass via the _<digits> filename suffix check.

Per-app-run config artefacts¶

When a simulation is launched from the Django/Electron app, SimulationRunner mirrors the CLI output layout by writing two artefacts into the run’s output directory:

simulation_config.json — the resolved SimulationConfig (after defaults, parameter merges and validation).
preparation_metadata.json — metadata about the data preparation that fed this run (cache hash, source master-input version, etc.).

This makes app runs and CLI runs leave the same on-disk shape, simplifying downstream analysis tooling that walks output directories regardless of how the run was started.