Outputs and Post-Processing¶
This page describes the artefacts a simulation produces — in-memory traces during the run, plot files written by SteelPlotter, and the post-processed CSVs assembled at the end. The model also writes geospatial statistics (LCOE / LCOH / overbuild factors) per-year and aggregated.
For where individual costs and emissions originate, see Cost Calculation Functions. For how the trade LP feeds these traces, see Trade Model Overview and TM-PAM Connector.
DataCollector traces¶
DataCollector (src/steelo/domain/datacollector.py) is invoked at the end of every simulation year and aggregates per-FG state into in-memory traces consumed by SteelPlotter and the post-processor.
Attribute |
Shape |
Source |
|---|---|---|
|
|
All active FGs |
|
|
|
|
|
All active FGs |
|
`{year: {iron |
steel: tonnes}}` |
|
|
Active FGs |
|
|
New FGs created that year |
|
|
Active FGs, all available boundaries, scopes |
|
|
Iron-ore allocations |
|
|
Iron-bearing inputs to steelmaking |
|
|
Cross-ISO3 flows of |
Emissions reshape and overcounting fix¶
trace_emissions was previously a flat {year: {tech: total}} that summed direct_ghg + direct_with_biomass_ghg + indirect_ghg for the configured carbon-cost boundary only. That sum double-counted direct emissions — the two direct views are alternatives, not separate scopes — and inflated 2025 totals by ~64%. The current shape stores every available boundary and keeps each scope separate, so:
charts can be produced per boundary without re-walking the plant graph;
the chart layer chooses which scopes to combine (e.g.
direct_ghg + indirect_ghg) without forcing a global decision at collection time;the same iteration accumulates
trace_production_by_productfor free, used as the denominator for intensity charts.
International iron trade¶
collect_international_iron_trade(year, trade_allocations) walks the LP allocations, filters to commodities in IRON_PRODUCTS, drops intra-country flows (from_iso3 == to_iso3), and accumulates per-product cross-border tonnes. Logged at info level with per-product totals; consumed by SteelPlotter.plot_international_iron_trade().
SteelPlotter¶
SteelPlotter (src/steelo/utilities/steeliq_plotter.py) is the unified plotting class that has progressively replaced the standalone functions in src/steelo/utilities/plotting.py. It centralises styling (footers, legends, color schemes), output-path resolution via PlotPaths, and per-chart CSV export.
Class-level conventions:
Each plot method takes a
trace_*dict and an optionaliso3_filter/ region grouping argument.Methods return the saved
Path, orNonewhen there is no data.export_csv=True(the default on most plots) writes a sibling.csvalongside the.pngvia_save_chart_data_to_csv(). The CSV has the same data the plot was built from — every furnace, every year — so capacity inventory and breakdowns survive even when the chart truncates or aggregates.Plot output subdirectory is selected per call (
subdir="plots_dir","pam_plots_dir", etc.) via_save_figure().
Plot catalogue:
Method |
Trace consumed |
Output subdir |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Plot folder layout¶
Emissions and cost-curve plots write to top-level sibling folders rather than under plots/PAM/…:
output/
plots/
PAM/ # plant-agent plots (capacity, capex, charges, prices)
GEO/ # geospatial / new-plant plots
TM/ # trade-model plots
emissions/ # SteelPlotter.plot_emissions_by_technology
cost_curves/ # SteelPlotter cost-curve methods
Cost-curve filenames follow cost_curve_{product}_by_{aggregation}_{year}.png (e.g. cost_curve_iron_by_region_2025.png); the legacy ordering {product}_cost_curve_by_{aggregation}_{year}.png is no longer produced by the new methods. plot_cost_curve_with_breakdown retains its previous filename convention.
Post-processed CSV columns¶
extract_and_process_stored_dataCollection() in src/steelo/adapters/dataprocessing/postprocessing/post_process_datacollection.py assembles a per-FG-per-year DataFrame from the stored pickle data. The column set is no longer hardcoded — keys come from runtime arguments computed once at simulation start.
Header columns (deterministic order)¶
Column |
Source |
|---|---|
|
|
|
|
|
|
|
Per-FG state |
Headers are placed first via explicit reordering so downstream consumers can rely on a stable schema.
Dynamic feedstock / carrier columns¶
Wide-form columns are emitted for each canonical feedstock or carrier key. Three families share this pattern:
Family |
Key source |
Example column |
|---|---|---|
|
|
|
|
|
|
|
|
|
The previous hardcoded STANDARD_COST_BREAKDOWN_COLUMNS list and the fluxes / lime → burnt lime rename map have been removed; new energy carriers and feedstocks now appear in the post-processed CSV automatically without code edits. Missing keys are zero-padded so the schema is the same across runs.
Optional columns¶
Column |
Present when |
|---|---|
|
FG records secondary-output cost adjustments |
|
Carbon-cost calculation produced a value |
|
FG has a non-zero |
|
Wide-form columns expanded from per-FG |
Per-carrier subsidy tracking¶
The data collector records each FG’s per-carrier subsidy (unit_subsidy_<carrier>) and a roll-up unit_subsidy_total. The post-processor picks these up by prefix, places the fixed-order set first, then appends any additional carriers found at runtime, then unit_subsidy_total. This keeps existing dashboards stable while letting new carriers flow through.
Tabular price outputs¶
SimulationRunner writes a per-year price CSV at the end of the run, keyed off data_collector.trace_price:
output/data/steel_iron_prices.csv
Columns: year, steel_price_usd_per_t, iron_price_usd_per_t, optional scrap_price_usd_per_t, optional iron_weighted_avg_cost_usd_per_t.
A matching matplotlib chart is also produced.
Geospatial statistics: LCOE, LCOH, overbuild factors¶
src/steelo/adapters/geospatial/geospatial_statistics.py exports per-country statistics every modelled geospatial year (typically every 5 years where the geospatial pipeline runs):
output/data/LCOE/lcoe_stats_{year}.csv— average / min / max / p10 / p20 / p25 / p50 LCOE in USD/MWh plusn_grid_points. Source LCOE is thepower_pricevariable inenergy_prices(USD/kWh, multiplied ×1000 on export).output/data/LCOH/lcoh_stats_{year}.csv— same statistics in USD/kg for thecapped_lcohvariable, plus ahydrogen_ceiling_pctcolumn reflecting the configuredGeoConfigpercentile cap.output/data/{factor}_factors/{factor}_stats_{year}.csv— overbuild-factor statistics (e.g.solar_factor,wind_factor,battery_factor) at LCOE percentile points (avg,min,max,p10,p20,p25,p50). Each value is the mean factor across grid points within ±5 % of the LCOE percentile in that country (with a closest-point fallback when the band is empty).
At end-of-run, aggregate_lcoe_lcoh_statistics(output_dir, start_year, end_year) concatenates the per-year files into:
output/data/LCOE/lcoe_stats_{start_year}_{end_year}.csvoutput/data/LCOH/lcoh_stats_{start_year}_{end_year}.csv
sorted by (year, country), formatted to 4 decimals. Per-year files are kept; aggregated files are recognised and excluded from the next aggregation pass via the _<digits> filename suffix check.
Per-app-run config artefacts¶
When a simulation is launched from the Django/Electron app, SimulationRunner mirrors the CLI output layout by writing two artefacts into the run’s output directory:
simulation_config.json— the resolvedSimulationConfig(after defaults, parameter merges and validation).preparation_metadata.json— metadata about the data preparation that fed this run (cache hash, source master-input version, etc.).
This makes app runs and CLI runs leave the same on-disk shape, simplifying downstream analysis tooling that walks output directories regardless of how the run was started.