OpenFrontIO

alois/OpenFrontIO

Fork 0

mirror of https://github.com/openfrontio/OpenFrontIO.git synced 2026-06-21 22:31:55 +00:00

Commit Graph

Author	SHA1	Message	Date
Evan	bca980f572	Shrink the per-tick worker → main update payload by ~90% (#4244 ) Stacked on #4243 (the `perf:client` harness) — first step of fixing the every-100ms main-thread stutter: make the per-tick burst small before spreading what remains across frames. ## Problem The harness showed the main-thread burst was dominated by `structuredClone` of the `updates` object, and the clone was dominated by two kinds of per-tick churn that re-sent object payloads every tick: - `gold` / `troops` / `tilesOwned` change for nearly every alive player every tick → ~278 partial `PlayerUpdate` objects per tick (world/400 bots), ~508 on giantworldmap. - Attack troop counts tick down every tick → whole `outgoingAttacks`/`incomingAttacks` arrays re-cloned for every fighting player every tick. - `playerNameViewData` (an all-players record) was cloned every tick but only recomputed every 30 ticks. ## Change Three additions to the worker → main protocol (all transferable, zero-clone): 1. `packedPlayerUpdates` — `[smallID, tilesOwned, gold, troops]` float64 quads for players whose stats changed. These fields no longer appear in `PlayerUpdate` diffs (first emissions still carry the full snapshot). Gold is exact in a float64 (game values ≪ 2^53). 2. `packedAttackUpdates` — `[ownerSmallID, direction, index, troops]` quads. Attack arrays are only resent when membership/order/retreating changes — which is exactly the condition that keeps the patch indexes valid (a tick either resends an array or patches it, never both). 3. `playerNameViewData` is now optional — attached only on placement-rebuild ticks (spawn ticks, first ticks, every 30th, spawn end). The client keeps the last applied values; dead players' name placements freeze at death (matching the previous effective behavior). On the client, `GameView.populateFrame` now also rebuilds `names` / `relationMatrix` / `allianceClusters` only when their inputs changed that tick — field presence on a partial `PlayerUpdate` marks them dirty. (`playerStatus`, nuke telegraphs, and attack rings still recompute every tick; they're tick- or unit-dependent.) ## Results (perf:client, this machine; low-end devices ~5–20× slower) Default run (world, 400 bots, 1800 ticks): \| stage \| before \| after \| \|---\|---\|---\| \| clone (serialize+deserialize) \| 1.02ms \| 0.09ms \| \| GameView.update \| 0.62ms \| 0.29ms \| \| WebGLFrameBuilder.update \| 0.04ms \| 0.04ms \| \| TOTAL burst mean \| 1.67ms \| 0.42ms \| \| TOTAL p99 / max \| 3.47 / 10.3ms \| 1.21 / 3.92ms \| giantworldmap/600t: 2.54 → 0.68ms mean. Player update objects: 278 → 6.5 per tick (world), 508 → 12 (giant). The remaining burst is mostly tile apply + per-tick derivations — the part that frame-spreading (next step) addresses. ## Verification - Sim final hash unchanged on all three reference configs (`5607618202213430`, `29309648281599524`, `39945089450032050`) — no simulation behavior change. - View hash unchanged on all three configs (`942106e9`, `a3aae227`, `cbaaf265`) — the rendered view state is provably identical tick-for-tick, including the name-freeze semantics. - New tests: `tests/PackedPlayerUpdates.test.ts` (drain + GameRunner cadence), packed-channel and freeze-at-death cases in `tests/client/view/GameView.test.ts`, `packAttackTroopDeltas` unit tests and updated diff contract in `tests/GameUpdateUtils.test.ts` / `tests/PlayerUpdateDiff.test.ts`. - `npm test` (1490 tests), `eslint`, `prettier`, `tsc --noEmit` all pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 16:50:56 -07:00
Evan	2e6f70c098	Speed up the core sim: inline sfc32 PRNG and allocation-free player updates (#4233 ) ## Summary Follow-up to #4230. Two more core-sim optimizations — these are behavior-affecting in controlled ways (unlike #4230, which was hash-identical), so both come with dedicated test coverage written before the change. Combined results (`npm run perf:game`, same machine, before → after): \| run \| mean tick \| ticks/sec \| p99 \| peak heap \| \|---\|---\|---\|---\|---\| \| default (world, 400 bots, 1800 ticks) \| 7.98 → 6.96 ms \| 125 → 144 \| 21.2 → 19.0 ms \| 438 → 294 MB \| \| giantworldmap, 600 ticks \| 17.4 → 15.2 ms \| 58 → 66 \| 32.6 → 30.5 ms \| \| Cumulative with #4230 vs. the original baseline: default run mean 9.04 → 6.96 ms (111 → 144 ticks/sec); giantworldmap 22.5 → 15.2 ms (44 → 66 ticks/sec, max tick 52.8 → 40.1 ms). ### 1. `PseudoRandom`: seedrandom ARC4 → inline sfc32 - ARC4 was ~4% of profiled self time. The new engine is sfc32 with splitmix32 seed expansion and a warmup, using only 32-bit integer ops — sequences are identical across platforms. The class API is unchanged. - This removes the `seedrandom` dependency entirely, making `src/core` actually dependency-free (the import was the only violation of that rule). - ⚠️ The random stream differs, so the deterministic game-state hash changes. All clients run the same code, so cross-client sync is unaffected; the harness reproduces the same hash on repeated runs per seed. New reference hashes: - `--map world --ticks 200 --bots 100` → `5607618202213430` - default run → `29309648281599524` - `--map giantworldmap --ticks 600` → `39945089450032050` - New `tests/PseudoRandom.test.ts` (15 tests) pins the engine-agnostic contract: per-seed determinism, ranges, uniformity, adjacent-seed decorrelation, and every API method. The tests were verified green against the old engine first, then the swap. - The stream change exposed a test that passed by RNG luck: in `AiAttackBehavior.test.ts`, "nation cannot attack allied player" was actually being blocked by the difficulty dice gate in `shouldAttack`, not the alliance check — hiding that the test's `AiAttackBehavior` was constructed without its `NationEmojiBehavior`. The test now supplies one and verifies the real protection layer (`AttackExecution`'s alliance check), robust to any dice outcome. ### 2. `PlayerImpl.toFullUpdate`: allocation-free empty collections - `toFullUpdate` runs for every player every tick and allocated ~10 collections each (allies, embargoes Set, attacks, alliance views, …) even when all were empty — the common case for most of 472 players. Because `lastSentUpdate` retains each snapshot for a full tick, these objects survived minor GC, got promoted, and accumulated as old-space garbage between major GCs — that's the peak-heap drop. - Empty collections now reuse shared frozen module-level singletons, so `diffPlayerUpdate`'s existing `a === b` fast paths skip structural comparison entirely. Non-empty collections build in single passes. Freezing makes accidental in-worker mutation throw loudly instead of silently corrupting every player; consumers across the worker boundary get mutable structured clones as before. (`Set` cannot be frozen — `EMPTY_EMBARGOES` is documented as never-mutate.) - Value-identical: the game-state hash is unchanged by this part (verified against the post-PRNG baseline). - New `tests/PlayerUpdateDiff.test.ts` (8 tests): full-snapshot shape, null-when-unchanged, embargo/alliance/target/attack diffs through the real tick pipeline, and the freeze contract. ### Verification - Full suite passes: 124 files / 1408 tests (23 new) + server tests; lint and prettier clean. - Hash reproducibility confirmed: repeated runs with identical args produce identical hashes on all three configs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 08:15:01 -07:00

Author

SHA1

Message

Date

Evan

bca980f572

Shrink the per-tick worker → main update payload by ~90% (#4244 )

Stacked on #4243 (the `perf:client` harness) — first step of fixing the
every-100ms main-thread stutter: make the per-tick burst small before
spreading what remains across frames.

## Problem

The harness showed the main-thread burst was dominated by
`structuredClone` of the `updates` object, and the clone was dominated
by two kinds of per-tick churn that re-sent object payloads every tick:

- `gold` / `troops` / `tilesOwned` change for nearly every alive player
every tick → ~278 partial `PlayerUpdate` objects per tick (world/400
bots), ~508 on giantworldmap.
- Attack troop counts tick down every tick → whole
`outgoingAttacks`/`incomingAttacks` arrays re-cloned for every fighting
player every tick.
- `playerNameViewData` (an all-players record) was cloned every tick but
only recomputed every 30 ticks.

## Change

Three additions to the worker → main protocol (all transferable,
zero-clone):

1. **`packedPlayerUpdates`** — `[smallID, tilesOwned, gold, troops]`
float64 quads for players whose stats changed. These fields no longer
appear in `PlayerUpdate` diffs (first emissions still carry the full
snapshot). Gold is exact in a float64 (game values ≪ 2^53).
2. **`packedAttackUpdates`** — `[ownerSmallID, direction, index,
troops]` quads. Attack arrays are only resent when
membership/order/retreating changes — which is exactly the condition
that keeps the patch indexes valid (a tick either resends an array or
patches it, never both).
3. **`playerNameViewData` is now optional** — attached only on
placement-rebuild ticks (spawn ticks, first ticks, every 30th, spawn
end). The client keeps the last applied values; dead players' name
placements freeze at death (matching the previous effective behavior).

On the client, `GameView.populateFrame` now also rebuilds `names` /
`relationMatrix` / `allianceClusters` only when their inputs changed
that tick — field presence on a partial `PlayerUpdate` marks them dirty.
(`playerStatus`, nuke telegraphs, and attack rings still recompute every
tick; they're tick- or unit-dependent.)

## Results (perf:client, this machine; low-end devices ~5–20× slower)

Default run (world, 400 bots, 1800 ticks):

| stage | before | after |
|---|---|---|
| clone (serialize+deserialize) | 1.02ms | **0.09ms** |
| GameView.update | 0.62ms | **0.29ms** |
| WebGLFrameBuilder.update | 0.04ms | 0.04ms |
| **TOTAL burst mean** | **1.67ms** | **0.42ms** |
| TOTAL p99 / max | 3.47 / 10.3ms | **1.21 / 3.92ms** |

giantworldmap/600t: 2.54 → 0.68ms mean. Player update objects: 278 → 6.5
per tick (world), 508 → 12 (giant). The remaining burst is mostly tile
apply + per-tick derivations — the part that frame-spreading (next step)
addresses.

## Verification

- **Sim final hash unchanged** on all three reference configs
(`5607618202213430`, `29309648281599524`, `39945089450032050`) — no
simulation behavior change.
- **View hash unchanged** on all three configs (`942106e9`, `a3aae227`,
`cbaaf265`) — the rendered view state is provably identical
tick-for-tick, including the name-freeze semantics.
- New tests: `tests/PackedPlayerUpdates.test.ts` (drain + GameRunner
cadence), packed-channel and freeze-at-death cases in
`tests/client/view/GameView.test.ts`, `packAttackTroopDeltas` unit tests
and updated diff contract in `tests/GameUpdateUtils.test.ts` /
`tests/PlayerUpdateDiff.test.ts`.
- `npm test` (1490 tests), `eslint`, `prettier`, `tsc --noEmit` all
pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

2026-06-12 16:50:56 -07:00

Evan

2e6f70c098

Speed up the core sim: inline sfc32 PRNG and allocation-free player updates (#4233 )

## Summary

Follow-up to #4230. Two more core-sim optimizations — these are
**behavior-affecting in controlled ways** (unlike #4230, which was
hash-identical), so both come with dedicated test coverage written
before the change.

Combined results (`npm run perf:game`, same machine, before → after):

| run | mean tick | ticks/sec | p99 | peak heap |
|---|---|---|---|---|
| default (world, 400 bots, 1800 ticks) | 7.98 → **6.96 ms** | 125 →
**144** | 21.2 → **19.0 ms** | 438 → **294 MB** |
| giantworldmap, 600 ticks | 17.4 → **15.2 ms** | 58 → **66** | 32.6 →
30.5 ms | |

Cumulative with #4230 vs. the original baseline: default run mean 9.04 →
6.96 ms (111 → 144 ticks/sec); giantworldmap 22.5 → 15.2 ms (44 → 66
ticks/sec, max tick 52.8 → 40.1 ms).

### 1. `PseudoRandom`: seedrandom ARC4 → inline sfc32

- ARC4 was ~4% of profiled self time. The new engine is sfc32 with
splitmix32 seed expansion and a warmup, using only 32-bit integer ops —
sequences are identical across platforms. The class API is unchanged.
- This **removes the `seedrandom` dependency entirely**, making
`src/core` actually dependency-free (the import was the only violation
of that rule).
- ⚠️ **The random stream differs, so the deterministic game-state hash
changes.** All clients run the same code, so cross-client sync is
unaffected; the harness reproduces the same hash on repeated runs per
seed. New reference hashes:
  - `--map world --ticks 200 --bots 100` → `5607618202213430`
  - default run → `29309648281599524`
  - `--map giantworldmap --ticks 600` → `39945089450032050`
- New `tests/PseudoRandom.test.ts` (15 tests) pins the engine-agnostic
contract: per-seed determinism, ranges, uniformity, adjacent-seed
decorrelation, and every API method. The tests were verified green
against the old engine first, then the swap.
- The stream change exposed a test that passed **by RNG luck**: in
`AiAttackBehavior.test.ts`, "nation cannot attack allied player" was
actually being blocked by the difficulty dice gate in `shouldAttack`,
not the alliance check — hiding that the test's `AiAttackBehavior` was
constructed without its `NationEmojiBehavior`. The test now supplies one
and verifies the real protection layer (`AttackExecution`'s alliance
check), robust to any dice outcome.

### 2. `PlayerImpl.toFullUpdate`: allocation-free empty collections

- `toFullUpdate` runs for every player every tick and allocated ~10
collections each (allies, embargoes Set, attacks, alliance views, …)
even when all were empty — the common case for most of 472 players.
Because `lastSentUpdate` retains each snapshot for a full tick, these
objects survived minor GC, got promoted, and accumulated as old-space
garbage between major GCs — that's the peak-heap drop.
- Empty collections now reuse shared **frozen** module-level singletons,
so `diffPlayerUpdate`'s existing `a === b` fast paths skip structural
comparison entirely. Non-empty collections build in single passes.
Freezing makes accidental in-worker mutation throw loudly instead of
silently corrupting every player; consumers across the worker boundary
get mutable structured clones as before. (`Set` cannot be frozen —
`EMPTY_EMBARGOES` is documented as never-mutate.)
- Value-identical: the game-state hash is unchanged by this part
(verified against the post-PRNG baseline).
- New `tests/PlayerUpdateDiff.test.ts` (8 tests): full-snapshot shape,
null-when-unchanged, embargo/alliance/target/attack diffs through the
real tick pipeline, and the freeze contract.

### Verification

- Full suite passes: 124 files / 1408 tests (23 new) + server tests;
lint and prettier clean.
- Hash reproducibility confirmed: repeated runs with identical args
produce identical hashes on all three configs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

2026-06-12 08:15:01 -07:00

2 Commits