
Great question. Honestly it’s still our biggest pain point.
Current approach:
- Pre-generate a CSV of test user credentials (10k accounts in our staging environment)
- k6 reads from the CSV using
SharedArrayso each VU gets a unique user - Product/inventory data is refreshed from a staging snapshot weekly via a script
It’s not elegant. But it works well enough that our load results are meaningfully realistic. The alternative — fully synthetic data — produces numbers that don’t translate to production at all in our experience.


The “performance theater” description is painfully accurate and way more common than people admit.
On the staging environment problem — proportional scaling is the key concept that helped us. You don’t need a full production replica, you need a consistent environment and you need to interpret results relatively, not absolutely.
If staging is consistently 2x slower than production under equivalent load, that ratio becomes your calibration factor. You care less about the raw p95 number and more about whether it changed compared to last run.
The regression detection piece is more valuable than the absolute number anyway.