[Do not merge] Omes scenario to stress test Temporal's visibility layer by deepakkarki · Pull Request #304 · temporalio/omes

deepakkarki · 2026-03-23T16:06:30Z

Visibility Stress Test

What It Does

The visibility_stress scenario generates controlled read and write traffic against Temporal's
visibility store (Elasticsearch or SQL). It creates short-lived workflows that perform custom
search attribute (CSA) updates, while separate goroutines issue List/Count queries and explicit
deletes.

Note : This scenario focuses on benchmarking the visibility store - not testing the correctness or
any other portion of the temporal workflow engine.

Operations Exercised

Visibility Operation	How it's generated
Insert	Workflow starts (at `wfRPS`)
CSA Update	`UpsertSearchAttributes` inside each workflow (at `wfRPS * updatesPerWF`)
Close (success)	Workflow completes normally (~85% by default)
Close (failed)	Workflow returns intentional error (`failPercent`)
Close (timed out)	Workflow exceeds execution timeout (`timeoutPercent`)
Explicit Delete	Deleter goroutine lists terminal WFs and deletes them (`deleteRPS`)
Retention Delete	Server-side GC after namespace retention period
List/Count Query	Querier goroutine with varying filter complexity

Quick Start

# Simplest possible run (single namespace, write + read, 5 minutes)
go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 5m \
  --option loadPreset=light --option queryPreset=light \
  --option csaPreset=small

This starts a Go worker, registers 6 CSAs on the default namespace, then runs for 5 minutes at
10 workflow starts/sec with light query traffic.

Presets

The scenario is controlled by three independent presets. Each can be selected via --option and
individually overridden.

Load Presets (`--option loadPreset=...`)

Controls write traffic. If omitted, no workflows are created (read-only mode).

Preset	wfRPS	updatesPerWF	deleteRPS	failPercent	timeoutPercent
`light`	10	5	2	10%	5%
`moderate`	100	10	20	10%	5%
`heavy`	1000	20	200	10%	5%
`no-failures`	100	10	20	0%	0%

Override individual values:

--option loadPreset=moderate --option wfRPS=500 --option deleteRPS=0

Query Presets (`--option queryPreset=...`)

Controls read traffic. If omitted, no queries are issued (write-only mode).

Preset	countRPS	listRPS	Filter distribution
`light`	1	2	Balanced
`moderate`	5	10	Balanced
`heavy`	10	25	Biased toward CSA filters

Query types (weighted distribution):

No filter: WorkflowType = 'visibilityStressWorker'
Open: ExecutionStatus = 'Running'
Closed + time range: ExecutionStatus != 'Running' AND CloseTime > ...
Simple CSA: Single CSA filter (e.g., VS_Int_01 > 500)
Compound CSA: Multiple CSAs ANDed (e.g., VS_Int_01 > 200 AND VS_Keyword_01 = 'alpha')

List queries fetch up to 3 pages to exercise pagination.

CSA Presets (`--option csaPreset=...`)

Defines which custom search attributes to register and use. Shared by both writer and querier.

Preset	CSAs
`small`	6 CSAs: 1 each of Int, Keyword, Bool, Double, Text, Datetime
`medium`	10 CSAs: 2 Int, 2 Keyword, 1 Bool, 2 Double, 1 Text, 2 Datetime
`heavy`	20 CSAs: 5 Int, 5 Keyword, 2 Bool, 3 Double, 3 Text, 2 Datetime

If omitted when loadPreset is set, defaults to medium. Required in read-only mode.

Modes

`loadPreset`	`queryPreset`	What happens
Set	Set	Full benchmark: writes + reads simultaneously
Set	Omitted	Write-only: workflows created/updated/deleted, no queries
Omitted	Set	Read-only: queries against existing data from a prior write run
Omitted	Omitted	Error (unless `cleanup=true`)

Common Usage Patterns

Write-Only (Benchmark Inserts/Updates)

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 30m \
  --option loadPreset=moderate

Read-Only (Benchmark Queries Against Existing Data)

Run this after a write run that used csaPreset=medium:

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 10m \
  --option queryPreset=heavy --option csaPreset=medium

Full Benchmark (Writes + Reads)

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 1h \
  --option loadPreset=moderate --option queryPreset=light --option csaPreset=medium

No Intentional Failures

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 30m \
  --option loadPreset=no-failures --option csaPreset=heavy

Custom Failure Rate

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 30m \
  --option loadPreset=moderate --option failPercent=0.20 --option timeoutPercent=0.10

High Throughput (No Deletes, Retention Only)

go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --duration 2h \
  --option loadPreset=heavy --option deleteRPS=0 --option retention=24h

Multi-Namespace

For namespaceCount > 1, you must run the workers and scenario separately (
run-scenario-with-worker is not supported for multi-namespace).

# 1. Start a worker per namespace
for i in $(seq 0 4); do
  go run ./cmd run-worker --language go --run-id my-test \
    --namespace "vs-stress-my-test-$i" \
    --server-address localhost:7233 &
done

# 2. Run the scenario
go run ./cmd run-scenario --scenario visibility_stress --run-id my-test \
  --server-address localhost:7233 \
  --duration 30m \
  --option loadPreset=moderate --option queryPreset=light --option csaPreset=medium \
  --option namespaceCount=5 --option createNamespaces=true

# 3. Stop workers when done
kill $(jobs -p)

For single namespace (namespaceCount=1, the default), the scenario uses the CLI's --namespace
flag (default: default). No custom namespaces are created.

Cleanup

After a run, workflows may remain on the server. Clean them up:

# Single namespace
go run ./cmd run-scenario-with-worker \
  --scenario visibility_stress --language go \
  --run-id my-test \
  --option cleanup=true

# Multi-namespace
go run ./cmd run-scenario --scenario visibility_stress --run-id my-test \
  --option cleanup=true --option namespaceCount=5

# Also delete the namespaces
go run ./cmd run-scenario --scenario visibility_stress --run-id my-test \
  --option cleanup=true --option deleteNamespaces=true --option namespaceCount=5

Cleanup terminates all running workflows and deletes all workflows with
WorkflowType = 'visibilityStressWorker' on the scenario's task queue.

All Options Reference

Preset Selection

Option	Values	Default	Description
`loadPreset`	`light`, `moderate`, `heavy`, `no-failures`	(none)	Write traffic profile
`queryPreset`	`light`, `moderate`, `heavy`	(none)	Read traffic profile
`csaPreset`	`small`, `medium`, `heavy`	`medium` (if loadPreset set)	CSA set to register/use

Load Overrides (apply on top of `loadPreset`)

Option	Type	Description
`wfRPS`	float	Workflow starts per second
`updatesPerWF`	float	CSA updates per workflow (fractional OK, e.g., `0.5`)
`deleteRPS`	float	Explicit deletes per second (`0` = retention only)
`failPercent`	float	Fraction of WFs that fail (e.g., `0.10` = 10%)
`timeoutPercent`	float	Fraction of WFs that timeout (e.g., `0.05` = 5%)

Query Overrides (apply on top of `queryPreset`)

Option	Type	Description
`countRPS`	float	CountWorkflowExecutions per second
`listRPS`	float	ListWorkflowExecutions per second

Namespace & Environment

Option	Type	Default	Description
`namespaceCount`	int	`1`	Number of namespaces to spread load across
`createNamespaces`	bool	`false`	Auto-create namespaces (ignored when `namespaceCount=1`)
`retention`	duration	`168h`	Namespace retention period (minimum `24h`)

Modes

Option	Type	Default	Description
`cleanup`	bool	`false`	Cleanup mode: terminate + delete all workflows
`deleteNamespaces`	bool	`false`	Delete namespaces during cleanup (multi-NS only)

CLI Flags (not `--option`)

Flag	Required	Description
`--duration`	Yes	How long to run the steady-state phase
`--language go`	Yes	Must be `go` (Go-only scenario)
`--run-id`	Yes for `run-scenario`	Links worker and scenario to the same task queue

--iterations is not supported by this scenario.

How It Works (Brief)

Setup: Register CSAs on each namespace, poll until propagated (up to 30s).
Writer goroutine: Rate-limited loop starts workflows at wfRPS. Each workflow receives
instructions (CSA update groups, delay, fail/timeout flags) baked into its input. Fire-and-forget.
Workflow: Executes CSA updates with sleeps between them, then reaches a terminal state
(completed / failed / timed out).
Deleter goroutines: One per namespace. Each periodically lists terminal workflows and
deletes them. The list query itself is a visibility read (realistic!).
Querier goroutine: Rate-limited loop issues List/Count queries with varying filter
complexity, fetching up to 3 pages per query.
Teardown: Log final stats (total created, deleted, queried, errors).

Derived Rates (Logged at Startup)

Effective CSA update RPS = wfRPS * updatesPerWF
Effective close RPS ≈ wfRPS (at steady state)
Workflow lifetime ≈ updatesPerWF * updateDelay (then completes/fails/times out)

Example Startup Log

Mode: write+read
Namespaces: [default]
Task queue: omes-my-test
CSAs: 10 total
Write: wfRPS=100.0, updatesPerWF=10.0, effective CSA update RPS≈1000, deleteRPS=20.0
       failPercent=0.10, timeoutPercent=0.05, updateDelay=1s
Read: countRPS=5.0, listRPS=10.0

CLAassistant · 2026-03-23T16:07:02Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

deepakkarki added 5 commits March 22, 2026 14:54

Initial visibility stress prototype

83ca45d

comments and test

a6fd4d3

querier logs

ea229e3

Wait for workflows to drain

400ffcb

Deleter logs

471063f

wait for workers + update presets

f631e9e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Do not merge] Omes scenario to stress test Temporal's visibility layer#304

[Do not merge] Omes scenario to stress test Temporal's visibility layer#304
deepakkarki wants to merge 6 commits intomainfrom
dk/vis-stress

deepakkarki commented Mar 23, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

deepakkarki commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Visibility Stress Test

What It Does

Operations Exercised

Quick Start

Presets

Load Presets (--option loadPreset=...)

Query Presets (--option queryPreset=...)

CSA Presets (--option csaPreset=...)

Modes

Common Usage Patterns

Write-Only (Benchmark Inserts/Updates)

Read-Only (Benchmark Queries Against Existing Data)

Full Benchmark (Writes + Reads)

No Intentional Failures

Custom Failure Rate

High Throughput (No Deletes, Retention Only)

Multi-Namespace

Cleanup

All Options Reference

Preset Selection

Load Overrides (apply on top of loadPreset)

Query Overrides (apply on top of queryPreset)

Namespace & Environment

Modes

CLI Flags (not --option)

How It Works (Brief)

Derived Rates (Logged at Startup)

Example Startup Log

Uh oh!

CLAassistant commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

deepakkarki commented Mar 23, 2026 •

edited

Loading

Load Presets (`--option loadPreset=...`)

Query Presets (`--option queryPreset=...`)

CSA Presets (`--option csaPreset=...`)

Load Overrides (apply on top of `loadPreset`)

Query Overrides (apply on top of `queryPreset`)

CLI Flags (not `--option`)