Skip to content

Add project executor framework for workflow load testing#301

Closed
THardy98 wants to merge 1 commit intomainfrom
pr/proj-load-testing
Closed

Add project executor framework for workflow load testing#301
THardy98 wants to merge 1 commit intomainfrom
pr/proj-load-testing

Conversation

@THardy98
Copy link
Copy Markdown
Contributor

@THardy98 THardy98 commented Mar 18, 2026

Motivation

Omes today tests Temporal SDKs through scenarios — predefined load patterns that drive a generic KitchenSink worker using protobuf action sequences. This works well for cross-SDK conformance testing, but makes it difficult to test real workflows written in native Go (or any other language). The protobuf-action model is indirect: you describe what the workflow should do in protobuf, and a generic worker interprets it. You never actually test the workflow code that a team would write in practice.

Project tests are a new paradigm for omes. Instead of describing workflows indirectly, each project is a self-contained Go module with its own workflows, activities, worker, and execution logic. The framework coordinates these projects via gRPC, making them buildable, runnable, and testable independently — while still plugging into omes' load generation and metrics infrastructure.

Architecture

A project test is a Go binary that exposes two subcommands:

  • <project> worker — starts a Temporal worker with the project's real workflows and activities
  • <project> project-server — starts a gRPC server that accepts Init and Execute RPCs

The harness (projecttests/go/harness/) provides this framework. Projects register three callbacks — RegisterWorker, OnInit, OnExecute — and call Run(). The harness handles gRPC server lifecycle, worker management, client pooling, Prometheus metrics, and TLS configuration.

The test runner (CLI or integration test) spawns both processes, sends an Init RPC with connection details and config, then drives iterations via Execute RPCs. This keeps projects decoupled from the omes CLI while allowing omes to orchestrate load patterns (steady-rate, ebb-and-flow, saturation) against any project.

What's included

Two example projects demonstrate the framework:

  • helloworld — minimal (~100 LOC), shows the basic pattern
  • throughputstress — more involved, it's effectively a port of the existing throughput stress scenario. It is similar in that it parses a config file to conditionally execute a workflow (local/remote activities, child workflows, self-queries, self-signals, workflow updates, retry scenarios, heartbeats, and continue-as-new, etc.).

Note: The throughput stress project acts as an example/demonstration of porting an existing scenario to a project. It's also the most commonly used scenario for our load testing, so is a valuable scenario to transition. You are not limited to using this project for throughput testing. Any project using the steady-state executor (i.e. the generic executor) can run throughput testing.

CLI integration adds omes project and omes exec commands for running project-based load tests with configurable executors, metrics collection, and post-run verification.

Note: the post-run verification is quite simply at the moment, basically a small collection/library of existing verification functions we use already

Docker support uses a two-layer image pattern: a base image with the Go toolchain and harness, plus thin per-project overlay images. A Docker Compose stack provides Temporal server + Prometheus for local testing with metrics.

CI added a projecttests.yml workflow with three jobs for simple regression testing (integration tests - builds and runs helloworld + throughputstress against a dev server, docker image build verification, and proto generation/lint consistency checks)

This PR also refactors the existing ebb-and-flow scenario to use shared internal utilities and adds Prometheus export/query support (metrics/prom_export.go, metrics/prom_query.go)

@THardy98 THardy98 requested review from a team as code owners March 18, 2026 17:25
@THardy98 THardy98 force-pushed the pr/proj-load-testing branch 2 times, most recently from 6b6fec8 to 047022f Compare March 18, 2026 18:48
@bergundy bergundy self-requested a review March 18, 2026 19:43
@THardy98 THardy98 requested a review from Sushisource March 18, 2026 20:09
@THardy98
Copy link
Copy Markdown
Contributor Author

THardy98 commented Mar 18, 2026

FWIW - I spent some time looking at migrating some of the other executors as well (namely, ebbandflow).

I think in cases where the executor wants to modify it's load based on feedback from an Execute call (i.e. in ebbandflow's case, it wants to adjust load based on scheduled activities, and then from completed activities) it may may motivate the option for another RPC like ExecuteStream where each Execute call/iteration can return arbitrary events that the executor expects to adjust its load. Just a thought.

@bergundy bergundy removed their request for review March 20, 2026 23:08
@THardy98 THardy98 force-pushed the pr/proj-load-testing branch 5 times, most recently from f32508d to 960bbc9 Compare March 21, 2026 21:49
@THardy98 THardy98 force-pushed the pr/proj-load-testing branch from 960bbc9 to 8ab5cd7 Compare March 21, 2026 22:05
Copy link
Copy Markdown
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this makes sense, but I have a few concerns:

  • We now have two entire harnesses/frameworks that need to be implemented for every language. That's kind of a lot. I would really prefer if there were just one, but it's not clear to me that there is a path to consolidating the existing stuff into the new structure. IMO we should try very hard to figure out how to make that work. Maybe that means something like retaining the gRPC invocation structure but allowing for project and non-project based things to get invoked that way. If we can't... I can probably accept having the two options... but it feels confusing and messy.
  • I think this could've used a bit more self review before publishing. There's a lot of AI-written comments that sort of say what something is without explaining why it matters or what the semantics are. Try to put your self in a reviewers shoe's when self-reviewing and editing those
  • The README situation could use some improvement. The one in projecttests/ is not bad at all but I desperately would like an architectural overview, probably with a diagram (this also makes life easier for reviewers) and I'd shorten an merge the docker readme into it.

Comment on lines +24 to +27
run: |
go test -v -race -timeout 10m ./projecttests/... 2>&1 | \
go run github.com/jstemmer/go-junit-report/v2@latest \
-set-exit-code -iocopy -out junit-projecttests-go.xml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably just be two separate steps

Comment on lines +60 to +62
- name: Smoke test overlay image
run: |
docker run --rm omes-projecttest-helloworld:ci --help
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'd call that a smoke test?

}

r.sdkOpts.AddCLIFlags(cmd.Flags())
cmd.Flags().Lookup("language").Usage = "Language to use for workflow tests (go only)"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't be Go only except currently though?

Also, seemingly the language flag is redundant, since the projecttests path includes the language

cmd.Flags().Lookup("language").Usage = "Language to use for workflow tests (go only)"
r.programOpts.AddFlags(cmd.Flags())
cmd.Flags().AddFlagSet(r.loggingOpts.FlagSet())
cmd.Flags().StringVar(&r.processMonitorAddr, "process-monitor-addr", "", "Address for process metrics sidecar (e.g. :9091)")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be just a port number? Seems like it would not make sense to ever bind to anything other than 0.0.0.0?

Comment on lines +36 to +37
omes project --language go --project-dir ./projecttests/go/tests/helloworld --spawn-worker --iterations 100
omes project --language go --project-dir ./projecttests/go/tests/helloworld --iterations 100`,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the names of the commands feel obvious. exec isn't in any way obviously related to project, but, it is. Maybe exec should be project-exec and this can be project-run.

Same comment as with exec that the language flag feels unnecessary here.


// DefaultDerivedQueries returns a curated list of PromQL queries that precompute
// worker/process metrics into Omni-friendly metric lines.
func DefaultDerivedQueries() []PromQuery {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very duplicative of buildMetricQueries

@@ -0,0 +1,151 @@
# projecttests Docker Setup
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is fairly useful (if a bit overly verbose), but it's not in a very useful place. Per my overall review comments, this should be combined with some kind of higher level readme that explains the whole structure and how to use it.

Comment on lines +14 to +15
func clientMain(ctx context.Context, config *harness.Config) error {
c, err := pool.GetOrDial("default", config.ConnectionOptions)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO rather than expecting everyone to use this pool, it would make more sense to just provide the client as part of the harness execute signature.

Comment on lines +1 to +4
# Project Tests

Self-contained Go programs for testing Temporal workflows under load. Each project is an independent Go module that implements its own workflows, activities, and execution logic, coordinated via gRPC.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more like the overview I was looking for but still lacks a real overview of the architecture, and still is written like Go will forever be the only language. Combining the docker readme with this, giving an architectural overview, and at least linking to it from the top level README would go a long way.

fs.StringVar(&r.runFamily, "run-family", "", "Human-readable identifier for grouping related runs")
fs.StringVar(&r.taskQueue, "task-queue", "", "Task queue name (default: omes-<run-id>)")
fs.IntVar(&r.clientPort, "client-port", 0, "Port for local client HTTP server (0 = auto)")
fs.StringVar(&r.executor, "executor", "", "projecttests executor to run")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should have a default?

@THardy98
Copy link
Copy Markdown
Contributor Author

Addressing feedback in: #317

@THardy98 THardy98 closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants