Agent-first generation: runnable scripts for media generation at scale#13
Open
yining1023 wants to merge 6 commits intomainfrom
Open
Agent-first generation: runnable scripts for media generation at scale#13yining1023 wants to merge 6 commits intomainfrom
yining1023 wants to merge 6 commits intomainfrom
Conversation
…ebrand Add generation skills that run directly via `uv run` — no SDK setup required: - scripts/: generate_video.py, generate_image.py, generate_audio.py, list_models.py, get_task.py, and shared runway_helpers.py - skills/: rw-generate-video, rw-generate-image, rw-generate-audio All generation scripts and skills include seedance2 (36 credits/sec) with text-to-video, image-to-video, and video-to-video support. README rebranded with two use cases: 1. Generate media at scale (primary) — agent runs scripts directly 2. Integrate into your app (secondary) — existing integration skills Plugin metadata updated to v2.0.0 with agent-first positioning. Made-with: Cursor
…ries Made-with: Cursor
Made-with: Cursor
- generate_audio.py: fix TTS voice format to {type: "runway-preset", presetId}
- generate_image.py: default to gemini_2.5_flash (Nano Banana Pro), auto-select
correct ratio per model (1344:768 for gemini, 1280:720 for others)
- generate_video.py: auto-correct duration for veo3 models (valid: 4, 6, 8),
handle seedance2 VTV without ratio/duration
- runway_helpers.py: add seedance2 to model registry, add valid durations for
veo3 models, improve error output with validation details
- list_models.py: add requests dependency
- rw-generate-image SKILL.md: update default model to gemini_2.5_flash
Made-with: Cursor
- generate_video.py: use videoUri (not promptVideo) for gen4_aleph VTV, remove duration/ratio from gen4_aleph VTV (not supported) - generate_audio.py: use audioUri for isolate/dub, targetLang (not targetLanguage) for dub, media object + voice object for sts - generate_image.py: error on gen4_image_turbo without reference images - runway_helpers.py: veo3 only supports duration 8 (not [4, 6, 8]) - rw-integrate-audio SKILL.md: fix TTS to promptText + voice object, isolate to audioUri, dub to audioUri + targetLang, sts to media + voice - rw-integrate-video SKILL.md: gen4_aleph VTV uses videoUri not promptVideo - rw-generate-audio SKILL.md: add STS example, document voice presets Made-with: Cursor
- runway_helpers.py: rewrite upload_file to use two-step presigned URL flow (POST /v1/uploads for uploadUrl+fields, then POST file to that URL) - Fix seedance2 ratio docs everywhere: API only accepts pixel-based ratios (1280:720, 720:1280, 960:960, etc.), not shorthands (16:9) - All 14 test scenarios passed: TTV (gen4.5, seedance2), ITV (gen4.5), VTV (gen4_aleph, seedance2), image (gemini_2.5_flash, gen4_image), TTS (Noah, Leslie), SFX, voice isolation, speech-to-speech, dubbing Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update description:
Add generation skills that run directly via
uv run— no SDK setup required:README rebranded with two use cases: