Sora 2 API Tutorial: Generate Videos Programmatically
A developer-focused guide to the Sora 2 API. Learn endpoints, authentication, parameters, code examples, and how to generate, extend, and edit AI videos programmatically.

Build Video Generation Into Your Product
The Sora 2 API turns OpenAI's most advanced video model into a programmable tool. Instead of manually prompting one video at a time, you can generate, extend, edit, and manage AI videos at scale — directly from your codebase.
Whether you're building a content platform, automating ad creative, or adding video generation to an existing SaaS product, the Sora 2 API gives you the building blocks.
This tutorial covers everything a developer needs: authentication, core endpoints, parameter reference, code examples in Python and cURL, the Character API, video extension, the Batch API for production, and common patterns. We'll also explain how VIDEOAI.ME wraps this API for non-technical users — relevant if you're evaluating build vs. buy.
The AI video generation market is growing rapidly. According to Grand View Research, it's projected to reach $2.17 billion by 2032. Developers who understand this API now are building the products that capture that market.
Authentication
The Sora 2 API uses standard OpenAI API authentication: a Bearer token in the Authorization header.
curl https://api.openai.com/v1/videos/generations \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{...}'
Or with the Python SDK:
from openai import OpenAI
client = OpenAI() # reads OPENAI_API_KEY from env
You need an OpenAI API key with video generation permissions. Video endpoints consume separate usage quotas from text and image generation.
Core Endpoint: Generate Video
The primary endpoint creates a video from a text prompt.
Endpoint
POST /v1/videos/generations
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | sora-2 or sora-2-pro |
prompt | string | Yes | Scene description (see prompting guide) |
size | string | Yes | 720x1280, 1280x720, 1080x1920, 1920x1080 |
seconds | integer | Yes | 4, 8, 12, 16, or 20 |
character_ids | array | No | Array of character reference IDs |
image_input | string | No | Base64-encoded image or URL for first-frame anchor |
n | integer | No | Number of variations to generate (default: 1) |
Resolution Availability
- sora-2:
720x1280,1280x720 - sora-2-pro:
1080x1920,1920x1080(plus all sora-2 resolutions)
cURL Example
curl https://api.openai.com/v1/videos/generations \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "sora-2",
"prompt": "A woman in her 30s holds a sleek glass bottle of face serum up to the camera. Soft natural window light. Clean white background. Medium close-up. Warm, airy aesthetic.",
"size": "720x1280",
"seconds": 12
}'
Python Example
from openai import OpenAI
client = OpenAI()
response = client.videos.generations.create(
model="sora-2",
prompt=(
"A woman in her 30s holds a sleek glass bottle "
"of face serum up to the camera. Soft natural "
"window light. Clean white background. Medium "
"close-up. Warm, airy aesthetic."
),
size="720x1280",
seconds=12,
)
# The response includes a video URL or generation ID
print(response.data[0].url)
Response Structure
The API returns a generation object containing the video URL, metadata, and generation ID (used for extension and editing):
{
"id": "gen_abc123",
"object": "video.generation",
"created": 1711234567,
"model": "sora-2",
"data": [
{
"url": "https://api.openai.com/v1/files/video_xyz789",
"generation_id": "gen_abc123",
"duration": 12,
"size": "720x1280"
}
]
}
Store the generation_id — you'll need it for extending or editing the video later.
Character API: Consistent Characters Across Videos
The Character API is one of Sora 2's most powerful features for production workflows. Create a character reference once, use it in unlimited future generations.
Create a Character
POST /v1/videos/characters
Upload a 2-4 second reference video clip:
import base64
from openai import OpenAI
client = OpenAI()
# Read reference video
with open("reference_clip.mp4", "rb") as f:
video_data = base64.b64encode(f.read()).decode()
character = client.videos.characters.create(
reference_video=video_data,
name="Brand Ambassador Sarah",
)
print(f"Character ID: {character.id}")
# Output: Character ID: char_sarah_abc123
Use a Character in Generation
Pass the character ID into the character_ids array:
response = client.videos.generations.create(
model="sora-2",
prompt=(
"A woman walks through a sunlit farmer's market, "
"picking up fresh produce and smiling at vendors. "
"Handheld camera following behind her. Warm, "
"natural color palette."
),
size="720x1280",
seconds=16,
character_ids=["char_sarah_abc123"],
)
The same person from your reference clip appears in this entirely new scene. Different outfit, different location, same face.
Best Practices for Reference Clips
- Length: 2-4 seconds. Longer clips don't improve quality.
- Quality: Clear, well-lit footage with the face visible
- Angle: Front-facing or three-quarter view works best
- Background: Simple backgrounds help the model isolate the character
- Action: Slight head movement is fine; avoid fast motion
Image Input: First-Frame Anchoring
Upload an image to use as the first frame of the generated video. The video will begin from this exact visual and evolve based on your prompt.
import base64
from openai import OpenAI
client = OpenAI()
with open("first_frame.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
response = client.videos.generations.create(
model="sora-2",
prompt=(
"The camera slowly zooms out to reveal the full "
"scene. Soft ambient light increases. The subject "
"turns to face the camera and smiles."
),
size="720x1280",
seconds=8,
image_input=image_data,
)
Important: The image must match the target resolution exactly. A 720x1280 generation requires a 720x1280 input image. Mismatched dimensions will be rejected or cropped.
This feature is invaluable for:
- Starting from a branded template or logo reveal
- Animating a still product photo into a video
- Maintaining exact visual continuity with a previous clip's last frame
Video Extension: Build Longer Sequences
The extension endpoint continues a previously generated video, maintaining visual consistency.
POST /v1/videos/extensions
extension = client.videos.extensions.create(
generation_id="gen_abc123", # the original video
prompt=(
"The woman sets down the serum bottle and picks up "
"a moisturizer, showing the label to camera. "
"Same lighting and framing."
),
seconds=8, # extend by 8 seconds
)
Extension Rules
- Maximum 6 extensions per original video
- Total maximum duration: 120 seconds (original + all extensions)
- Each extension can be 4, 8, 12, 16, or 20 seconds
- The model maintains visual consistency with the original (lighting, characters, environment)
- You can provide a new prompt direction for each extension, creating evolving narratives
This is how you build long-form content with Sora 2. A 20-second base clip extended 5 times at 20 seconds each gives you a 2-minute video — all visually coherent.
Video Editing: Modify Existing Videos
The editing endpoint lets you modify a generated video with new text instructions:
POST /v1/videos/edits
edit = client.videos.edits.create(
generation_id="gen_abc123",
prompt="Change the lighting to warm golden sunset tones",
)
Editing is useful for:
- Adjusting color grading after the fact
- Modifying environmental elements
- Iterating on the mood without regenerating from scratch
Batch API: Production at Scale
For production workflows generating dozens or hundreds of videos, the Batch API is essential. Instead of sending individual requests and managing concurrency, you submit a batch of requests and retrieve results asynchronously.
Step 1: Create a JSONL File
Each line is a complete generation request:
{"custom_id": "ad-variant-1", "method": "POST", "url": "/v1/videos/generations", "body": {"model": "sora-2", "prompt": "A young man holding a smartphone, excited expression, speaking to camera. Bright studio lighting.", "size": "720x1280", "seconds": 12}}
{"custom_id": "ad-variant-2", "method": "POST", "url": "/v1/videos/generations", "body": {"model": "sora-2", "prompt": "A woman in her 40s at a kitchen counter, speaking to camera about a cooking app. Warm natural light.", "size": "720x1280", "seconds": 12}}
{"custom_id": "ad-variant-3", "method": "POST", "url": "/v1/videos/generations", "body": {"model": "sora-2", "prompt": "Close-up of hands swiping through a mobile app interface. Clean, modern aesthetic.", "size": "720x1280", "seconds": 8}}
Step 2: Upload and Submit
# Upload the JSONL file
batch_file = client.files.create(
file=open("video_batch.jsonl", "rb"),
purpose="batch",
)
# Submit the batch
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/videos/generations",
completion_window="24h",
)
print(f"Batch ID: {batch.id}")
Step 3: Check Status and Retrieve Results
# Check batch status
status = client.batches.retrieve(batch.id)
print(f"Status: {status.status}")
print(f"Completed: {status.request_counts.completed}")
print(f"Failed: {status.request_counts.failed}")
# When complete, download results
if status.status == "completed":
results = client.files.content(status.output_file_id)
# Each line contains the result for one request,
# matched by custom_id
Batch API Benefits
- Cost savings — batch requests are typically priced lower than individual requests
- No rate limiting — submit hundreds of requests without managing concurrency
- Async processing — submit and check back later, no need to hold connections open
- Failure handling — individual failures don't block the batch; results include per-request status
For teams generating ad variations, this is the production-grade pattern. Create 50 script variants, submit them as a batch, and have all 50 videos ready the next morning.
Common Integration Patterns
Pattern 1: Ad Creative Pipeline
def generate_ad_variants(script_variants, character_id, count=10):
"""Generate multiple ad variants with consistent character."""
generations = []
for i, script in enumerate(script_variants[:count]):
response = client.videos.generations.create(
model="sora-2",
prompt=script,
size="720x1280",
seconds=16,
character_ids=[character_id],
)
generations.append({
"variant": i + 1,
"script": script,
"video_url": response.data[0].url,
"generation_id": response.data[0].generation_id,
})
return generations
Pattern 2: Long-Form Video Builder
def build_long_video(scenes: list[dict]) -> list[str]:
"""Build a multi-scene video using extension."""
# Generate the first scene
first = client.videos.generations.create(
model="sora-2",
prompt=scenes[0]["prompt"],
size=scenes[0].get("size", "1280x720"),
seconds=scenes[0].get("seconds", 20),
)
video_urls = [first.data[0].url]
gen_id = first.data[0].generation_id
# Extend for each subsequent scene
for scene in scenes[1:]:
ext = client.videos.extensions.create(
generation_id=gen_id,
prompt=scene["prompt"],
seconds=scene.get("seconds", 20),
)
video_urls.append(ext.data[0].url)
gen_id = ext.data[0].generation_id
return video_urls
Pattern 3: Image-to-Video Product Animation
def animate_product_photo(image_path, animation_prompt, duration=8):
"""Turn a static product photo into a video."""
with open(image_path, "rb") as f:
image_data = base64.b64encode(f.read()).decode()
response = client.videos.generations.create(
model="sora-2-pro",
prompt=animation_prompt,
size="1920x1080",
seconds=duration,
image_input=image_data,
)
return response.data[0].url
Error Handling
Robust error handling is essential for production integrations:
from openai import OpenAI, APIError, RateLimitError
import time
client = OpenAI()
def generate_with_retry(prompt, max_retries=3):
for attempt in range(max_retries):
try:
response = client.videos.generations.create(
model="sora-2",
prompt=prompt,
size="720x1280",
seconds=12,
)
return response
except RateLimitError:
wait = 2 ** attempt # exponential backoff
time.sleep(wait)
except APIError as e:
if e.status_code >= 500:
time.sleep(2 ** attempt)
else:
raise # client errors shouldn't be retried
raise Exception("Max retries exceeded")
Key error scenarios:
- Rate limiting (429) — back off exponentially and retry
- Server errors (5xx) — retry with backoff
- Invalid parameters (400) — check resolution/model compatibility, prompt length
- Content policy (403) — prompt was flagged; modify and retry
Why Some Developers Choose VIDEOAI.ME Instead
Building a video generation product on the raw API works, but it requires significant engineering investment:
- Queue management — handling generation times, timeouts, and retries
- File storage — storing and serving generated videos
- Character management — building UI for character creation and reuse
- Prompt optimization — developing templates and testing what works
- Billing — metering usage and managing API costs
VIDEOAI.ME handles all of this. For teams that want to use Sora 2's capabilities without building and maintaining the infrastructure, the platform provides:
- A complete video creation interface with AI-powered script writing
- Pre-built AI actor library (no need to create character references manually)
- One-click video generation with optimized prompts
- Built-in video extension and editing
- Export in platform-optimized formats
- Managed billing without per-API-call accounting
For developers evaluating whether to build on the raw API or use a platform, the decision comes down to whether video generation is your core product (build) or a tool your team uses (buy).
Start Building
The Sora 2 API gives developers access to the most capable video generation model available. Whether you're building the next creative tool or automating ad production at scale, the endpoints covered in this tutorial provide every building block you need.
For the full prompting reference, see our beginner's tutorial and best prompts guide.
If you'd rather skip the engineering and start generating videos now, try VIDEOAI.ME free — all the power of Sora 2, no API key required.
Frequently Asked Questions
Share
AI Summary

Paul Grisel
Paul Grisel is the founder of VIDEOAI.ME, dedicated to empowering creators and entrepreneurs with innovative AI-powered video solutions.
@grsl_frReady to Create Professional AI Videos?
Join thousands of entrepreneurs and creators who use Video AI ME to produce stunning videos in minutes, not hours.
- Create professional videos in under 5 minutes
- No video skills experience required, No camera needed
- Hyper-realistic actors that look and sound like real people
Get your first video in minutes
Related Articles

Sora 2 vs Runway: Which AI Video Generator Is Better?
An honest, detailed comparison of Sora 2 and Runway Gen-3 Alpha across video quality, motion coherence, resolution, pricing, API access, and more. Find out which AI video generator is right for your needs.

How to Create Video Ads with Sora 2 AI in Minutes
Sora 2 lets you generate high-quality video ads in minutes instead of weeks. Learn the prompting workflow, see example prompts for every ad format, and discover how VIDEOAI.ME makes AI ad creation effortless.

Sora 2 Tutorial: Complete Beginner's Guide to AI Video
Learn everything you need to know about Sora 2, OpenAI's video generation model. This step-by-step tutorial covers prompting, parameters, resolutions, and how to create your first AI video.