How an AI Can Write, Direct, and Film a Short Comedy—All by Itself

The Big Picture

AI can automatically produce 1–2 minute comedy sketches that approach professional quality by running many specialized writer, director, and critic bots that compete and iteratively refine ideas based on viewer engagement data.

ON THIS PAGE

The Evidence

Multiple specialized agents—writers, critics, editors, and scene directors—can be arranged into competing populations to brainstorm, evaluate, and refine comedic scripts and their video realizations. Multi-Agent Content Creation Critics trained from thousands of YouTube sketch videos provide human-aligned feedback, and iterative competition across ‘islands’ of scripts leads to stronger, more diverse comedy. The full pipeline generates 1–2 minute sketch videos with coherent dialogue, consistent characters, and shot continuity, at a fraction of typical production cost and time. Mutual Verification Pattern

Not sure where to start?Get personalized recommendations

Learn More

Data Highlights

1Generated sketches are 1–2 minutes long, matching short-form comedy norms.

2Base configuration runs in roughly one day on a single GPU with an API budget of about $5, far below professional production costs.

3Critics are aligned using engagement patterns from thousands of YouTube sketch comedy videos to guide what viewers find funny.

What This Means

Engineers building multi-agent systems and creative pipelines can use the approach to coordinate specialized roles (writing, directing, critiquing) and scale creative search. Hierarchical Multi-Agent Pattern Product and content leaders in studios or platforms can prototype short-form video ideas cheaply and rapidly, using the system to explore diverse comedic styles before investing in live production.

Key Figures

Figure 1 : COMIC is an agentic sketch comedy video generator. It takes images, voices, and brief descriptions as input, and automatically generates funny comedy scripts along with corresponding video and audio. Our method flexibly builds stories around multiple characters and custom backgrounds. Each generated comedy is 1–2 minutes long; please watch them at https://susunghong.github.io/COMIC .

Fig 1: Figure 1 : COMIC is an agentic sketch comedy video generator. It takes images, voices, and brief descriptions as input, and automatically generates funny comedy scripts along with corresponding video and audio. Our method flexibly builds stories around multiple characters and custom backgrounds. Each generated comedy is 1–2 minutes long; please watch them at https://susunghong.github.io/COMIC .

Figure 2 : Overall agentic flow. Our method is loosely modeled on human production studios, with agentic counterparts for each role, such as writer, critic, and director. The writing and rendering loops allow us to generate scripts and videos with sufficient breadth and depth through island-based competition and iteration, as illustrated in Fig. 4 and Fig. 5 , respectively.

Fig 2: Figure 2 : Overall agentic flow. Our method is loosely modeled on human production studios, with agentic counterparts for each role, such as writer, critic, and director. The writing and rendering loops allow us to generate scripts and videos with sufficient breadth and depth through island-based competition and iteration, as illustrated in Fig. 4 and Fig. 5 , respectively.

Fig 3: Figure 3 : Sketch comedy videos featuring various generated situations. See our project page for videos of these results.

Figure 4 : Script writing stage. Isolated script populations evolve on separate islands under distinct critic committees sampled from the aligned critic pool. Losing scripts are refined through round-robin pairwise tournaments by each island’s critic committee, driving improvement while supporting aesthetic diversity across islands.

Fig 4: Figure 4 : Script writing stage. Isolated script populations evolve on separate islands under distinct critic committees sampled from the aligned critic pool. Losing scripts are refined through round-robin pairwise tournaments by each island’s critic committee, driving improvement while supporting aesthetic diversity across islands.

Ready to evaluate your AI agents?

Learn how ReputAgent helps teams build trustworthy AI through systematic evaluation.

Learn More

Considerations

The iterative refinement improves quality but still incurs computational and development cost, and results depend on the models plugged into the pipeline. Using YouTube view counts and engagement as a proxy for humor introduces noise from clickbait and platform promotion, which can bias what the system learns to favor. The system does not fully address originality, attribution, or legal risks from training on internet-sourced material, and audio beyond dialogue (like sound effects) is not yet integrated. Responsible AI

Methodology & More

COMIC organizes many specialist agents—writers that propose concepts, critic committees that score humor using viewer-aligned signals, editors that revise scripts, and scene directors that break scripts into shots—into an iterative, competitive loop. Scripts evolve on separate “islands” with different critic philosophies; weaker scripts are refined using feedback from winners in round-robin tournaments, which preserves diverse comedic styles (slapstick, dry, surreal) while improving overall quality. After scripts settle, scene directors generate shot-by-shot directions; visual and audio generators produce frames and voice, and rendering critics evaluate and refine realizations through single-elimination tournaments to pick the best video versions. Event-Driven Agent Pattern Mutual Verification Pattern

Avoid common pitfallsLearn what failures to watch for

Learn More

Credibility Assessment:

University of Washington affiliation and Ira Kemelmacher‑Shlizerman (h‑index ~34) plus other recognized authors — solid established group though arXiv preprint.

multi-agent orchestration agent-to-agent evaluation agent reliability

Not sure where to start?