083: I'm Tim, and I approve this generative message

We’re going to normalize generative versions of ourselves

Dec 01, 2023

The promise of video at scale—it will be everywhere! everyone will create it!—always struck me as false. Have you ever run a content studio?

The most challenging element was never the cameras, or the editing software. Instead, it’s always been the humans—on camera, and off—because of simple things like availability, mood, or script changes long after we’ve wrapped filming.

But what if those factors no longer mattered?

Generative AI has changed my mind. (And I can truly empathize now with both sides of the actors strike.)

By quick example, what if you suddenly couldn’t attend that event you’d committed to?

👆🏽 I made this video inside HeyGen using a script written by Claude.ai all in about 20 minutes.

Consider the implicit complexity in producing video content of a c-suite leader, teacher, crisis comms pro:

They have to be available on camera
It helps if they’re in the mood to deliver a compelling message
And we’re assuming there’s a script
What about lighting and wardrobe?
[Later, in edit] OMG, they mispronounced the customer’s name
[After first round approvals] Can we make them look less tired?

If you know, you know.

The accessibility of smartphone video, or the brilliance of apps like CapCut didn’t solve the core challenges of access, or attitude. And that’s why, soon enough, you’re going to have generative video models of all kinds of people in various roles—especially leaders. (And a quick “yes” to some sort of “on chain” proof to govern the process, traceability and authenticity of said content.)

Imagine—the only skill required to ship a video is writing

We film our talent (leaders, teachers, whomever) reciting words to camera as reference. Two minutes of a recording will do.
We’ll upload those to a generative video platform. I use HeyGen. We’ll work through the platform’s tools to fine tune pacing, inflection, etc. of our generative self. But once those attributes are set, we’re essentially done with location, wardrobe, makeup, lighting, availability, mood, etc. forever. (Now you can see why the studios and actors care about AI.)
We also record an “I approve” preview.
Then we wait until need arises.
The need arises! The executive got bogged down. A flight got cancelled. A crisis occurred. You’re having a bad hair day. And suddenly it would be really awesome to have a video, and quickly.
Now the only skill required to make a video is writing.
The script wrangler (you?) uploads words into a generative video platform, tunes the talent’s “performance,” and exports a file.
The video is approved, or edited/exported/edited/exported until approval.
Append the “I approve” preview to the front. (I assembled mine in Premiere but you can see platforms baking this into their process soon enough.)
Ship the video.

I messed around with an earlier approach to all of the above back in February. The technology has improved almost exponentially since.

Of course, the value will likely always boil down to the writing.

Are the words relevant, sincere, remarkable? Because if they are, then converting them into video pixels which look and sound good enough is here.

Wait, but…

“We’re not going to fall for this, are we?” you might ask. I think we will. We’re going to learn to accept this approach to content. Because the quality of a generative talking head is going to be as good if not better than a traditional filmed approach; especially when it sidesteps the inability of the on-camera talent to deliver a coherent take and smile at the end. “Oh, it’s another one of these.” We’re going to just focus on the message. Which was always the point anyway.

Then the question becomes, what will we do with all this?

Curiosity+Courage

Discussion about this post