154: Removing complexity

Google Vids points to AI enabling more people to easily make more video

Sep 27, 2024

🎭🎬 If you dig long-format podcasts on specific creative topics, i.e. film making, theater and cinematography, I strongly recommend Team Deakins and their latest episode (Apple, Spotify) with actor and director Tim Blake Nelson (O Brother, where are thou; Dune Part Two; Syriana; Eye of God; etc). The “team” is acclaimed cinematographer Roger Deakins and his collaborator James Deakins (they helped make 1917; Blade Runner 2049; Fargo; No Country for Old Men; Skyfall; The Shawshank Redemption, etc.). The archive is equally rewarding.

Almost two years ago, as ChatGPT was becoming transcendent, Derek Thompson wrote in The Atlantic (bolding mine),

“A constellation of generative AI start-ups promise to automate an array of tasks we’ve historically considered for humans only: drawing, painting, image editing, audio editing, music writing, video-game designing, and more.”

That quote continues to resonate.

Who is empowered to create continues to spread further with AI, generating fascinating tensions and challenges—opening many doors, while also raising many bars. On the one hand, the sentiment, “Oh, I’m not creative” sounds increasingly hollow. AI technologies assert you will be creative, but likely as an editor or curator. On the other, democratizing technologies threaten ancient perceptions of divine-given talent. There’s an honest sting seeing years of meticulous, hard-won technique reduced to a click. And yet, who doesn’t appreciate so many newly enabling technologies?

How we make video is undergoing profound change

Earlier this week Google released its Cloud 2024 “Gemini at Work” videos. Their approach mirrors Microsoft’s Wave 2 events (covered here) from last week. AI is in everything, unlocking connections, insights, individual potential—and most important, making video creation less and less mysterious.

This is a good thing.

Most marketers are—rightly—not all that intrigued by lens choices, lighting design, production details, wardrobe, aspect ratios, or the lingo of editorial and post production. They’re not even that interested in transitional artifacts like scripts, and storyboards, let’s be honest. For the same reasons they’re not attending meetings related to ink procurement for the 2025 sales brochures. Most marketers are properly focused on the big picture, and the broad context those myriad production details live within.

But of course the details matter. Which is where so much friction develops when the deliverable is video. Unlike print or static digital where a WYSIWYG concept could be the final deliverable, video typically asks us to consider A) a concept; B) perhaps an evolved script or storyboard, something to elicit “yes, go make that”; C) the process of creating/gathering/curating; and D) the process of editing. So many potential steps. So much industry vernacular. So many headaches.

The evergreen challenge of video has been the complexity of its process.

Granted, my take above refers to “conceptual” video—commercials, brand films and the like. The complexity of those types of video is often kind of the point; the mystique is part and parcel of their magic. But there are so many other kinds of video—product demo, talking head, event recap, on-screen tutorial, to name a few. They are generally straightforward. The knowns are known. And yet—even these types of video remain challenging for most marketers to create.

Which is why “Shot on iPhone” is simultaneously a godsend and a prank. Yes, it’s empowering to have a Hollywood production studio in your pocket. But how many of us have the capabilities of a Roger and James Deakins to take advantage? The same holds true with CapCut, Descript, Canva, Vimeo and Express. Wonderful tools, if you’ve got the wherewithal to learn and hone skills predicated on a century of industry technique.

Now, what if AI removed some of the complexity inherent in these routine, non-conceptual videos?

Let’s be clear: Google Vids does not grant you good taste.

Google Vids does not magically present you with an elusive, career-defining concept. You still have to hire humans (likely using AI) for those.

But Google Vids does deserve credit for demystifying and automating a significant portion of standard, non-conceptual, video-making process; especially with initial structure and editing. As with all things AI, Gemini suggests beginning the process with a prompt. Their example: “Create a recap video of…with takeaways from [document].” Note the capability, similar to Copilot, to extract insights and content from existing materials and fold them into what you’re creating.

And the first thing Gemini does is generate an outline, akin to B) “something to elicit ‘yes, go make that?’” This is actually pretty huge—enabling someone to go from a simple prompt to a transitional interface where they organize and edit elements in a sequence. The UX is guiding you to think about narrative structure, even if you don’t know the term.

Next you select a design template, the LLM and video system work their magic, and you arrive at your first view of a draft edit timeline.

The significance here is multimodal—not only has Gemini assembled a sequence of clips, it’s also written a draft Voice Over script, and curated every single image or video element with royalty-free stock or AI-generated content. Next, you could generate a VO.

Once approved, the script for each clip/section is generated and synched to its related part of the timeline. Then you edit or curate stock images and video, upload additional elements, add or edit sound effects, add music, and either keep or replace what was originally generated. You didn’t have write or capture content, instead you serve as editor or curator. Colors and fonts for each template are also updatable.

Finally, you might decide to incorporate additional video from a colleague. Because the edit timeline exists within your broader organizational workspace, you can tag a coworker with a comment at a specific point in the timeline.

A tagged contributor arrives at the appropriate clip with the option to simply record their portion. They could read along with the AI-generated script using their laptop camera. Or not. Clearly you could complicate things if you wanted.

But that’s it. Any contributions are folded into the timeline. If you wanted to edit more, add more, you could. Or share the result with your team.

And as we say about all things AI, this is the worst this tool will ever be.

Is this a film maker’s solution? Of course not. But it could be a marketer’s solution towards templated, brand-safe video content generation—fueled by various AI tooling.

Granted, Google Vids isn’t live just yet. And when they launch you’ll be limited to creating desktop/horizontal videos under 10 minutes in length. But at $12/month per user, what they’re proposing will reduce friction, remove mystery, and enable more people to create in ways they hadn’t before—which is entirely the point of generative AI.

AI+Creativity Update

📆 Google and General Mills are hosting their second Transforming CPG: The Future of AI and Digital Innovation event on October 24 and you’re invited. I’ll be there.

🤖🎙️ OpenAI released an evolved Voice Mode within ChatGPT. We’re going to talk with our AIs more and that’s very new and weird. Nate wrote a thoughtful take on all this. “Talking to an AI feels different than typing—not because the AI is different but because I am.”

🤔 Rishad Tobaccowala offers 10 thoughts on AI, Humans and Work. If you read one additional item, read this. My favorite of his observations: “The key is to think about how to turbo-charge oneself and ones firm versus defending the status quo and fretting about change.”

🤖🎙️ Educator Marc Watkins offers a deep analysis of Google’s NotebookLM and its trending audio overview (podcast) capabilities. He illuminates how Notebook’s advanced retrieval augmented generation (RAG) process helps elevate and contextualize numerous tasks.

✏️ Open AI CEO Sam Altman wrote about The Intelligent Age. “We are more capable not because of genetic change, but because we benefit from the infrastructure of society being way smarter and more capable than any one of us; in an important sense, society itself is a form of advanced intelligence.”

😳 🤖 Oh, and AI and creativity researchers in Switzerland have determined, “The creative ideas produced by AI chatbots are rated more creative than those created by humans.” Of course it’s much, much more nuanced and complex than that clickbait quote. Maybe read their paper.

Curiosity+Courage

Discussion about this post