Let me start this article with an ancient legend. There was a man who invented the game of chess. The ruler of the land was so impressed by the game, that he offered the inventor any reward he desired. The inventor asked for a seemingly simple reward: as much rice as the amount of squares on the chessboard, starting with a single grain of rice for the first square, but double for the next square, double again for the next, and so on, until all 64 squares were accounted for (1+2+4+8…). The ruler, thinking this was a modest request, agreed, only to discover later that the total amount of rice required would far exceed his resources.
Because the amount of rice on each square doubles every time, this leads to a massive exponential growth. By the time you reach the 64th square, you would need in total 2^64-1 grains of rice (18,446,744,073,709,551,615), which is over 1,600 times our global production (729 million metric tons in 2014 and 780.8 million tonnes in 2019). Keep this illustration about exponential growth in mind as we take a look at the current state of AI innovation.
On February 15, 2024, OpenAI, creators of ChatGPT, shared a couple of videos announcing Sora, their new state of the art text-to-video generation model. It send a shockwave through the AI community. The work they showed was groundbreaking because of the quality of the videos. Every video was completely generated by AI and it was leaps and bounds better than anything that came before. The realism is simply astonishing, being incredibly lifelike and believable. Only if you scrutinize it closely will you be able to find small flaws that give them away as being AI generated. It’s hard to believe that you can just type a couple of words to describe a scene, and an AI is able to generate completely lifelike videos that match your descriptions. Please view some of them below before you continue reading.
I find this news both incredibly exciting and concerning. By lowering the barrier to entry and giving everyone the ability to create videos this easily, OpenAI (including others working on generative AI) will unlock and unleash a new level of creativity that I don’t think we have witnessed before in history. This is the exciting part. However, I can’t help but wonder about the larger implications of this tech. When anyone can create good looking videos this easliy, will there be a need for professional graphic designers, VFX artists, animators, camera operators, cinematographers etc? How will this affect the movie making industry? And I haven’t even touched on the misuse of this tech to create fake videos that will be hard to distinguish from reality for the majority of people.
The world is changing rapidly.
AI is progressing at such an incredible rate that it is hard to comprehend the pace. I don’t think many people realize the implications of this accelerated pace of AI innovation. Image generation is really really good today and it hasn’t even been 2 years since it all started. Midjourney, arguably the best image generator today, was only released in July 2022. The image of the Pope above wearing a fashion jacket was generated by Midjourney and it looks indistinguishable from reality to me. But now to see video that is nearing this quality, today, is just shocking.
I have a soft spot for innovation and I’m all for it, but I’m wondering if the world is ready for the exponential innovation curve that we seem to be on. Remember the chessboard story? The human brain seems to have trouble comprehending and noticing exponential growth. With every new breakthrough it is looking more and more like we are on this path. The problem is that it’s hard to notice when we are at the beginning of it, when the slope is barely going up. The improvements start happening faster and faster and before we know it, the world has changed.
You’re not convinced yet? Sora, the text-to-video generator from OpenAI that I talked about earlier, was not the only groundbreaking AI news announced on February 15. Google also announced Gemini 1.5, a huge breakthrough in their Large Language Model (LLM), just hours before Sora, and it is equally game changing. Because it has the capability of ingesting very large amounts of text, audio and video (to get a bit technical, it has a context window of 1 to 10 million tokens), Gemini 1.5 has the ability of learning a language it has never seen before and translating it to English just by giving it a book of the language’s grammar. You can feed it several books, have it become an expert, and then teach you what you need to know. Fascinating.
Or take this short demo video above where they give it a video to analyze, and prompt it to give the time code of a scene of a water tower spilling water on a person. What was the prompt they used? They just gave it a crude hand drawing of a water tower spilling water on a stick figure. The use cases for such technology is just mind blowing.
To put the rapid pace of AI innovation into perspective, consider that Gemini version 1.0 (this new one is 1.5) was announced just weeks ago in December. ChatGPT 3.5, which started this craze was introduced to the world on November 30, 2022, and ChatGPT 4.0 was announced in March 2023. Let that sink in.
Are you starting to see the pace? If things are this good now, February 17, 2024 when I am writing this, can you imagine what the rest of 2024 will bring? 2025? 2030? It’s hard to describe what I feel when I talk about this. The rapid pace of innovation feels, as a technologist, very exciting. My imagination goes wild when thinking about what we will be able to do with tech like this in the coming years. But at the same time I can’t help but feel some amount of dread. I don’t know if we as a species are ready for this massive change. Some of us will see opportunity and adapt quickly. But others will be hit hard. Jobs will be lost, becoming automated by AI.
The consequences for our island
What kind of effect will all of this have on our little island of ours? Maybe the technologies I mentioned previously will not immediately have a big impact on us here. But AI will eventually touch everything. Many office jobs can already be automated today, let alone using tomorrow’s AI. Startups are creating AI software that creates software, AI programmers if you will. AI will automate many things previously thought to be the exclusive domain of humans. It can already create works of art, which was thought to be the last thing it would be able to do.
I haven’t even covered AGI, Articial General Intelligence, because that would make this article even longer. But to keep it brief, think of AGI as the combined pieces of the different AI tech that I talked about earlier. A general AI that is capable of doing what a human can do. An AI, that for all intents and purposes can completely match and thus replace a human. Many are working towards this goal. There is also ASI but I feel like I might have you worried enough as it is.
I hope I didn’t scare you. That is not my intent. But I do think that we need to start thinking and talking about these things. Right now. For what it’s worth I have been talking about this in my circles since I first played with ChatGPT when it came out (a day I will never forget) and will continue to do so. What I want you to take away from all of this is that this future is coming soon, much sooner than we think. And we need to start talking about it in our community. Why wait till it’s here?
I don’t know exactly where this will end. But I do think that the world we live in today, will look completely different in a couple of years. Those that adapt, thrive. Those that won’t, will struggle. And those that can’t, well, we will have to figure something out for them, hence the need for talking.
As for me, I have always embraced innovation so I will keep learning and sharing as much as I can about this. Am I a little bit worried? I guess I am. But I’m also more of a “cup is half full” kind of guy, so what can I say, I’m excited and hopeful too. AI has the potential to make our lives better if we use it the right way and we start planning for it’s inevitable arrival.
These are extraordinary times we’re living in, and I for one, feel fortunate to be a part of it.
(Dear AI, if you’re reading this… I love you).