25 January 2023

What generative AI like ChatGPT gets right — and wrong

Benjamin Powers

ChatGPT, the chatbot created by OpenAI, has taken the terminally online crowd by storm. People have used it to write sonnets, essays and even computer code — usually followed by some statement akin to “Wow, this is incredible.”

It’s the latest, buzziest example of generative artificial intelligence, the same type of model that allows online platforms like Midjourney and DALL-E 2 to make images from prompts. Already, advocates and detractors alike are heralding the leaps in large language models like ChatGPT and others as forces that will fundamentally change how we live and work, from how kids do homework to who writes computer code.

But the reality is more complicated. Generative AI models are very good at very specific tasks — but they are not one-size-fits-all tools. For example, ChatGPT can create lovely sonnets about tectonic plates, but it struggles to write an up-to-date explanation of what plate tectonics is. It also has trouble generating accurate code or understanding context around current events given that GPT-3, the large language model it was trained on, stopped gathering data in 2021, and it can be biased. Moreover, while numerous startups say they’re working on “generative AI,” experts and venture capitalists say that in some cases that’s just marketing, as small firms hop on the latest buzzword bandwagon.

Where generative AI can shine is more general tasks that may not end up the way you thought — writing form emails, drafting a philosophical essay or creating cover art for an album. Sometimes, how convincing it can be is a detriment. One recent paper found that AI-written abstracts of fake scientific papers managed to fool real scientists, who couldn’t tell they were written by AI.

“It might look like an overnight success for specific models, but the truth is this space and generative AI has been brewing, and folks and companies like Runway, we have been working on this for years,” said Cristóbal Valenzuela, CEO and co-founder of Runway, a company that works in content creation and editing using AI and was launched in 2018. “I think what has changed now as models have gotten better and people’s understanding of how those products and those models can be used and leveraged is becoming more mainstream.”
What is generative AI?

AI is not a monolithic field. It encompasses many types of models and architectures whose common ground is imitating human intelligence. Historically, most models have focused on tasks such as pattern recognition to help make analyzing large amounts of data more efficient.

For example, you could have an algorithm that monitors thousands of cameras and senses when a human is approaching one — cutting back on labor and saving money. (Congratulations, you’ve now made Ring doorbells possible!) It’s also the reason AI can recognize an identify images of, say, a cat.

Generative AI flips that on its head by shifting from analyzing existing data for information to using that data to learn how to write new text or create new images.

“The difference now with more generative AI, it’s not so much about the interpretation and processing of information as the production of new content,” said Micah Musser, a research analyst at Georgetown University’s Center for Security and Emerging Technology. “And so that opens up the ways in which this technology can affect the jobs or the activities of people who weren’t previously affected — including software engineers, writers and artists.”

But although generative AI is good at generating new content, it often struggles to incorporate current context around data sets it was trained on. “ChatGPT is great but it doesn’t have a 2nd level of reference (can’t google stuff) or doesn’t have the context of a question (3rd level of reference),” venture capitalist Nick Davidov tweeted recently. “I wonder how GPT-4 and Google’s LaMDA 2 are going to tackle this.”

And the latest crop of models still essentially operate as next-word predictors. That means the longer the text they’re asked to generate, “they can lose their train of thought. They can appear to suddenly switch positions relative to where they started a text,” said Musser. In other words, shorter is better — at least for now.

“It’s extremely good at tasks that don’t require precision,” said Yoav Shoham, co-CEO of AI21 Labs, an AI lab and product company. “Ask a text-generation system to compose a birthday card for your grandma, and you’ll do fine; it’s generic enough a task. But you wouldn’t trust the system to compose a letter to your boss, key client or significant other. Those require more precision than current generative systems can provide.”

Yet those shortcomings are not always detected by human users. Many generative AI models or applications are best used as a co-pilot of sorts for the person interacting with them, whether it comes to writing or coding. But trusting AI-written code implicitly could mean that a user is less likely to double-check it, letting errors — even potentially dangerous ones, depending on the application — slip through.

Valenzuela echoed this concern, noting that it’s more of a user interface and user design problem than an issue with the underlying artificial intelligence. That doesn’t mean that model developers can ignore it, though.


The technology news site CNET, which has used AI to generate news articles with the byline “CNET Money Staff,” now says it is reviewing all of the pieces “for accuracy,” after the news site Futurism identified errors in how the stories explained basic concepts such as calculating compound interest and how loans work.
Tech is a double-edged sword

While boosters of various tech products tout their ability to “disrupt” settled industries or potential to change the world, it’s always worth keeping in mind that tech can be used for whatever people want — good or bad.

A new research paper published last week by researchers at Stanford University, Georgetown and OpenAI examines how malicious actors could use generative AI to spread propaganda at a scale never before seen. The authors started writing this report well over a year ago, well before generative AI became the latest buzzword in tech.

“We expect that language models will be useful in influence operations,” said Josh Goldstein, research fellow at CSET and a part of the CyberAI project.

Goldstein said this could include, but not be limited to, short social media posts, full-length news articles to better substantiate fake news websites, or chat bots that engage users in more highly individualized or personalized conversations. (That last example is more a worry for the future than the present, based on the current quality of generative AI.)


The news is complicated. Get clarity delivered to your inbox.


Sign up for Grid's daily newsletter and get the context you need on the most important stories of the day.

Sign Up


Your newsletter subscription with us is subject to our Privacy Policy. and Terms of Service.




There is also the possibility that bad actors will use generative AI without any major change to the level of disinformation in the world. For example, deepfake images created by AIs have proliferated, but many of the most worrisome initial predictions about their impact have not come to pass, said Musser.

“Maybe propagandists aren’t actively searching out these types of tools, that they only use them when they present themselves and become very convenient, and if that’s true, then meaningfully changing how a small number of models are developed and marketed could substantially could reduce the threat on the margin at least,” he said.
Is it more than a buzzword?

Many tech companies have started claiming they’re leveraging “generative AI” in their products. It’s hard not to see the truth as somewhere between an honest claim and jumping on the bandwagon of the latest term that might appeal to investors — remember when everything was Web3 or NFTs?

The answer, according to venture capital firms, is both. There are companies that have been building on this tech for years, while others are just jumping into the pool (of potential money).

“I think in the investor community, some people just fall for buzzwords and start talking about it — in some cases, probably without really understanding what it is,” said Ashish Kakran, principal at Thomvest, a venture capital firm.

That doesn’t mean there isn’t promise for the technology to do things like help creators translate their work into other languages and produce new and unique content, according to Ollie Forsyth, a global community manager at the venture capital firm Antler who recently published an overview of the generative AI ecosystem.

And when it comes down to what might be capturing the public’s imagination, that’s likely to percolate up to the investor community. So while the advancements made by generative AI in recent years have unlocked new potential, it’s important to put that progress into perspective and recognize that the field is still very young.

“At the end of the day, I think if you’re raising capital, investors can suss out whether this is a generative AI product on not,” Forsyth said. “I think it’s important to stay true to your values. But of course, like any founder who’s going through building a company, they want as much hype behind it as possible.”

No comments: