17 September 2018

The world's most prolific writer is a Chinese algorithm

By Douglas Heaven 

Load up the homepage for e-commerce giant, Alibaba – a wholesale shopping site that’s more or less China’s answer to eBay – and you’ll find images and descriptions of anything you could wish to buy, from kitchen sinks to luxury yachts. Every item has a short headline, but most are little more than lists of keywords: hand-picked search terms to ensure this USB phone charger or that pair of flame-resistant overalls float to the top in a sea of thousands upon thousands of similar items. It sounds simple, but there’s an art to this copywriting. Yet Alibaba recently revealed that it is training an artificial intelligence to generate these item descriptions automatically – and they’re not the only ones. Over the last few decades AIs have been taught to compose music, paint pictures and write (bad) poems. Now they’re writing advertising copy, 20,000 lines of it a second.

“Generative bots are the new chatbot,” says Jun Wang at University College London. “Generating copy is just one of the applications that can be done.”

Launched by Alibaba’s digital marketing arm, Alimama, the AI copywriter applies deep learning and natural-language processing tech to millions of item descriptions on Alibaba’s Tmall and Taobao sites to generate new copy of its own.

“The tool removes the inconvenience of having to spend hours seeking design inspiration by looking at competitor listings and manufacturer sites,” says an Alibaba spokesperson. “The user can create their ideal copy with just a couple of clicks.”


Alibaba recently announced that it is using AI to generate listings copy (Credit: Alamy)

Despite their forays into the world of art, creating unexciting text such as ad-copy is where generative systems will have the biggest impact in the short term. Software will produce millions of words and images that millions of people will see – and be influenced by – every day. And if they do the job well enough, we will never even notice the difference.

The line between human and machine agency is already blurred online. Twitter bots sow the seeds of misinformation, spambots generate oddly poetic emails about Viagra, and automatic aggregators find and republish online news articles so quickly it can be hard to tell who first published what and when.

Take the news about Alibaba’s copywriter. The English version of the press release was picked up by several news outlets, mainly in the UK, the US and India. But among those first reports was a video on an obscure YouTube channel called “Breaking News”. A synthesised voice reads out the news story, with subtitles appearing over a series of stock images related to Alibaba and ecommerce. Buried in the video’s description is a link to the text’s source: an article published an hour or so before by International Business Times, a website based in India.

The speed and weird sloppiness with which the original story was repurposed – the subheadings are copied over as if part of the main text – strongly suggest the video was automatically generated. As does the fact that, apart from the Alibaba video, the channel seems to post nothing but news reports about international football, also all republished from other sources.

We have news about one AI churned out by another. Welcome to the future: at once weird and mundane

Someone may be picking which stories to republish, but no obvious human activity is visible on the channel or the associated Twitter account. So we have news about one AI churned out by another. Welcome to the future: at once weird and mundane. 

“It's not science fiction,” says Wang. He thinks advertising is a perfect fit for generative AI because it has a clear goal. “You want to maximise the number of people that click and then buy,” he says. “We’re not talking about generating art.” 


Alibaba says it can generate 20,000 lines of copy a second (Credit: Alamy)

According to Alibaba, using its tool is simple. You provide a link to the item you want a description of and click a button. “This brings up numerous copy ideas and options,” says the Alibaba spokesperson. “The user can then alter everything from the length to tone, as they see fit.”

The tool is also prolific. Alibaba claims that it can produce 20,000 lines of copy a second and that it is being used nearly a million times a day by companies – including US clothing brand Dickies – that want to create multiple versions of advertisements that still grab our attention when presented in different sized slots on webpages.

And it’s not just Alibaba. The company’s main rival JD.com says it is also using software – which it calls an "AI writing robot” – to generate item descriptions. According to tech website ZDNet, JD.com’s system can produce more than 1,000 “pieces of content” a day and has a flair for flowery language, describing wedding rings as symbolising “holy matrimony drops from the sky”.

Alibaba claims that its program can produce 20,000 lines of copy a second and that it is being used nearly a million times a day by companies

Mark Riedl at the Georgia Institute of Technology is sceptical that such tools are as good as the PR implies, however. Even if we ignore the claim made in Alibaba’s press release that its AI copywriter can ace the Turing Test – in which an AI must pass itself off as human – there are questions about the approach. For one thing, we do not know how good these systems are at achieving their clear goal of making people click and then buy – a process known as conversion.

Learning how to produce text that describes an object is definitely the kind of thing generative systems have become good at, says Riedl. “You can take an image or a few keywords and produce something that looks like a product description.” An AI can recognise an image of a camera, say, look up what it knows about that object and string together a short description that looks like it could have been written by a human.


Alibaba's rival JD.com says it is also using AI to generate descriptions of its products (Credit: Alamy)

But that’s only half the job. “Copywriting is really about the long tail,” says Riedl. To convert clicks into sales, especially when competition for attention online is so fierce, you need to address the specific concerns and interests of a particular – possibly niche – audience.

“You don’t just want to say this is a camera and it has these features, you want to say why someone should buy it or why this camera will solve problems that other cameras won’t,” says Riedl. “That requires a lot more context. You’re going to want to know a lot more information about your product and a lot more information about your audience.”

The problem with a machine-learning approach like that used by Alibaba and JD.com is that the generative system will tend to learn the most average way of saying things. “AI is really good at generic formats but the more you want to specialise or customise it becomes a much, much harder problem,” says Riedl. “I don’t think we’re there yet.”

Perhaps not, but it is where we’re heading.

To understand why, you first need to delve into the ways that advertising is already personalised and adapted to your habits. Wang, for instance, is co-founder of a company called MediaGamma, which uses reinforcement learning – a type of machine learning that also powers DeepMind’s Go-winning software AlphaGo – to help advertisers buy ad space. When you visit a website that has adverts, the ads that you see will probably be different to the ones someone else sees. That’s because the ads have been selected especially for you.

As soon as you load a webpage, the page lets the internet's ad-brokers know who is visiting and a high-speed bidding war kicks off, typically involving around 100 advertisers, with the winners getting to show you their ads. The whole process is over in 100 milliseconds, faster than the blink of an eye.

MediaGamma’s AI helps advertisers make smarter bids in this automatic auction. How much is this person’s attention worth? “We don’t necessarily know who you are but we know your activity online,” says Wang.

Google’s trackers operate on around 75% of the million most popular websites

The three biggest ad networks – Adsense, Admob and DoubleClick – are run by Google. And there are few places on the internet that Google cannot track you. Its trackers operate on around 75% of the million most-popular websites. And if Google can’t see you, Facebook – which has trackers on 25% of those sites - probably can.

Those trackers record what we search for, what websites we visit and how long we spend on them. Say someone is interested in shoes and is known to have bought a particular type of shoe from a particular store. MediaGamma’s AI learns to categorise internet users based on all of the information it has available. If your online activity is similar to the person who previously bought footwear then it will advise a shoe advertiser to bid.

“It is highly probable that you will convert,” says Wang. “We also estimate, for this type of user in this type of market, how much to bid in order to win.”


Chinese news agency Xinhua says it has already started using AI to generate short bulletins (Credit: Alamy)

But this is just the start. Last month MediaGamma won a grant from the UK government’s ‘innovation agency’ to develop more advanced AI that can generate text and images for targeted ads. This would effectively involve splicing together something like Alibaba’s copywriter with MediaGamma’s current user-analysis tech. Instead of using your online activity to decide which existing ad you should see, that information could soon be used to generate a bespoke ad on the fly. “We could have a banner ad specifically tailored to a person’s tastes,” says Wang.

Or in Alibaba's hands, such a system could generate a bespoke item description, designed to appeal to your individual preferences, based on what it knows about your purchasing habits. This would be a direct pitch to the long-tail that Riedl imagines. 

These AI systems are getting smarter but are they getting more creative? Here’s a famous six-word story by Ernest Hemingway: “For sale: baby shoes, never worn.” It’s ad copy, similar to the descriptions churned out by Alibaba’s AI. But the emotional resonance of Hemingway’s words comes from his deep understanding of a human life that machines do not have. Even if they produced those words, we would not react to them in the same way.

At least not yet. Riedl is interested in giving AI a form of narrative intelligence, the ability to construct and understand stories more like people do. One of his experimental systems, called Shezarade, generates short narratives based on crowd-sourced information about common human activities, such as a trip to the cinema. Here’s an extract of one such story:

With sweaty palms and heart racing, John drove to Sally’s house for their first date. Sally, her pretty white dress flowing in the wind, carefully entered John’s car. John and Sally drove to the movie theatre. John and Sally parked the car in the parking lot. Wanting to feel prepared, John had already bought tickets to the movie in advance. A pale-faced usher stood before the door; John showed the tickets and the couple entered. Sally was thirsty so John hurried to buy drinks before the movie started. John and Sally found two good seats near the back. John sat down and raised the arm rest so that he and Sally could snuggle. John paid more attention to Sally while the movie rolled and nervously sipped his drink. Finally working up the courage to do so, John extended his arm to embrace Sally. He was relieved and ecstatic to feel her move closer to him in response. Sally stood up to use the restroom during the movie, smiling coyly at John before that exit.

It’s not quite Hemingway but narrative generation is a growing area of AI. For Riedl, narrative intelligence will help AIs understand the world more like we do – we often tell stories to make sense of things. Such understanding could make AIs that we interact with, such as Siri, appear less alien.

As well as telling stories and becoming better salespeople, more creative AIs could also be used to generate customised campaign emails or social media posts for political candidates. We are also seeing the first AI copywriters generating short news bulletins. China’s Xinhua news agency recently announced it would begin using software to write some of its newswire reports. This has raised concerns, since Xinhua is viewed by many to be a propaganda machine for the Chinese government. If you have hundreds of bots pushing out a particular version of a narrative it could be hard to counter and could have big implications for news bias.

Yet this is the direction we’re heading. We’re seeing more and more businesses, political campaigns and consulting firms starting to use AI to help with their communications. We should try to spot it when we can. But that’s easier said than done. We may find we are aware of these machine-generated messages about as much as are of the online ad auctions running every time a website opens. “Nobody notices,” says Wang.

No comments: