Imagine a collection of books – maybe millions, maybe even billions – thrown haphazardly by publishers into a heap pile in a field. Every day the pile grows exponentially.
These books are full of knowledge and answers. But how would an aspirant find them? Lacking organization, books are useless.
This is the raw internet in all its unfiltered glory. That’s why most of our quests for “enlightenment” online start with Google (and yes, there are other search engines as well). Google’s algorithmic tentacles crawl and index every book in this unholy pile. When someone types a query into the search bar, the search algorithm crawls through their indexed version of the internet, brings up pages, and presents them in a ranked list of top results.
This approach is incredibly useful. So useful, in fact, that it hasn’t fundamentally changed in over two decades. But now AI researchers at Google, the very company that set the bar for search engines in the first place, are sketching out a blueprint for what might come next.
In one paper on the arXiv preprint server, the team suggests that the technology to make the internet even more searchable is at hand. They say big language models — machine learning algorithms like OpenAI’s GPT-3 — could completely replace the current system of indexing, retrieving, then ranking.
Is AI the search engine of the future?
When looking for information, most people would like to ask an expert and get a nuanced and trustworthy answer, the authors write. Instead, they google it. It can work or go wrong. Like when you get sucked down a panicked, health-bound rabbit hole at two in the morning.
Although search engines present sources (hopefully of good quality) that contain at least elements of an answer, it is up to the searcher to scan, filter and read the results to piece together that answer as best they can.
Search results have improved by leaps and bounds over the years. Yet the approach is far from perfect.
There are question-and-answer tools, like Alexa, Siri and Google Assistant. But these tools are fragile, with a limited (though growing) repertoire of questions they can answer. Although they have their own shortcomings (more on those below), large language models like GPT-3 are much more flexible and can construct new natural language responses to any query or prompt.
The Google team suggests that the next generation of search engines could synthesize the best of all worlds, integrating today’s best information retrieval systems into large-scale AI.
It should be noted that machine learning is already at work in traditional indexing and then ranking search engines. But instead of simply augmenting the system, the authors propose that machine learning could replace it entirely.
“What would happen if we got rid of the notion of an index completely and replaced it with a large pre-trained model that efficiently and effectively encodes all the information in the corpus?” Donald Metzler and his co-authors write in the article. “What if the distinction between retrieval and ranking disappeared and there was instead a single response generation phase?”
An ideal outcome they envision is much like the Enterprise ship’s computer in star trek. Information seekers ask questions, the system responds in a conversational manner, i.e. with a natural language response as you would expect from an expert, and includes authoritative quotes in its response .
In the article, the authors outline what they call an ambitious example of what this approach might look like in practice. A user asks, “What are the health benefits of red wine?” The system returns a nuanced answer in plain prose from multiple authoritative sources — in this case WebMD and the Mayo Clinic — outlining the potential benefits and risks of drinking red wine.
It doesn’t have to stop there, however. The authors note that another advantage of large language models is their ability to learn many tasks with just a few adjustments (this is called one-shot or few-shot learning). Thus, they may be able to perform all the tasks that current search engines perform, and dozens more as well.
Just another vision
Today, that vision is out of reach. The great language models are what the authors call “dilettantes”.
Algorithms like GPT-3 can produce prose that is, at times, almost indistinguishable from passages written by humans, but they are also prone to nonsensical responses. Worse still, they indiscriminately reflect the biases embedded in their training data, have no sense of contextual understanding, and cannot cite sources (or even separate high-quality and low-quality sources) to justify their answers.
“They are seen as knowing a lot, but their knowledge is deep,” the authors write. The document also outlines the breakthroughs needed to close the gap. Indeed, many of the challenges they describe apply to the field as a whole.
A key advance would be to move beyond algorithms that only model relationships between terms (such as individual words) to algorithms that also model the relationship between words in an article, for example, and the article in general. Additionally, they would also model the relationships between many different items on the internet.
Researchers also need to define what constitutes a quality response. This is not an easy task in itself. But, to begin with, the authors suggest that high-quality answers should be authoritative, transparent, unbiased, accessible, and contain diverse perspectives.
Even the most advanced algorithms today do not approach this bar. And it would be unwise to deploy natural language models at this scale until they are resolved. But if it is resolved – and there is already work in progress to resolve some of these challenges—search engines would not be the only applications to benefit.
‘Early Grey, Hot’
It’s a tantalizing sight. Scouring web pages for answers while trying to figure out what’s trustworthy and what’s not can be exhausting.
Undoubtedly, many of us are not doing the job as well as we could or should.
But it’s also worth speculating about how an accessible internet like this would change the way people contribute to it.
If we primarily consume information by reading algorithmically synthesized prose responses, instead of opening and reading the individual pages themselves, would creators publish as much work? And how would Google and other search engine makers compensate the creators who, in essence, themselves produce the information that drives the algorithms?
There would still be a lot of people reading the news, and in those cases search algorithms would have to provide lists of stories. But I wonder if a subtle shift might be happening where smaller creators add less, and in doing so, the web becomes less information rich, weakening the very algorithms that rely on that information.
There’s no way to know. Often speculation is rooted in today’s issues and proves innocent in hindsight. In the meantime, the work will undoubtedly continue.
Perhaps we will solve these challenges – and others as they arise – and, in doing so, arrive at this scholarly and pleasantly talkative star trek computer that we have long imagined.