The Future of Search

Today, to search is to google. Specifically, it is to use Google’s search engine to find something on the Web. As for those other searches that once helped define the human condition—for meaning, love, purpose, or God—those have, in little more than a decade, assumed almost secondary importance.

From its now almost apocryphal beginnings at Stanford in 1998, Google was described by its cofounders Larry Page and Sergey Brin as a technology designed to “organize the world’s information.” In an early press release, Brin declared that “a perfect search engine will process and understand all the information in the world.” In its first decade, Google focused on the former undertaking—organizing information on a global scale—by trying to map the World Wide Web, essentially an ever-expanding and highly fragile set of documents connected by hyperlinks. Google’s search engine helped people navigate the Web by tracing the links among webpages. Google’s search engineers thought of the Web as a medium of documents. Accordingly, the search engine they designed was document centric, keyword based, and highly contextual. Search results were always embedded in particular texts—documents that, once you clicked on one, framed information in a particular way.

Google’s first generation of search technology captured an order intrinsic to the Web itself. In their original paper outlining the “anatomy of a large scale hyper-textual Web search engine,” Page and Brin explained that they had started from the insight that, as John Battelle explained in Wired in 2005, the Web “was loosely based on the premise of citation—after, all what is a link but a citation?” The original aim of Google, then, was to trace all of these links among pages, not only the outgoing links from individual pages but also the incoming links. e goal was a more complete model of the citational structure of the Web. And the groundbreaking technology Page and Brin devised was PageRank—a proprietary algorithm that modeled the links that constituted the Web.

What distinguished Google from other search engines, including the early Yahoo product, was that it did not simply collect citations. The PageRank algorithm took the citational logic a step further by differentiating among pages, determining the value of a page according to the number and quality of links to it. A page with more links, or with links from other highly ranked pages, would have a higher value, because PageRank recognized it as more important. The PageRank value of a website was basically a function of a page’s degree of connectedness.

As Google became the most important means of searching for information online, the organizational logic of PageRank virtually extinguished those websites with few incoming links or with links from similarly unpopular pages. “An unindexed Internet site,” as technology writer James Gleick put it in The Information (2011), “is in the same limbo as a misshelved library book.”

Furthermore, Google’s PageRank technology makes no claims about the internal content of the pages it tracks. It makes no claims about that content’s truth. e value or worth that PageRank measures is the importance of a website as determined by other websites. PageRank measures how well connected the New York Times website is—its popularity, not the accuracy of its information. In fact, some gossip websites have far higher PageRank scores than many other more accurate sites. PageRank levels the standards of legitimacy so that traditional notions of epistemic authority—expertise, cultural and social capital, scholarly peer review—have little place in its calculations.

For some, such as Michael Lynch, a professor of philosophy at the University of Connecticut, Google PageRank represents everything that is wrong with knowledge in the digital age. It is central to what he calls in The Internet of Us “Google-knowing,” not just the way we use Google’s search engine but the way “we are increasingly dependent on knowing” by means of it and other digital technologies. Although Lynch acknowledges the ample benefits of such technologies, he worries that our increasing reliance on them will ultimately “undermine” and weaken other ways of knowing. He is concerned in particular about how “Google-knowing” impedes ways of knowing that require “taking responsibility for our own beliefs” and understanding how “information fits together.” Information technologies, claims Lynch, may be expanding our “ability to know in one way” but they are impeding our ability “to know in other, more complex ways.” In a more nuanced but familiar tone, Lynch argues that Google is (still) making us stupid.

The problem with “Google-knowing,” for Lynch, is that it is so unstable, so unreliable, so contingent. Relying on the popularity of a website—PageRank’s basic method—is a poor path to knowledge. By contrast, Lynch says, real knowledge requires that we have some rm basis of “trust.” Justified true belief requires reliable and stable sources, experiences, or reasons, or a grasp of the “big picture,” which he calls understanding. Without trust, “Google-knowing” isn’t really knowledge. It’s just opinion.

But why, as Socrates asks Meno in Lynch’s helpful retelling, does knowledge matter more than “true opinion”? Why does it matter to someone asking me for directions to Charlottesville’s downtown mall if I give them directions based on a guess that happens to be right—“go down 5th Street and take a right at West Main, I don’t know, but I think that’s right”—as opposed to giving those same directions based on my own repeated experiences of having driven down 5th Street and taken a right innumerable times? Knowledge, as Socrates says to Meno, “should ever be more prized than right opinion.” Even if opinions happen to be true, they are unreliable. Right opinions are like the statues of Daedalus, the ancient Greek craftsman who fashioned sculptures that, as legend had it, could move. Like the statues, right opinions, “if they are not fastened up,” Socrates says, “play truant and run away; but, if fastened, they stay where they are.” Knowledge is right opinion that is grounded.

Grounded or “real” knowledge requires a basis of trust, reason for justification. And as Lynch argues in this lucid and engaging account, “Google-knowing” may be fast, easy, and e cient, but it is not necessarily reliable in the Socratic sense of knowledge as opposed to right opinion.

Search engineers themselves, at Google and elsewhere, were never fully satis ed with Google Search 1.0. It was too “document-centric.” From the company’s earliest days, Brin made it clear that he wanted Google ultimately to make a search

engine that would not merely process and organize all of the world’s information, but would “understand” it. He always wanted Google to capture more than popularity. Like Lynch, he wanted “real knowledge.”

In recent years, Google engineers have started to imagine and even openly predict a new generation of search engines that will do just that. As one of these engineers blogged in 2012, Google would in the near future be transformed from an “information engine” to a “knowledge engine.” In a 2013 interview with the Wall Street Journal, computer scientist and singularity guru Ray Kurzweil, hired in 2012 to help bring “natural language understanding to Google,” predicted the introduction of search technologies that would “know at a semantically deep level” what users wanted without their even asking. Kurzweil promised a search technology that would liberate data from documents and use that data to create knowledge.

In a research paper published last spring, a team of Google engineers proposed and modeled a new search method that would rely not on “exogenous signals” (links) but on “endogenous” ones (facts). The authors’ objective was to evaluate websites on the basis of the “correctness of [the sites’] factual information” by designing an algorithmic method of extracting facts and evaluating the accuracy of websites’ facts. (One example of a subject for their proposed search method was the mountain of gossip that sustained the trumped- up controversy over Barack Obama’s nationality.) This algorithm would, its creators said, yield a trustworthiness score (Knowledge-Based-Trust, or KBT, in Google-talk) in which trustworthiness would be quanti ed as the probability that a web source contained the correct “value for a fact.”

The project raises many questions, including important epistemological ones: What is a fact, and how do you determine one? What kind of knowledge is fact-based knowledge? But perhaps most telling is Google’s interest in trustworthiness in the rst place. In its shift from mapping and processing hyperlinks to “recreating intelligence”and anticipating what humans want, Google seeks, as technology writer and social critic Nicholas Carr put it on his blog in 2013, “to read not just the web but us.” The perfect search engine won’t model the Web; it will model us.

At its most ambitious, Google wants to overcome the Web’s relationship to the history of media—documents, citations, footnotes, texts. It wants, like Socrates, “real knowledge” not merely “right opinion.” In a perhaps ironic historical twist, Google Search 2.0 would mean the end of Lynch’s “Google-Knowing.” Google engineers want what Lynch does: a way of knowing that is less “vulnerable to the manipulations and desires of others,” a way of knowing that is more stable and reliable. The key difference, of course, is that whereas Lynch views knowing as fundamentally personal (a process that happens solely in our minds), Google considers knowing to be something tied to tools that extend well beyond our minds. Google’s algorithms are only the lastest version of a long history of knowledge technologies.

But what would Google’s engineers’ pursuit of “real knowledge” mean for search? The term search has long had connotations of an open-ended exploration that propels readers beyond themselves and their settled desires to unknown realms that hold out the promise of intellectual serendipity—of finding a book, an idea, or something else that surprises. We don’t always know what we want. Google Search 1.0 placed more emphasis on simply organizing and collating information about information. It never purported to identify facts, adjudicate truth claims, or provide stable knowledge. And it never told us what we really wanted. It was profoundly uncritical and contingent, but it was based on a legitimate premise: Search technologies facilitate but do not replace the messy, context-bound, all-too-human creation of knowledge. In their e orts to make search less about navigating the seas of information and more about understanding human desires, Google’s engineers and Search 2.0 could well render us more passively dependent on search technologies than we already are. And search technologies are never abstract, free-floating tools. They are designed and owned by particular interests with certain ends in mind. Google knows what it wants. And now it plans to engineer what we want. I’ll take “Google-Knowing 1.0” any day.

Review of The Internet of Us: Knowing More and Understanding Less in the Age of Big Data Michael Patrick Lynch (New York, NY: W.W. Norton, 2016). First appeared in The Hedgehog Review. Summer 2016: 110-113.

Chad Wellmon

The Future of Search

Leave a comment Cancel reply

The Future of Search

Share this:

Leave a comment Cancel reply