Harvard AI Expert Jonathan Zittrain Talks Large Language Models with ExtraHop

September 6, 2023

Harvard AI Expert Jonathan Zittrain Talks Large Language Models with ExtraHop

On August 23rd, ExtraHop hosted a champagne tasting for an exclusive group of about 50 cybersecurity executives. The virtual event featured special guest Jonathan Zittrain, co-founder and faculty director of Harvard’s Berkman Klein Center for Internet & Society. Zittrain, a world-renowned expert on digital technology and policy, gave a talk on the state of artificial intelligence (AI) that attendees said succeeded in demystifying much of this vexing technology.

“This has been the best presentation on AI that I’ve seen,” one attendee wrote in the chat.

“Wonderful event. Thanks so much for sharing these insights. Very valuable,” wrote another.

Read on for a recap of Zittrain’s presentation.

1966: The Birth of Generative AI

One of the longest-standing problems in AI research has been figuring out how to have a conversation with a computer that feels human and far ranging. Zittrain traced the history of this problem and its influence on AI chatbots and generative AI back to 1966, when Joseph Weizenbaum, a computer scientist and professor at MIT, created ELIZA, “a program which makes natural language conversation with a computer possible.” Weizenbaum named the program after Eliza Doolittle, a character from George Bernard Shaw’s play “Pygmalion” (and its 1964 screen adaption, “My Fair Lady”) to emphasize that its language abilities could be improved with the help of a teacher.

Zittrain characterized ELIZA as a rudimentary version of what’s known as an expert system. In this form of AI, end users interact with a knowledge base curated from experts. ELIZA was meant to act as a psychotherapist, which was a helpful conceit to hide the fact that its conversational abilities were limited. Weizenbaum admitted that ELIZA’s conversation skills were little more than a parlor trick: it would simply mirror what a user said to it. “I’m” becomes “you’re” in a reply and vice versa in an imitation of the way a therapist might speak. Despite its limitations, users were more than happy to go along, which worried Weizenbaum, who issued a prescient warning that’s as appropriate for today’s chatbots as it was for ELIZA in the mid-1960s: “ELIZA shows, if nothing else, how easy it is to create and maintain the illusion of understanding, hence perhaps judgment deserving of credibility. A certain danger lurks there.”

How Does Machine Learning Work?

Machine learning (ML) has its roots in neural networks, the building blocks of which have been in place since the 1940s. ML and neural networks have stepped into the limelight in the last decade as improvements in computing power and training data have enabled them to become dramatically more impressive.

Zittrain explained there are at least two methods by which neural networks learn: unsupervised and supervised learning. Figuratively speaking, he compared unsupervised learning to pouring a data set into a bucket, gently shaking it around, and seeing if anything useful appears. Oftentimes, Zittrain noted, nothing of value appears, but sometimes the machine groups the data in a way that makes intuitive sense to humans. The machine doesn’t know anything about the labels and reality behind the data; it’s simply offering a possible grouping. It’s up to us to make inferences and find meaning in the grouping. In a way, Zittrain added, unsupervised learning is simply applied statistics or vectorized math. This method can produce surprisingly powerful results, like predicting songs you’d like based on previous listening habits.

In contrast to unsupervised learning, supervised learning starts with being provided labeled data sets—”ground truth.” A machine designed to identify frogs would be trained on at least two data sets, one with images labeled “frogs” and another with images of other creatures labeled “not frogs.” Given these ground truths, ideally the model should be able to classify images from outside the training data as frogs or not frogs.

Zittrain explained that the architecture of these kinds of AI are based on a rudimentary conception of the connections between neurons, which is why they’re called neural networks. In a “frog-not frog” machine, each pixel in an image is fed to a “neuron” in the input layer. Depending on the output, nodes in the next layer either fire or don’t. This cascades through more layers until the output layer is reached. Here, all the outputs are summed to generate a number between 0 and 1. One means the image is of a frog. A zero means it isn’t a frog. And something in between indicates something frog-like.

These supervised models start out untrained, so the first results from the frog-not frog machine will be random. But through various training techniques, said Zittrain, we can start to get more accurate answers by telling the model whether it was right or not.

Strangely, Zittrain noted, if you look closely at the successfully trained model, you won’t easily find a parameter for “buggy eyes” or “green.” This is known as the problem of interpretability. It’s difficult for a human to look inside a model and see how it “reasons” whether something is a frog or not. Essentially, we know it works, but it’s not clear exactly how.

Surprising Successes and Shortcomings of Large Language Models

Large language models (LLMs) are the most popular current iteration of ML. They’re a type of AI known as generative AI that creates something new based on inputs and a corpus of training data. Generally, LLMs start out with unsupervised learning, and are then refined through supervised or “reinforcement” learning, according to Zittrain.

ChatGPT is probably the most well known LLM, though it’s far from the first. Zittrain noted that Microsoft released a short-lived chatbot named Tay to Twitter in 2016. Similar to ELIZA’s therapist conceit, Tay was intended to imitate “teenage web speak,” according to Zittrain. He said it’s not clear how Tay was trained, but it seems that it may have taken human responses to Twitter posts as examples of “correct” responses. Unfortunately, within 24 hours of being introduced, Tay began reflecting the worst of Twitter by responding to human users in highly inappropriate and offensive ways, leading Microsoft to take it down.

By 2018, advances in unsupervised learning led to auto-completion in texts and emails. And a new Microsoft chatbot was able to confidently answer questions like “who is Adele?” with “she’s a singer,” or “I don’t know who Adele is.”

ChatGPT is based on OpenAI’s GPT3.5. The model has, by many metrics, improved vastly over previous versions but, Zittrain remarked, it’s important to note how GPT2 was originally described. OpenAI billed it as “a large-scale unsupervised language model which generates coherent paragraphs of text” [emphasis added]. Zittrain pointed out that OpenAI didn’t describe GPT2 as generating “accurate” or “truthful” paragraphs of text; the company specifically used the word “coherent.”

Though the goal was only coherence, GPT2 grew to show elements of cognition in its later iterations. GPT3.5 could answer riddles coherently but often incorrectly. Now GPT4 appears to actually give the right answer to some riddles that might require logical processing. Zittrain and colleagues, experimenting with an early version of GPT3, found out that GPT could be prompted to write a speech for a figure of state to deliver in the event the President, say, gets eaten by a snake, perfectly in the style of presidential speechwriters. Normally a skeptic, Zittrain found these capabilities jaw-dropping. But on the other hand, if you ask how many “n’s” are in “mayonnaise,” it might tell you only one. Ultimately, “hallucinations” like this are innate to LLMs because their design is for coherence, not truth or an accurate representation of knowledge, he explained. He added that LLM hallucinations may prevent generative AI tools from becoming the new search engines. At present, without (and sometimes even with) precise prompting, it seems these tools are just as likely to mislead you as they are to tell you the truth, he said.

AI Risks and Recommendations

Like the frog-not frog supervised machine learning data set, LLMs also suffer from the problem of interpretability, according to Zittrain. We don’t really know well how they generate outputs, which makes it impossible to predict what they’ll say next. This has led some researchers to compare LLMs to the Shoggoth, an octopus-like being from the novels of H.P. Lovecraft. AI researchers working on LLMs have attempted to address this through a type of supervised learning referred to as reinforcement learning through human feedback. Experts interact with the model and train it not to give inaccurate or otherwise undesirable answers. Critics say this is just putting a smiley face mask on the Shoggoth, and Zittrain pointed out that it’s worth considering the power in the hands of whoever designs that mask.

LLMs are also vulnerable to what are known as adversarial perturbations. ML researchers discovered these seemingly random tokens that a user can append to a prompt to unlock forbidden responses. It’s unclear how or why these work, though they predate LLMs and exist across the realm of ML.

To mitigate the risks and get the most out of LLMs and other AI technologies, Zittrain offered the following recommendations:

Treat it like a friend. If you want to get the best responses from an LLM, it pays to think carefully about your prompts and instruct it as explicitly as possible. Many of us are used to “search engine shorthand,” said Zittrain, but when interacting with LLMs, the more you treat it like a friend, the better its answers will be. As models improve, their context windows grow, allowing them to remember more of the previous prompts and enabling you to iterate with them until you get the best answer. Asking LLMs to show their work or go step by step produces more accurate results. And in one fascinating example, telling a model that Turing Award winner Yann LeCun was skeptical it could solve a problem led to it finally getting the right answer. “The more you treat it like a friend, whether or not it’s merely a ‘stochastic parrot,’ in the memorable words of some prominent AI researchers, the more it apparently will deliver to you,” he said.
Experiment early and often. Encourage employees to experiment with these technologies and to share their experiments, results, and findings, including with independent researchers. You need to know where, when, and how these experiments are happening. Do evaluations and perform red-teaming in a structured way so you can catch any problems early on and avoid a public Microsoft Tay moment.
Keep a ledger. AI is like asbestos, said Zittrain. It can be found in all sorts of places, and you never know where it’s going to show up. Similarly, it also seems to be very good at its intended purpose, but problems may not manifest until later, when it’s difficult to change things. So, like asbestos, it’s best to keep records of where you’re using AI so if you have to peel any of it back, you know where to look.
Be cautious in your implementation. Once you’ve opened the AI Pandora’s box, keep the lid handy. When possible, implement AI tools in such a way that you can pull back if things aren’t working as expected.

Zittrain concluded with a quote from renowned science fiction author Arthur C. Clarke. In his 1962 book, “Profiles of the Future: An Inquiry into the Limits of the Possible,” Clarke put it succinctly: “any sufficiently advanced technology is indistinguishable from magic.” Like a fine champagne, AI certainly feels like magic at times, but it’s important to not let it bewitch you.

This article wasnotwritten with AI.

Discover more

C-LevelAI

Michael Clark

Marketing Content Writer

Michael Clark is a marketing content writer at ExtraHop. He previously worked for Optiv Security, where he developed a wide range of assets on ransomware, operational technology, threat intelligence, and more during his nearly four-year tenure with the reseller. Outside of work, Michael enjoys bouldering and writing sci-fi short stories.

Explore related articles

Continuous Compromise: Saving AI from Itself and Saving Us from AI

June 12, 2023

ExtraHop CEO Patrick Dennis discusses what it will take to realize the benefits of AI while mitigating its trust-eroding risks in this latest blog in his Continuous Compromise series.

C-LevelZero TrustSecurity ThreatsAI

Read article

Generative AI: What is It and How Can I Use It?

June 1, 2023

Understanding generative artificial intelligence and its potential applications can be tough with all the hype. ExtraHop breaks down the basics of AI.

AICybersecurity

Read article

Reveal(x) Detects Data Leaks from Employee Use of ChatGPT

May 18, 2023

ExtraHop Reveal(x) lets organizations see in employees are sending intellectual property and other sensitive data to OpenAI ChatGPT.

AIRevealXProducts

Read article

Experience RevealX NDR for Yourself

Schedule a demo

What is NDR

RevealX Platform

Integrations

NPM Resources

Capabilities

Ransomware Attacks

Advanced Threat Hunting

Threat Detection and Response

Network Forensics and Investigation

Security Hygiene

Cloud Workload Security

Operational Resilience

Troubleshooting and Resolution

Cloud Migration

Cloud Workload Monitoring

Network Forensics

Zero Trust

Multicloud & Hybrid Cloud Security

XDR Strategy

SOC Modernization

Digital Transformation

Financial Services

Education

Public Sector

Federal Civilian Agencies

Defense and Intelligence

State and Local Government

View Now

Careers

About

Press Releases

News

Leadership Team

Industry Recognition

Technology Partners

Channel Partners

Managed Service Providers

Apply Today

Sign In

Service Credits

Resident Experts

Implementation Services

Customer Community

Technical Support

Education Services

Virtual

In Person

View & Register

View & Register

Customer Stories

Reports

Demos & Videos

Webinars

Briefs

At-a-glance

Papers & E-books

Datasheets

Attack Types

Network Protocols

Blog

News & Articles

Overview

What is NDR

RevealX Platform

Integrations

Overview

NPM Resources

Capabilities

Overview

Overview

Overview

Ransomware Attacks

Advanced Threat Hunting

Threat Detection and Response

Network Forensics and Investigation

Security Hygiene

Cloud Workload Security

Overview

Operational Resilience

Troubleshooting and Resolution