Reveal(x) Detects Data Leaks from Employee Use of ChatGPT

May 18, 2023

Reveal(x) Detects Data Leaks from Employee Use of ChatGPT

Ever since OpenAI released ChatGPT on November 30, 2022, use of AI as a service (AIaaS), including generative AI tools, has skyrocketed. Within five days of launching, ChatGPT exceeded 1 million users. Two months later, it had 100 million users. By March, it boasted 1 billion. By comparison, it took TikTok eight years to reach 1 billion users.

It’s no wonder these tools have been, in large part, such a runaway success, what with the productivity improvements they promise (and deliver) for professionals across a range of occupations, from software development to pharmaceutical research.

And even though these tools are not well understood, many large enterprises that see the enormous potential of generative AI and AIaaS are committing significant resources to use it. According to Gartner® Research, the Gartner 2023 CIO and Technology Executive Survey showed that “nine out of 10 respondents (92%) ranked AI as the technology their organizations were most likely to implement by 2025.”1

The downside? Beyond the ethical concerns, organizations using AIaaS tools have learned some hard lessons about data leaks and intellectual property (IP) risk after employees shared proprietary information with these public tools. What those employees using AI to accelerate code reviews and new drug discovery may not have realized is that they’re effectively putting confidential data in the public domain: once they share proprietary information with an AIaaS, the service continues to use it to process other people’s requests–in some cases, indefinitely, depending on the terms of the AIaaS provider.

The immediate IP risk centers on users logging into the websites and APIs of these generative AI solutions and sharing proprietary data. However, this risk increases as local deployments of these systems flourish and people start connecting them to each other. Once an AI service is determining what data to share with other AIs, the human oversight element that currently makes that determination is lost. The AI service is unlikely to understand the impact and potential consequences of sharing data it has access to and may not notify its human handlers that data has been shared outside of the organization.

Thus, the risk of IP loss and leakage of customer data has made it imperative for organizations to understand the scope of generative AI use across their businesses. Until now, organizations haven’t had an easy way to audit employee use and potential misuse of these tools.

Data Protection From Rogue AI Use and Accidental Misuse

Earlier this week, ExtraHop released a new capability that helps organizations understand their potential risk exposure from employee use of OpenAI ChatGPT.

ExtraHop Reveal(x) provides customers with visibility into the devices and users on their networks that are connecting to OpenAI domains. This capability is essential as organizations move quickly to adopt policies governing the use of large language models and generative AI tools, since it will give organizations a mechanism to audit compliance with those policies. We are taking this step as part of a larger security platform approach, incorporating the AIaaS monitoring with our existing, industry-leading network detection and response (NDR) capabilities.

By tracking which devices are connecting to OpenAI domains, identifying the users associated with those devices and the amount of data those devices are sending to those domains, Reveal(x) enables organizations to assess the risk associated with their users' ongoing use of AI services.

In addition, because Reveal(x) shows the amount of data being sent to and received from OpenAI domains, security leaders can evaluate what falls within an acceptable range and what indicates potential IP loss. For example, simple user queries to a chatbot should fall within a range of bytes to kilobytes. If security teams see MBs of data flowing to these domains, that volume may signify employees are sending proprietary data with their query. Organizations will be able to identify the type of data and individual files that employees are sending to OpenAI domains if the traffic in question is not encrypted and Reveal(x) is able to identify related data exfiltration and data staging detections.

Reveal(x) is able to provide this deep visibility and real-time detection because we use network packets as the primary data source for monitoring and analysis. Using a real-time stream processor, Reveal(x) transforms unstructured packets into structured wire data and analyzes payloads and content from OSI Layer 2–7 for complete network visibility. From device discovery to behavioral analysis, network telemetry is the immutable source of truth for understanding an organization’s hybrid environment. Logs can tell you that two devices talked to each other, but Reveal(x) provides rich context about the communication.

At ExtraHop, we can’t underscore the importance of this capability as organizations grapple with the popularity and proliferation of AIaaS and the data leakage risk associated with it. ExtraHop believes the productivity benefits of these tools outweigh the data exposure risks, provided organizations understand how these services will use their data (and how long they’ll retain it), and provided organizations not only implement policies governing use of these services but also have a control like Reveal(x) in place that allows them to assess policy compliance and spot risks in real time.

1 - Gartner®, Applying AI - Key Trends and Futures, Bern Elliott, Jim Hare, Frances Karamouzis, 25 April 2023. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Discover more

AIRevealXProducts

ExtraHop

ExtraHop is on a mission to arm security teams to confront active threats and stop breaches. Our RevealX™ 360 platform, powered by cloud-scale AI, covertly decrypts and analyzes all cloud and network traffic in real time to eliminate blind spots and detect threats that other tools miss. Sophisticated machine learning models are applied to petabytes of telemetry collected continuously, helping ExtraHop customers to identify suspicious behavior and secure over 15 million IT assets, 2 million POS systems, and 50 million patient records.

Learn more at our about us page.

Explore related articles

Introducing ExtraHop IDS: Next-Gen Intrusion Detection

April 25, 2024

ExtraHop IDS integrates with the RevealX NDR platform to offer customers a simplified approach to intrusion detection that supports expanded CVE coverage.

Security ThreatsSecurity FrameworksCompanyLaunchesProducts

Read article

Harvard AI Expert Jonathan Zittrain Talks Large Language Models with ExtraHop

September 6, 2023

Harvard professor Jonathan Zittrain gives a masterclass on artificial intelligence, machine learning, generative AI, and large language models.

C-LevelAI

Read article

ExtraHop Launches Automated Retrospective Detection in Reveal(x) 360

April 25, 2023

Automated Retrospective Detection (ARD) automatically searches historical network records for evidence of previously unknown threats as soon as new IOCs are released.

Security ThreatsLaunchesNetwork DataRevealXCloud SecurityProducts

Read article

Experience RevealX NDR for Yourself

Schedule a demo

NEW

ExtraHop named a leader in the Gartner® Magic Quadrant™ for Network Detection and Response

Professional Services

Education Services

Partners

Partner Login

Partner Finder

View All Use Cases

View All Industries

View All Integrations

Reveal(x) Detects Data Leaks from Employee Use of ChatGPT

Data Protection From Rogue AI Use and Accidental Misuse

Share

Share

Explore related articles

Introducing ExtraHop IDS: Next-Gen Intrusion Detection

Read article

Harvard AI Expert Jonathan Zittrain Talks Large Language Models with ExtraHop

Read article

ExtraHop Launches Automated Retrospective Detection in Reveal(x) 360

Read article

Experience RevealX NDR for Yourself