Data privacy and security in GenAI-enabled services

The rapid adoption of GenAI (generative artificial intelligence) for enterprise business scenarios has ushered in a new era of unprecedented creativity, utility and productivity.

More specifically, the proliferation of large language models (LLMs) like Copilot (ChatGPT), Lambda, and Falcon 40B, has led to organizations seeking to quickly train and deploy GenAI-powered applications and services for their businesses across a variety of industries, reshaping digital transformation efforts as we know them. However, to make the most meaningful use of GenAI, organizations must continuously (and securely) integrate large data sets into their machine learning (ML) models; after all, your outputs are only as good as the data used for training your services.

While GenAI demonstrates promise, security and privacy risks associated with ingesting and exposing sensitive data such as personal identifiable information (PII), protected health information (PHI) data has now become clear.

The real-world risks of not securing PII before it is ingested, or not having a well-tested model, can include: unintentional data loss, sensitive IP exposure, and even potential infringement on regional data privacy regulations.

It’s imperative to understand that ML models are not static algorithms — they’re evolving entities shaped by the data they process. In other words, these models learn and adapt as they encounter diverse data sets. This adaptability introduces inherent security risks that organizations must navigate with caution.

Consider the idea of the “poisoned data chain.” In the case of models like ChatGPT, which are trained using vast general knowledge data sets, including those sourced from platforms like Wikipedia, the risk lies in the potential inclusion of “poisoned.” Similarly, enterprises adopt GenAI to train the ML models using data sets of their own or even aggregated from third party organizations. Consider the possibility that datasets could contain hidden and unknown malware, akin to ransomware that are designed to compromise your system. Now, if the training data contains misinformation or malicious content, it becomes part of the ML model’s learning process.

With this in mind, it’s easy to see how even a small amount of “poisoned” data can exponentially grow into a much larger problem.

Another facet of this challenge is the integration of PII data into the training data. When most people think of “poisoned data,” images of malware, threats, and immediate risks come to mind. But the scope of poisoned data extends beyond these conventional threats to safety concerns. The potential for PII to be uploaded into the data repositories used to train ML models can lead to the unintended abuse of personal information, posing significant risks to both targeted individuals and organizations:

As organizations delve into the transformative potential of GenAI, specific applications and services are reshaping enterprise digital transformation efforts. However, these endeavors come with their own unique set of challenges, particularly in ensuring the safety, security, and privacy of sensitive data.

Here are a couple of examples of applying GenAI to digital transformation efforts and the resulting security cautions to address early on:

Using GenAI to orchestrate corporate HR processes: A popular use case lies in training and applying LLM models to automate manual HR tasks. These types of tasks can include: global performance, and compensation management processes. By adopting GenAI, HR professionals can focus on interactions with employees, orchestrating engagement with relevant, accurate, and up-to-date information.

Security caution: Early adopters of GenAI in HR processes have complained about unintended data privacy exposure, such as executive salary compensation or personal confidential information in the initial testing/training phase. To prevent this, organizations must ensure privacy preserving techniques are applied prior to training LLM models.

Using GenAI to enhance customer experiences in billing, claims processing: LLM models can be trained and used to derive quick answers to frequently asked questions, facilitating the billing and claims processors to efficiently address daily volumes. This is especially useful in insurance, finance, and healthcare organizations.

Security caution: The key security risk in deploying GenAI for customer interactions lies in ensuring privacy-preserving techniques are rigorously applied. For instance, during individual customer claims calls, it is imperative that GenAI instrumentation does not expose other personally identifiable information.

Training LLM models demands the collection and curation of vast amounts of unstructured data (corpus). This corpus, essential for the model’s efficacy, must be protected against malicious content and, more critically, prevent data privacy exposures. Organizations adopting GenAI for customer interactions must implement privacy by design. This proactive approach streamlines knowledge management, enhances operational efficiencies, and ensures safe and responsible utilization of GenAI innovations.

Safeguarding PII before it’s ingested by an LLM is similar to the process of safeguarding data from malware before it reaches an organization’s endpoint. Both are intended to prevent major issues (and liabilities) before they ever have a chance to crop up. Today, artificial intelligence has only increased the need for proactive prevention within the realm of cybersecurity.

The rapid adoption of GenAI has introduced newfound opportunities and challenges for organizations seeking to harness its transformative power. The imperative of real-time secure data integration is underscored by the dynamic nature of ML models, emphasizing the need for rigorous privacy-preserving techniques. Successful integration strategies, as exemplified in HR processes and customer service interactions, showcase the potential of GenAI to elevate digital transformation efforts. However, these applications demand a real-time security approach, with a primary focus on safeguarding sensitive information as it is used and mitigating unintended data privacy exposures.

As organizations navigate this evolving landscape, a commitment to safe and ethical AI practices, regulatory compliance, and continual innovation will be essential. In embracing GenAI, organizations not only embark on a journey of technological advancement but also assume the responsibility of ensuring the safe and secure deployment of this transformative force. By prioritizing privacy, security, and innovation in equal measure, organizations can navigate the challenges posed by GenAI, unlocking its full potential while safeguarding the trust of users and stakeholders alike.

Up to 95.9% of workplace chatbot use is on personal accounts, risking data exposure.

Phishing emails have increased by 341% and 856% during the past six and 12 months, respectively, with the surge mostly attributed to the increasing adoption of ChatGPT and other generative artificial intelligence services among threat actors, reports SiliconAngle.

Microsoft, Google, OpenAI, and 13 other major artificial intelligence firms have committed to ensure safety and transparency in developing AI models at the AI Seoul Summit, The Associated Press reports.

By clicking the Subscribe button below, you agree to SC Media Terms and Conditions and Privacy Policy.

Your use of this website constitutes acceptance of CyberRisk Alliance Privacy Policy and Terms & Conditions.

Data privacy and security in GenAI-enabled services | SC Media

About admin

Related Posts

Committee Chairs Rodgers, Cantwell Unveil Historic Draft Comprehensive Data Privacy Legislation

Chair Rodgers Statement on Strengthening Americans’ Data Privacy and Security

Innovation, Data, and Commerce Subcommittee Markup Recap: Monumental Step Forward for Data Privacy and Kids Online Safety

About admin