Data security in AI

Topic

A significant concern for enterprises using public AI models, as proprietary information in prompts can be leaked to model builders. This concern is presented as the primary driver for a potential 'on-prem comeback'.


First Mentioned

2/14/2026, 3:56:14 AM

Last Updated

2/14/2026, 4:10:34 AM

Research Retrieved

2/14/2026, 4:10:34 AM

Summary

Data security in AI encompasses the strategies and technologies used to protect sensitive information within artificial intelligence systems, particularly as enterprises navigate the risks of using third-party models from providers like OpenAI. Concerns over data leakage and unauthorized access are driving a significant shift toward on-premises solutions, often referred to as an on-prem comeback, as companies seek to maintain control over proprietary data. This focus on security is occurring alongside a massive expansion in specialized AI infrastructure, with AI factories or data centers projected to attract $650 billion in investment by 2026. Key security challenges include adversarial attacks, model poisoning, and data breaches, while mitigation strategies rely on robust AI governance, data minimization, and advanced encryption techniques.

Referenced in 1 Document
Research Data
Extracted Attributes
  • Common Risks

    Adversarial attacks, model poisoning, data breaches, and automated attacks

  • Core Principles

    AI governance, IT security controls, data science security controls, data minimization, and behavior control

  • Primary Concern

    Data leakage and unauthorized access when using third-party AI models

  • Supply Chain Impact

    Shortages in High Bandwidth Memory (HBM) and competition for advanced chips and power

  • Infrastructure Trend

    Resurgence of on-premises solutions (On-prem comeback)

  • Projected Investment

    $650 billion in AI data centers by 2026

  • Mitigation Techniques

    Adversarial training, data encryption, anonymization, and access-control mechanisms

Timeline
  • Global push to construct specialized AI data center facilities accelerates during the AI boom of the 2020s. (Source: undefined)

    2020-01-01

  • Stanford University Institute of Human-Centered Artificial Intelligence publishes research on privacy risks in the AI era. (Source: undefined)

    2024-03-18

  • Reports emerge regarding LinkedIn training AI on user data, highlighting ongoing privacy and security concerns. (Source: undefined)

    2024-09-18

  • Projected milestone for major tech companies to reach $650 billion in total spending on AI data centers. (Source: undefined)

    2026-12-31

AI data center

An AI data center (sometimes known as an AI factory) is a specialized data center facility designed for the computationally intensive tasks of training and running inference for artificial intelligence (AI) and machine learning models. Unlike general-purpose data centers, they are optimized for the parallel processing demands of AI workloads, typically utilizing hardware such as AI accelerators (e.g., GPUs, TPUs) and high-speed interconnects. The global push to construct these specialized facilities accelerated dramatically during the AI boom of the 2020s. This demand has reshaped supply chains, driving memory manufacturers to prioritize production of High Bandwidth Memory (HBM) essential for AI servers, which has led to a global memory supply shortage, and triggering a broader competition for advanced chips, power, and infrastructure. Major tech companies are estimated to spend $650 billion on AI data centers in 2026.

Web Search Results
  • What Is AI Security? Risks, Principles and Benefits

    The core principles behind AI security include: AI governance: Implementing governance processes for AI risk, integrating AI into information security, and software lifecycle processes. Conventional IT security controls: Applying risk-based technical IT security controls, as AI systems are IT systems, ensuring standard protections. Data science security controls: Applying risk-based controls by data scientists, focusing on data validation and model testing. Data minimization: Limiting data amount and retention time, both development-time and runtime, to reduce exposure. Behavior control: Controlling the impact of ai on security, whether by mistake or manipulation, through oversight, least privilege, transparency, explainability, and continuous validation. [...] Adversarial attacks: Threat actors can manipulate AI systems by introducing subtly altered inputs designed to cause misclassification or erroneous outputs. Organizations can mitigate this risk through adversarial training, where security teams deliberately introduce adversarial examples during model training to enhance resilience. Model poisoning: Attackers target training data to corrupt AI models, potentially creating backdoors for future exploitation. Implementing robust AI data security processes and maintaining secure training environments significantly reduces this risk. [...] Implement data governance: Strong data governance ensures that AI systems use accurate, relevant, and secure datasets. Moreover, regularly updating training data helps AI models adapt to evolving threats while maintaining accuracy and reliability. Maintain ethics and transparency: Documenting algorithms, datasets, and decision-making processes builds trust and accountability in AI use. Furthermore, transparent communication with stakeholders helps identify biases early and supports responsible AI governance. Integrate AI with existing infrastructure: Connecting AI systems with existing tools like threat intelligence feeds enables unified monitoring and faster response. This integration enhances detection capabilities without disrupting ongoing security operations.

  • What Is AI Security?

    Despite its benefits, AI poses security challenges, particularly with data security. AI models are only as reliable as their training data. Tampered or biased data can lead to false positives or inaccurate responses. For instance, biased training data used for hiring decisions can reinforce gender or racial biases, with AI models favoring certain demographic groups and discriminating against others.3 AI tools can also help threat actors more successfully exploit security vulnerabilities. For example, attackers can use AI to automate the discovery of system vulnerabilities or generate sophisticated phishing attacks. [...] Ability to scale: AI cybersecurity solutions can scale to protect large and complex IT environments. They can also integrate with existing cybersecurity tools and infrastructure, such as security information and event management (SIEM) platforms, to enhance the network's real-time threat intelligence and automated response capabilities. ## Potential vulnerabilities and security risks of AI Despite the many benefits, the adoption of new AI tools can expand an organization’s attack surface and present several security threats. Some of the most common security risks posed by AI include: ### Data security risks [...] # What is AI security? ## Authors Annie Badman Staff Writer IBM Think Matthew Kosinski Staff Editor IBM Think ## What is AI security? Short for artificial intelligence (AI) security, AI security is the process of using AI to enhance an organization's security posture. With AI systems, organizations can automate threat detection, prevention and remediation to better combat cyberattacks and data breaches. Organizations can incorporate AI into cybersecurity practices in many ways. The most common AI security tools use machine learning (ML) and deep learning to analyze vast amounts of data, including traffic trends, app usage, browsing habits and other network activity data.

  • Exploring privacy issues in the age of AI

    Explore AI cybersecurity Take the next step Whether you need data security, endpoint management or identity and access management (IAM) solutions, our experts are ready to work with you to achieve a strong security posture. Transform your business and manage risk with a global industry leader in cybersecurity consulting, cloud and managed security services. Explore cybersecurity solutions Discover cybersecurity services 1 “Privacy in an AI Era: How Do We Protect Our Personal Information?” Stanford University Institute of Human-Centered Artificial Intelligence. 18 March 2024. 2 “LinkedIn Is Quietly Training AI on Your Data—Here’s How to Stop It.” PCMag. 18 September 2024. [...] Read the guide Explainer What is data security? Find out how data security helps protect digital information from unauthorized access, corruption or theft throughout its entire lifecycle. Read the article Report IDC MarketScape: Cybersecurity Consulting Services Vendor Assessment See why IBM has been named a Major Player and gain insights for selecting the Cybersecurity Consulting Services Vendor that best fits your organization’s needs. Read the report Explainer What is a cyberattack? A cyberattack is an intentional effort to steal, expose, alter, disable or destroy data, applications or other assets through unauthorized access. Read the article Related solutions Enterprise security solutions [...] ### Following security best practices Organizations that use AI should follow security best practices to avoid the leakage of data and metadata. Such practices might include using cryptography, anonymization and access-control mechanisms. ### Providing more protection for data from sensitive domains Data from certain domains should be subject to extra protection and used only in “narrowly defined contexts.” These “sensitive domains” include health, employment, education, criminal justice and personal finance. Data generated by or about children is also considered sensitive, even if it doesn’t fall under one of the listed domains. ### Reporting on data collection and storage

  • The dual face of artificial intelligence in data protection and...

    Data breaches: AI systems require large datasets to function effectively. These datasets often contain sensitive personal information, making them prime targets for cybercriminals. A breach in such systems can lead to the exposure of massive amounts of personal data. Automated attacks: Hackers leverage AI to automate and scale their attacks. AI-driven malware can adapt to security measures, making it more difficult to detect and counter. For instance, AI can generate phishing emails that are highly personalized, increasing the likelihood of unsuspecting individuals falling victim to such scams. [...] The first factor is that the AI use cases that are currently present in the market are exiting from the innovation sphere and will become very soon real production applications. They are in fact using real data also on these initial use cases, being already a big concern for the CISO and CSO organizations, and they will represent an increasing risk once they will be widely used by end customers and spread on the complex digital environment present in every company. Analyzing and securing the cyber risks related to AI tools and techniques, will therefore be a key element to protect the investments, with Cyber Security organizations playing a key role on AI adoption. [...] Data encryption: AI can enhance data encryption techniques, making it harder for unauthorized parties to access sensitive information. Access control: AI can manage access to data by continuously assessing the risk level of users and adjusting permissions accordingly. This ensures that only authorized individuals have access to sensitive information. Data masking: AI can mask sensitive data in non-production environments, allowing organizations to use real data for testing and development without compromising privacy. ## How Reply can help

  • Privacy in an AI Era: How Do We Protect Our Personal ...

    What kinds of risks do we face, as our data is being bought and sold and used by AI systems? First, AI systems pose many of the same privacy risks we’ve been facing during the past decades of internet commercialization and mostly unrestrained data collection. The difference is the scale: AI systems are so data-hungry and intransparent that we have even less control over what information about us is collected, what it is used for, and how we might correct or remove such personal information. Today, it is basically impossible for people using online products or services to escape systematic digital surveillance across most facets of life—and AI may make matters even worse. [...] At present, we depend on the AI companies to remove personal information from their training data or to set guardrails that prevent personal information from coming out on the output side. And that’s not really an acceptable situation, because we are dependent on them choosing to do the right thing. [...] so that data is not collected by every actor possible and every place you go.