Artifical Intelligence, Cybersecurity

Multiple Vulnerabilities in AI Platforms Exposes Sensitive Data to Anyone

By TeamwinPublished On: August 29, 2024

Multiple Vulnerabilities in AI Platforms Exposes Sensitive Data to Anyone

Artificial intelligence (AI) platforms have become integral tools for businesses and organizations worldwide.

These technologies promise efficiency and innovation, from chatbots powered by large language models (LLMs) to intricate machine learning operations (MLOps).

However, recent investigations have uncovered alarming vulnerabilities in these systems, exposing sensitive data to potential exploitation.

This article delves into the findings of a comprehensive study on AI platform vulnerabilities, focusing on vector databases and LLM tools. AI platforms are celebrated for streamlining operations and enhancing user experiences.

Businesses utilize these tools to automate tasks, manage data, and interact with customers. However, the convenience of AI comes with significant risks, particularly concerning data security. The Legit Security study highlights two primary areas of concern: vector databases and LLM tools.

Publicly Exposed Vector Databases

Understanding Vector Databases

Vector databases are specialized systems that store data as multi-dimensional vectors, commonly used in AI architectures. They play a crucial role in retrieval-augmented generation (RAG) systems, where AI models rely on external data retrieval to generate responses. Popular platforms include Milvus, Qdrant, Chroma, and Weaviate.

Weaviate vector database (Source: Legit Security)

Security Risks

Despite their utility, vector databases pose severe security threats. Many instances are publicly accessible without proper authentication, allowing unauthorized users to access sensitive information.

This includes personally identifiable information (PII), medical records, and private communications. The study found that data leakage and data poisoning are prevalent risks.

Image Position: Center, illustrating a vector database architecture with highlighted vulnerabilities.

Real-World Examples

The investigation uncovered approximately 30 servers containing sensitive corporate or private data, including:

Company email conversations
Customer PII and product serial numbers
Financial records
Candidate resumes

In one case, a Weaviate database from an engineering services company contained private emails. Another instance involved a Qdrant database with customer details from an industrial equipment firm.

Publicly Exposed LLM Tools

No-Code LLM Automation Tools

Low-code platforms like Flowise enable users to build AI workflows by integrating data loaders, caches, and databases. While powerful, these tools are vulnerable to data breaches if not properly secured.

Security Risks

LLM tools face threats similar to those of vector databases, including data leakage and credential exposure. The study identified a critical vulnerability (CVE-2024-31621) in Flowise, allowing authentication bypass through simple URL manipulation.

Exposed Flowise server, which returns HTTP 401 - Unauthorized Error on any API request (source: Legit Security) — Exposed Flowise server, which returns HTTP 401 – Unauthorized Error on any API request (source: Legit Security)

Key Findings

The research revealed numerous exposed secrets, such as:

OpenAI API keys
Pinecone API keys

The Pinecone API key that we found hardcoded in one of the flow configurations is shown (source: Legit Security)

GitHub access tokens

GitHub Tokens and OpenAI API keys from a vulnerable Flowise instance (source: Legit Security)

These findings underscore the potential for significant data breaches if vulnerabilities are not addressed.

Mitigation Strategies

To combat these vulnerabilities, organizations must implement robust security measures. Recommended actions include:

Enforcing strict authentication and authorization protocols
Regularly updating software to patch known vulnerabilities
Conducting thorough security audits and penetration testing
Educating staff on best practices for data protection

The vulnerabilities uncovered in AI platforms highlight the urgent need for enhanced security measures. As AI permeates various sectors, safeguarding sensitive data must be a top priority. Organizations are urged to proactively mitigate risks and protect their digital assets.

The findings of this study serve as a stark reminder of the potential consequences of neglecting cybersecurity in the age of AI. By addressing these vulnerabilities, businesses can harness the full potential of AI technologies while ensuring the safety and privacy of their data.

3 / 5

Artifical Intelligence, Cybersecurity

Multiple Vulnerabilities in AI Platforms Exposes Sensitive Data to Anyone