Researchers Hack Google’s Gemini CLI Through Prompt Injections in GitHub Actions

By Published On: December 8, 2025

 

The integration of artificial intelligence into development pipelines offers powerful automation capabilities, yet it also introduces novel attack vectors. Recent research has exposed a critical vulnerability class, dubbed “PromptPwnd,” impacting AI agents within GitHub Actions and GitLab CI/CD environments. This sophisticated attack vector, allowing prompt injections to compromise powerful tools like Google’s Gemini CLI, underscores a significant concern for organizations leveraging AI in their CI/CD processes. Understanding and mitigating PromptPwnd is no longer optional; it’s a security imperative.

Understanding Prompt Injections in CI/CD

Prompt injection is a type of vulnerability where an attacker manipulates the input fed to an AI model, causing it to deviate from its intended behavior or execute unauthorized actions. In the context of CI/CD pipelines, this becomes particularly dangerous as AI agents often operate with elevated privileges to perform tasks like code analysis, deployment, or secret management. The “PromptPwnd” vulnerability specifically targets these AI agents, leveraging untrusted user inputs found in places such as issue titles or pull request bodies.

By crafting malicious prompts embedded within seemingly innocuous text, attackers can trick AI models—including those powering Google’s Gemini CLI—into executing commands they were never authorized to run. This could lead to a cascade of security incidents, from sensitive data leakage to complete workflow alteration and backdoor creation within critical systems.

How PromptPwnd Exploits AI in GitHub Actions

GitHub Actions and GitLab CI/CD pipelines are designed to automate software development workflows, making them central to modern DevOps practices. When AI agents are integrated into these pipelines, they often receive context and instructions from various sources, including user-generated content. This very integration presents the PromptPwnd attack surface.

For example, an attacker could open a pull request with a specially crafted title or description. An AI agent, designed to summarize or analyze pull requests, might then process this malicious input. If the AI model is not adequately secured against prompt injection, it could interpret the malicious instruction as a legitimate command, potentially leading to:

  • Secret Exfiltration: The AI agent could be tricked into printing environment variables, API keys, or other sensitive credentials stored within the CI/CD environment.
  • Code Manipulation: Malicious prompts could induce the AI to alter code, inject backdoors, or disable security checks within the repository.
  • Infrastructure Compromise: With sufficient privileges, the AI agent might execute commands that interact with cloud resources, potentially leading to infrastructure compromise or denial-of-service.
  • Workflow Disruption: Attackers could command the AI to halt pipelines, delete build artifacts, or create infinite loops, disrupting development and deployment schedules.

The reported incidence of this vulnerability affecting at least five Fortune 500 companies underscores the widespread and critical nature of PromptPwnd. While specific CVEs related to this broad class of attack are emerging, a general awareness of CVE-2023-45802 (a related vulnerability in LLMs) highlights the ongoing efforts to categorize these threats.

Remediation Actions for Prompt Injection Vulnerabilities

Addressing PromptPwnd and similar prompt injection vulnerabilities requires a multi-layered security approach, focusing on input validation, privilege minimization, and robust monitoring.

  • Input Sanitization and Validation: All user-generated content that feeds into AI agents must be rigorously sanitized and validated. Implement strict allow-listing for expected inputs and escape any special characters that could be interpreted as commands by the AI.
  • Principle of Least Privilege: Configure AI agents with the absolute minimum necessary permissions. They should only have access to the resources and commands critical for their intended function, limiting the blast radius of any successful injection.
  • Sandboxing AI Agents: Isolate AI agents within sandboxed environments. This ensures that even if an agent is compromised, its ability to interact with critical system resources or data is severely restricted.
  • Contextual Awareness and RAG: Implement Retrieval-Augmented Generation (RAG) or similar techniques to ground AI responses in trusted knowledge bases rather than solely relying on user input. This helps the AI differentiate between legitimate instructions and malicious prompts.
  • Human-in-the-Loop Review: For critical actions or outputs generated by AI agents, introduce a human review step. This serves as a final safeguard against unintended or malicious AI actions.
  • Anomaly Detection and Monitoring: Implement robust logging and monitoring for AI agent activities. Look for unusual command executions, access patterns, or deviations from normal behavior that could indicate a prompt injection attempt.
  • Regular Security Audits: Conduct frequent security audits of CI/CD pipelines and AI agent configurations to identify and rectify potential vulnerabilities before they can be exploited.

Tools for Detection and Mitigation

Several tools and practices can aid in detecting and mitigating prompt injection vulnerabilities within CI/CD pipelines.

Tool Name Purpose Link
OWASP Top 10 for LLMs Comprehensive guide to LLM vulnerabilities OWASP Foundation
OpenAI Moderation API Identifies potentially harmful content in prompts OpenAI
Guardrails AI Framework for adding validation and moderation to LLMs Guardrails AI
SAST/DAST Tools Detects common code vulnerabilities and potential injection points OWASP SAST Tools

Conclusion

The vulnerability class known as “PromptPwnd,” which enables prompt injections against AI agents like Google’s Gemini CLI operating within GitHub Actions and GitLab CI/CD, represents a critical security challenge. The ability for attackers to leverage untrusted user inputs to execute privileged commands, exfiltrate secrets, or alter workflows necessitates an immediate and robust response. Organizations must prioritize stringent input validation, implement the principle of least privilege for AI agents, and establish strong monitoring and human oversight. As AI increasingly integrates into core development processes, comprehending and addressing these unique attack vectors becomes paramount to maintaining secure and resilient software delivery pipelines.

 

Share this article

Leave A Comment