Skip to content

feat(unpinned-images): detect docker/podman commands in run: steps#1677

Open
jfagoagas wants to merge 9 commits intozizmorcore:mainfrom
jfagoagas:unpinned-images-run-block
Open

feat(unpinned-images): detect docker/podman commands in run: steps#1677
jfagoagas wants to merge 9 commits intozizmorcore:mainfrom
jfagoagas:unpinned-images-run-block

Conversation

@jfagoagas
Copy link

@jfagoagas jfagoagas commented Mar 1, 2026

Pre-submission checks

Please check these boxes:

  • Mandatory: This PR corresponds to an issue (if not, please create
    one first).

  • I hereby disclose the use of an LLM or other AI coding assistant in the
    creation of this PR. PRs will not be rejected for using AI tools, but
    will be rejected for undisclosed use.

If a checkbox is not applicable, you can leave it unchecked.

Motivation

GitHub's agentic workflows are increasingly generating run: step scripts that include docker pull/docker run commands. Unlike human-authored workflows where a developer might remember to pin images, agent-generated scripts almost never pin to a digest. This turns every unpinned docker pull into a supply chain injection point — an attacker who hijacks a tag on a public registry gets code execution inside the workflow.

The existing unpinned-images audit only covered declarative container: and services: blocks, missing the imperative docker commands that are becoming more common as CI pipelines get more dynamic.

This also opens the door for future audits targeting other agentic workflow patterns: unvalidated tool outputs piped into shell commands, dynamic action references constructed from agent suggestions, or secrets passed to agent-spawned containers.

Summary

Closes #738.

Extends the unpinned-images audit to detect unpinned image references in docker pull, docker run, docker create, and their podman equivalents within run: steps, in addition to the existing container: and services: checks.

  • Uses tree-sitter bash parsing (following the github-env audit pattern) to identify docker/podman command invocations, avoiding false positives from strings and comments (e.g. echo "docker pull ubuntu")
  • Handles flags, combined short flags (-dit), and --flag=value syntax to correctly extract the image argument
  • Skips shell variable expansions ($IMAGE) that can't be statically resolved
  • Adds audit_step for workflow steps and audit_composite_step for action steps

Test Plan

  • cargo build compiles
  • cargo test -p zizmor -- unpinned_images — existing snapshot tests unchanged, new tests pass
  • cargo insta review — accept new snapshots for test_run_docker_pedantic and test_run_docker_regular
  • cargo clippy — no new warnings
  • Unit tests cover: argument extraction, flag skipping, combined flags, subcommand filtering, echo false-positive, variable skipping, podman, multiline scripts

Copilot AI review requested due to automatic review settings March 1, 2026 20:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends the unpinned-images audit to detect unpinned Docker/Podman image references inside run: steps by parsing shell scripts with tree-sitter, complementing the existing container: and services: image checks.

Changes:

  • Add tree-sitter bash-based detection of docker/podman pull|run|create invocations in run: steps (and composite action steps) and emit findings based on pinning policy.
  • Add new integration test workflow fixture and snapshot-based tests for the new detection behavior.
  • Update audit documentation to describe the new run:-step coverage and remediation guidance.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
docs/audits.md Documents the new run:-step Docker/Podman detection and adds an additional remediation example.
crates/zizmor/src/audit/unpinned_images.rs Implements tree-sitter-based parsing of run: scripts to extract Docker/Podman image references and audit them.
crates/zizmor/tests/integration/test-data/unpinned-images/run-docker.yml Adds workflow test data covering docker/podman commands in run: steps (including multiline scripts and flags).
crates/zizmor/tests/integration/audit/unpinned_images.rs Adds snapshot tests (regular + pedantic personas) for the new fixture.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@woodruffw woodruffw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening a PR @jfagoagas.

I'm interested in this feature, but I'm concerned about this approach's stability/brittleness; I've left some comments to that effect.

Separately, this PR is probably too big to review -- anything above roughly ~500 lines takes me much longer to read and understand, so it'd be ideal to break this up into several smaller PRs (each with tests, where possible). If you're interested in taking that on, we can discuss what an MVP/tractable design would look like here.

Comment on lines +21 to +77
/// Tree-sitter query matching `docker` and `podman` command invocations in bash.
///
/// NOTE: `@span` is required by [`utils::SpannedQuery`] and can be used in the
/// future to produce more precise finding locations within multiline scripts.
const BASH_DOCKER_CMD_QUERY: &str = r#"
(command
name: (command_name) @cmd
argument: (_)* @args
(#match? @cmd "^(docker|podman)$")
) @span
"#;

/// Boolean short flags for docker/podman pull/run/create that do NOT consume
/// the next argument.
///
/// NOTE: `-a` is intentionally excluded — it means `--all-tags` for `pull`
/// (boolean) but `--attach` for `run`/`create` (value-consuming). Treating it
/// as value-consuming is the safer default to avoid false negatives.
const BOOLEAN_SHORT_FLAGS: &[&str] = &["-d", "-i", "-t", "-P", "-q"];

/// Boolean long flags for docker/podman pull/run/create that do NOT consume
/// the next argument.
///
/// This is the union of boolean flags across all three subcommands for both
/// Docker and Podman. Unknown flags default to value-consuming, which is the
/// safe choice for a security tool (may miss findings, but won't produce false
/// positives).
const BOOLEAN_LONG_FLAGS: &[&str] = &[
// docker pull/run/create
"--all-tags",
"--detach",
"--disable-content-trust",
"--init",
"--interactive",
"--no-healthcheck",
"--oom-kill-disable",
"--privileged",
"--publish-all",
"--quiet",
"--read-only",
"--rm",
"--sig-proxy",
"--tty",
"--use-api-socket",
// podman-only booleans (pull)
"--tls-verify",
// podman-only booleans (run/create)
"--env-host",
"--http-proxy",
"--no-hostname",
"--no-hosts",
"--passwd",
"--read-only-tmpfs",
"--replace",
"--rmi",
"--rootfs",
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this approach is sustainable long-term, unfortunately -- we either need a lighter-weight way to detect docker images/invocations in run: blocks, or something that's more formal and testable (e.g. explicitly modeling these tools using clap).

(As is, this is going to be brittle on any CLI changes to either docker or podman, which in practice is going to mean a lot of false positive reports.)

Copy link
Author

@jfagoagas jfagoagas Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both CLIs are stable enough but we can explore other options to make this simpler.

let text = node.utf8_text(source).ok()?;
text.strip_prefix('\'')?.strip_suffix('\'')
}
_ => node.utf8_text(source).ok(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback seems risky -- I think you want to match "word" explicitly here and then have unknown node types here be _ => None?

Copy link
Author

@jfagoagas jfagoagas Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"string" => {
// For double-quoted strings, extract the string_content child
// to avoid the surrounding quotes.
if node.named_child_count() == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should explicitly match the string_content node, rather than assuming that any child of string is string_content -- the latter is probably not guaranteed to be true.

For example, "$(abc)" is string > command_substitution instead of string > string_content.

Copy link
Author

@jfagoagas jfagoagas Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let Some(image_ref) = Self::extract_docker_image(&args) {
// Skip arguments containing shell variable expansions or
// GitHub Actions expressions — we can't statically resolve those.
if !image_ref.contains('$') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting: this might not be solvable in a clean way (because of actions expressions), but ideally we'd do this filtering via the CST instead (since $ could be a valid part of an image identifier, in principle).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to look into this in depth. Rust is not one of my preferred/known languages.

///
/// `args` contains all arguments after the command name (docker/podman).
/// Handles global flags before the subcommand (e.g. `docker --context foo run alpine`).
fn extract_docker_image<'a>(args: &[&'a str]) -> Option<&'a str> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the comment above, I think this is unfortunately going to be too brittle. If the goal is to parse Docker CLI arguments, we should use clap or another fully-equipped parser that can track positionals/consumption states for us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is going to be nontrivial, however, since Docker and Podman have large CLIs.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of clap but definitely something to take a look to track this. We can make this simpler covering some scenarios.

///
/// NOTE: `@span` is required by [`utils::SpannedQuery`] and can be used in the
/// future to produce more precise finding locations within multiline scripts.
const BASH_DOCKER_CMD_QUERY: &str = r#"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about pwsh, etc.? We generally try to support those too since they're common enough (unfortunately) in GitHub Actions.

(Those don't need to be part of an MVP here, but curious if you have thoughts on them.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this but I didn't want to add more to this PR. We can add this as part of the plan.

});

for image_str in self.docker_images_in_run(run, shell)? {
let location = step.location().primary().with_keys(["run".into()]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to use a subfeature to get the exact command/argument span, since run: blocks can be very large.

@jfagoagas
Copy link
Author

Thanks for opening a PR @jfagoagas.

Thank you for maintaining such a great tool 🙌

I'm interested in this feature, but I'm concerned about this approach's stability/brittleness; I've left some comments to that effect.

I'll take a look and reply inline.

Separately, this PR is probably too big to review -- anything above roughly ~500 lines takes me much longer to read and understand, so it'd be ideal to break this up into several smaller PRs (each with tests, where possible). If you're interested in taking that on, we can discuss what an MVP/tractable design would look like here.

I thought this would happen when I was creating it. I saw this change requires a refactor in the audit, so please let me know what's your plan to split this up! We can create a chain of PRs where each one points another and can be merged independently.

@woodruffw
Copy link
Member

Thank you for maintaining such a great tool 🙌

Thank you for the kind words!

Yeah, my preference here would be to come up with a sequence of PRs that add this, each no more than a few hundred lines. Ideally we'd start with just Docker, and do:

  1. A PR containing accurate models of the Docker CLI, so we can parse it and extract image references without ad-hoc CLI parsing. These could be based on clap or something else; it's a somewhat open design space (but we already use clap for zizmor's own CLI). This would also ideally contain tests so we can pre-validate it before adding it to the audit.
  2. Integrations into this audit, basically extracting command (name) (argument) (argument) ... trees from the bash CST and parsing them with (1) + reaching determinations.
  3. Repeating the same for Podman's CLI.
  4. Repeating the same for non-bash shells, primarily pwsh. This is optional though; I guess it's probably not very common to invoke Docker or other OCI containers on Windows runners now that I think about it.

I think a sequence roughly like that would make this tractable for me to review.

@jfagoagas
Copy link
Author

jfagoagas commented Mar 3, 2026

Thank you for the kind words!

Yeah, my preference here would be to come up with a sequence of PRs that add this, each no more than a few hundred lines. Ideally we'd start with just Docker, and do:

  1. A PR containing accurate models of the Docker CLI, so we can parse it and extract image references without ad-hoc CLI parsing. These could be based on clap or something else; it's a somewhat open design space (but we already use clap for zizmor's own CLI). This would also ideally contain tests so we can pre-validate it before adding it to the audit.

  2. Integrations into this audit, basically extracting command (name) (argument) (argument) ... trees from the bash CST and parsing them with (1) + reaching determinations.

  3. Repeating the same for Podman's CLI.

  4. Repeating the same for non-bash shells, primarily pwsh. This is optional though; I guess it's probably not very common to invoke Docker or other OCI containers on Windows runners now that I think about it.

I think a sequence roughly like that would make this tractable for me to review.

Sounds like a plan. I'll start with the first one once I get the chance to, most likely during the weekend. I'll keep this PR as a reference if you don't mind until the feature is implemented.

@woodruffw
Copy link
Member

woodruffw commented Mar 3, 2026

Yeah, no worries. Thanks for contributing!

@jfagoagas
Copy link
Author

@woodruffw before jumping into writing code I'd like to validate the following plan with you:

Plan: Split unpinned-images-run-block into reviewable PR chain

PR chain

PR 1: Docker CLI models with clap (~350-400 lines, from main)

  • New file: crates/zizmor/src/models/docker.rs
  • Clap-derive structs modeling Docker CLI for pull, run, create subcommands
  • Single public API: DockerCli::parse_image(args: &[&str]) -> Option<String>
  • Clap handles combined -dit flags, --flag=value, -eFOO=bar automatically
  • try_parse_from failures return None (safe false negatives)
  • Unit tests only, no audit dependency
  • Docker-only flags (no Podman)

PR 2: Audit integration (~300-400 lines, net negative from deletions)

  • Delete hand-rolled flag parsing (~140 lines of const arrays + helper functions)
  • Replace Self::extract_docker_image(&args) with DockerCli::parse_image(&args_refs)
  • Keep all tree-sitter logic, check_image_ref, Audit trait impls unchanged
  • Add test fixture (run-docker.yml) and integration tests

PR 3: Podman CLI model (~200-250 lines)

  • Add PodmanCli struct with Podman-specific flags (--remote, --syslog, --env-host, --replace, etc.)
  • -a semantics differ between Docker and Podman, separate structs keep this clean
  • Dispatch on @cmd tree-sitter capture: "docker" → DockerCli, "podman" → PodmanCli

PR 4: Documentation (final)

  • docs/audits.md updates for run: block detection

Open questions: PR review workflow

Two options for how to structure the chain:

Option A — Sequential merge into main: Each PR targets main. PR #2 waits for PR #1 to merge, PR #3 waits for PR #2. You review and merge one at a time.

Option B — Stacked PRs: PR 1 targets main, PR 2 targets PR 1's branch, PR 3 targets PR 2's branch. You can review the full chain at once, then merge bottom-up.

Which do you prefer?

@woodruffw
Copy link
Member

Thanks @jfagoagas, that plan sounds reasonable to me. I prefer to do sequential merges for this, as long as that works for you!

(I have some small nitpicks on each, but they're minor and can be addressed on-the-fly.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: unpinned-images could discover docker pull ... patterns in run: clauses

3 participants