Skip to content

fix(config): expand env vars inside global-block heredocs#284

Open
worstell wants to merge 1 commit intomainfrom
worstell/expand-opa-heredoc-envars
Open

fix(config): expand env vars inside global-block heredocs#284
worstell wants to merge 1 commit intomainfrom
worstell/expand-opa-heredoc-envars

Conversation

@worstell
Copy link
Copy Markdown
Contributor

@worstell worstell commented May 1, 2026

Summary

${VAR} placeholders inside the opa { policy = <<EOF ... EOF } global block (and any other heredoc on a global block such as log, scheduler, bind) were never substituted at startup. The placeholder reached the OPA Rego compiler verbatim, and the caller_principal == "${MY_WARMER_PRINCIPAL}" style guard could never match any real caller — every write got 403, regardless of the env var value.

Root cause

There are two AST decode paths in the codebase, with different env-var handling:

Decode path What it covers Heredoc expansion
config.Load -> expandVars strategy { }, cache { }, metadata { } yes — walks AST, runs os.Expand on *hcl.String and *hcl.Heredoc
loadGlobalConfig -> hcl.UnmarshalAST with hcl.WithDefaultTransformer opa { }, log { }, scheduler { }, bind, state, url, ... no — transformer only runs against struct-tag default: strings via valueFromTag; live *hcl.Heredoc values reach unmarshalValue and call rv.SetString(v.GetHeredoc()) with no transformer

So the comment on the OPA block ("interpolated at startup via os.Expand") was true for non-heredoc fields elsewhere in the config, but silently false for the heredoc that needed it most.

Fix

  • Export expandVars as ExpandVars so it can be used by both decode paths.
  • Call config.ExpandVars(ast, envars) from loadGlobalConfig in cmd/cachewd/main.go once, before either hcl.UnmarshalAST pass. This in-place mutation makes the live AST consistent with what Load already does for provider blocks, so subsequent unmarshal passes (and the OPA compiler) see the expanded values.
  • Add TestExpandVars covering both *hcl.String and *hcl.Heredoc attribute values plus the os.Expand "unset -> empty string" case, so the heredoc path will not silently regress again.

Verification

  • just lint passes.
  • just test — all 594 tests pass, including the new TestExpandVars.
  • Verified end-to-end against the broken behavior: a caller presenting the SPIFFE id stored in CACHEW_WARMER_PRINCIPAL was being denied even though the env var value matched what the policy text would compare against after substitution. With this change, OPA receives the substituted comparison and the same caller is allowed.

Backwards compatibility

No HCL behaviour changes for configurations that do not use ${VAR} placeholders inside global heredocs — the placeholder simply was not being expanded before, so nothing could have relied on the previous behaviour.

The OPA policy is configured via a heredoc inside the global `opa { policy
= <<EOF ... EOF }` block. Operators are expected to template caller-identity
values into that policy with `${MY_PRINCIPAL}` placeholders that get
substituted from the environment at startup, mirroring how strategy/cache
blocks handle env vars.

That substitution did not happen for the global block. `Load()` calls
`expandVars()` (which walks the AST and runs `os.Expand` on both
`*hcl.String` and `*hcl.Heredoc` attribute values) on the providers AST,
but `loadGlobalConfig` decoded the global AST directly via
`hcl.UnmarshalAST` with `hcl.WithDefaultTransformer`. That transformer is
only applied to struct-tag `default:` values inside `valueFromTag`, never
to live AST node strings — so `${MY_PRINCIPAL}` reached the OPA compiler
verbatim and no SPIFFE id ever matched the resulting comparison.

Export `expandVars` as `ExpandVars` and call it from `loadGlobalConfig`
before either UnmarshalAST pass, so global heredoc and string attribute
values get the same treatment as provider blocks. Adds `TestExpandVars`
covering both node kinds plus the os.Expand "unset → empty" case.
@worstell worstell marked this pull request as ready for review May 1, 2026 00:53
@worstell worstell requested a review from a team as a code owner May 1, 2026 00:53
@worstell worstell requested review from joshfriend and removed request for a team May 1, 2026 00:53
Comment thread cmd/cachewd/main.go
// struct-tag default values, not against live AST node strings, so without
// this pass `opa { policy = <<EOF ... ${CACHEW_X} ... EOF }` would reach
// the OPA compiler with the placeholder intact.
config.ExpandVars(ast, envars)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense, given the line above is InjectEnvars(), that we just refactor that function to do whatever is missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants