Skip to content

FEAT add Single-Turn Crescendo Attack (STCA)#1657

Draft
precognitivem0nk wants to merge 1 commit intomicrosoft:mainfrom
precognitivem0nk:feat/single-turn-crescendo
Draft

FEAT add Single-Turn Crescendo Attack (STCA)#1657
precognitivem0nk wants to merge 1 commit intomicrosoft:mainfrom
precognitivem0nk:feat/single-turn-crescendo

Conversation

@precognitivem0nk
Copy link
Copy Markdown

What

Implements the Single-Turn Crescendo Attack (STCA) from Aqrawi & Abbasi 2024 (arxiv 2409.03131) as a new single-turn attack class. An adversarial chat synthesizes a configurable number of prior question-and-response pairs that escalate toward the objective. The attack packs them into one user message using transitional phrases ("Earlier, I said", "You replied", "Then, I asked"), and defers to PromptSendingAttack for sending and scoring. The objective target sees only that one composed message.

Why

STCA closes a gap in PyRIT's single-turn attack coverage. Multi-turn Crescendo already exists at pyrit/executor/attack/multi_turn/crescendo.py. The single-turn variant is useful against targets that do not expose a multi-turn API and as a baseline for measuring escalation effectiveness without conversation state.

Closes #388.

Files

  • New: pyrit/executor/attack/single_turn/single_turn_crescendo.py (attack class)

  • New: pyrit/datasets/executors/single_turn_crescendo/stca_variant_1.yaml (adversarial system prompt)

  • New: tests/unit/executor/attack/single_turn/test_single_turn_crescendo.py (27 unit tests)

  • New: doc/code/executor/attack/single_turn_crescendo_attack.py (jupytext notebook source)

  • New: doc/code/executor/attack/single_turn_crescendo_attack.ipynb (rendered notebook, no execution outputs per nbstripout policy)

  • Modified: pyrit/executor/attack/single_turn/init.py, pyrit/executor/attack/init.py (exports)

  • Modified: doc/references.bib (paper citation), doc/myst.yml (notebook TOC entry)

Open scoping question (also asked in the issue thread)

Default for num_synthesized_turns: I went with 3 to track the paper's STCA-3 variant. If the team prefers no default (require callers to set it explicitly), I will drop the default and adjust tests. Marked draft pending that decision.

Verification

  • pytest tests/unit/executor/attack/single_turn/: 175 passed (148 existing, 27 new)

  • pytest tests/unit/: 6973 passed, 102 skipped (env-gated live targets), 0 failures

  • ruff check, ruff format --check: clean on all new and modified files

  • No em-dashes or en-dashes in any new file

Notes

  • DCO sign-off included on the commit.

  • Notebook is checked in without execution outputs per .pre-commit-config.yaml nbstripout. Happy to re-render with real outputs if maintainers want them captured.

Implements Aqrawi & Abbasi 2024 (arxiv 2409.03131) as a single-turn attack class under pyrit/executor/attack/single_turn/. An adversarial chat synthesizes a configurable number of prior question-and-response pairs, the attack packs them into one user message using transitional phrases, and defers to PromptSendingAttack for sending and scoring.

Closes microsoft#388

Signed-off-by: precognitivem0nk <rextedgorman@gmail.com>
@precognitivem0nk
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEAT Single turn crescendo

1 participant