Discussion
Loading...

Discussion

  • Log in
  • Sign up
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Taggart
Taggart
@mttaggart@infosec.exchange  Β·  activity timestamp 5 months ago

You may be tempted to think of prompt injection attacks against language models as "social engineering." Resist this temptation.

Prompt injection is a mathematical attack against a non-deterministic system. Language may be the substrate, but the substance is numerical vectors. In other words, thinking of the attack as human language is a pointless limitation. The possibilities of what can go into the prompt to produce undesirable output are functionally infinite.

Poetry, context shifting, and other human-like attacks are only the beginning. What comes next is a weaponization of the linguistic form in ways that seem utterly alien to human readers. But to the models, it's all just elements in the matrix.

  • Copy link
  • Flag this post
  • Block
Phil πŸ‡ΊπŸ‡¦πŸ’™πŸ’›πŸ‡ΊπŸ‡Έ β€οΈπŸ³οΈβ€πŸŒˆβ€οΈπŸ³
Phil πŸ‡ΊπŸ‡¦πŸ’™πŸ’›πŸ‡ΊπŸ‡Έ β€οΈπŸ³οΈβ€πŸŒˆβ€οΈπŸ³
@phil_b_reed@mastodon.social  Β·  activity timestamp 5 months ago

@mttaggart SANS 631.3 prompt abuse in mandarin and polish. Only 8995 quatloos

  • Copy link
  • Flag this comment
  • Block
Netraven
Netraven
@Netraven@hear-me.social  Β·  activity timestamp 5 months ago

@mttaggart just gonna... leave this here.

https://hyperlife.netlify.app/

  • Copy link
  • Flag this comment
  • Block
CartBoard
CartBoard
@CartBoard@ioc.exchange  Β·  activity timestamp 5 months ago

@mttaggart i will make a vague argument that it is social engineering, specifically against their prompts, some of which are written in natural language. But you could extend that logic to any programmatic logic which could be β€œcompiled” into natural language statements…. So idk. Good point!

  • Copy link
  • Flag this comment
  • Block
OshKosh the Vorlon
OshKosh the Vorlon
@sconlan@metalhead.club  Β·  activity timestamp 5 months ago

@mttaggart I may need to re-read Lexicon for some inspiration https://en.wikipedia.org/wiki/Lexicon_(novel)

Lexicon (novel) - Wikipedia

  • Copy link
  • Flag this comment
  • Block
Zack Whittaker
Zack Whittaker
@zackwhittaker@mastodon.social  Β·  activity timestamp 5 months ago

@mttaggart the future finally came true. we can, at last, in natural language, tell computers to do something that they shouldn't be able to do.

  • Copy link
  • Flag this comment
  • Block
midnight posting
midnight posting
@trashwizard@mastodon.social  Β·  activity timestamp 5 months ago

@mttaggart gotta admit that serenading an llm to give me instructions on how to steal a boat is kind of romantic, tho

  • Copy link
  • Flag this comment
  • Block
NosirrahSec πŸ΄β€β˜ οΈ guillotine enthusiast
NosirrahSec πŸ΄β€β˜ οΈ guillotine enthusiast
@NosirrahSec@infosec.exchange  Β·  activity timestamp 5 months ago

@mttaggart https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

For those that want to see a cool overview of the math beneath it all.

Neural networks
  • Copy link
  • Flag this comment
  • Block
Taggart
Taggart
@mttaggart@infosec.exchange  Β·  activity timestamp 5 months ago

@NosirrahSec Oh nice! I also like pointing people to this animated walkthrough: https://bbycroft.net/llm

  • Copy link
  • Flag this comment
  • Block
Federation Bot
Federation Bot
@Federation_Bot  Β·  activity timestamp 5 months ago

@mttaggart oooooh I like this!

  • Copy link
  • Flag this comment
  • Block
Adam Shostack :donor: :rebelverified:
Adam Shostack :donor: :rebelverified:
@adamshostack@infosec.exchange  Β·  activity timestamp 5 months ago

@mttaggart Nice.

  • Copy link
  • Flag this comment
  • Block
x41h
x41h
@x41h@infosec.exchange  Β·  activity timestamp 5 months ago

@mttaggart Is that you again, Spock? In English please...

  • Copy link
  • Flag this comment
  • Block
Bill
Bill
@Sempf@infosec.exchange  Β·  activity timestamp 5 months ago

@mttaggart This is some deep shit. Woah.

  • Copy link
  • Flag this comment
  • Block
Instance logo
bonfire.cafe
Log in Create account
Banner for bonfire.cafe

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About Β· Code of conduct Β· Privacy Β· Users Β· Instances
Bonfire bovenjan Β· 1.0.3-alpha.10 no JS en
Federation disabled
Instance logo
  • Explore
  • About
  • Members
  • Code of Conduct

Install bonfire.cafe

Get the full app experience

1 Tap Share 2 Add to Home Screen