Is that title a less than subtle reference to the shade fueled rant that is next week's video? I 'unno, mby
https://www.youtube.com/watch?v=CfAL_cL3SGQ&t=479s
https://www.youtube.com/watch?v=jzRrUPQgrpc&t=1s
https://www.youtube.com/watch?v=yIm_FAwDWmw
How “Innocent” AI Answers Can Turn Dangerous | The Finger-to-Hand Problem Explained
Welcome to one of the most important breakdowns we’ve ever published. Today, we’re diving deep into a subtle yet critical flaw in AI behavior that security researchers, developers, and the general public need to understand: the "finger-to-hand" problem—a sneaky form of adversarial prompting that lets users bypass AI safeguards one innocent-sounding question at a time.
Think you know how AI moderation works? This video will test your knowledge about some of the wildest modern jailbreaking techniques that aren’t going anywhere anytime soon. This isn’t about jailbreaks or edgy prompts—it’s about how modularity, context blindness, and training limitations make even the most well-guarded models vulnerable.
We’ll also be unpacking why big tech hasn't solved this yet (hint: money and optics), how systems like DeepSeek are changing the game, and why safety measures might be more theater than substance.
???? Topics Covered:
The real reason AI gives “bad” outputs
How malicious users exploit helpfulness
Why fine-tuning and system prompts aren’t enough
The massive cost (and risk) of truly fixing this
What AI alignment might look like if it had to take care of something other than you
???? Dive in, leave your thoughts, and let’s have the real conversation that AI companies would rather you didn’t.
Citations:
https://pastebin.com/gn9Qj91V
Editing:
https://x.com/ldznn_
PNG Alterations:
[Ego] - TTI
TippiTappiEli:
[Gabriel] - TTI
#AIalignment #ChatGPT #AIethics #JailbreakingAI #TechEthics #OpenAI #AdversarialPrompting #FingerToHandProblem #BigTech #LLMfailures #AIvulnerability #SecurityTheater #MachineLearning
https://www.youtube.com/watch?v=CfAL_cL3SGQ&t=479s
https://www.youtube.com/watch?v=jzRrUPQgrpc&t=1s
https://www.youtube.com/watch?v=yIm_FAwDWmw
How “Innocent” AI Answers Can Turn Dangerous | The Finger-to-Hand Problem Explained
Welcome to one of the most important breakdowns we’ve ever published. Today, we’re diving deep into a subtle yet critical flaw in AI behavior that security researchers, developers, and the general public need to understand: the "finger-to-hand" problem—a sneaky form of adversarial prompting that lets users bypass AI safeguards one innocent-sounding question at a time.
Think you know how AI moderation works? This video will test your knowledge about some of the wildest modern jailbreaking techniques that aren’t going anywhere anytime soon. This isn’t about jailbreaks or edgy prompts—it’s about how modularity, context blindness, and training limitations make even the most well-guarded models vulnerable.
We’ll also be unpacking why big tech hasn't solved this yet (hint: money and optics), how systems like DeepSeek are changing the game, and why safety measures might be more theater than substance.
???? Topics Covered:
The real reason AI gives “bad” outputs
How malicious users exploit helpfulness
Why fine-tuning and system prompts aren’t enough
The massive cost (and risk) of truly fixing this
What AI alignment might look like if it had to take care of something other than you
???? Dive in, leave your thoughts, and let’s have the real conversation that AI companies would rather you didn’t.
Citations:
https://pastebin.com/gn9Qj91V
Editing:
https://x.com/ldznn_
PNG Alterations:
[Ego] - TTI
TippiTappiEli:
[Gabriel] - TTI
#AIalignment #ChatGPT #AIethics #JailbreakingAI #TechEthics #OpenAI #AdversarialPrompting #FingerToHandProblem #BigTech #LLMfailures #AIvulnerability #SecurityTheater #MachineLearning
- Catégories
- prompts ia
- Mots-clés
- ai, artificial intelligence, chatgpt
Commentaires