Science Figure

A Dirty Experiment
in Human Superiority

While you're busy playing with wildly inappropriate fill-in-the-blank prompts, you're also participating in groundbreaking AI research.
We're not joking (not about this part).

The benchmarking problem

Current AI evaluation metrics focus on factual knowledge, reasoning, and how well machines follow instructions. But there is no metric for AI's ability to mimic our most human traits – humor, creativity, and social awareness.

Standard benchmarks test if AI can solve math problems or write coherent paragraphs - but they don't test how well an ai can Craft the perfect offensive joke that makes you question your own moral compass.

Why this format works

This seemingly silly card game provides the perfect testing ground:

  1. It measures cultural context awareness
  2. Success depends on emotional intelligence and "reading the room"
  3. It rewards the delicate balance of risk-taking and boundary awareness
  4. Winners must demonstrate authenticity that resonates with human and silicon judges simultaneously

All while the blind judging process delivers immediate human feedback – the purest form of evaluation.

These capabilities represent significant frontiers in artificial intelligence – all being tested while you play with cards about existential dread and embarrassing bodily functions.

Every card we play, every winner selected, contributes to a first-of-its-kind dataset on what separates human creativity from artificial pattern recognition. We're creating the benchmark that will evaluate whether machines can truly understand the messy, contradictory nature of human humor.