spize.ai
Login

The technical screen
for the AI era.

See how candidates actually build with AI — every prompt, every decision, every tool.

Become a Design Partner
$ npx @spize/cli <token>
spize-session — zsh
~ npx @spize/cli abc-123-def
spize Activating session...
spize Challenge ready. Work normally — we're watching.
Session active · 1h 30m remaining
Signals ● OBSERVING
[14:02] Candidate prompted AI → modified output before committing
[14:08] 3 min pause — reviewed AI output, then rewrote auth logic manually
[14:12] Caught AI hallucination in error handler — corrected it
[14:18] ⚠ Hardcoded API key in config — security flag raised
[14:23] Switched tools mid-task — novel prompting technique detected

Finding your next AI engineer sucks.

Your team is stuck doing interviews and reviewing take-homes — and 84% of devs use AI in ways you'll never see.

LeetCode

Tests memorization, not real work.

Take-homes

You see the output, never the process.

"No AI" rules

Tests a workflow nobody uses anymore.

Assessment Review Dashboard

Three candidates. Same challenge.

Assessment: Recreate HackerNews · 90 min · React + Node · 3 completed

Hover the top candidate to see their prompt timeline →

B
Candidate B
88 min · 6 commits
MAYBE
LIVE
HackerNews Clone
Show HN: I built a full-stack app in 90 min
142 points · 28 comments
The future of AI-assisted development
89 points · 12 comments
Why Rust is eating the world
67 points · 9 comments
Tools
Claude Code
AI Usage
High delegation · low scrutiny
72% accept18% modify10% reject
Signals
Prompt Clarity
58
Output Scrutiny
42
Tool Fluency
65
Autonomy
30
Timeline
Requirements
Auth
Stories
Voting
Comments
Nested comments
Pagination
A
Candidate A
72 min · 14 commits · ★ top performer
STRONG HIRE
LIVE
Y
Hacker News new | past | comments | ask | show | jobs | submit
1. Show HN: I built a full-stack app in 90 min (github.com)
▲ 142 points by user1 3h ago | 28 comments
2. The future of AI-assisted development (blog.example.com)
▲ 89 points by user2 5h ago | 12 comments
3. Why Rust is eating the world (medium.com)
▲ 67 points by user3 7h ago | 9 comments
Tools
Claude Code Cursor Magic Patterns Puppeteer MCP
AI Usage
Low delegation · high scrutiny
35% accept45% modify20% reject
Signals
Prompt Clarity
92
Output Scrutiny
88
Tool Fluency
85
Autonomy
78
Timeline
Requirements
Auth
Stories
Voting
Comments
Nested comments
Pagination
↻ Hover to see prompt timeline
A
Candidate A — Prompt Timeline
23 prompts · 72 min session
↻ hover off to flip back
0:00 PROMPT #1
"We are tasked with recreating HackerNews. Before we write any code, let's scour GitHub to find public repos of people doing this exact project. Find only high quality repos from the last year. I want to study their architecture decisions before we start."
+1 novel prompting technique research-first approach
0:04 PROMPT #2
"Good finds. Now deep think about the common patterns across these repos. What's the minimal schema we need? I want you to circle back with me to clarify before generating any models."
key term: "deep think" key term: "circle back to clarify" +1 verification checkpoint
0:12 PROMPT #3
"Schema looks right. Now scaffold the project — but don't implement any routes yet. Just give me the folder structure, package.json, and DB migrations. I'll use Magic Patterns to generate the UI components separately."
+1 task decomposition multi-tool orchestration
0:20 PROMPT #4 — Modified AI output
"The auth middleware you generated catches generic errors. That's not good enough — I need specific handlers for TokenExpiredError and InvalidSignatureError. Also, move the JWT secret to an env var, not hardcoded."
+1 error correction +1 security instinct
0:35 PROMPT #5
"Core features are working. Now I want nested comments — but before you implement, show me just the recursive SQL query you'd use. I want to validate the approach before we build the component tree."
key term: "validate approach" +1 verification checkpoint
0:52 PROMPT #6
"Last push — add pagination. Use cursor-based, not offset. And run the Puppeteer tests against localhost to make sure everything renders correctly before I submit."
+1 testing awareness cursor pagination (advanced)
... 17 more prompts in full session →
C
Candidate C
90 min · 3 commits
NO HIRE
FAILED
Build Failed
ERROR in src/App.tsx:24:8
TS2322: Type 'string' is not assignable
to type 'number'.
ERROR in src/components/Story.tsx:11:5
Cannot find module './Comments'
webpack compiled with 2 errors
Tools
Codex CLIChatGPT (browser)
AI Usage
Full delegation · zero scrutiny
94% accept4% modify2% reject
Signals
Prompt Clarity
22
Output Scrutiny
8
Tool Fluency
35
Autonomy
12
Timeline
Requirements
Auth
Stories
Voting
Comments
Nested comments
Pagination
AI prompting
Manual editing
Reviewing output
Errors / debugging
How it works

From real work to hiring signal.

Linear ENG-347
Feature High Priority
Add threaded commenting to Discussions

Implement a commenting system similar to HackerNews for our Discussions section. Needs nested replies, upvoting, and real-time updates. Reference HN's UX for threading depth.

Assigned: Eng Team Sprint 14
spize agent generates assessment
Recreate HackerNews AI GENERATED

Build a functional HackerNews clone — auth, stories, voting, nested comments. Tests the exact skills from ENG-347 but in a standalone challenge.

⏱ 90 min React + Node From: ENG-347
☐ Auth (login/register)
☐ Story submission
☐ Upvoting/downvoting
☐ Threaded comments
☐ Bonus: Pagination
☐ Bonus: Tagging

Use real work. Or let our agent create it.

Point Spize at your actual Linear stories, GitHub issues, or Jira tickets. Our agent analyzes the work your team has done and generates a standalone assessment that tests the same skills — without exposing your codebase.

Or bring your own challenge. Either way, candidates get something that actually matters to your team.

Recreate HackerNews
ACTIVE
⏱ 90 min 📦 React + Node 👥 3 invited
Invite sent to candidate@email.com
$ npx @spize/cli abc-123-def
Token expires in 7 days · One-time use

Invite candidates with a single token.

Each candidate gets a unique, expiring token. One command to start — no accounts, no setup, no IDE restrictions. They use their own tools.

Session active · 1h 28m remaining
─────────────────────────
[14:02] prompt → researched GitHub repos
[14:08] edit → modified auth middleware
[14:15] review → 3 min pause, rewrote logic
[14:22] tool → switched to Magic Patterns
[14:30] flag → caught AI hallucination

They work. We observe.

Every prompt, every tool switch, every pause, every edit. Our agent captures it all silently — no screen recording, no webcam, just AI interaction data.

Assessment Results 3 of 3 completed
A
Candidate A STRONG HIRE
B
Candidate B MAYBE
C
Candidate C NO HIRE

Compare. Decide. Hire.

Side-by-side review with signal scores, prompt timelines, and AI usage profiles. Then each candidate defends their decisions in The Defense.

What We Surface

Not just what they built — how they built it.

A
Candidate A — Prompt Timeline
23 prompts · 72 min session
0:00 PROMPT #1
"We are tasked with recreating HackerNews. Before we write any code, let's scour GitHub to find public repos of people doing this exact project. Find only high quality repos from the last year. I want to study their architecture decisions before we start."
+1 novel technique research-first approach
0:04 PROMPT #2
"Good finds. Now deep think about the common patterns across these repos. What's the minimal schema we need? I want you to circle back with me to clarify before generating any models."
+1 AI proficiency key term: "deep think"
0:20 PROMPT #4 — Modified AI output
"The auth middleware you generated catches generic errors. That's not good enough — I need specific handlers for TokenExpiredError and InvalidSignatureError. Also, move the JWT secret to an env var, not hardcoded."
+1 error correction +1 security instinct
0:35 PROMPT #5
"Core features are working. Now I want nested comments — but before you implement, show me just the recursive SQL query you'd use. I want to validate the approach before we build the component tree."
+1 AI proficiency verification checkpoint
🧪

Novel Techniques

Creative approaches that separate good engineers from great ones. Research-first, multi-tool orchestration, unconventional prompting strategies.

90
🎯

AI Proficiency

Vibing or orchestrating? Do they lead the AI with intent, or just accept whatever comes back?

92
🐛

Error Correction

When the AI hallucinates — do they catch it, fix it, or ship it?

88
🛡️

Security Instincts

Do they think about secrets and trust boundaries — or ship whatever compiles?

85
Post-Assessment

The Defense

Code is done. Now AI interrogates every decision — generated from their actual session data.

The Defense — Candidate A
Recreate HackerNews · 6 questions · ~15 min
LIVE Q 3 / 6
📎 CONTEXT
At 0:00 you searched GitHub for HackerNews clones before writing any code. At 0:04 you asked the AI to "deep think" and "circle back to clarify." At 0:20 you rejected the AI's generic error handler and rewrote it.
Q1 RESEARCH APPROACH

You spent the first 4 minutes researching existing HackerNews repos instead of coding. What specifically were you looking for, and how did it change your architecture?

A1 CANDIDATE

I wanted to see how others handled the comment threading — recursive CTEs vs. adjacency list vs. materialized paths. Found two repos using materialized paths which confirmed my instinct. Saved me from over-engineering the schema.

Q2 ERROR CORRECTION

At 0:20, Claude generated auth middleware with a generic catch(err) block. You rewrote it to handle TokenExpiredError and InvalidSignatureError separately. Why wasn't the generic handler sufficient?

A2 CANDIDATE

Expired tokens need a 401 with a refresh hint. Invalid signatures are a potential attack — that's a 403, log it, and maybe rate-limit the IP. Lumping them together means you can't distinguish between a user who needs to re-login and someone probing your auth.

Q3 SECURITY

You also moved the JWT secret from a hardcoded string to an environment variable in the same edit. Was that in response to the AI's code, or something you planned from the start?

A3 CANDIDATE

Session: 72 min Prompts: 23 Flags: 12
Powered by session data · not vibes

FAQ