Testing magic link auth: every edge case, no scripts

Magic links are the one auth flow that looks easy and is not. The happy path is six lines of code: generate a token, email it, accept it, sign the user in. The failure modes are where the bugs hide — and they're the kind of bugs you ship without noticing, because nobody clicks an expired link in the test environment unless they're trying to.

This is the testing playbook we'd actually run against a magic-link flow, written as five plain-English prompts. No Playwright, no selectors, no token-decoding logic in your test suite. If you've been here before, scroll to the prompts. If you haven't, the next two sections will save you a security bug.

What "covering magic-link auth" actually means

Most teams test the happy path: request a link, click it, end up signed in. That's roughly 20% of what can go wrong. A complete check needs to verify, at minimum:

Token expiry. A link clicked after the configured TTL fails cleanly with a useful error. (OWASP's Authentication Cheat Sheet treats short-lived tokens as table stakes — the industry has converged on 15–30 minutes for login tokens, with 15 minutes being most common.)
Single-use enforcement. A token consumed once can't be consumed again, even within the TTL. This is the protection against an attacker who got hold of the email after you used the link.
Account enumeration. The page after "send me a link" shows the same message whether or not an account exists. If it doesn't, you've handed attackers a free user-existence oracle.
Rate limiting. Submitting the same email a hundred times in a minute doesn't actually send a hundred emails — and doesn't tell you the email is valid through timing.
Late delivery. A link that arrives three minutes after the user requested it still works, assuming the TTL hasn't elapsed. (This one trips up people who use "request time" instead of "send time" as the issuance moment.)
Cross-device flow. Requesting from one browser and clicking from another (the typical phone-on-laptop scenario) works without weird "session not found" errors.
Logged-out side effects. Clicking the link while already logged in as a different account does the right thing — either signs you out of the wrong account and into the right one, or refuses cleanly, but never leaves you in a "logged in as both" state.

Anything short of that is a smoke test. We're going to write prompts for all of it.

The model bug everyone ships once

Before the prompts, the bug to look for first: token generation that isn't actually single-use.

Implemented naively, the verification step looks like this:

SELECT * FROM magic_link_tokens WHERE token = $1 AND used_at IS NULL;
-- if found, log the user in
UPDATE magic_link_tokens SET used_at = NOW() WHERE token = $1;

That's two queries with a race window between them. Two parallel requests on the same token both pass the SELECT and both reach the UPDATE — the user (or an attacker) gets two sessions from one token. The fix is atomic consumption:

UPDATE magic_link_tokens
SET used_at = NOW()
WHERE token = $1 AND used_at IS NULL
RETURNING user_id;
-- only proceed if exactly one row was returned

You can't test that race condition reliably from a browser. You can test the visible consequence — clicking the same link twice from two tabs — and you can test the broader single-use rule, which is what prompt 2 below does. If you find single-use isn't enforced at the UI level, that's a strong hint the DB-level enforcement is also missing.

Setup: pick the environment

You need a staging environment where:

Magic links go to a real test inbox you can read (a @yourapp.test address piped to a service like Mailosaur, or a catch-all on a domain you control).
The token TTL matches production (don't run with MAGIC_LINK_TTL=24h "for testing").
Rate limiting is on, with production-like thresholds.

The agent can read the email if you give it the inbox URL or its API. Most teams point it at a hosted test inbox; if you're using a service like Mailosaur or Mailtrap, give the agent the message URL or a credentials-protected mailbox page. We're not going to bake those credentials into the prompts — pass them via your scenario config and reference them as {{INBOX_URL}} or similar.

Throughout this guide, the URL we'll target is https://staging.yourapp.com/login. Substitute your own.

Prompt 1: the happy path

Start with the boring one. If this is broken, every other test is moot.

Go to https://staging.yourapp.com/login.
Enter the email "magiclink-happy@yourapp.test" and request a magic link.

Verify that the page shows a generic "Check your email" message after submitting.

Then open the test inbox at {{INBOX_URL}}, find the most recent email
sent to that address, and click the magic link inside it.

Verify that:
- The link opens the application and signs the user in
- The user lands on the dashboard (or wherever first-login redirects)
- No error messages appear on the page or in the console

If any of those fail, that's the bug.

The bug class this catches that you didn't think to script: a malformed mailto: link, a token URL that's percent-encoded twice, a redirect target that breaks in incognito because of a missing third-party-cookie fallback.

Prompt 2: token reuse

This is the test most teams skip. It's also the one that most often surfaces a real bug.

Go to https://staging.yourapp.com/login.
Request a magic link for the email "magiclink-reuse@yourapp.test".

Open the test inbox at {{INBOX_URL}} and locate the magic link URL.
Copy the URL without clicking it.

In a fresh incognito window, paste the URL and visit it. Verify it signs in successfully.

Now log out, then visit the same URL again in another incognito window.

Verify that the second visit fails with a clear error message
(something like "this link has already been used" or "expired").
Verify that the second visit does NOT sign anyone in.

The failure mode you're hunting: the token is single-use server-side, but the page that receives the link doesn't actually mark it consumed until the redirect resolves. So two near-simultaneous clicks both succeed. The agent won't reproduce the millisecond-level race, but it will catch the "I clicked it yesterday and it still works today" class of bug, which is structurally the same.

Prompt 3: expired token

The TTL test. You'll need a way to force-expire a token — either an admin endpoint that ages tokens, an environment flag that sets TTL to 5 seconds for tests, or a clock you can advance. Most teams have one of these in staging; if you don't, the cheapest version is a ?ttl_override=5s query parameter the request endpoint accepts in non-prod environments.

Go to https://staging.yourapp.com/login?ttl_override=5s and request a
magic link for "magiclink-expired@yourapp.test".

Open the test inbox at {{INBOX_URL}} and find the link URL.
Wait 30 seconds without clicking it.

Then visit the link URL.

Verify that the page shows a clear "this link has expired" message.
Verify that the user is NOT signed in.
Verify that the page offers a useful next step — like a button to
request a new link — and that clicking that button works.

The bug to catch: an expired link that fails silently, leaves the user on a blank page, or — worse — accepts the token anyway because the expiry check is on the wrong column.

Prompt 4: account enumeration

The security one. Most teams have shipped this bug at least once.

Go to https://staging.yourapp.com/login.

Step 1: Submit the form with an email that you know is NOT registered:
"definitely-does-not-exist-{random}@yourapp.test".
Record the exact text of the confirmation message, the response time,
and any console output.

Step 2: Submit the form with an email that IS registered:
"magiclink-happy@yourapp.test".
Record the exact text of the confirmation message, the response time,
and any console output.

Compare the two. They should be visually identical and arrive within
similar time. If the wording differs ("we sent you a link" vs "no account
found"), or if one takes noticeably longer than the other, that's an
account enumeration vulnerability — flag it as a bug.

Enumeration is one of the easiest auth bugs to ship and one of the hardest to notice without a deliberate check. OWASP's Authentication Cheat Sheet covers the same idea under "Response Discrepancy" — the user-facing language and timing should be indistinguishable.

Prompt 5: cross-device + already-logged-in

The flow the analytics team will ask about three months from now.

Go to https://staging.yourapp.com/login.
Request a magic link for "magiclink-cross@yourapp.test".

Open the test inbox at {{INBOX_URL}}, find the link, and copy the URL.

Now, in a fresh browser session, sign in to https://staging.yourapp.com
as a DIFFERENT user using whatever credentials work
(test-other@example.com / Password123!). Confirm you're on the dashboard
as that user.

While signed in as that other user, paste the magic link URL into the
same browser tab and visit it.

Verify the outcome is sane. Either:
- The session is replaced with the magic-link user (and the prior
  session is cleanly ended), or
- The user is shown a clear "you are already signed in, sign out first"
  prompt with a working button.

There should NOT be a state where the user is partially logged in as
both, or where the URL shows "/dashboard" for the wrong account.

This is the test that catches the cookie-collision bug, the "we have two session keys" bug, and the "the magic link silently fails when there's an existing session" bug. None of those rises to a critical security issue on its own. Stacked together over time, they're how support tickets get weird.

What you do with a failure

A failed agent run on a magic-link flow gives you the same artifact as any other Monito session: a screenshot timeline, the full network log (the POST /api/auth/magic-link/request and the GET /api/auth/magic-link/verify calls with payloads), the console output, and the agent's reasoning at each branch. The run docs cover how to pull session details from the CLI; for auth bugs specifically, the network log is where you'll spend most of your time. (If you're new to reading agent runs, the breakdown in why AI QA agents find bugs your scripts miss is the longer version of why the session beats a green checkmark.)

If the failure is the enumeration one (prompt 4), don't just patch the wording — check that the timing is also indistinguishable. The fast path is usually "respond with the success message before the database lookup completes," which the agent can't measure to the millisecond but can usually tell apart at 100ms+.

Run these on every release

Five prompts, ~5 minutes of agent time per full sweep, well under a dollar of run credits. Wire them into a CI/CD step that runs against every preview deploy, or against staging on a schedule. The prompts don't rot — the agent doesn't care that you renamed the "Send me a link" button to "Email me a link" — so the maintenance cost over the next year is approximately zero. That's the trade compared to a Playwright suite that would need a data-testid update every time the form moves; we wrote about the broader pattern in Playwright alternatives without the code.

Magic-link auth is the kind of flow where one bug = a locked-out user who can't even tell you they're locked out. Five prompts is cheap insurance against the silence.

Want the five prompts as a starter pack? Sign up for Monito, paste them into a new Project, point them at your staging URL, and configure your test inbox. First runs are on us.