The work behind a 290,000-strong UK research panel

A specialist UK fieldwork operation runs on a platform that used to lose half a day to admin every time a new project started. The job of turning a client’s brief into a live, quota-managed survey was skilled, repetitive, and nobody’s favourite afternoon. Now an AI reads the brief in under a minute and hands back a structured screener ready to review. Same humans, same quality bar, less of the rote.

The brief

Research Opinions recruit respondents on behalf of clients across consumer, healthcare, and B2B research. Every project starts with a specification — a paragraph or two of plain English about who the client needs to talk to. “We want 50 men aged 25–34 who have considered switching to an EV and don’t currently own one” is a typical line; real briefs add nuance and run to multiple pages.

Turning a spec like that into a live survey used to require a project manager to do four things in sequence: read the brief, write the screening questions, configure the quota structure, and embed it all into LimeSurvey. They’d then do the work again for the recruitment funnel — a wide-net online survey that feeds eligible respondents through to a deeper telephone interview. Different audience, different register, different question set, same underlying spec. Translating between the two was a separate piece of skilled work.

The team had been doing this manually for years. It worked. But it was slow, and it scaled poorly. The cost wasn’t just time — every translation step is a place an error can creep in.

What we built

AF2 — Autofill v2 — is the platform that sits between client brief and live survey. It manages the full recruitment lifecycle: AI-assisted screener generation, quota engineering, multi-database fraud detection, automated cron orchestration, and four-brand multi-tenant theming.

The AI layer is the heart of it. A PM pastes a specification into the admin UI; the AI reads it and returns a fully structured screener — questions with typed IDs, answer options with screen-out flags, quota cells with condition trees, and a knowledge store of structured notes the system can reference later. Critically, the AI isn’t asked to write prose. It’s constrained to call a specific tool with a defined schema, so anything it returns can be validated and inserted into the database directly. If the response doesn’t match the schema, the system retries with feedback on what went wrong.

The CATI-to-CAWI conversion is the second AI operation. After a PM approves the telephone screener, one click converts it into an online survey suitable for self-completion. The AI rewrites questions from telephone register to web register, broadens answer options for unassisted respondents, detects questions already captured via URL parameters and suppresses them, generates LimeSurvey screening equations from the quota structure, and writes a full changelog of what it changed and why. The PM sees a summary panel — what was changed, what was considered but not changed, what might be missing — and can refine via free-text instructions.

Underneath the AI layer sits Acumonitor, the quality-scoring system that runs on every respondent before they can reach a survey. It searches two separate databases (the panel and the PM database) for duplicates by email, phone, and name+postcode; scores participation frequency with a six-month decay; checks VPN and proxy signals; and produces a 0–100 score. Healthcare surveys score differently to consumer surveys; the weighting is configurable per brand, per survey.

A custom cron manager registers and monitors twelve scheduled jobs across the recruitment stack — quota snapshots, response auto-caching, geocoding enrichment, scheduled survey launches, response alerts, debrief extraction. A single admin UI shows health across all of them.

How it lands

The platform went live in March 2026. AI screener generation costs around fifteen pence per parse. The build covers four brands — Research Opinions, Healthcare Opinions, Acumen Fieldwork, Acumen Health — on a unified codebase with brand-specific theming, copy, and respondent flow.

What we can measure from the code: each AI job is tracked in a job table with token counts, status (parsed → reviewed → saved), retry count, and the model used. Each generation produces both the structured screener and a knowledge store of extracted notes — spec summary, target audience, exclusion criteria, methodology, client terminology — that subsequent operations reference instead of re-reading the original brief.

What the platform doesn’t claim to do yet: generate the kind of interlocking quota matrices where Group 2’s age breakdown is itself a structured table. The AI produces flat quotas reliably; matrix logic is the next piece of engineering. We say so out loud because pretending otherwise would catch us out the first time a client asked.

What it’s worth

The headline isn’t “AI replaced our PMs.” The PMs are still in the loop — reviewing, editing, approving. The headline is that the work the AI does is the work humans found least interesting and most error-prone. The PMs review a structured screener instead of authoring one from scratch. The CATI-to-CAWI conversion is a one-click review instead of an afternoon’s rewriting. The quality scoring is happening for every respondent, not just the ones a PM had time to check.

The platform has been running quietly since March. We’ll write up the volume numbers and the operational wins once we’ve had a longer view.

The brief

What we built

How it lands

What it’s worth

Start with a conversation.