Table of contents
AI companies are spending billions training the next generation of large language models — and they need real humans to do it. Not programmers. Not data scientists. Doctors, accountants, writers, lawyers, and even voice actors are earning $25–$300 per hour working remotely for AI labs through platforms like Mercor.
This guide covers every remote AI evaluation role available right now, who qualifies, what the work actually looks like, and how to get hired fast.
Quick summary: These are legitimate contract roles from Mercor, a talent marketplace backed by Benchmark and General Catalyst that partners with leading AI labs including top frontier model companies. Pay is weekly via Stripe or Wise. No office. No set hours.
What is AI evaluation work?
When an AI company trains a model like GPT or Claude, they need humans to review the AI's outputs and rate them. This is called RLHF — Reinforcement Learning from Human Feedback. The better the human feedback, the better the model gets.
The work looks different depending on your background:
- A doctor might review AI-generated clinical scenarios and flag medical inaccuracies
- A finance professional might evaluate whether an AI's financial analysis is sound
- A mathematician might write hard exam questions the AI has to solve
- A writer might rate which of two AI responses is clearer and more helpful
You don't need prior AI experience. You need deep expertise in your own field — that's the whole point. The AI labs already have engineers. What they lack is a cardiologist who can spot when an AI hallucinates a drug interaction, or a CPA who notices when a balance sheet doesn't add up.
Full list of roles & pay rates (2026)
| Role | Pay | Background needed | Apply |
|---|---|---|---|
| Emergency Medicine Expert | $130–$300/hr | EM physician, board certified | Apply → |
| Superstar Reviewer | $70–$150/hr | Top-performer evaluation role | Apply → |
| Voice Actor — CX Voice Cloning | $50–$150/hr | Voice acting / narration (US only) | Apply → |
| Generalist AI Evaluator | $60–$200/hr | Native English, bachelor's preferred | Apply → |
| Cyber Benchmark — Blue Team | $85–$140/hr | SOC / detection engineering / IR | Apply → |
| Venture Capital Expert | $100/hr | 2+ yrs in VC | Apply → |
| Corporate Finance Expert | $100/hr | 2+ yrs corporate finance | Apply → |
| Accounting Expert | $85–$100/hr | CPA, Big 4 background ideal | Apply → |
| STEM Python Expert | $70–$90/hr | BS/MS/PhD STEM + Python | Apply → |
| Econ & Finance Assessment | $40–$90/hr | PhD / Master's in Econ or Finance | Apply → |
| Business & Commerce Assessment | $40–$90/hr | PhD / MBA in Business | Apply → |
| Enterprise Sales / AE Expert | $50–$90/hr | 5+ yrs AE / B2B sales | Apply → |
| STEM PhD Contributor | $50–$80/hr | STEM PhD from US/UK/CA/EU university | Apply → |
| AI Training Scenario Designer | $30–$80/hr | Writing, PM, UX, or game design background | Apply → |
| Mathematics Assessment Specialist | $25–$60/hr | PhD / Master's in Math | Apply → |
| Bilingual French STEM Expert | $38–$53/hr | Native French, BS in STEM | Apply → |
| Commerce Specialist — AI Agents | $47–$55/hr | Retail / e-commerce operations | Apply → |
| Bilingual German STEM Expert | $38–$43/hr | Native German, BS in STEM | Apply → |
| Generalist — Real World Understanding | $34–$40/hr | Recent grad from selective university | Apply → |
| Bilingual Italian Evaluator | $25–$30/hr | Native Italian (Italy or Swiss-Italian) | Apply → |
Highest paying roles ($85–$300/hr)
1. Emergency Medicine Expert — $130–$300/hr
Mercor is partnering with a leading AI lab to train frontier models on clinical reasoning. EM attendings, dual-boarded physicians, and final-year residents design realistic scenarios, write reference responses, and grade AI outputs against evidence-based rubrics. 20 hrs/week, fully async. US and Canada only.
Apply now →2. Generalist AI Evaluator — $60–$200/hr
The most accessible high-paying role. Read AI-generated responses and write structured feedback on reasoning quality, accuracy, and nuance. No AI tools allowed — your genuine human judgment is the product. Bachelor's degree preferred, native English required. US, UK, Canada, AU, NZ.
Apply now →3. Cyber Benchmark — Blue Team Engineer — $85–$140/hr
Design and build benchmark tasks grounded in real SOC and detection engineering work. Construct realistic evaluation environments — multi-host networks, Active Directory, cloud control planes. Requires hands-on blue-team experience in detection engineering, threat hunting, incident response, or malware analysis.
Apply now →Easiest to get hired (lowest barrier)
If you want to start earning quickly without a PhD or medical license, these three roles have the lowest barriers to entry and still pay well:
- Generalist AI Evaluator ($60–$200/hr) — just needs native English and a bachelor's degree. 2,131 hired this month alone.
- AI Training Scenario Designer ($30–$80/hr) — writers, PMs, UX researchers, and game designers all qualify. Strong communication skills are the main requirement.
- Generalist — Real World Understanding ($34–$40/hr) — designed for recent grads with strong analytical skills. No specific domain expertise required.
Pro tip: The Generalist AI Evaluator role hires the most people by far. If you're unsure where to start, apply there first. You can always apply to a higher-paying specialized role once you're already in the Mercor system.
Best role by background
Not sure which role fits you? Here's the fastest path based on what you do:
- Doctor / Physician → Emergency Medicine Expert ($130–$300/hr)
- STEM PhD → STEM PhD Contributor ($50–$80/hr) or STEM Python Expert ($70–$90/hr)
- CPA / Accountant → Accounting Expert ($85–$100/hr)
- Finance / VC / Banking → Corporate Finance Expert or VC Expert ($100/hr)
- Cybersecurity → Cyber Benchmark Blue Team ($85–$140/hr)
- Writer / PM / UX → AI Scenario Designer ($30–$80/hr)
- Voice actor / Narrator → Voice Actor role ($50–$150/hr, US only)
- Native French/German/Italian speaker → Bilingual STEM Expert ($25–$53/hr)
- Recent grad / General → Generalist AI Evaluator ($60–$200/hr)
How to apply (step by step)
The process is the same for every role:
- Step 1: Click any Apply link above or visit the ReferWorks job board
- Step 2: Upload your resume and complete a short application (~10 minutes)
- Step 3: Complete a 20–30 minute AI-led screening interview
- Step 4: Get matched to a project based on your background
- Step 5: Start working on your own schedule — get paid weekly
Important: Applying through a referral link fast-tracks your application in Mercor's system. All links on this site are referral links — use them and you'll be prioritized over cold applications.
Frequently asked questions
Is this legitimate?
Yes. Mercor is a venture-backed talent marketplace (investors include Benchmark, General Catalyst, Adam D'Angelo, and Jack Dorsey) that connects experts with AI labs. They have paid out millions to contractors. Pay is weekly via Stripe or Wise — not crypto, not gift cards.
Do I need AI experience?
No. The whole point is that you don't. AI labs need real human domain experts — doctors, lawyers, accountants, engineers — who can spot errors that generalist reviewers miss. Your field expertise is the value, not AI knowledge.
Can I work outside the US?
Most roles accept candidates from the US, UK, Canada, Australia, and the EU. Some roles (Emergency Medicine, Voice Actor) are US/Canada only. The bilingual roles specifically require candidates from certain countries. Check each listing for location requirements.
How quickly can I start earning?
Most people hear back within 1–4 weeks of applying. After passing the screening, some roles start immediately. Pay is weekly once you begin working.
Can I do multiple roles?
You can apply to multiple roles but will typically be matched to one project at a time. As you build your track record on Mercor, you're more likely to be selected for additional or higher-paying projects.
Ready to apply?
Browse all 20 live roles and find the right fit for your background.
View all roles →