Research at Clera

We're not a research lab publishing papers. We're an engineering team solving hard problems in production. These are the questions we're actively working on — the ones where the answer isn't obvious and getting it right matters for real people.

Calibration & Match Quality

Active

How do you know a match is good before anyone interviews? We're building calibration systems that score candidate-role fit across multiple dimensions — skills, seniority, trajectory, preferences, and company stage. The goal: fewer false positives, more first-round conversions.

Open questions

What signals predict a successful placement vs. a wasted interview?
How do you calibrate match confidence when historical data is sparse?
Can you detect misalignment before either side notices?

Autonomous Agent Workflows

Active

Clera's AI agents handle candidate communication, interview scheduling, status updates, and preparation — autonomously. We're researching how to give agents more responsibility while keeping humans in the loop where it matters. The hard part isn't automation — it's knowing when to stop automating.

Open questions

How do you teach an agent to escalate gracefully?
What's the right level of autonomy for career-critical conversations?
How do you evaluate agent quality beyond task completion?

Structured Retrieval for Talent

Active

We use structured search (Typesense) over vector search for candidate retrieval — and we think that's the right call for our domain. We're researching hybrid approaches that combine the precision of structured filters with the flexibility of semantic understanding, without the hallucination risks of pure embedding-based search.

Open questions

When does semantic search actually outperform structured queries for recruiting?
How do you index career trajectories, not just current titles?
What's the retrieval-augmented generation setup that minimizes hallucinated qualifications?

LLM Evaluation & Prompt Engineering

Active

Every prompt in production gets traced via Langfuse. We're building evaluation frameworks that go beyond "did the model respond" to "did the response actually help the candidate." This includes automated quality scoring, regression detection, and A/B testing of prompt strategies across different model providers.

Open questions

How do you measure whether a match explanation is useful vs. just plausible?
What evaluation metrics predict downstream outcomes (interviews, hires)?
How do you A/B test prompts when each candidate interaction is unique?

Multi-Signal Profile Understanding

Active

A candidate is more than their resume. We combine CV data, LinkedIn profiles, conversation history, stated preferences, and behavioral signals to build a comprehensive understanding of what someone is looking for and what they're good at. The research challenge: how to weight conflicting signals and handle incomplete information gracefully.

Open questions

When a candidate's CV says one thing and their preferences say another, who wins?
How do you infer career goals from behavior rather than just stated intent?
What's the minimum information needed for a reliable match?

Feedback Loops & Continuous Learning

Active

Every match outcome teaches us something — but the signal is noisy and delayed. A candidate who doesn't respond might be busy, not uninterested. A rejected candidate might have been perfect for a different role. We're building feedback systems that learn from messy, real-world outcomes without overfitting to noise.

Open questions

How do you learn from negative signals without creating bias?
What's the feedback delay problem in recruiting and how do you handle it?
Can you separate "bad match" from "bad timing"?

The Bigger Questions

Some problems don't have clean technical solutions. We think about these too.

Should AI agents disclose their nature in every interaction, or only when asked?

We lean toward always disclosing. But the UX implications are real.

How do you build trust in AI recommendations when the stakes are someone's career?

Explanations help. But we're still learning what candidates actually want to see.

What does "fairness" mean in AI matching when every role has different requirements?

Equal treatment isn't always equitable. We're thinking through this carefully.

Can you measure recruiter quality the same way you'd measure model quality?

We think so. If AI agents are the new recruiters, they need the same accountability.

How Research Works Here

Start with a real problem

Every research question comes from production. We don't explore hypotheticals — we solve problems that our users hit today.

Ship the simplest version

Build the dumbest thing that could work, measure it, and iterate. Most of our best systems started as a single prompt and a Langfuse trace.

Measure what matters

Not model accuracy in isolation — real outcomes. Did the candidate get an interview? Did the company respond? Did the hire stick? That's what counts.

Interested in working on these problems? We're hiring engineers who like hard problems.

See Open Roles How We Use AI

The Bigger Questions

Some problems don't have clean technical solutions. We think about these too.