About this role

We are seeking an AI Systems & Data Engineer to join our team. We are building a fast, flexible, and complex platform with a robust, event-driven architecture. This role requires expertise in building data pipelines within the Databricks environment, specifically for ingesting unstructured data, and leveraging that data to build AI agents.

💥 What You’ll Do

Design and operate Databricks pipelines in Python to ingest and normalize large-scale unstructured data
Build streaming and batch ingestion using Auto Loader, Delta Live Tables, and Workflows
Model and maintain AI-ready lakehouse tables with Delta Lake and Unity Catalog
Prepare retrieval and context datasets for RAG and agent systems
Orchestrate Temporal-based workflows to coordinate data prep, validation, and AI handoff
Enforce data quality, lineage, and access controls across pipelines
Optimize PySpark jobs for performance, reliability, and cost
Integrate pipeline outputs into production AI systems and APIs
Monitor freshness, schema drift, and pipeline health

🧰 Tech Stack (So Far)

Python (primary language for all LLM + orchestration work)
LangChain + LangGraph + LangSmith
Databricks + PySpark for processing, labeling, and training context
Gemini + model routing logic
Postgres, and custom orchestration via MCP
GitHub Actions, GCP

You’ll be a crucial member of rolling out products that will have immediate impact.

💻 How We Build

Engineers come first: your time, focus, and judgment are respected
Deep work > chaos: fixed cycles & cooldowns protect focus and keep context switching low
Autonomy is the default: trusted builders who own outcomes, no babysitters
Ship daily, safely: merge early, integrate vertically, ship often, use feature flags, and keep momentum
Outcomes over optics: solve real problems, not ticket soup
Voice matters: from week one, contribute, improve something, and shape how we build
Senior peers, no ego: collaborate in a high-trust, async-friendly environment
Bold problems, cool tech: work on complex challenges that actually move the needle
Fun is part of it: we move fast, but we also celebrate wins and laugh together

✅ What We’re Looking For

5-7 years of experience building production-grade ML, data, or AI systems.
Strong grasp of prompt engineering, context construction, and retrieval design.
Comfortable working in LangChain and building agents.
Experience with PySpark and Databricks to handle real-world data scale.
Ability to write testable, maintainable Python with clear structure.
Understanding of model evaluation, observability, and feedback loops.
Excited to push from prototype → production → iteration.
Familiarity with Databricks Data Intelligence Platform which unifies data warehousing and AI use cases on a single platform.
Knowledge of Unity Catalog for open and unified governance of data, analytics, and AI on the lakehouse.
Understanding of data security concerns related to AI and how to mitigate them using the Databricks AI Security Framework (DASF).
Confident English skills to collaborate clearly and effectively with teammates

🔥 Bonus If You:

Have built scalable agent-like workflows on the Databricks platform.
Have worked on semantic chunking, vector search, or hybrid retrieval strategies.
Can walk us through a real-world prompt failure and how you fixed it.
Have contributed to OSS tools or internal AI platforms.
Think of yourself as both an engineer and a systems designer.
Are familiar with the concept of a data lakehouse architecture.

📍 Location & Compensation

Must be based in San Francisco, Las Vegas, or Tel Aviv
Full-time role with competitive comp
Flexible hours, async-friendly culture, engineering-led environment

About this role

💥 What You’ll Do

Design and operate Databricks pipelines in Python to ingest and normalize large-scale unstructured data
Build streaming and batch ingestion using Auto Loader, Delta Live Tables, and Workflows
Model and maintain AI-ready lakehouse tables with Delta Lake and Unity Catalog
Prepare retrieval and context datasets for RAG and agent systems
Orchestrate Temporal-based workflows to coordinate data prep, validation, and AI handoff
Enforce data quality, lineage, and access controls across pipelines
Optimize PySpark jobs for performance, reliability, and cost
Integrate pipeline outputs into production AI systems and APIs
Monitor freshness, schema drift, and pipeline health

🧰 Tech Stack (So Far)

Python (primary language for all LLM + orchestration work)
LangChain + LangGraph + LangSmith
Databricks + PySpark for processing, labeling, and training context
Gemini + model routing logic
Postgres, and custom orchestration via MCP
GitHub Actions, GCP

You’ll be a crucial member of rolling out products that will have immediate impact.

💻 How We Build

Engineers come first: your time, focus, and judgment are respected
Deep work > chaos: fixed cycles & cooldowns protect focus and keep context switching low
Autonomy is the default: trusted builders who own outcomes, no babysitters
Ship daily, safely: merge early, integrate vertically, ship often, use feature flags, and keep momentum
Outcomes over optics: solve real problems, not ticket soup
Voice matters: from week one, contribute, improve something, and shape how we build
Senior peers, no ego: collaborate in a high-trust, async-friendly environment
Bold problems, cool tech: work on complex challenges that actually move the needle
Fun is part of it: we move fast, but we also celebrate wins and laugh together

✅ What We’re Looking For

5-7 years of experience building production-grade ML, data, or AI systems.
Strong grasp of prompt engineering, context construction, and retrieval design.
Comfortable working in LangChain and building agents.
Experience with PySpark and Databricks to handle real-world data scale.
Ability to write testable, maintainable Python with clear structure.
Understanding of model evaluation, observability, and feedback loops.
Excited to push from prototype → production → iteration.
Familiarity with Databricks Data Intelligence Platform which unifies data warehousing and AI use cases on a single platform.
Knowledge of Unity Catalog for open and unified governance of data, analytics, and AI on the lakehouse.
Understanding of data security concerns related to AI and how to mitigate them using the Databricks AI Security Framework (DASF).
Confident English skills to collaborate clearly and effectively with teammates

🔥 Bonus If You:

Have built scalable agent-like workflows on the Databricks platform.
Have worked on semantic chunking, vector search, or hybrid retrieval strategies.
Can walk us through a real-world prompt failure and how you fixed it.
Have contributed to OSS tools or internal AI platforms.
Think of yourself as both an engineer and a systems designer.
Are familiar with the concept of a data lakehouse architecture.

📍 Location & Compensation

Must be based in San Francisco, Las Vegas, or Tel Aviv
Full-time role with competitive comp
Flexible hours, async-friendly culture, engineering-led environment

AI Systems & Data Engineer

Summary

Location

Type

Experience

Company links

About this role

💥 What You’ll Do

🧰 Tech Stack (So Far)

💻 How We Build

✅ What We’re Looking For

🔥 Bonus If You:

📍 Location & Compensation

What you'll do

About HyperFi

Ready to join HyperFi?

Frequently Asked Questions

What does a AI Systems & Data Engineer do at HyperFi?

Is the AI Systems & Data Engineer position at HyperFi remote?

How do I apply for the AI Systems & Data Engineer position at HyperFi?

AI Systems & Data Engineer

Summary

Location

Type

Experience

Company links

About this role

💥 What You’ll Do

🧰 Tech Stack (So Far)

💻 How We Build

✅ What We’re Looking For

🔥 Bonus If You:

📍 Location & Compensation

What you'll do

About HyperFi

Ready to join HyperFi?

Frequently Asked Questions

What does a AI Systems & Data Engineer do at HyperFi?

Is the AI Systems & Data Engineer position at HyperFi remote?

How do I apply for the AI Systems & Data Engineer position at HyperFi?

AI Systems & Data Engineer

Summary

Location

Type

Experience

Company links

About this role

💥 What You’ll Do

🧰 Tech Stack (So Far)

💻 How We Build

✅ What We’re Looking For

🔥 Bonus If You:

📍 Location & Compensation

What you'll do

About HyperFi

Ready to join HyperFi?

Frequently Asked Questions

What does a AI Systems & Data Engineer do at HyperFi?

Is the AI Systems & Data Engineer position at HyperFi remote?

How do I apply for the AI Systems & Data Engineer position at HyperFi?

AI Systems & Data Engineer

Summary

Location

Type

Experience

Company links

About this role

💥 What You’ll Do

🧰 Tech Stack (So Far)

💻 How We Build

✅ What We’re Looking For

🔥 Bonus If You:

📍 Location & Compensation

What you'll do

About HyperFi

Ready to join HyperFi?

Frequently Asked Questions

What does a AI Systems & Data Engineer do at HyperFi?

Is the AI Systems & Data Engineer position at HyperFi remote?

How do I apply for the AI Systems & Data Engineer position at HyperFi?

Join Clera's Talent Pool

Join Clera's Talent Pool