Engineering · build the loop

Prove you can ship the model into production.

AI integration, RAG and retrieval, agent orchestration, fine-tuning, and the eval harness that keeps it honest. Verified by build, not buzzwords.

Hiring for: AI Engineer, ML Engineer, RAG Engineer, Applied Scientist.

18-question adaptive quiz across LLM fundamentals, RAG, agents, evals, tool use, prompt caching, and MCP — plus a graded RAG-pipeline project. ~12h total.

4 skill clusters

The microskills the rubric tests against.

1294+ live jobs

AI Engineering roles in the SignalAI marketplace.

18 months

A verified score is good for 18 months before re-certification.

What you'll prove

The clusters every Engineering candidate is graded on.

Retrieval that works

Hybrid retrieval, reranking, chunking strategy. You picked an approach because the numbers said so — not because it was the default.

Agents and orchestration

Tool use, multi-step plans, recovery from bad calls. You can name the failure modes and show how your system survives them.

Evals as a first-class artifact

An eval set sourced from real queries, scored on the metrics that matter, with before/after numbers when you change the system.

Fine-tuning and adaptation

When to fine-tune vs prompt vs retrieve. You've actually shipped one of each and can defend the call.

Sample project brief

Build a hybrid retrieval pipeline with measured quality

Pick a corpus and ship a working RAG pipeline (BM25 + dense + reranker) with an eval harness that lets a reviewer reproduce your numbers.

Deliverables

  • · Public GitHub repo, runs end-to-end from a clean clone
  • · Eval set of ≥50 queries with Recall@10 and faithfulness scores
  • · One ablation that didn't pan out — documented honestly
  • · 1-page narrative on chunk-size and reranker tradeoffs

How grading works

Transparent rubric. Same bar for everyone.

Each criterion is scored 0–5 with a written rationale. Your score is the weighted sum, published with the rubric so an employer can see exactly what you did.

Build quality

Repo runs end-to-end. Pipeline is wired correctly. Ingestion is idempotent.

End-to-end evals

Real query set, real metrics, before/after numbers, one documented ablation.

Code clarity

Right-sized modules. Secrets via env. Eval harness sits next to the pipeline.

Build narrative

Names two real tradeoffs and resolves them with numbers.

Live Engineering jobs

Where AI engineers are hiring right now.

See all 1294

What a verified profile looks like

Every candidate publishes the same way.

See an example scorecard with the composite score, rubric breakdown, project artifact links, and quiz top-microskills. Yours will look exactly like this.

See a sample profile

Get verified

Engineering candidates take AI Builder Foundations today.

The AIB Foundations assessment grades all three rubrics — including Engineering — on the same bar. Standalone Engineering Foundations ships in cohort 02.