Reference shelf

Longer study guides and supporting notes that sit behind the main writing feed.

data-engineering / system-design / interview-prep

Data System Design Interview Glossary

A reference of every technical concept a data engineer should be ready to name, define, or trade off in a data architecture interview. Organized to map onto your delivery framework: Functional Requirements → Non-Funct...

Open
markdown / data-engineering / mlops

Study Guide: Data Operations Architecture at Scale

Production ML data operations is a collection of connected systems: ingestion, annotation, synthetic data, multimodal storage, enrichment, quality monitoring, agentic remediation, self-service tooling, scale patterns,...

Open
data-engineering / llm-pretraining / nemotron

Study Guide: The Nemotron 3 Super Data Engineering Pipeline

A structured reference for how NVIDIA built the 25-trillion-token pretraining corpus behind Nemotron 3 Super — a case study in large-scale data engineering for LLM pretraining, distinct from (but related to) operation...

Open