Skip to content

Colin Webb

Tag: #data-engineering

SCD via SQL

Slowly Changing Dimensions (SCD) is a data engineering technique to track changes in a record over time. They're useful when changes are unpredictable, infrequent, and a historical vew of the data is needed.

SQL is ubiquitous in data engineering, and therefore it is the perfect tool to implement SCD with.

Fake Data With Python

Real data is best, but sometimes it's not available - or it's not allowed.

Fake data is useful for testing, and for prototyping. It's also useful for generating data for demos, and for training models. Generating fake data is easy and quick.