Walkthrough of a sample based on a real ML use case. Dealing with large unbalanced datasets, lazily preprocessing only the data used for training, orchestrating the workflow in Beam.
Inspired by: https://wildlifeinsights.org