Machine Learning as a Pipeline

From Deep Learning Patterns and Practices by Andrew Ferlitsch

Like the best software engineering, modern deep learning uses a pipeline architecture based on reusable patterns.

Take 37% off Deep Learning Patterns and Practices by entering fccferlitsch into the discount code box at checkout at

You’ve likely seen this before. A successful ML engineer will need to decompose a machine learning solution into the following steps:

  1. Identify the Type of Model for the Problem
  2. Design the Model
  3. Prepare the Data for the Model
  4. Train the Model
  5. Deploy the Model

ML engineer(s) organized these steps into a two stage end-to-end (e2e) pipeline. The first e2e pipeline consists of the first three steps, which is depicted in figure 1 below as modeling, data engineering, and training. Once the ML engineer(s) is successful with this stage, it would be coupled with the deployment step to form a second e2e pipeline. Typically, the model was deployed into a container environment and accessed via a REST based or microservice interface.

Fig. 1 2017 prevailing practice for end-2-end machine learning pipeline

That was the prevailing practice in 2017. I refer to it as the discovery phase. What are the parts and how do they fit together?

Machine Learning as a CI/CD production process

In 2018, businesses were formalizing the continuous integration/continuous development (CI/CD) production process, which I refer to as the exploration phase. Figure 2 is a slide I used in a Google presentation to business decision makers in late 2018 which captures where we were then. It wasn’t just a technical process anymore, but included the integration of planning and quality assurance. The data engineering became more defined as extraction, analysis, transformation, management and servicing steps. Model designing and training included feature engineering, and the deployment expanded to include continuous learning.

Fig. 2 By 2018, Google and other large enterprise businesses were formalizing the production process to include the planning and quality assurance stages as well as the technical process.

Model Amalgamation in production

Models today in production don’t have a single output layer. Instead they have multiple output layers, from essential feature extraction (common layers), representational space, latent space (feature vectors, encodings) and probability distribution space (soft and hard labels). The models now are the whole application — there is no backend. They learn the optimal way to interface and data communication. The enterprise ML engineer of 2020 is now guiding the search space within an amalgamation of models. You can see a generalized example of a model amalgamation in figure 3.

Fig. 3 Model Amalgamation — when the models become the entire application!

Let’s break down this generalized example. On the left side is the input to the amalgamation. The input is processed by a common set of convolutional layers into, referred to as the shared model bottom. The output from the shared model bottom in this depiction has four learned output representations: 1) high dimensional latent space, 2) low dimensional latent space, 3) pre-activation conditional probability distribution, and 4) post-activation independent probability distribution. Each of these learned output representations are reused by specialized downstream learned tasks which perform an action (e.g., state transition change or transformation). For each task, represented in the figure as tasks 1, 2, 3 and 4, reuses the output representation which is the most optimal (size, speed, accuracy) for the task’s goal.

These individual tasks may then produce multiple learned output representations or combine learned representations from multiple tasks (dense embeddings) for reuse for further downstream tasks, as you saw in the sports broadcasting example.

Not only do serving pipelines enable these types of solutions, the components within the pipelines can be version controlled and reconfigured. This enables these components to be reusable, which is a fundamental principle in modern software engineering.

That’s all for now. If you want to learn more about the book, check it out on Manning’s liveBook platform here.




Follow Manning Publications on Medium for free content and exclusive discounts.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

A Comprehensive Guide to Convolution Neural Network

AI and cognitive computing: Unstructured data becomes structured knowledge

Natural Language Processing: Naive Bayes Classification in Python

Dog Breed Classifier

Practical Life lessons learned when performing Machine Learning Analysis

Learning Day 69: Image segmentation for biomedical applications — U-Net

Deep Learning

Use Metrics to Determine LDA Topic Model Size

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Manning Publications

Manning Publications

Follow Manning Publications on Medium for free content and exclusive discounts.

More from Medium

How Experiment Management Makes it Easier to Build Better Models Faster

Starting with ML Engineering in production.

Comparing Multiple Model Accuracy using Apache Airflow

How Containers Simplify the MLOps Model Production Pipeline