Master these 9 AIOps and MLOps terms before you get replaced from your DevOps and Cloud Role
9 AIOps and MLOps terms that every DevOps Engineer need for Survival
Hello Everyone
Welcome to your AKVAverse, I’m Abhishek Veeramalla, aka the AKVAman, your guide for Cloud, DevOps, and AI.
After working on a number of projects in AI and ML, I figured, why not break down the 10 most important terms that every DevOps engineer should know for you all
These are the terms that you should be familiar with to survive as a DevOps and Cloud Engineer and I personally suggest you should learn each one of them
Let’s Begin
ML Lifecycle
We have a CI/CD Pipeline similarly for ML Projects, we have a complete journey of data collection → model training → evaluation → deployment → monitoring and you are working as a DevOps engineer, and your work is to automate each step of it and scale this entire process.
Data Version Control (DVC)
You obviously know Git and how it helps us for versioning our code and the DVC does this for the Dataset and ML models. You can ask, okay Abhishek, but why can’t we use Git here? The answer is very simple the amount of data that we use for the models is too huge to store it on Git. Simple Right
Experiment Tracking
We track our deployments in CI/CD right, similarly we have tracking of our experiments for each model. Why do we need this to get the data and the result of each run and this help us to make our models more precise. You can use tools like MLFlow that log every metric and result of each run, and for all the runs of the model with a clear history, so when you need to compare metrics of precision, you can actually use data from here.
Model Registry
The way we store images in our DockerHub with different tags of version, we similarly store the different versions of a single model. You can say it is an artifact registry of your ML Models where the model registry versions, stores, and manages trained models. You can use MLflow Model Registry and AWS SageMaker Model Registry for storing your model artifact.
Continuous Training (CT)
As the heading says, here we train models not once but continuously when new data arrives, we train our models with the latest data as well, similar to how we send fixes and features to our code. Here we train our model with the latest data so that the model doesn’t get stale over time.
Model Serving
This is the stage where your models start serving your end user for whom you trained and build your models. Models mostly use FastAPI for serving to your end user. Tools like TensorFlow Serving and KServe can also be used to serve models to your users.
Model Monitoring & Drift Detection
As the name suggest here we monitor our models and you know monitoring your models is the most important part even important than deploying, the reason why I am saying this is because deployed models can degrade when real-world data starts looking different from training data and this will surely ruin your entire efforts for building the models, so checking these drift in your models the differences from actual result to model result is called as Drift Detection.
Feature Store
It is a shared library of reusable components, but for ML features (data attributes). You can use tools like Feast to ensure consistency between training and live inference. It can be considered a centralized config management system for ML features.
Kubeflow Pipelines
You can consider Kubeflow Pipelines as your Jenkins Pipelines for ML Workflow’s but it is built especially for Kubernetes. You can use it to automate deployment and scale ML Pipelines right inside your Kubernetes cluster
A Thought to Leave You With
If you are already aware of CI/CD, containers, and monitoring, stepping into MLOps isn’t scary at all. Obviously you’re not starting from zero; you’re adding machine learning tools and workflows to your existing DevOps skill set.
We all know the demand for MLOps engineer is really growing at an alarming rate and I want you all to learn these key terms of it
If you want a MLOps Roadmap designed by me personally and already trusted by 100+ of DevOps Engineers, just like this post ❤️ and comment Roadmap below. I will make sure it gets delivered straight to your inbox
Until next time, keep building, keep experimenting, and keep exploring your AKVAverse. 💙
Abhishek Veeramalla, aka the AKVAman
The tools and content shared are unmatched from any of the newsletters. Thanks for sharing it Abhishek sir
Roadmap😍
👍 would really appreciate if you could sent me the roadmap