Files
awesome-kubernetes/docs/mlops.md
Inaki Fernandez cdfd26bccd spring + summer
2022-09-11 21:43:43 +02:00

22 KiB
Raw Blame History

Machine Learning Ops (MLOps) and Data Science

Introduction. MLOps

Object Detection Libraries

MLFlow

Kubeflow

Flyte

  • https://flyte.org

  • Union Cloud ML and Data Orchestration powered by Flyte

  • mlops.community: MLOps with Flyte: The Convergence of Workflows Between Machine Learning and Engineering

  • ==Machine Learning in Production. What does an end-to-end ML workflow look like in production? (transcript)== 🌟🌟🌟 - Play Recording

    • Kelsey Hightower joined the @flyteorg team to discuss what ML looks like in the real world, from ingesting data to consuming ML models via an API.
    • @kelseyhightower You can't go swimming in a #data_lake if you actually can't swim, right? You're going to drown. 🏊‍♂️
    • @ketanumare Machine Learning products deteriorate in time. If you have the best model today it's not guaranteed to be the best model tomorrow.
    • @thegautam It's hard to verify models before you put them in production. We need our systems to be fully reproducible, which is why an #orchestration_tool is important, running multiple models in parallel.
    • @ketanumare We at @union_ai unify the extremely fragmented world of ML and give the choice to users when to use proprietary technology versus when to use open source. (1/2)
    • @ketanumare #Flyte makes it seamless to work on #kubernetes with spark jobs, and that's a big use case, but you can also use @databricks. Similarly, we are working on Ray and you can also use @anyscalecompute. (2/2)
    • @Ketanumare Most machine learning engineers are not distributed systems engineers. This becomes a challenge when youre deploying models to production. Infrastructure abstraction is key to unlock your teams potential.
    • @ketanumare on #Machine_Learning workflows: Creating Machine Learning workflows is a team sport. 🤝
    • @arnoxmp: A Machine Learning model is often a blackbox. If you encounter new data, do a test run first.
    • @fabio_graetz In classical software engineering the only thing that changes is the code, in a ML system the data can change. You need to version and test data changes.
    • @Forcebananza This is actually one of the reasons I really like using #Flyte. You can map a cell in a notebook to its own task, and they're really easy to compose and reuse and copy and paste around. (1/2)
    • @Forcebananza Jupyter notebooks are great for iterating, but moving more towards a standard software engineering workflow and making that easy enough for data scientists is really really important.(2/2)
    • @jganoff Taking snapshots of petabytes of data is expensive, there are tools that version a dataset without having to copy it. Having metadata separate from the data itself allows you to treat a version of a dataset as if it were code.
    • @SMT_Solvers In F500s it is mostly document OCR. Usually batch jobs - an API wouldnt work - you need the binaries on the server even if it is a sidecar Docker container. One org (not mine) blows $$ doing network transfer from AWS to GCP when GCP could license their OCR in a container.
    • @Forcebananza Flyte creates a way for all these teams to work together partially because writing workflows, writing reusable components… is actually simple enough for data scientists and data engineers to work with.
    • @kelseyhightower We're now at a stage where we can start to leverage systems like https://flyte.org to give us more of an opinionated end-to-end workflow. What we call #ML can become a real discipline where practitioners can use a common set of terms and practices.
  • stackoverflow.com: How is Flyte tailored to "Data and Machine Learning"?

Azure ML

  • docs.microsoft.com: MLflow and Azure Machine Learning One of the open-source projects that has made #ML better is MLFlow. Microsoft is expanding support for APIs, no-code deployment for MLflow models in real-time/batch managed inference, curated MLflow settings, and CLI v2 integrations.
  • bea.stollnitz.com: Creating batch endpoints in Azure ML
    • Suppose youve trained a machine learning model to accomplish some task, and youd now like to provide that models inference capabilities as a service. Maybe youre writing an application of your own that will rely on this service, or perhaps you want to make the service available to others. This is the purpose of endpoints — they provide a simple web-based API for feeding data to your model and getting back inference results.
    • Azure ML currently supports three types of endpoints: batch endpoints, Kubernetes online endpoints, and managed online endpoints. Im going to focus on batch endpoints in this post, but let me start by explaining how the three types differ.

KServe Cloud Native Model Server

Data Science

Other Tools

Samples

  • fepegar/vesseg Brain vessel segmentation using 3D convolutional neural networks

ML Courses

ML Competitions and Challenges

Tweets

Click to expand!

To my JVM friends looking to explore Machine Learning techniques - you dont necessarily have to learn Python to do that. There are libraries you can use from the comfort of your JVM environment. 🧵👇

— Maria Khalusova (@mariaKhalusova) November 26, 2020
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

You don't need to go to a university to learn machine learning - you can do it from your living room, for completely free.

Here is an extensive list of curated free courses and tutorials, from beginner to advanced. ↓

(Trust me, you want to bookmark this tweet.)

— Tivadar Danka (@TivadarDanka) September 21, 2021
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

I started taking data science courses last year, after studying and coding for at least 10 hours 6 days a week and doing several ML projects alongside data analysis projects, I finally got my first data analyst offer from a Nigerian bank last week after countless rejections

— Sam (@SamsonTontoye) February 20, 2022
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

Deep Neural Networks are used for many applications. One I'm particularly fond of is medical imaging. A trained model can process the input thanks to the activation functions propagating through a network of perceptrons and generating the output of interest.#NeuralNets #Medical pic.twitter.com/vPwm0TfHnn

— Valerio Pergola (@valerio_pergola) July 6, 2022
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

#3D intracranial artery segmentation using a convolutional neural networks #CNN - #opensource > https://t.co/Z2WDp2UOl3 | #python #TensorFlow #DeepLearning #MachineLearning #Nvidia #GPU #brain #medical #conda #Neurology #Artificial_Intelligence #medical_imaging #Nifti pic.twitter.com/eKrBBuFxSy

— NewUlmDesign (@ulmdesign) July 7, 2022
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

https://t.co/WxspfKvLFS

— nubenetes (@nubenetes) July 22, 2022
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

@kelseyhightower We're now at a stage where we can start to leverage systems like #Flyte to give us more of an opinionated end-to-end workflow. What we call #ML can become a real discipline where practitioners can use a common set of terms and practices.#KelseyTakesFlyte #MLOps

— Flyte (@flyteorg) July 22, 2022
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>