2 talks on SageMaker Feature Store
Talk #1: Standardize and Automate your Feature Engineering Workflows with SageMaker Feature Store (Mani Khanuja, AI/ML Specialist Solutions Architect @ AWS / manikhanuja )
As a data scientist, you certainly spend a lot of time crafting feature engineering code. Indeed, given the experimental nature of this work, even a small project can lead to multiple iterations. Thus, you’ll often run the same feature engineering code again and again, wasting time and compute resources on repeating the same operations. In large organizations, this may cause an even greater loss of productivity, as different teams often run identical jobs, or even write duplicate feature engineering code because they have no knowledge of prior work. As models are trained on engineered datasets, it’s also imperative that you apply the same transformations to data used for prediction. This often means rewriting your feature engineering code (sometimes in a different language), integrating it in your prediction workflow, and running it at prediction time. This whole process is not only time-consuming, it can also introduce inconsistencies, as even the tiniest variation in your data transforms can have a large impact on predictions. In this hands-on session, you’ll learn how to solve all these problems with Amazon SageMaker Feature Store, and how to use it with both the SageMaker Studio user interface and the SageMaker SDK. You’ll also see how it works together with SageMaker Data Wrangler to simplify your end to end data preparation workflows.
In this talk, I will cover the following:
1) Explore data
2) Pre-process data
3) Create a feature store
4) Ingest data into a feature store
5) Explain the difference between offline and online feature stores.
6) Prepare data for training
7) Run the training job
8) Deploy model
9) Using Online Feature Store for inference
Talk #2: Feature Stores and why you need them to productionize ML applications (Heiko Hotz, Solution Architect for AI/ML @ AWS)
Businesses are increasingly using machine learning (ML) to make near-real time decisions, such as placing an ad, assigning a driver, recommending a product, or even dynamically pricing products and services. For example, a model for a ride sharing app can choose the best price for a ride from the airport, but only if it knows the number of ride requests received in the past 10 minutes and the number of passengers projected to land in the next 10 minutes. A routing model in a call center app can pick the best available agent for an incoming call, but it is only effective if it knows the customer’s latest web session clicks. Although the business value of near real time ML predictions is enormous, the architecture required to deliver them reliably, securely, and with good performance is complicated. Solutions need high-throughput updates and low latency retrieval of the most recent feature values in milliseconds, something most data scientists are not prepared to deliver. To address these challenges, SageMaker Feature Store provides a fully managed central repository for ML features, making it easy to securely store and retrieve features, without having to build and maintain your own infrastructure. SageMaker Feature Store lets you define groups of features, use batch ingestion and streaming ingestion, retrieve the latest feature values with single-digit millisecond latency for highly accurate online predictions, and extract point-in-time correct datasets for training.
The demo walks through a complete example of how you can couple streaming feature engineering with Amazon SageMaker Feature Store to make ML-backed decisions in near-real time. We detect credit card fraud by updating aggregate features from a live stream of transactions using low-latency feature retrievals.
RSVP Webinar: https://www.eventbrite.com/e/1-hr-fre...
Meetup:
https://meetup.datascienceonaws.com
Zoom:
https://zoom.us/j/690414331
Webinar ID: 690 414 331
Phone:
+1 646 558 8656 (US Toll) or +1 408 638 0968 (US Toll)
Related Links
Meetup: https://meetup.datascienceonaws.com
GitHub Repo: https://github.com/data-science-on-aws/
O'Reilly Book: https://datascienceonaws.com
YouTube: https://youtube.datascienceonaws.com
Slideshare: https://slideshare.datascienceonaws.com
Monthly Workshop: https://www.eventbrite.com/e/full-day...
RSVP Webinar: https://www.eventbrite.com/e/1-hr-fre...
Информация по комментариям в разработке