278: Azure is on a Bender: Bite my Shiny Metal FXv2-series VMs

Описание к видео 278: Azure is on a Bender: Bite my Shiny Metal FXv2-series VMs

Welcome to episode 278 of The Cloud Pod, where the forecast is always cloudy! When Justin’s away, the guys will… maybe get a show recorded? This week, we’re talking OpenAI, another service scheduled for the grave over at AWS, saying goodbye to pesky IPv4 fees, Azure FXv2 VMs, Valkey 8.0 and so much more! Thanks for joining us, here in the cloud! 

Titles we almost went with this week:




• Another One Bites the Dust


• Peak AI reached: OpenAI Now Puts Print Statements in Code to Help You Debug


A big thanks to this week’s sponsor: Archera (https://shortclick.link/uthdi1)



There are a lot of cloud cost management tools out there. But only Archera provides cloud commitment insurance. It sounds fancy but it’s really simple. Archera gives you the cost savings of a 1 or 3 year AWS Savings Plan with a commitment as short as 30 days. If you don’t use all the cloud resources you’ve committed to, they will literally put money back in your bank account to cover the difference. Other cost management tools may say they offer “commitment insurance”, but remember to ask: will you actually give me my money back? Archera will. Click this link to check them out (https://shortclick.link/uthdi1)

AI Is Going Great – Or How ML Makes All It’s Money



00:59 Introducing vision to the fine-tuning API (https://openai.com/index/introducing-...) .




• OpenAI has announced the integration of vision capabilities into its fine-tuning API, allowing developers to enhance the GPT-4o model to analyze and interpret images alongside text and audio inputs. 


• This update broadens the scope of applications for AI, enabling more multimodal interactions.


• The fine-tuning API now supports image inputs, which means developers can train models to understand and generate content based on visual data in conjunction with text and audio.


• After October 31, 2024, training for fine-tuning will cost $25 per 1 million tokens, with inference priced at $3.75 per 1 million input tokens and $15 per 1 million output tokens. 


• Images are tokenized based on size before pricing. The introduction of prompt caching and other efficiency measures could lower the operational costs for businesses deploying AI solutions.


• The API is also being enhanced to include features like epoch-based checkpoint creation, a comparative playground for model evaluation, and integration with third-party platforms like Weights and Biases for detailed fine-tuning data management.


• What does it mean? Admit it – you’re dying to know. 


• Developers can now create applications that not only process text or voice but also interpret and generate responses based on visual cues, and importantly fine tuned for domain specific applications, and this update could lead to more intuitive user interfaces in applications, where users can interact with services using images as naturally as they do with text or speech, potentially expanding the user base to those less tech-savvy or in fields where visual data is crucial.




03:53 Jonathan – “I mean, I think it’s useful for things like quality assurance in manufacturing, for example. You know, could, you could tune it on what your nuts and bolts are supposed to look like and what a good bolt looks like and what a bad bolt looks like coming out of the factory. You just stream the video directly to, to an AI, AI like this and have it kick out all the bad ones. It’s kind of, kind of neat.”



04:41  Introducing the Realtime API (https://openai.com/index/introducing-...)




• OpenAI has launched its Realtime API in public beta, designed to enable developers to create applications with real-time, low-latency, multimodal interactions. 


• This API facilitates speech-to-speech conversations, making user interactions more natural and engaging.


• The Realtime API uses WebSockets for maintaining a persistent connection, allowing for real-time input and output of both text and audio. This includes function calling capabilities, making it versatile for various applications.


• It leverages the new GPT-4o model, which supports multimodal inputs (text, audio, and now with vision capabilities in fine-tuning).


• Use Cases include:



•  Interactive applications: Developers can now build apps where users can have back-and-forth voice conversations or even integrate visual data for a more comprehensive interaction.


• Customer Service: The API can revolutionize customer service with real-time voice interactions that feel more human-like.


• Voice Assistants: Healthify already uses the API for natural, conversational interactions with its AI coach, Ria.






...

Комментарии

Информация по комментариям в разработке