How Microsoft shipped a production-optimized image model in under a month. The speed of this release deserves attention.
Google LLC’s DeepMind artificial intelligence unit today rolled out a new text-to-speech model called Gemini 3.1 Flash TTS.
OpenAI released its text-to-video artificial intelligence model, Sora, this week after the completion of its testing phase. The Microsoft-backed AI startup first teased the model in February and ...
Microsoft's New AI Models Go Beyond Just Text ...
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Multi-modal models that can process both ...
OpenAI's text-to-videos tool Sora generates high-quality videos up to one minute in length. (OpenAI) OpenAI on Thursday announced Sora, a brand new model that generates high-definition videos up to ...
Google on Friday added a new, experimental “embedding” model for text, Gemini Embedding, to its Gemini developer API. Embedding models translate text inputs like words and phrases into numerical ...
Researchers at Amazon have trained the largest ever text-to-speech model yet, which they claim exhibits “emergent” qualities improving its ability to speak even complex sentences naturally. The ...
The new Sora artificial intelligence (AI) program has obvious pitfalls, but experts highlighted the benefits of the new tech while acknowledging the need for greater safeguards and public education on ...
Kokoro 82M is an 82-million-parameter text-to-speech model that beats many TTS APIs while running locally on CPUs, including ...