OVHcloud service offers API access to open-source AI models

AI Endpoints is a serverless platform that allows developers to integrate more than 40 open-source AI models into applications through APIs, covering use cases like chatbots, speech-to-text, and coding support.

OVHcloud has launched AI Endpoints, a serverless offering that allows developers to integrate a wide range of AI capabilities into their applications without managing infrastructure or needing machine learning expertise. The platform provides over 40 open-source large language models (LLMs) and generative AI tools for chatbots, coding assistance, and text-to-speech.

Developers can experiment with the models in a sandbox environment before deploying them into production for internal tools, customer service, or other business processes. With API-based access, the service is intended to streamline how AI models are tested, adopted, and scaled.

AI Endpoints supports several use cases for developers to build or upgrade AI-enabled systems. This includes conversational interfaces whereby developers can add agent-based interactions using LLMs to automate responses or enhance customer engagement through natural language conversation.

The platform also includes models that can extract and organize unstructured data, helping streamline tasks like document processing and supporting ETL (Extract, Transform, Load) pipelines. For speech transcription and generation, the platform uses APIs, allowing developers to integrate voice-to-text and text-to-speech capabilities for applications that require audio input or transcription services. With model integration into IDEs, developers can receive real-time code suggestions, automate tasks, and flag potential errors, supporting productivity and code quality.

Cloud infrastructure and data sovereignty

The platform is built on OVHcloud's infrastructure in Europe, with data hosted under European jurisdiction. This setup is intended to meet the needs of organizations concerned about data privacy and compliance with non-European regulations.

In terms of energy use, AI Endpoints is deployed on water-cooled servers housed in OVHcloud data centres, with the goal of reducing environmental impact while maintaining compute performance. The service also supports model transparency by offering open-weight models, allowing customers to deploy the same models on their own infrastructure or other environments.

The service is now available in Asia-Pacific, Canada, and Europe, with deployments handled through OVHcloud's Gravelines data center. The pay-as-you-go pricing model charges based on the number of tokens processed per minute, with costs varying depending on the selected model.

The current model catalogue includes LLMs and SLMs from Llama and Mixtral as well as code models from Qwen, Codestra Mamba and reasoning model DeepSeek-R1. Qwen 2.5VL 72B multimodal models and SDXL image generation are included as well. For speech models, both ASR and TTS are supported.

According to OVHcloud, the rollout follows a preview phase in which customer feedback was used to shape new features, including support for stable open-source models and more granular API key management.