Managed Inference

Serve Generative AI models and answer prompts from European end-consumers securely.

Choose among ready-to-be-served LLMs

What makes inference fast? Model optimization is one lever. To be served fast, a model must be optimized to the machines that run it.
This isn't always a piece of cake, and can turn into a time-consuming process. That's why Scaleway is providing an evolutionary Model Library, with curated and optimized LLMs.

Benefit from a various range of dedicated GPUs

Thanks to a complete GPU portfolio, the Managed Inference product is powered by the GPU you see fit: L4, H100 PCIe GPU Instances and very soon L40S GPU Instances. Multiple options to offer you the flexibility you need to reach efficiency and cost-effectiveness at the same time.

Run on a fully secured European Cloud

Enjoy tailored security for your infrastructure: from highly secure VPC environments to accessible setups with internet and IAM tokens.
Maintain complete data control: no storage nor third-party access to your data (prompt & responses), ensuring it remains exclusively yours and within Europe.

Available zones:
Paris:PAR 2

State-of-the-art open weights LLMs

Mixtral-8x7B-Instruct-v0.1

Trained on Scaleway's Nabuchodonosor 2023, Mixtral-8x7B is a state-of-the-art, pretrained generative model known as a Sparse Mixture of Experts. It has been benchmarked to surpass the performance of the Llama 2 70B model across a variety of tests.

Benefit from a secured European Cloud ecosystem

Virtual Private Cloud

Your AI endpoints are accessible through low-latency and secure connection to your resources hosted at Scaleway, thanks to a resilient regional Private Network.

Learn more

Access Management

We make generative AI endpoints compatible with Scaleway's Identity and Access Management, so that your deployments are compliant with your enterprise architecture requirements.

Learn more

Cockpit

Identify bottlenecks on your deployments, view inference requests in real time and even report your energy consumption with a fully managed observability solution.

Learn more
  • Scaleway is a NVIDIA Elite Partner