Cover Image for Hugging Face introduces an open-source tool to facilitate the cost-effective deployment of artificial intelligence.
Sun Dec 15 2024

Hugging Face introduces an open-source tool to facilitate the cost-effective deployment of artificial intelligence.

Available at a cost of $1 per hour per container.

Hugging Face has launched its new service, Hugging Face Generative AI Services (HUGS), designed to facilitate the deployment and scaling of generative AI applications using open-source models. This service is based on Hugging Face technologies such as Transformers and Text Generation Inference (TGI), and it promises optimized performance across various hardware accelerators.

For developers using AWS or Google Cloud, the service costs $1 per hour per container and offers a five-day free trial on AWS to help users get started. HUGS allows developers to run AI models on their own infrastructure without the need for manual configurations. One of the biggest challenges when deploying large language models (LLMs) is optimizing them for specific hardware environments, as each type of accelerator, whether an NVIDIA GPU or an AMD GPU, requires tuning to maximize its performance.

With HUGS, these optimizations are handled automatically, providing high performance from the first use. In addition to NVIDIA and AMD GPUs, support for AWS Inferentia and Google TPUs is expected to be extended soon. Hugging Face aims to ease the transition from 'black box' APIs to open, self-hosted solutions, supporting a wide range of well-known models, including LLMs like Llama and Gemma, with plans to introduce multimodal models such as Idefics and Llava in the future. There are also plans to include embedding models like BGE and Jina, giving developers more options to customize their AI applications.

This service is based on standardized APIs that are compatible with OpenAI's model interfaces, making it easy for developers to migrate their own code. For startups, HUGS represents an opportunity to build AI applications without the high costs associated with proprietary platforms. The availability of one-click deployments on DigitalOcean makes it even easier for small teams to experiment with generative AI technologies.

On the other hand, larger companies can leverage HUGS to scale their applications without being tied to a single cloud provider or proprietary API. On DigitalOcean, HUGS is included at no additional cost beyond the standard cost of GPU Droplets. Moreover, Hugging Face offers customized deployment solutions for enterprises through its Enterprise Hub.