Integration Brings Cerebras Inference Capabilities to Hugging Face Hub
AI {hardware} firm Cerebras has teamed up with Hugging Face, the open supply platform and neighborhood for machine studying, to combine its inference capabilities into the Hugging Face Hub. This collaboration supplies greater than 5 million builders with entry to fashions operating on Cerebras’ CS-3 system, the businesses mentioned in a press release, with reported inference speeds considerably greater than standard GPU options.
Cerebras Inference, now out there on Hugging Face, processes greater than 2,000 tokens per second. Current benchmarks point out that fashions comparable to Llama 3.3 70B operating on Cerebras’ system can attain speeds exceeding 2,200 tokens per second, providing a efficiency enhance in contrast to main GPU-based options.
“By making Cerebras Inference out there by Hugging Face, we’re enabling builders to entry various infrastructure for open supply AI fashions,” mentioned Andrew Feldman, CEO of Cerebras, in a press release.
For Hugging Face’s 5 million builders, this integration supplies a streamlined approach to leverage Cerebras’ know-how. Customers can choose “Cerebras” as their inference supplier throughout the Hugging Face platform, immediately accessing one of many {industry}’s quickest inference capabilities.
The demand for high-speed, high-accuracy AI inference is rising, particularly for test-time compute and agentic AI purposes. Open supply fashions optimized for Cerebras’ CS-3 structure allow sooner and extra exact AI reasoning, the businesses mentioned, with pace positive aspects starting from 10 to 70 instances in contrast to GPUs.
“Cerebras has been a pacesetter in inference pace and efficiency, and we’re thrilled to companion to convey this industry-leading inference on open supply fashions to our developer neighborhood,” commented Julien Chaumond, CTO of Hugging Face.
Builders can entry Cerebras-powered AI inference by choosing supported fashions on Hugging Face, comparable to Llama 3.3 70B, and selecting Cerebras as their inference supplier.
Concerning the Creator
John K. Waters is the editor in chief of plenty of Converge360.com websites, with a give attention to high-end improvement, AI and future tech. He is been writing about cutting-edge applied sciences and tradition of Silicon Valley for greater than two many years, and he is written greater than a dozen books. He additionally co-scripted the documentary movie Silicon Valley: A 100 12 months Renaissance, which aired on PBS. He will be reached at [email protected].
Source link
#Integration #Brings #Cerebras #Inference #Capabilities #Hugging #Face #Hub #Campus #Technology