AWS announces new Inferentia machine learning chip

AWS is not content to cede any part of any market to any company. When it comes to machine learning chips, names like Nvidia or Google come to mind, but today at AWS re:Invent in Las Vegas, the company announced a new dedicated machine learning chip of its own called Inferentia.

“Inferentia will be a very high-throughput, low-latency, sustained-performance very cost-effective processor,” AWS CEO Andy Jassy explained during the announcement.

Holger Mueller, an analyst with Constellation Research, says that while Amazon is far behind, this is a good step for them as companies try to differentiate their machine learning approaches in the future.

“The speed and cost of running machine learning operations — ideally in deep learning — are a competitive differentiator for enterprises. Speed advantages will make or break success of enterprises (and nations when you think of warfare). That speed can only be achieved with custom hardware, and Inferentia is AWS’s first step to get in to this game,” Mueller told TechCrunch. As he pointed out, Google has a 2-3 year head start with its TPU infrastructure.

Inferentia supports popular frameworks like INT8, FP16 and mixed precision. What’s more, it supports multiple machine learning frameworks, including TensorFlow, Caffe2 and ONNX.

Of course, being an Amazon product, it also supports data from popular AWS products such as EC2, SageMaker and the new Elastic Inference Engine announced today.

While the chip was announced today, AWS CEO Andy Jassy indicated it won’t actually be available until next year.

Techcrunch event

Disrupt 2026: The tech ecosystem, all in one room

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400.

Save up to $300 or 30% to TechCrunch Founder Summit

1,000+ founders and investors come together at TechCrunch Founder Summit 2026 for a full day focused on growth, execution, and real-world scaling. Learn from founders and investors who have shaped the industry. Connect with peers navigating similar growth stages. Walk away with tactics you can apply immediately

Offer ends March 13.

San Francisco, CA | October 13-15, 2026

more AWS re:Invent 2018 coverage

Topics

, , , , , , ,
Loading the next article
Error loading the next article