The Cerebras Wafer Scale Engine (WSE) was presented at Hot Chips. Sanders Olson provided Nextbigfuture with the presentation deck.
The Cerebras in the most powerful processor for AI with 1.2 trillion transistors.
It has
* 400,000 AI-optimized cores
* the wafer has 46,225 square millimeters of area
* it is silicon
* 1.2 trillion transistors
* 18 Gigabytes of On-chip Memory
* 9 PByte/s memory bandwidth
* 100 Pbit/s fabric bandwidth
* TSMC 16nm process.
It has room to drop to the 7 nanometer or eventual one nanometer lithography. A next generation wafer chip at 7 nanometers is already in development and will have 2.6 trillion transistors. Getting to 1 nanometer would mean about 25 trillion transistors if the same scaling occurred.
The CS-2, when it becomes available, will feature:
* 850,000 AI optimized cores
* 2.6 trillion transistors
* TSMC 7nm process
* It is working in Cerebras labs today.
Cerebras CS-1: Cluster-Scale machine is Deep Learning Performance in a Single System. It is powered by the Wafer Scale Engine. It has programming accessibility of a single node, using TensorFlow, PyTorch, and other frameworks. Deploys into standard datacenter infrastructure. Multiple units delivered, installed, and in-use today across multiple verticals.
It is built from the ground up for AI acceleration.
SOURCES- Hot Chips, Primeur Magazine, Sander Olson
Written By Brian Wang, Nextbigfuture.com
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
What I don't like is the limited SRAM memory of 18 GB. Sure, it sounds a lot, but when you compare it with the fantastic bus bandwidth (petabytes), it's not that great. Also, there are many large ANN's which would not fit in the memory.
I think that they should address this. If you are paying several million dollars for a system, you would expect more memory. The should stack SRAM on top as to reach 1 TB of SRAM, at least (at the same power consumption).
Designing the wafer tester for that will be … interesting…
How does this chip perf compare to nvidia A100?
wake me when they do the memrister equivalent
It would be nice for wafer scale integration to finally find it’s niche.