META CEO Mark Zuckerberg predicts that 2025 is the year that an AI assistant will serve over one billion users and he thinks META will be company that provides that AI assistant.
In 2023, Meta revealed they have AI inference accelerators that they are designing in-house specifically for Meta’s AI workloads. Deep learning recommendation models that are improving a variety of experiences across Meta products.
At yesterdays earnings call, Meta highlighted the Meta Training and Inference Accelerator (MTIA) family of custom-made chips designed for Meta’s AI workloads. Custom $AVGO MTIA chips now targeting training workloads & ranking systems, aiming to reduce NVIDIA GPU dependence. The custom ASICs support inference. They will get cost efficiencies by deploying the custom MTIA silicon in areas where they can achieve a lower cost of compute by optimizing the chip to their unique workloads.
Meta revenues increased 21% over the last year to a new record high of $48.4 billion. Net income increased 49% YoY to a new record high of $20.8 billion. Operating margins increased to 48% from 41% a year ago.
META will still spend $60-65 billion on Capex and most of this will be AI Infrastructure.
AI success involves heavily investing in infrastructure and CapEx to deliver quality products at scale.

AI is driving revenue growth. 4 million advertisers are using generative AI tools which is up from 1 million six months ago. AI is benefiting their business, and the advancements and investments are becoming evident as they take a thoughtful approach.
AI is enhancing margins. META will improve margins by developing an AI agent capable of coding at a mid-level engineering standard.
META will focus on AI monetization after they reach a billion user scale. META wants to gets the AI products to scale and then look at monetization later.
Meta's strategic push: Custom $AVGO MTIA chips now targeting training workloads & ranking systems, aiming to reduce NVIDIA GPU dependence. $NVDA down -5% as Meta's in-house silicon threatens their AI dominance. Strategic shift could impact future datacenter GPU sales $META https://t.co/z9Re1jlJAQ pic.twitter.com/jrpMT97b0N
— semi (@johnwayne12591) January 29, 2025

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
Pipelined custom implementation of a LLM will be far more efficient and a lot faster than any GPU, so scales out more economically. Reminds me of a pipelined FPGA implementation of a speech to text model that could handle the equivalent of 1.3m concurrent conversations on one chip……