The Other AI Elephant in the Room

Of all the talk about the benefits of Artificial Intelligence (AI), there is an ugly downside. Everybody is mostly concerned about AI as a humanity killer: gaining sentience and running amok, being used to develop a super virus, or commanding killer drone swarms.But there is another elephant in the room - and it is really hungry. AI has a nearly insatiable appetite for data, manpower, energy, and water.For the most part, the AI industry has responded with a brute force solution: building more data centers that are powered by nuclear power plants. But there is a more elegant solution: better timing synchronization that allows distributed databases to efficiently send and receive data and optimizes power-hungry AI.It may seem non-intuitive, but timing precision reduces surge events in databases, eliminates centralized nodes, and reduces the effort to work with the database. Meta and NVIDIA found that a synchronization improvement of 80x made the distributed database run 3x faster - "an incredible performance boost."And better timing helps address a new problem created by more powerful AI models: power surges.AI requires a lot of processors working together, that, individually, have small power surges. To ensure that these surges aren't aligned, these parallel processors need to de-sync via sync (see below).AI (or, actually, our demand for better AI) is an insatiable beast. Timing can help keep it trim.Last Week's Theme: Any Port in a Storm

Industry News

Some DoD officials believe that China is on the path to space supremacy but think the US "should partner with them, or at least understand what they're doing." Other officials believe that "it's too early to be fighting a space race with China."
That being said, the best defense is a good offense. The US Space Force is planning to install antennas to jam adversary satellites in case of a conflict.
A recent assessment of "10 Emerging Technologies for Defense in 2024" included Alternative Position, Navigation, and Timing (PNT), noting that the "DOD continues to heavily rely on GPS, a dependency that represents an alarming national security vulnerability."
The G7 Cyber Expert Group (CEG) released a "Statement for the Opportunities and Risks of Quantum Computing," noting that quantum computers pose "a threat to traditional cryptographic algorithms protecting digital communications, IT systems, and data."
In their search for a cryptography standard that cannot be broken by future quantum computers, the US National Institute of Standards and Technology (NIST) has down selected to fourteen potential options in their ongoing post-quantum cryptography (PQC) competition.
Investment in quantum research continue apace:
- Europe announced an additional $97M in funding for the European Quantum Communication Infrastructure (EuroQCI) program.
- The US Department of Energy announced a $30M Quantum Computing for Computational Chemistry (QC3) program "to develop quantum algorithms to revolutionize diverse areas of energy research."
- The state of Massachusetts announced the Massachusetts Green High Performance Computing Center (MGHPCC).
- The UK opened the National Quantum Computing Centre (NQCC).
- Indiana announced plans to develop a Quantum Corridor.
- Colorado is already on the case. Following designation by the US Department of Commerce’s Economic Development Administration (EDA) as one of the 31 inaugural Tech Hubs, Colorado officials broke ground on the Quantum COmmons as part of "America’s response to beat China."

Conferences

IQT Quantum+AI, October 29 - 30, New York, New York
International Timing and Sync Forum 2024, November 4 - 7, Seville, Spain
UK National Quantum Technologies Showcase, November 8, London, UK
UK PNT Leadership Seminar, November 20, London, UK
SLUSH, November 20 - 21, Helsinki, Finland
Q2B24 Silicon Valley, December 10-12, Santa Clara, California
Consumer Electronics Show, January 5 - 7, Las Vegas, NV
Photonics West, January 25 - 30, San Francisco, CA
Workshop on Synchronization and Timing Systems (WSTS), May 12 - 15, 2025, Savannah, GA
European Navigation Conference, May 21 - 23, 2025, Wroclaw, Poland

The More You Know...

"Accurate timing is a new compute glue" for data centers being developed for AI.Over the past decade there has been a trend of data centers moving "from centralized systems to more decentralized and now more distributed systems" to improve resiliency, scalability and efficiency.But this also creates new issues that precise synchronization can help mitigate:

Congestion reduction - complex deep learning models typically use "all to all" collective operations, which is "one of the most expensive workloads in large data centers." Creating accurate transmission windows and communication sequences can "put some order to the chaos," reducing data congestion that leads to packet drop and retransmission.
Data consistency - data sequencing, high frequency telemetry and performance analysis requires accurate time stamping for consistency, ordering, debugging and monitoring.
Efficiency - storage and databases use accurate timing to accelerate cache coherency and reduce database locking time. Optimization through better synchronization lowers the Window of Uncertainty = less data traffic = server utilization is higher = reduced costs.
Latency reduction - precision timing streamlines data processing across infrastructure sites.
Security - synchronization enhances security measures, ideally leveraging a PNT parent signal that cannot be spoofed or jammed (unlike GNSS).

With the advent of AI-centric data centers, there is an additional need for precise synchronization.Processing the data in Large Language Models (LLMs) as fast as possible requires horizontal expansion with a lot of machines working together. These machines and processes tend to run differently, which creates inefficiencies and tail latency that engineers strive to remove.But this optimization creates other problems.Individually, these machines have small power spikes, that, if aligned, can lead to catastrophe. It wasn't a problem until recently, but as the size of LLMs have grown, so have the number of machines that "require a lot of parallelization" causing power "spikes showing up close to each other."So data centers need to be synchronized to de-sync these potential catastrophic power spikes, thereby allowing them to run more efficiently and consume less power and water.

The Other AI Elephant in the Room

Recent Posts

Komentar