google-tpu

1 posts

google

Exploring a space-based, scalable AI infrastructure system design (opens in new tab)

Project Suncatcher is a Google moonshot initiative aimed at scaling machine learning infrastructure by deploying solar-powered satellite constellations equipped with Tensor Processing Units (TPUs). By leveraging the nearly continuous energy of the sun in specific orbits and utilizing high-bandwidth free-space optical links, the project seeks to bypass the resource constraints of terrestrial data centers. Early research suggests that a modular, tightly clustered satellite design can achieve the necessary compute density and communication speeds required for modern AI workloads. ### Data-Center Bandwidth via Optical Links * To match terrestrial performance, inter-satellite links must support tens of terabits per second using multi-channel dense wavelength-division multiplexing (DWDM) and spatial multiplexing. * The system addresses signal power loss (the link budget) by maintaining satellites in extremely close proximity—kilometers or less—compared to traditional long-range satellite deployments. * Initial bench-scale demonstrations have successfully achieved 800 Gbps each-way transmission (1.6 Tbps total) using a single transceiver pair, validating the feasibility of high-speed optical networking. ### Orbital Mechanics of Compact Constellations * The proposed system utilizes a sun-synchronous low-earth orbit (LEO) at an altitude of approximately 650 km to maximize solar exposure and minimize the weight of onboard batteries. * Researchers use Hill-Clohessy-Wiltshire equations and JAX-based differentiable models to manage the complex gravitational perturbations and atmospheric drag affecting satellites flying in tight 100–200m formations. * Simulations of 81-satellite clusters indicate that only modest station-keeping maneuvers are required to maintain stable, "free-fall" trajectories within the orbital plane. ### Hardware Resilience in Space Environments * The project specifically tests Google’s Trillium (v6e) Cloud TPUs to determine if terrestrial AI accelerators can survive the radiation found in LEO. * Hardware is subjected to 67MeV proton beams to analyze the impact of Total Ionizing Dose (TID) and Single Event Effects (SEEs) on processing reliability. * Preliminary testing indicates promising results for the radiation tolerance of high-performance accelerators, suggesting that standard TPU architectures may be viable for orbital deployment with minimal modification. While still in the research and development phase, Project Suncatcher suggests that the future of massive AI scaling may involve shifting infrastructure away from terrestrial limits and toward modular, energy-rich orbital environments. Organizations should monitor the progress of free-space optical communication and radiation-hardened accelerators as these technologies will be the primary gatekeepers for space-based computation.