real-time-data-processing

2 posts

naver

Iceberg Low-Latency Queries with Materialized Views (opens in new tab)

This technical session from NAVER ENGINEERING DAY 2025 explores the architectural journey of building a low-latency query system for real-time transaction reports. The project focuses on resolving the tension between high data freshness, massive scalability, and rapid response times for complex, multi-dimensional filtering. By leveraging Apache Iceberg in conjunction with StarRocks’ materialized views, the team established a performant data pipeline that meets the demands of modern business intelligence. ### Challenges in Real-Time Transaction Reporting * **Query Latency vs. Data Freshness:** Traditional architectures often struggle to provide immediate visibility into transaction data while maintaining sub-second query speeds across diverse filter conditions. * **High-Dimensional Filtering:** Users require the ability to query reports based on numerous variables, necessitating an engine that can handle complex aggregations without pre-defining every possible index. * **Scalability Requirements:** The system must handle increasing transaction volumes without degrading performance or requiring significant manual intervention in the underlying storage layer. ### Optimized Architecture with Iceberg and StarRocks * **Apache Iceberg Integration:** Iceberg serves as the open table format, providing a reliable foundation for managing large-scale data snapshots and ensuring consistency during concurrent reads and writes. * **StarRocks for Query Acceleration:** The team selected StarRocks as the primary OLAP engine to take advantage of its high-speed vectorized execution and native support for Iceberg tables. * **Spark-Based Processing:** Apache Spark is utilized for the initial data ingestion and transformation phases, preparing the transaction data for efficient storage and downstream consumption. ### Enhancing Performance via Materialized Views * **Pre-computed Aggregations:** By implementing Materialized Views, the system pre-calculates intensive transaction summaries, significantly reducing the computational load during active user queries. * **Automatic Query Rewrite:** The architecture utilizes StarRocks' ability to automatically route queries to the most efficient materialized view, ensuring that even ad-hoc reports benefit from pre-computed results. * **Balanced Refresh Strategies:** The research focused on optimizing the refresh intervals of these views to maintain high "freshness" while minimizing the overhead on the cluster resources. The adoption of a modern lakehouse architecture combining Apache Iceberg with a high-performance OLAP engine like StarRocks is a recommended strategy for organizations dealing with high-volume, real-time reporting. This approach effectively decouples storage and compute while providing the low-latency response times necessary for interactive data analysis.

google

Android Earthquake Alerts: A global system for early warning (opens in new tab)

Google’s Android Earthquake Alerts system utilizes the onboard accelerometers of billions of smartphones to create a global, crowdsourced seismic network. By detecting the initial P-waves of an earthquake and rapidly processing aggregate data, the system provides critical early warnings to regions that often lack traditional, expensive seismic infrastructure. This technological shift has expanded earthquake early warning access from roughly 250 million people to over 2.5 billion worldwide. ### Leveraging On-Device Accelerometers * Mobile accelerometers, typically used for screen orientation, function as mini-seismometers capable of detecting the initial, fast-moving P-waves of an earthquake. * When a stationary phone detects these vibrations, it sends a signal along with a coarse location to a centralized detection server. * The system aggregates these signals to confirm the event and estimate its magnitude before the slower, more destructive S-waves reach the population. ### Global Reach and Implementation * Active in nearly 100 countries, the system has detected over 18,000 earthquakes ranging from M1.9 to M7.8. * The system has issued alerts for over 2,000 significant earthquakes, resulting in approximately 790 million alerts sent to users globally. * By utilizing existing consumer hardware, the system serves as a "global safety net" for earthquake-prone regions that cannot afford traditional ground-based sensor networks. ### Magnitude Estimation and Accuracy * A primary technical challenge is the trade-off between speed and accuracy; the first few seconds of data are limited but essential for providing early warnings. * Over three years, the system's median absolute error for initial magnitude estimates has been reduced from 0.50 to 0.25. * The accuracy of these smartphone-based detections is now comparable to—and in some cases exceeds—the performance of established traditional seismic networks. ### User Alerts and Real-World Impact * The system delivers two tiers of notifications: "BeAware" alerts for light shaking and "TakeAction" alerts, which use full-screen takeovers and loud sounds for intense shaking. * During a magnitude 6.7 earthquake in the Philippines, the system issued alerts 18.3 seconds after the quake began, providing users further from the epicenter up to 60 seconds of lead time. * To maintain privacy, the system relies on coarse location data and requires users to have Wi-Fi or cellular connectivity and location settings enabled. For users in seismic zones, ensuring that Android Earthquake Alerts and location services are enabled provides a vital layer of protection. This crowdsourced model demonstrates how ubiquitous consumer technology can be repurposed to provide essential public safety infrastructure on a planetary scale.