Techlist.io - Korean Tech Blog Curator

coupang Nov 14, 2024

Coupang Rocket Delivery’s spatial index-based delivery management system | by Coupang Engineering | Coupang Engineering Blog | Medium (opens in new tab)

Coupang’s Rocket Delivery system recently transitioned from a text-based postal code infrastructure to a sophisticated spatial index-based management system to handle increasing delivery density. By adopting Uber’s H3 hexagonal grid system, the engineering team enabled the visualization and precise segmentation of delivery areas that were previously too large for a single driver to manage. This move has transformed the delivery process into an intuitive, map-centric operation that allows for data-driven optimization and real-time area modifications. ### Limitations of Text-Based Postal Codes * While postal codes provided a government-standardized starting point, they became inefficient as delivery volumes grew from double to triple digits per code. * The lack of spatial data meant that segmenting a single postal code into smaller units, such as individual apartment complexes or buildings, required manual input from local experts familiar with the terrain. * Relying on text strings prevented the system from providing intuitive visual feedback or automated metrics for optimizing delivery routes. ### Adopting H3 for Geospatial Indexing * The team evaluated different spatial indexing systems, specifically comparing Google’s S2 (square-based) and Uber’s H3 (hexagon-based) frameworks. * H3 was chosen because hexagons provide a constant distance between the center of a cell and all six of its neighbors, which simplifies the modeling of movement and coverage. * The hexagonal structure minimizes "edge effect" distortions compared to squares or triangles, making it more accurate for calculating delivery radius and area density. ### Technical Redesign and Implementation * The system utilizes H3’s hierarchical indexing, allowing the platform to store delivery data at various resolutions to balance granularity with computational performance. * Delivery zones were converted from standard polygons into "hexagonized" groups, enabling the system to treat complex geographical shapes as sets of standardized cell IDs. * This transition allowed for the creation of a visual interface where camp leaders can modify delivery boundaries directly on a map, with changes reflected instantly across the logistics chain. By shifting to a spatial index, Coupang has decoupled its logistics logic from rigid administrative boundaries like postal codes. This technical foundation allows for more agile resource distribution and provides the scalability needed to handle the continued growth of high-density urban deliveries.

data-visualization h3 spatial-indexing geospatial-data+2

coupang Nov 14, 2024

Meet Coupang’s Machine Learning Platform | by Coupang Engineering | Coupang Engineering Blog | Medium (opens in new tab)

Coupang’s internal Machine Learning Platform (MLP) is a comprehensive "batteries-included" ecosystem designed to streamline the end-to-end lifecycle of ML development across its diverse business units, including e-commerce, logistics, and streaming. By providing standardized tools for feature engineering, pipeline authoring, and model serving, the platform significantly reduces the time-to-production while enabling scalable, efficient compute management. Ultimately, this infrastructure allows Coupang to leverage advanced models like Ko-BERT for search and real-time forecasting to enhance the customer experience at scale. **Motivation for a Centralized Platform** * **Reduced Time to Production:** The platform aims to accelerate the transition from ad-hoc exploration to production-ready services by eliminating repetitive infrastructure setup. * **CI/CD Integration:** By incorporating continuous integration and delivery into ML workflows, the platform ensures that experiments are reproducible and deployments are reliable. * **Compute Efficiency:** Managed clusters allow for the optimization of expensive hardware resources, such as GPUs, across multiple teams and diverse workloads like NLP and Computer Vision. **Notebooks and Pipeline Authoring** * **Managed Jupyter Notebooks:** Provides data scientists with a standardized environment for initial data exploration and prototyping. * **Pipeline SDK:** Developers can use a dedicated SDK to define complex ML workflows as code, facilitating the transition from research to automated pipelines. * **Framework Agnostic:** The platform supports a wide range of ML frameworks and programming languages to accommodate different model architectures. **Feature Engineering and Data Management** * **Centralized Feature Store:** Enables teams to share and reuse features, reducing redundant data processing and ensuring consistency across the organization. * **Consistent Data Pipelines:** Bridges the gap between offline training and online real-time inference by providing a unified interface for data transformations. * **Large-scale Preparation:** Streamlines the creation of training datasets from Coupang’s massive logs, including product catalogs and user behavior data. **Training and Inference Services** * **Scalable Model Training:** Handles distributed training jobs and resource orchestration, allowing for the development of high-parameter models. * **Robust Model Inference:** Supports low-latency model serving for real-time applications such as ad ranking, video recommendations in Coupang Play, and pricing. * **Dedicated Infrastructure:** Training and inference clusters abstract the underlying hardware complexity, allowing engineers to focus on model logic rather than server maintenance. **Monitoring and Observability** * **Performance Tracking:** Integrated tools monitor model health and performance metrics in live production environments. * **Drift Detection:** Provides visibility into data and model drift, ensuring that models remain accurate as consumer behavior and market conditions change. For organizations looking to scale their AI capabilities, investing in an integrated platform that bridges the gap between experimentation and production is essential. By standardizing the "plumbing" of machine learning—such as feature stores and automated pipelines—companies can drastically increase the velocity of their data science teams and ensure the long-term reliability of their production models.

ai machine-learning nlp mlops+5

coupang Apr 17, 2023

Rocket Delivery: New Spatial Indexing- (opens in new tab)

Coupang transitioned its Rocket Delivery management from a text-based zip code system to a spatial index-based system using Uber’s H3 library. This shift addresses the limitations of zip codes, which became too coarse for high-density delivery areas, by enabling precise, map-based visualization and manipulation of delivery zones. By adopting a hexagonal grid-based approach, Coupang has improved operational flexibility and its ability to handle complex urban delivery environments. ### The Limitations of Zip Code Systems * Zip codes originally served as the base unit for Rocket Delivery, but as delivery volumes scaled, individual codes became too large for a single driver to manage. * Sub-dividing these areas (e.g., splitting a zip code into specific apartment complexes or even individual buildings) required the manual expertise of senior managers because text-based addresses lack inherent spatial intelligence. * The previous reliance on text made it difficult to visualize delivery boundaries or reassign areas quickly in response to changes in order volume. ### Implementing H3 for Geospatial Indexing * To modernize the system, Coupang adopted H3, a hexagonal hierarchical geospatial indexing system that converts geographic coordinates into unique cell identifiers. * Hexagons were selected over square grids because they provide uniform distances between the center of a cell and all its neighbors, which minimizes distortion in distance-based calculations. * The system uses H3’s hierarchical structure to manage different levels of detail, allowing the platform to aggregate small hexagonal units into larger, custom-defined delivery polygons. ### Technical Challenges in System Redesign * A primary engineering hurdle was selecting the optimal grid resolution to ensure cells were small enough to capture individual building footprints without creating excessive data overhead. * The team developed algorithms to transform groups of hexagonal indices into filled polygons, enabling camp managers to "draw" and modify delivery zones directly on a digital map. * By basing the system on spatial coordinates rather than administrative text, the platform can dynamically adjust to urban changes, such as the construction of new high-rises or the demolition of old structures. Transitioning from text-based addressing to hexagonal indexing allows logistics platforms to move beyond the constraints of administrative boundaries. For high-density urban delivery services, adopting a spatial-first infrastructure like H3 is a necessary step to ensure scalability and operational precision.

data-visualization h3 spatial-indexing geospatial-data+2