Amazon OpenSearch Service improves vector database performance and cost with GPU acceleration and auto-optimization | AWS News Blog (opens in new tab)
Amazon OpenSearch Service has introduced serverless GPU acceleration and auto-optimization features designed to enhance the performance and cost-efficiency of large-scale vector databases. These updates allow users to build vector indexes up to ten times faster at a quarter of the traditional indexing cost, enabling the creation of billion-scale databases in under an hour. By automating complex tuning processes, OpenSearch Service simplifies the deployment of generative AI and high-speed search applications.
GPU Acceleration for Rapid Indexing
The new serverless GPU acceleration streamlines the creation of vector data structures by offloading intensive workloads to specialized hardware.
- Performance Gains: Indexing speed is increased by 10x compared to non-GPU configurations, significantly reducing the time-to-market for data-heavy applications.
- Cost Efficiency: Indexing costs are reduced to approximately 25% of standard costs, and users only pay for active processing through OpenSearch Compute Units (OCU) rather than idle instance time.
- Serverless Management: There is no need to provision or manage GPU instances manually; OpenSearch Service automatically detects acceleration opportunities and isolates workloads within the user's Amazon VPC.
- Operational Scope: Acceleration is automatically applied to both initial indexing and subsequent force-merge operations.
Automated Vector Index Optimization
Auto-optimization removes the requirement for deep vector expertise by automatically balancing competing performance metrics.
- Simplified Tuning: The system replaces manual index tuning—which can traditionally take weeks—with automated configurations.
- Resource Balancing: The tool finds the optimal trade-off between search latency, search quality (recall rates), and memory requirements.
- Improved Accuracy: Users can achieve higher recall rates and better cost savings compared to using default, unoptimized index configurations.
Configuration and Integration
These features can be integrated into new or existing OpenSearch Service domains and Serverless collections through the AWS Console or CLI.
- CLI Activation: Users can enable acceleration on existing domains using the
update-domain-configcommand with the--aiml-optionsflag set to enableServerlessVectorAcceleration. - Index Settings: To leverage GPU processing, users must create a vector index with specific settings, notably setting
index.knn.remote_index_build.enabledtotrue. - Supported Workloads: The service supports standard OpenSearch operations, including the Bulk API for adding vector data and text embeddings.
For organizations managing large-scale vector workloads for RAG (Retrieval-Augmented Generation) or semantic search, enabling GPU acceleration is a highly recommended step to reduce operational overhead. Developers should transition existing indexes to include the remote_index_build setting to take immediate advantage of the improved speed and reduced OCU pricing.