amazon-bedrock

5 posts

aws

AWS Weekly Roundup: Kiro CLI latest features, AWS European Sovereign Cloud, EC2 X8i instances, and more (January 19, 2026) | AWS News Blog (opens in new tab)

The January 19, 2026, AWS Weekly Roundup highlights significant advancements in sovereign cloud infrastructure and the general availability of high-performance, memory-optimized compute instances. The update also emphasizes the maturing ecosystem of AI agents, focusing on enhanced developer tooling and streamlined deployment workflows for agentic applications. These releases collectively aim to satisfy stringent regulatory requirements in Europe while pushing the boundaries of enterprise performance and automated productivity. ## Developer Tooling and Kiro CLI Enhancements * New granular controls for web fetch URLs allow developers to use allowlists and blocklists to strictly govern which external resources an agent can access. * The update introduces custom keyboard shortcuts to facilitate seamless switching between multiple specialized agents within a single session. * Enhanced diff views provide clearer visibility into changes, improving the debugging and auditing process for automated workflows. ## AWS European Sovereign Cloud General Availability * Following its initial 2023 announcement, this independent cloud infrastructure is now generally available to all customers. * The environment is purpose-built to meet the most rigorous sovereignty and data residency requirements for European organizations. * It offers a comprehensive set of AWS services within a framework that ensures operational independence and localized data handling. ## High-Performance Computing with EC2 X8i Instances * The memory-optimized X8i instances, powered by custom Intel Xeon 6 processors, have moved from preview to general availability. * These instances feature a sustained all-core turbo frequency of 3.9 GHz, which is currently exclusive to the AWS platform. * The hardware is SAP certified and engineered to provide the highest memory bandwidth and performance for memory-intensive enterprise workloads compared to other Intel-based cloud offerings. ## Agentic AI and Productivity Updates * Amazon Quick Suite continues to expand as a workplace "agentic teammate," designed to synthesize research and execute actions based on organizational insights. * New technical guidance has been released regarding the deployment of AI agents on Amazon Bedrock AgentCore. * The integration of GitHub Actions is now supported to automate the deployment and lifecycle management of these AI agents, bridging the gap between traditional DevOps and agentic AI development. These updates signal a strategic shift toward highly specialized infrastructure, both in terms of regulatory compliance with the Sovereign Cloud and raw performance with the X8i instances. Organizations looking to scale their AI operations should prioritize the new deployment patterns for Bedrock AgentCore to ensure a robust CI/CD pipeline for their autonomous agents.

aws

Amazon Bedrock adds reinforcement fine-tuning simplifying how developers build smarter, more accurate AI models | AWS News Blog (opens in new tab)

Amazon Bedrock has introduced reinforcement fine-tuning, a new model customization capability that allows developers to build more accurate and cost-effective AI models using feedback-driven training. By moving away from the requirement for massive labeled datasets in favor of reward signals, the platform enables average accuracy gains of 66% while automating the complex infrastructure typically associated with advanced machine learning. This approach allows organizations to optimize smaller, faster models for specific business needs without sacrificing performance or incurring the high costs of larger model variants. **Challenges of Traditional Model Customization** * Traditional fine-tuning often requires massive, high-quality labeled datasets and expensive human annotation, which can be a significant barrier for many organizations. * Developers previously had to choose between settle for generic "out-of-the-box" results or managing the high costs and complexity of large-scale infrastructure. * The high barrier to entry for advanced reinforcement learning techniques often required specialized ML expertise that many development teams lack. **Mechanics of Reinforcement Fine-Tuning** * The system uses an iterative feedback loop where models improve based on reward signals that judge the quality of responses against specific business requirements. * Reinforcement Learning with Verifiable Rewards (RLVR) utilizes rule-based graders to provide objective feedback for tasks such as mathematics or code generation. * Reinforcement Learning from AI Feedback (RLAIF) uses AI-driven evaluations to help models understand preference and quality without manual human intervention. * The workflow can be powered by existing API logs within Amazon Bedrock or by uploading training datasets, eliminating the need for complex infrastructure setup. **Performance and Security Advantages** * The technique achieves an average accuracy improvement of 66% over base models, enabling smaller models to perform at the level of much larger alternatives. * Current support includes the Amazon Nova 2 Lite model, which helps developers optimize for both speed and price-to-performance. * All training data and customization processes remain within the secure AWS environment, ensuring that proprietary data is protected and compliant with organizational security standards. Developers should consider reinforcement fine-tuning as a primary strategy for optimizing smaller models like Amazon Nova 2 Lite to achieve high-tier performance at a lower cost. This capability is particularly recommended for specialized tasks like reasoning and coding where objective reward functions can be used to rapidly iterate and improve model accuracy.

aws

Amazon Bedrock AgentCore adds quality evaluations and policy controls for deploying trusted AI agents | AWS News Blog (opens in new tab)

AWS has introduced several new capabilities to Amazon Bedrock AgentCore designed to remove the trust and quality barriers that often prevent AI agents from moving into production environments. These updates, which include granular policy controls and sophisticated evaluation tools, allow developers to implement strict operational boundaries and monitor real-world performance at scale. By balancing agent autonomy with centralized verification, AgentCore provides a secure framework for deploying highly capable agents across enterprise workflows. **Governance through Policy in AgentCore** * This feature establishes clear boundaries for agent actions by intercepting tool calls via the AgentCore Gateway before they are executed. * By operating outside of the agent’s internal reasoning loop, the policy layer acts as an independent verification system that treats the agent as an autonomous actor requiring permission. * Developers can define fine-grained permissions to ensure agents do not access sensitive data inappropriately or take unauthorized actions within external systems. **Quality Monitoring with AgentCore Evaluations** * The new evaluation framework allows teams to monitor the quality of AI agents based on actual behavior rather than theoretical simulations. * Built-in evaluators provide standardized metrics for critical dimensions such as helpfulness and correctness. * Organizations can also implement custom evaluators to ensure agents meet specific business-logic requirements and industry-specific compliance standards. **Enhanced Memory and Communication Features** * New episodic functionality in AgentCore Memory introduces a long-term strategy that allows agents to learn from past experiences and apply successful solutions to similar future tasks. * Bidirectional streaming in the AgentCore Runtime supports the deployment of advanced voice agents capable of handling natural, simultaneous conversation flows. * These enhancements focus on improving consistency and user experience, enabling agents to handle complex, multi-turn interactions with higher reliability. **Real-World Application and Performance** * The AgentCore SDK has seen rapid adoption with over 2 million downloads, supporting diverse use cases from content generation at the PGA TOUR to financial data analysis at Workday. * Case studies highlight significant operational gains, such as a 1,000 percent increase in content writing speed and a 50 percent reduction in problem resolution time through improved observability. * The platform emphasizes 100 percent traceability of agent decisions, which is critical for organizations transitioning from reactive to proactive AI-driven operations. To successfully scale AI agents, organizations should transition from simple prompt engineering to a robust agentic architecture. Leveraging these new policy and evaluation tools will allow development teams to maintain the necessary control and visibility required for customer-facing and mission-critical deployments.

aws

Amazon S3 Vectors now generally available with increased scale and performance | AWS News Blog (opens in new tab)

Amazon S3 Vectors has reached general availability, establishing the first cloud object storage service with native support for storing and querying vector data. This serverless solution allows organizations to reduce total ownership costs by up to 90% compared to specialized vector database solutions while providing the performance required for production-grade AI applications. By integrating vector capabilities directly into S3, AWS enables a simplified architecture for retrieval-augmented generation (RAG), semantic search, and multi-agent workflows. ### Massive Scale and Index Consolidation The move to general availability introduces a significant increase in data capacity, allowing users to manage massive datasets without complex infrastructure workarounds. * **Increased Index Limits:** Each index can now store and search across up to 2 billion vectors, representing a 40x increase from the 50 million limit during the preview phase. * **Bucket Capacity:** A single vector bucket can now scale to house up to 20 trillion vectors. * **Simplified Architecture:** The increased scale per index removes the need for developers to shard data across multiple indexes or implement custom query federation logic. ### Performance and Latency Optimizations The service has been tuned to meet the low-latency requirements of interactive applications like conversational AI and real-time inference. * **Query Response Times:** Frequent queries now achieve latencies of approximately 100ms or less, while infrequent queries consistently return results in under one second. * **Enhanced Retrieval:** Users can now retrieve up to 100 search results per query (increased from 30), providing broader context for RAG applications. * **Write Throughput:** The system supports up to 1,000 PUT transactions per second for streaming single-vector updates, ensuring new data is immediately searchable. ### Serverless Efficiency and Ecosystem Integration S3 Vectors functions as a fully serverless offering, eliminating the need to provision or manage underlying instances while paying only for active storage and queries. * **Amazon Bedrock Integration:** It is now generally available as a vector storage engine for Bedrock Knowledge Bases, facilitating the building of RAG applications. * **OpenSearch Support:** Integration with Amazon OpenSearch allows users to utilize S3 Vectors for storage while leveraging OpenSearch for advanced analytics and search features. * **Expanded Footprint:** The service is now available in 14 AWS Regions, up from five during the preview period. With its massive scale and 90% cost reduction, S3 Vectors is a primary candidate for organizations looking to move AI prototypes into production. Developers should consider migrating high-volume vector workloads to S3 Vectors to benefit from the serverless operational model and the native integration with the broader AWS AI stack.

aws

Amazon Bedrock adds 18 fully managed open weight models, including the new Mistral Large 3 and Ministral 3 models | AWS News Blog (opens in new tab)

Amazon Bedrock has significantly expanded its generative AI offerings by adding 18 new fully managed open-weight models from providers including Google, Mistral AI, NVIDIA, and OpenAI. This update brings the platform's total to nearly 100 serverless models, allowing developers to leverage a broad spectrum of specialized capabilities through a single, unified API. By providing access to these high-performing models without requiring infrastructure changes, AWS enables organizations to rapidly evaluate and deploy the most cost-effective and capable tools for their specific workloads. ### Specialized Mistral AI Releases The launch features four new models from Mistral AI, headlined by Mistral Large 3 and the edge-optimized Ministral series. * **Mistral Large 3:** Optimized for long-context tasks, multimodal reasoning, and instruction reliability, making it suitable for complex coding assistance and multilingual enterprise knowledge work. * **Ministral 3 (3B, 8B, and 14B):** These models are specifically designed for edge-optimized deployments on a single GPU. * **Use Cases:** While the 3B model excels at real-time translation and data extraction on low-resource devices, the 14B version is built for advanced local agentic workflows where privacy and hardware constraints are primary concerns. ### Broadened Model Provider Portfolio Beyond the Mistral updates, AWS has integrated several other open-weight options to address diverse industry requirements ranging from mobile applications to global scaling. * **Google Gemma 3 4B:** An efficient multimodal model designed to run locally on laptops, supporting on-device AI and multilingual processing. * **Global Provider Support:** The expansion includes models from MiniMax AI, Moonshot AI, NVIDIA, OpenAI, and Qwen, ensuring a competitive variety of reasoning and processing capabilities. * **Multimodal Capabilities:** Many of the new additions support vision-based tasks, such as image captioning and document understanding, alongside traditional text-based functions. ### Streamlined AI Development and Integration The primary technical advantage of this update is the ability to swap between diverse models using the Amazon Bedrock unified API. * **Infrastructure Consistency:** Developers can switch to newer, more efficient models without rewriting application code or managing underlying servers. * **Evaluation and Deployment:** The serverless architecture allows for immediate testing of different model weights (such as moving from 3B to 14B) to find the optimal balance between performance and latency. * **Enterprise Tooling:** These models integrate with existing Bedrock features, allowing for simplified agentic workflows and tool-use implementations. To take full advantage of these updates, developers should utilize the Bedrock console to experiment with the new Mistral and Gemma models for edge and multimodal use cases. The unified API structure makes it practical to run A/B tests between these open-weight models and established industry favorites to optimize for specific cost and performance targets.