AWS / serverless

5 posts

aws

AWS Weekly Roundup: AWS Lambda for .NET 10, AWS Client VPN quickstart, Best of AWS re:Invent, and more (January 12, 2026) (opens in new tab)

The AWS Weekly Roundup for January 2026 highlights a significant push toward modernization, headlined by the introduction of .NET 10 support for AWS Lambda and Apache Airflow 2.11 for Amazon MWAA. To encourage exploration of these and other emerging technologies, AWS has revamped its Free Tier to offer new users up to $200 in credits and six months of risk-free experimentation. These updates collectively aim to streamline serverless development, enhance container storage efficiency, and provide more robust authentication options for messaging services. ### Modernized Runtimes and Orchestration * AWS Lambda now supports .NET 10 as both a managed runtime and a container base image, with AWS providing automatic updates to these environments as they become available. * Amazon Managed Workflows for Apache Airflow (MWAA) has added support for version 2.11, which serves as a critical stepping stone for users preparing to migrate to Apache Airflow 3. ### Infrastructure and Resource Management * Amazon ECS has extended support for `tmpfs` mounts to Linux tasks running on AWS Fargate and Managed Instances; this allows developers to utilize memory-backed file systems for containerized workloads to avoid writing sensitive or temporary data to task storage. * AWS Config has expanded its monitoring capabilities to discover, assess, and audit new resource types across Amazon EC2, Amazon SageMaker, and Amazon S3 Tables. * A new AWS Client VPN quickstart was released, providing a CloudFormation template and a step-by-step guide to automate the deployment of secure client-to-site VPN connections. ### Security and Messaging Enhancements * Amazon MQ for RabbitMQ brokers now supports HTTP-based authentication, which can be enabled and managed through the broker’s configuration file. * RabbitMQ brokers on Amazon MQ also now support certificate-based authentication using mutual TLS (mTLS) to improve the security posture of messaging applications. ### Educational Initiatives and Community Events * New AWS Free Tier accounts now include a 6-month trial period featuring $200 in credits and access to over 30 always-free services, specifically targeting developers interested in AI/ML and compute experimentation. * AWS published a curated "Best of re:Invent 2025" playlist, featuring high-impact sessions and keynotes for those who missed the live event. * The 2026 AWS Summit season begins shortly, with upcoming events scheduled for Dubai on February 10 and Paris on March 10. Developers should take immediate advantage of the new .NET 10 Lambda runtime for serverless applications and review the updated ECS `tmpfs` documentation to optimize container performance. For those new to the platform, the expanded Free Tier credits provide an excellent opportunity to prototype AI/ML workloads with minimal financial risk.

aws

New serverless customization in Amazon SageMaker AI accelerates model fine-tuning (opens in new tab)

Amazon SageMaker AI has introduced a new serverless customization capability designed to accelerate the fine-tuning of popular models like Llama, DeepSeek, and Amazon Nova. By automating resource provisioning and providing an intuitive interface for advanced reinforcement learning techniques, this feature reduces the model customization lifecycle from months to days. This end-to-end workflow allows developers to focus on model performance rather than infrastructure management, from initial training through to final deployment. **Automated Infrastructure and Model Support** * The service provides a serverless environment where SageMaker AI automatically selects and provisions compute resources based on the specific model architecture and dataset size. * Supported models include a broad range of high-performance options such as Amazon Nova, DeepSeek, GPT-OSS, Meta Llama, and Qwen. * The feature is accessible directly through the Amazon SageMaker Studio interface, allowing users to manage their entire model catalog in one location. **Advanced Customization and Reinforcement Learning** * Users can choose from several fine-tuning techniques, including traditional Supervised Fine-Tuning (SFT) and more advanced methods. * The platform supports modern optimization techniques such as Direct Preference Optimization (DPO), Reinforcement Learning from Verifiable Rewards (RLVR), and Reinforcement Learning from AI Feedback (RLAIF). * To simplify the process, SageMaker AI provides recommended defaults for hyperparameters like batch size, learning rate, and epochs based on the selected tuning technique. **Experiment Tracking and Security** * The workflow introduces a serverless MLflow application, enabling seamless experiment tracking and performance monitoring without additional setup. * Advanced configuration options allow for fine-grained control over network encryption and storage volume encryption to ensure data security. * The "Continue customization" feature allows for iterative tuning, where users can adjust hyperparameters or apply different techniques to an existing customized model. **Evaluation and Deployment Flexibility** * Built-in evaluation tools allow developers to compare the performance of their customized models against the original base models to verify improvements. * Once a model is finalized, it can be deployed with a few clicks to either Amazon SageMaker or Amazon Bedrock. * A centralized "My Models" dashboard tracks all custom iterations, providing detailed logs and status updates for every training and evaluation job. This serverless approach is highly recommended for teams that need to adapt large language models to specific domains quickly without the operational overhead of managing GPU clusters. By utilizing the integrated evaluation and multi-platform deployment options, organizations can transition from experimentation to production-ready AI more efficiently.

aws

Build multi-step applications and AI workflows with AWS Lambda durable functions (opens in new tab)

AWS Lambda durable functions introduce a simplified way to manage complex, long-running workflows directly within the standard Lambda experience. By utilizing a checkpoint and replay mechanism, developers can now write sequential code for multi-step processes that automatically handle state management and retries without the need for external orchestration services. This feature significantly reduces the cost of long-running tasks by allowing functions to suspend execution for up to one year without incurring compute charges during idle periods. ### Durable Execution Mechanism * The system uses a "durable execution" model based on checkpointing and replay to maintain state across function restarts. * When a function is interrupted or resumes from a pause, Lambda re-executes the handler from the beginning but skips already-completed operations by referencing saved checkpoints. * This architecture ensures that business logic remains resilient to failures and can survive execution environment recycles. * The execution state can be maintained for extended periods, supporting workflows that require human intervention or long-duration external processes. ### Programming Primitives and SDK * The feature requires the inclusion of a new open-source durable execution SDK in the function code. * **Steps:** The `context.step()` method defines specific blocks of logic that the system checkpoints and automatically retries upon failure. * **Wait:** The `context.wait()` primitive allows the function to terminate and release compute resources while waiting for a specified duration, resuming only when the time elapses. * **Callbacks:** Developers can use `create_callback()` to pause execution until an external event, such as an API response or a manual approval, is received. * **Advanced Control:** The SDK includes `wait_for_condition()` for polling external statuses and `parallel()` or `map()` operations for managing concurrent execution paths. ### Configuration and Setup * Durable execution must be enabled at the time of the Lambda function's creation; it cannot be retroactively enabled for existing functions. * Once enabled, the function maintains the same event handler structure and service integrations as a standard Lambda function. * The environment is specifically optimized for high-reliability use cases like payment processing, AI agent orchestration, and complex order management. AWS Lambda durable functions represent a major shift for developers who need the power of stateful orchestration but prefer to keep their logic within a single code-based environment. It is highly recommended for building AI workflows and multi-step business processes where state persistence and cost-efficiency are critical requirements.

aws

Amazon S3 Vectors now generally available with increased scale and performance (opens in new tab)

Amazon S3 Vectors has reached general availability, establishing the first cloud object storage service with native support for storing and querying vector data. This serverless solution allows organizations to reduce total ownership costs by up to 90% compared to specialized vector database solutions while providing the performance required for production-grade AI applications. By integrating vector capabilities directly into S3, AWS enables a simplified architecture for retrieval-augmented generation (RAG), semantic search, and multi-agent workflows. ### Massive Scale and Index Consolidation The move to general availability introduces a significant increase in data capacity, allowing users to manage massive datasets without complex infrastructure workarounds. * **Increased Index Limits:** Each index can now store and search across up to 2 billion vectors, representing a 40x increase from the 50 million limit during the preview phase. * **Bucket Capacity:** A single vector bucket can now scale to house up to 20 trillion vectors. * **Simplified Architecture:** The increased scale per index removes the need for developers to shard data across multiple indexes or implement custom query federation logic. ### Performance and Latency Optimizations The service has been tuned to meet the low-latency requirements of interactive applications like conversational AI and real-time inference. * **Query Response Times:** Frequent queries now achieve latencies of approximately 100ms or less, while infrequent queries consistently return results in under one second. * **Enhanced Retrieval:** Users can now retrieve up to 100 search results per query (increased from 30), providing broader context for RAG applications. * **Write Throughput:** The system supports up to 1,000 PUT transactions per second for streaming single-vector updates, ensuring new data is immediately searchable. ### Serverless Efficiency and Ecosystem Integration S3 Vectors functions as a fully serverless offering, eliminating the need to provision or manage underlying instances while paying only for active storage and queries. * **Amazon Bedrock Integration:** It is now generally available as a vector storage engine for Bedrock Knowledge Bases, facilitating the building of RAG applications. * **OpenSearch Support:** Integration with Amazon OpenSearch allows users to utilize S3 Vectors for storage while leveraging OpenSearch for advanced analytics and search features. * **Expanded Footprint:** The service is now available in 14 AWS Regions, up from five during the preview period. With its massive scale and 90% cost reduction, S3 Vectors is a primary candidate for organizations looking to move AI prototypes into production. Developers should consider migrating high-volume vector workloads to S3 Vectors to benefit from the serverless operational model and the native integration with the broader AWS AI stack.

aws

Amazon Bedrock adds 18 fully managed open weight models, including the new Mistral Large 3 and Ministral 3 models (opens in new tab)

Amazon Bedrock has significantly expanded its generative AI offerings by adding 18 new fully managed open-weight models from providers including Google, Mistral AI, NVIDIA, and OpenAI. This update brings the platform's total to nearly 100 serverless models, allowing developers to leverage a broad spectrum of specialized capabilities through a single, unified API. By providing access to these high-performing models without requiring infrastructure changes, AWS enables organizations to rapidly evaluate and deploy the most cost-effective and capable tools for their specific workloads. ### Specialized Mistral AI Releases The launch features four new models from Mistral AI, headlined by Mistral Large 3 and the edge-optimized Ministral series. * **Mistral Large 3:** Optimized for long-context tasks, multimodal reasoning, and instruction reliability, making it suitable for complex coding assistance and multilingual enterprise knowledge work. * **Ministral 3 (3B, 8B, and 14B):** These models are specifically designed for edge-optimized deployments on a single GPU. * **Use Cases:** While the 3B model excels at real-time translation and data extraction on low-resource devices, the 14B version is built for advanced local agentic workflows where privacy and hardware constraints are primary concerns. ### Broadened Model Provider Portfolio Beyond the Mistral updates, AWS has integrated several other open-weight options to address diverse industry requirements ranging from mobile applications to global scaling. * **Google Gemma 3 4B:** An efficient multimodal model designed to run locally on laptops, supporting on-device AI and multilingual processing. * **Global Provider Support:** The expansion includes models from MiniMax AI, Moonshot AI, NVIDIA, OpenAI, and Qwen, ensuring a competitive variety of reasoning and processing capabilities. * **Multimodal Capabilities:** Many of the new additions support vision-based tasks, such as image captioning and document understanding, alongside traditional text-based functions. ### Streamlined AI Development and Integration The primary technical advantage of this update is the ability to swap between diverse models using the Amazon Bedrock unified API. * **Infrastructure Consistency:** Developers can switch to newer, more efficient models without rewriting application code or managing underlying servers. * **Evaluation and Deployment:** The serverless architecture allows for immediate testing of different model weights (such as moving from 3B to 14B) to find the optimal balance between performance and latency. * **Enterprise Tooling:** These models integrate with existing Bedrock features, allowing for simplified agentic workflows and tool-use implementations. To take full advantage of these updates, developers should utilize the Bedrock console to experiment with the new Mistral and Gemma models for edge and multimodal use cases. The unified API structure makes it practical to run A/B tests between these open-weight models and established industry favorites to optimize for specific cost and performance targets.