ai-agent

33 posts

toss

The Software 3.0 (opens in new tab)

당신의 팀은 같은 LLM을 쓰고 있나요? 현재 많은 개발팀이 LLM을 도입하고 있지만, 냉정하게 들여다보면 그것은 '각자도생'에 가깝습니다. 같은 모델, 같은 IDE를 쓰는데도 결과물의 차이는 극심합니다. 어떤 엔지니어는 '컨텍스트 엔지니어링(Context Engineering)'에 대한 높은 이해도로 LLM에게 정확한 역할을 부여해 10분 만에 복잡한 리팩토링을 끝냅니다. 반면, 어떤 엔지니어는 단순한 질문과 답변을 반복하며 할루시네이션과 씨름하느라 1시간을 허비하죠. 예를 들어, 같은 레포지…

aws

AWS Weekly Roundup: Claude Sonnet 4.6 in Amazon Bedrock, Kiro in GovCloud Regions, new Agent Plugins, and more (February 23, 2026) | Amazon Web Services (opens in new tab)

AWS Weekly Roundup: Claude Sonnet 4.6 in Amazon Bedrock, Kiro in GovCloud Regions, new Agent Plugins, and more (February 23, 2026) Last week, my team met many developers at Developer Week in San Jose. My colleague, Vinicius Senger delivered a great keynote about renascent softwa…

aws

AWS Weekly Roundup: Amazon Bedrock agent workflows, Amazon SageMaker private connectivity, and more (February 2, 2026) | Amazon Web Services (opens in new tab)

AWS Weekly Roundup: Amazon Bedrock agent workflows, Amazon SageMaker private connectivity, and more (February 2, 2026) Over the past week, we passed Laba festival, a traditional marker in the Chinese calendar that signals the final stretch leading up to the Lunar New Year. For m…

grammarly

How to Use AI Agents: A Simple Guide to Getting Started (opens in new tab)

AI agents represent a shift from reactive, prompt-based AI to proactive, goal-oriented systems capable of planning and executing multi-step tasks with minimal oversight. By operating in a continuous loop of gathering context, selecting tools, and evaluating results, these agents can manage complex workflows that previously required manual follow-up. The most effective implementation strategy involves starting with small, repeatable processes and gradually increasing agent autonomy as reliability is proven through feedback and testing. ### The Mechanism of Agentic AI * Unlike traditional generative AI that responds to isolated instructions, agents possess "agency," allowing them to decide the next best action to reach a defined objective. * Agents function through an iterative operational cycle: they analyze relevant context, select an action, utilize available tools, and evaluate the outcome to determine if the goal is met. * Advanced writing agents, such as those integrated into workplace tools, can proactively suggest revisions for tone, logical progression, and specificity by maintaining contextual awareness across a document's lifecycle. ### Deploying Agents via Repeatable Workflows * Initial use cases should focus on contained, well-understood tasks rather than end-to-end process overhauls to ensure the agent’s logic can be easily monitored. * In research and organization, agents can be tasked with continuously gathering and categorizing sources, updating citations as new data becomes available. * Communication workflows benefit from agents that can reference historical conversation threads to draft follow-ups, summarize long discussions, and adjust meeting agendas dynamically. * Content creation agents can manage the transition from rough notes to structured outlines, applying specific tone and clarity feedback across multiple versions of a draft. ### Integration and Tool Selection * Effective deployment often requires no coding experience, as agentic capabilities are increasingly built into existing word processors, email clients, and project management platforms. * Using familiar software ecosystems reduces the technical barrier to entry and allows for easier scaling of the agent’s behavior over time. * Project management agents can be utilized to monitor task progress, adjust timelines based on changing conditions, and surface high-priority items automatically. ### Establishing Goals and Ownership * Success depends on defining specific end states rather than vague instructions; for example, asking an agent to "flag logical gaps and suggest supporting evidence" is more effective than asking it to "improve writing." * Defining clear ownership ensures the agent knows which parameters to prioritize, such as maintaining a consistent brand voice while revising for conciseness. * Testing should begin with small-scale scenarios, like a single recurring email update, to allow for the refinement of instructions and priorities based on real-world performance. ### Scaling Autonomy and Oversight * Once an agent demonstrates consistent accuracy in a narrow task, its scope can be broadened to include related steps, such as tracking data throughout the week to prepare a draft before being prompted. * Increased autonomy does not mean a lack of control; humans should remain in the loop to provide feedback, which the agent uses to refine its future decision-making logic. * The transition from prompts to progress is achieved by allowing agents to work across different tools and contexts as they prove their ability to handle more complex judgment calls. To get the most out of AI agents, treat them as collaborative partners by starting with a narrow focus and providing specific, goal-oriented feedback. Rather than handing off entire processes immediately, focus on delegating repeatable tasks where the agent’s ability to plan and adapt can yield the highest immediate value.

grammarly

AI Assistants vs. AI Agents: What’s the Difference and When to Use Each (opens in new tab)

While AI assistants and agents often share the same large language model foundations, they serve distinct roles based on their level of autonomy and task complexity. Assistants operate on a reactive "prompt-response" loop for immediate, single-step tasks, whereas agents function as semi-independent systems capable of planning and executing multistep workflows to achieve a broader goal. Ultimately, the most effective AI strategy involves leveraging assistants for quick, guided interactions while utilizing agents to manage complex, coordinated projects that require memory and tool integration. ### Reactive vs. Proactive AI Architectures * Assistants are reactive tools that follow a "prompt-response" loop, similar to a tennis match where the user must always serve to initiate action. * Agents are proactive and semi-independent; once given a high-level goal, they can decompose it into actionable steps and execute them with minimal step-by-step direction. * In a practical scenario, an assistant might summarize meeting notes upon request, whereas an agent can organize those notes, assign tasks in a project management tool, and schedule follow-ups automatically. ### Technical Capabilities and Coordination * Both tools utilize Large Language Models (LLMs) to understand natural language, but agents incorporate advanced features like long-term memory and cross-app integrations. * Memory allows agents to retain feedback and results from previous interactions to deliver better outcomes over time, while integrations enable them to act on the user's behalf across different software platforms. * The two systems often work in tandem: the assistant acts as the front-facing interface (the "waiter") for user commands, while the agent acts as the back-end engine (the "kitchen") that performs the orchestration. ### Balancing Control and Complexity * AI assistants provide high user control and instant setup, making them ideal for "out of the box" tasks like grammar checks, rephrasing text, or answering quick questions. * AI agents excel at reducing cognitive load by managing "moving parts" like deadline tracking, organizing inputs from different stakeholders, and maintaining project states across various tools. * Grammarly’s implementation of agents serves as a technical example, moving beyond simple text revision to offer context-aware suggestions that help with brainstorming, knowledge retrieval, and predicting audience reactions. To maximize productivity, users should delegate isolated, high-control tasks to AI assistants while allowing AI agents to handle the background orchestration of complex projects. Success with these tools depends on maintaining human oversight, using assistant-led prompts to provide the regular feedback that agents need to refine their autonomous workflows.

toss

Welcoming the Era of (opens in new tab)

The tech industry is shifting from Software 1.0 (explicit logic) and 2.0 (neural networks) into Software 3.0, where natural language prompts and autonomous agents act as the primary programming interface. While Large Language Models (LLMs) are the engines of this era, they require a "Harness"—a structured environment of tools and protocols—to perform real-world tasks effectively. This evolution does not render traditional engineering obsolete; instead, it demonstrates that robust architectural principles like layered design and separation of powers are essential for building reliable AI agents. ### The Evolution of Software 3.0 * Software 1.0 is defined by explicit "How" logic written in languages like Python or Java, while Software 2.0 focuses on weights and data in neural networks. * Software 3.0, popularized by Andrej Karpathy, moves to "What" logic, where natural language prompts drive the execution. * The "Harness" concept is critical: just as a horse needs a harness to be useful to a human, an LLM needs tools (CLI, API access, file systems) to move from a chatbot to a functional agent like Claude Code. ### Mapping Agent Architecture to Traditional Layers * **Slash Commands as Controllers:** Tools like `/review` or `/refactor` act as entry points for user requests, similar to REST controllers in Spring or Express. * **Sub-agents as the Service Layer:** Sub-agents coordinate multiple skills and maintain independent context, mirroring how services orchestrate domain objects and repositories. * **Skills as Domain Components:** Following the Single Responsibility Principle (SRP), individual skills should handle one clear task (e.g., "generating tests") to prevent logic bloat. * **MCP as Infrastructure/Adapters:** The Model Context Protocol (MCP) functions like the Repository or Adapter pattern, abstracting external systems like databases and APIs from the core logic. * **CLAUDE.md as Configuration:** Project-specific rules and tech stacks are stored in metadata files, acting as the `package.json` or `pom.xml` of the agent environment. ### From Exceptions to Questions * Traditional 1.0 software must have every branch of logic predefined; if an unknown state is reached, the system throws an exception or fails. * Software 3.0 introduces Human-in-the-Loop (HITL), where "Exceptions" become "Questions," allowing the agent to ask for clarification on high-risk or ambiguous tasks. * Effective agent design requires identifying when to act autonomously (reversible, low-risk tasks) versus when to delegate decisions to a human (deployments, deletions, or high-cost API calls). ### Managing Constraints: Tokens and Complexity * In Software 3.0, tokens represent the "memory" (RAM) of the system; large codebases can lead to "token explosion," causing context overflow or high costs. * Deterministic logic should be moved to external scripts rather than being interpreted by the LLM every time to save tokens and ensure consistency. * To avoid "Skill Explosion" (similar to Class Explosion), developers should use "Progressive Disclosure," providing the agent with a high-level entry point and only loading detailed task knowledge when specifically required. Traditional software engineering expertise—specifically in cohesion, coupling, and abstraction—is the most valuable asset when transitioning to Software 3.0. By treating prompt engineering and agent orchestration with the same architectural rigor as 1.0 code, developers can build agents that are scalable, maintainable, and truly useful.