亚马逊云科技2025 re：invent

Agentic AI的实践之路

The Practical Path Forward for Agentic AI

在当前Agentic AI,正在从“技术探索”迈向“价值兑现”的关键转折期，企业必须深刻洞察并适应几大核心战略范式的转移。在这一轮深刻的变革浪潮中，谁能更精准地把握技术实质，更前瞻地布局未来架构，谁便能在市场中确立显著的竞争优势。我们观察到以下三个决定未来格局的关键趋势：

As Agentic AI transitions from a phase of “technological exploration” to one of “value realization,” enterprises must deeply understand and adapt to several pivotal shifts in strategic paradigms. In this profound wave of transformation, those who can most accurately grasp the essence of the technology and most foresightedly architect for the future will establish a decisive competitive advantage in the market.We have identified the following three critical trends that will shape the landscape ahead:

AI基础设施（AI Infra）的范式转移

软件的主要交互对象正在发生根本性变化，正从以“人类开发者”为中心，转向以“AI Agent（智能体）”为中心。这意味着基础设施层需要重新定义其接口、性能指标和资源调度策略，以适应AI自主消费资源的新模式。

Paradigm Shift in AI Infrastructure (AI Infra):
The primary interaction partner of software is undergoing a fundamental transformation—shifting from being centered on human developers to being centered on AI Agents. This demands a redefinition of infrastructure-layer interfaces, performance metrics, and resource scheduling strategies to accommodate the new paradigm in which AI autonomously consumes computational resources.

AI Agent的规模化爆发临界点：

我们正处于AI Agent从概念验证走向全面大规模商业落地的前夜。Gartner技术成熟度曲线表明，AI Agent正在快速爬升，即将成为未来2-5年内影响力最大的技术之一。这不仅是工具的升级，更是业务流程重构和生产力变革的核心引擎。

The Tipping Point for Mass-Scale AI Agent Adoption:
We stand on the cusp of AI Agents’ transition from proof-of-concept experiments to full-scale commercial deployment. According to Gartner’s Hype Cycle, AI Agents are rapidly ascending the curve and are poised to become one of the most impactful technologies over the next 2–5 years. This evolution represents far more than an incremental tool upgrade—it is the core engine driving the reengineering of business processes and a fundamental shift in productivity.

中美齐头并进，中国市场的场景化优势：

在应用层创新方面，中美处于同一起跑线。中国凭借其庞大的经济体量、丰富的产业生态和活跃的数字化场景，为AI Agent的试错、迭代和规模化应用提供了全球独一无二的“试验田”。

Parallel Progress in the U.S. and China, China’s Edge in Scenario-Driven Innovation:
At the application layer, the U.S. and China are effectively starting from the same baseline. China, however, possesses a distinct advantage: its vast economic scale, rich industrial ecosystems, and highly dynamic digital environments create a globally unique “testing ground” for rapid experimentation, iteration, and large-scale deployment of AI Agents.

在近期的reinvent发布会上，AWS的战略布局清晰地印证了上述趋势，特别是对“Agent化开发”的全面捕捉。其通过彻底重构和升级围绕Agent开发的全栈产品体系，为用户从传统应用向生成式AI架构的迁移提供了高效的路径。

At its recent re:Invent conference, AWS clearly validated the aforementioned trends, particularly through its comprehensive embrace of agent-centric development. By thoroughly rearchitecting and upgrading its full-stack product portfolio around AI Agent development, AWS has provided users with an efficient pathway to migrate from traditional applications to generative AI architectures.

在当前开源模型蓬勃发展的背景下，过度投入底层模型训练已非最优解。AWS明智地选择了“平台与工具”战略，致力于为用户提供广泛的选择和灵活的集成能力。这体现了以“用户价值优先”为核心的理念，即通过降低技术门槛和供应商锁定风险，赋能客户在多样化的模型生态中做出最优选择。

Against the backdrop of today’s rapidly flourishing open-source model ecosystem, heavy investment in foundational model training is no longer the optimal strategy. AWS has wisely pivoted toward a “platform and tools” approach, focusing on delivering users broad model choices and flexible integration capabilities. This reflects a core philosophy centered on user value first, lowering technical barriers and mitigating vendor lock-in risks to empower customers to make optimal selections within a diverse and evolving model ecosystem.

全栈Agentic AI：重新定义与实践优化

Full-Stack Agentic AI: Redefining Development and Optimizing Practice

“全栈Agentic AI”正在重新定义软件开发和应用交付的模式。这要求基础设施及软件提供商必须从实践中不断优化和探索。

“Full-stack Agentic AI” is fundamentally reshaping the paradigms of software development and application delivery. This demands that infrastructure and software providers continuously refine their approaches through real-world implementation, iterating toward greater efficiency, adaptability, and value creation.

用户基础即护城河：

在Agent生态的竞争中，谁拥有更庞大的用户基础和更丰富的应用场景，谁就更有可能通过数据飞轮效应，迭代出功能更强大、工具更完备的Agent解决方案。

User Base as the Moat:

In the competition for Agent ecosystems, whoever possesses a larger user base and richer application scenarios is more likely to leverage the data flywheel effect to iteratively develop more powerful and fully-featured Agent solutions.

重构工作边界：Infra及软件企业面临的核心挑战在于，必须清晰地理解和区分“人类开发者”与“AI Agent”在执行任务时的能力差异和边界。需要构建能够理解Agent行为模式的新型基础设施。

Redefining the Boundary of Work:

A core challenge for infrastructure and software companies lies in clearly understanding and architecturally distinguishing the differing capabilities and operational boundaries between human developers and AI Agents. This requires building new infrastructure capable of understanding AI Agent behavior patterns.

动态平衡的协作架构：

未来的开发和运维模式将是“人类开发者 + AI Agent”的动态协作。如何在任务执行过程中，实现两者能力的弹性互补和动态平衡，是所有基础设施和软件企业必须解决的架构性挑战。

A Dynamically Balanced Collaborative Architecture:

The future of development and operations will be defined by dynamic collaboration between human developers and AI Agents. A fundamental architectural imperative for all infrastructure and software providers is to enable elastic complementarity and real-time balance between human and agent capabilities throughout task execution.

我们必须承认，地缘政治因素加剧了全球技术竞争，导致技术栈在短期内呈现出一定程度的分割态势。然而，从长期来看，技术的“融合与共享”是不可逆转的历史潮流。

We must acknowledge that geopolitical tensions have intensified global technological competition, leading to a degree of fragmentation in technical stacks in the short term. Yet, over the long time, the historical tide toward integration and shared progress in technology remains irreversible.

短期的技术竞争和路线之争，客观上有助于“去伪存真”，并帮助人类社会更好地控制全面AI到来的节奏，为我们留出更多关于AI伦理、安全和治理的思考与准备时间。毕竟，AI技术作为一项通用目的技术，其终极归属是服务于全人类福祉。我们应致力于在竞争中寻求合作，在分歧中构建共识，共同引导这项强大的技术向善而行。

Short-term technological competition and debates over development pathways objectively help to separate the wheat from the chaff and enable human society to better manage the pace at which comprehensive AI arrives, giving us more time to reflect on and prepare for issues related to AI ethics, safety, and governance. After all, as a general-purpose technology, AI’s ultimate purpose is to serve the well-being of all humanity. We should strive to seek cooperation amid competition and build consensus despite differences, jointly guiding this powerful technology toward beneficial outcomes.

第一阶段：重塑基础设施——“降本”是 Agent 普及的前提

Stage One: Rebuilding the Infrastructure — Cost Reduction as the Prerequisite for Agent Adoption

在人工智能从“对话助手”迈向“数字员工”的关键跃迁中，智能体（Agent）正成为企业智能化转型的核心载体。然而，一个不容忽视的现实是：真正的 Agent 必须 24 小时在线，持续感知、推理、决策并执行任务。这意味着其背后需要稳定、高效、低成本的算力基础设施作为支撑。若无法突破高昂的推理与运行成本瓶颈，“用得起”便无从谈起，“用得广”更是一句空话。当前，行业共识正在形成：Agent 的规模化落地，首先是一场基础设施的革命。而这场革命的核心命题，正是“降本”。

As artificial intelligence transitions from “conversational assistants” to “digital employees,” intelligent agents (Agents) are becoming the core vehicle for enterprise intelligent transformation. However, a critical reality cannot be ignored: true Agents must remain online 24/7, continuously perceiving, reasoning, making decisions, and executing tasks. This demands stable, efficient, and low-cost computing infrastructure as foundational support. Without overcoming the bottleneck of high inference and operational costs, affordability—let alone widespread adoption—remains an empty promise. Industry consensus is now coalescing around a key insight: the large-scale deployment of Agents begins with a revolution in infrastructure, and the central theme of this revolution is “cost reduction.”

自研芯片的降维打击：从 Trainium3 到 Graviton5，重构 AI 算力经济模型

Custom Silicon’s Disruptive Edge: From Trainium3 to Graviton5, Redefining the Economics of AI Compute

2025 年，亚马逊 AWS 在 re:Invent 大会上亮出“双芯合璧”的底牌——新一代训练芯片 Trainium3 与通用服务器 CPU Graviton5，共同构筑起面向 Agent 时代的高性价比算力基座。作为亚马逊首款基于3nm工艺打造的芯片，Trainium3在计算能力、能效表现及内存带宽方面实现大幅跃升。与传统基于图形处理单元（GPU）的解决方案相比，采用Trainium3进行AI模型训练和推理，最多可降低50%的成本。

At its 2025 re:Invent conference, Amazon AWS unveiled a “dual-chip synergy” strategy—its next-generation training chip Trainium3 and general-purpose server CPU Graviton5—together forming a high-performance, cost-efficient compute foundation tailored for the Agent era. As Amazon’s first chip built on a 3nm process, Trainium3 delivers significant leaps in computational power, energy efficiency, and memory bandwidth. Compared to traditional GPU-based solutions, AI model training and inference using Trainium3 can reduce costs by up to 50%.

亚马逊新一代自研服务器芯片Graviton5则是迄今为止性能最强劲且能效最高级别的数据中心服务器中央处理器(CPU)，每个Graviton5核心可访问的L3级别缓存容量是Graviton4的2.6倍，其网络与存储带宽也提升了15%至20%。与前一代相比，Graviton5在保持行业内领先能效的同时可提供高达25%的计算性能提升，使客户们能够更快运行应用、大幅降低计算成本并实现可持续发展目标。

Amazon’s new-generation custom server CPU, Graviton5, represents the most powerful and energy-efficient data center CPU in the company’s history. Each Graviton5 core accesses L3 cache capacity 2.6 times larger than that of Graviton4, while network and storage bandwidth have increased by 15% to 20%. Delivering up to 25% higher compute performance than its predecessor—all while maintaining industry-leading energy efficiency—Graviton5 enables customers to run applications faster, substantially lower compute costs, and advance sustainability goals.

推理引擎与模型协同优化：Mantle + Nova，打造精细推理流水线

Co-Optimization of Inference Engine and Models: Mantle + Nova, Building a Precision Inference Pipeline

硬件降本之外，软件栈的深度协同同样关键。AWS 推出的 Mantle 推理引擎，专为大模型推理场景设计，通过内核级调度、内存池复用与批处理动态融合等技术，显著降低延迟与资源碎片。在服务层级上，系统允许客户将请求分配到三个通道：Priority通道提供实时低延迟，Standard通道提供稳定可预测的性能，Flex通道适合后台任务且效率优先。每个客户拥有独立队列，一个客户的突发情况不会影响其他客户性能。

Beyond hardware-driven cost savings, deep software-hardware co-optimization is equally crucial. AWS’s newly launched Mantle inference engine is purpose-built for large-model inference scenarios. Through kernel-level scheduling, memory pool reuse, and dynamic batch fusion, Mantle significantly reduces latency and resource fragmentation. At the service layer, the system allows customers to route requests through three distinct channels: Priority for real-time, low-latency workloads; Standard for stable and predictable performance; and Flex for background tasks prioritizing efficiency. Each customer has a dedicated queue, ensuring one customer’s traffic spikes do not impact others’ performance.

在生成式AI推理方面，亚马逊云科技推出了Amazon Nova 2系列基础模型，这些模型能够满足不同场景下的需求，从低成本的文本到文本响应到功能强大的多模态模型，Amazon Nova系列都能够提供卓越的性能和准确性。配合 Mantle 引擎，Amazon Nova 模型家族进一步细化了成本与性能的平衡空间，Nova2 Sonic 面向高吞吐、低延迟的实时交互场景（如客服 Agent、交易风控），通过结构化稀疏与量化感知训练，在保持 95%以上原始模型能力的同时，将推理速度提升 2.1 倍；Nova2 Lite 则聚焦边缘与轻量部署，采用知识蒸馏与模块剪枝技术，模型体积压缩至原版的 1/8，可在 Graviton5 实例上以极低成本并发运行数百个微型 Agent。

In generative AI inference, Amazon Web Services introduced the Amazon Nova 2 family of foundation models, designed to meet diverse application needs—from low-cost text-to-text responses to powerful multimodal capabilities—delivering exceptional performance and accuracy across the board. Coupled with the Mantle engine, the Amazon Nova model family further refines the trade-off between cost and performance. Nova2 Sonic targets high-throughput, low-latency real-time interaction scenarios (e.g., customer service Agents or transaction risk control), leveraging structured sparsity and quantization-aware training to achieve 2.1× faster inference speeds while retaining over 95% of the original model’s capability. Meanwhile, Nova2 Lite focuses on edge and lightweight deployments, applying knowledge distillation and module pruning to shrink model size to 1/8 of the original, enabling hundreds of micro-Agents to run concurrently at ultra-low cost on Graviton5 instances.

硬件级安全护航：Confidential Computing 为 Agent 赋予“可信身份”

Hardware-Enforced Security: Confidential Computing Grants Agents a “Trusted Identity”

成本之外，信任是企业采纳 Agent 的另一道门槛。当 Agent 开始接触客户数据、财务信息甚至核心业务流程，数据安全与隐私合规便成为不可妥协的红线。AWS 通过机密计算（Confidential Computing）技术，在硬件层面构建“零信任”执行环境。同时，全新第六代 Nitro系统和 Nitro 隔离引擎，进一步提升Graviton 5的安全保障。 Nitro系统能虚拟化、存储和网络任务卸载到专用硬件上。Graviton5 引入Nitro 隔离引擎，通过使用形式化验证来增强 Nitro 系统，从而确保工作负载彼此隔离和安全。全新Nitro隔离引擎使用精简且经过形式化验证的代码库，其中包含数学证明，以确保其行为完全符合定义，这项技术为经数学验证的云安全树立了新的标准。

Beyond cost, trust is another critical barrier to enterprise Agent adoption. As Agents begin handling customer data, financial information, and even core business workflows, data security and privacy compliance become non-negotiable requirements. AWS addresses this through Confidential Computing—a technology that establishes a “zero-trust” execution environment at the hardware level. Complementing this, the all-new sixth-generation Nitro system and Nitro Isolation Engine further enhance Graviton5’s security posture. The Nitro system offloads virtualization, storage, and networking tasks onto dedicated hardware. Graviton5 introduces the Nitro Isolation Engine, which strengthens the Nitro system through formal verification to ensure strict workload isolation and security. Built on a minimal, formally verified codebase backed by mathematical proofs guaranteeing its behavior strictly adheres to specification, this new Nitro Isolation Engine sets a new benchmark for mathematically verifiable cloud security.

第二阶段：构建与开发——从“写代码”到“编排智能体”

Stage 2: Construction and Development — From “Writing Code” to “Orchestrating Agents”

当前，人工智能的开发范式正在经历一场深刻变革。传统软件开发依赖于编写确定性的指令序列，而智能体的构建核心已转变为对自主认知系统的编排。这一转变要求开发者从流程编码者，转型为智能体能力与组件的整合者。亚马逊云科技在re:Invent 2025发布的一系列工具与服务，正是为了系统化地支持这一新范式，降低从概念验证到大规模部署的全链路门槛。

A profound transformation is currently underway in the paradigm of artificial intelligence development. While traditional software development relies on writing deterministic sequences of instructions, the core of agent construction has shifted to orchestrating autonomous cognitive systems. This change requires developers to transition from being process coders to integrators of agent capabilities and components. The suite of tools and services launched by AWS at re:Invent 2025 is designed to systematically support this new paradigm, lowering barriers across the entire journey from proof-of-concept to large-scale deployment.

新范式下的核心开发框架

为践行模型驱动的开发理念，亚马逊云科技开源了 Strands Agents SDK。该框架摒弃了预先定义刚性工作流的传统思路，其核心在于信任并利用大语言模型固有的规划、推理与工具调用能力。开发者仅需通过简洁的代码定义任务目标与可用工具，即可快速构建功能型智能体，实现“配置即构建”。

Core Development Frameworks for the New Paradigm

To implement the model-driven development philosophy, AWS open-sourced the Strands Agents SDK. This framework departs from the traditional approach of predefining rigid workflows. Its core principle is to trust and leverage the inherent planning, reasoning, and tool-calling capabilities of large language models. Developers can quickly build functional agents by concisely defining task objectives and available tools, , achieving “configure to build.”

降低门槛与生态扩展：

该SDK全面支持Python和TypeScript，并内置超过20个开箱即用的工具。通过简单的装饰器，任何函数都能转化为智能体可调用的工具。其与模型上下文协议的深度集成，更能让智能体安全接入海量第三方工具与服务，极大扩展了能力边界。

Lowering Barriers and Ecosystem Expansion:

The SDK fully supports Python and TypeScript and comes with over 20 out-of-the-box tools. Any function can be transformed into an agent-callable tool via a simple decorator. Its deep integration with the Model Context Protocol (MCP) further allows agents to securely access a vast array of third-party tools and services, significantly expanding their capability boundaries.

原生支持复杂协作：

该框架原生支持多种多智能体协作范式。开发者可根据任务需求，选择适用于严格顺序执行的工作流模式、基于有向无环图结构的图模式，或允许多智能体自主协同的集群模式，以构建适应不同复杂度的系统。

Native Support for Complex Collaboration:

The framework natively supports multiple paradigms for multi-agent collaboration. Depending on task requirements, developers can choose the Workflow mode for strictly sequential execution, the Graph mode based on a directed acyclic graph structure, or the Swarm mode that allows for autonomous collaboration among multiple agents, enabling the construction of systems suited to varying levels of complexity.

面向生产的模块化核心能力

Modular Core Capabilities for Production

将智能体投入企业级生产环境，需系统化解决记忆、安全集成与可信执行等工程挑战。Amazon Bedrock AgentCore 服务采用模块化设计，提供了一系列可独立组合使用的企业级构建块，为智能体的规模化应用提供坚实基础。

Deploying agents in enterprise-grade production environments requires systematically addressing engineering challenges such as memory, secure integration, and trustworthy execution. The Amazon Bedrock AgentCore service adopts a modular design, offering a series of enterprise-grade building blocks that can be used independently or in combination, providing a solid foundation for the scalable application of agents.

记忆，从事实存储到情景学习：

AgentCore Memory 提供分层的记忆管理。除了维持对话上下文的短期记忆和存储用户偏好的长期记忆外，本次新发布的情景记忆功能实现了关键突破。它使得智能体能够从连续的历史交互中进行模式提炼与经验学习，而非仅进行静态事实检索。例如，智能体可自主归纳出用户“携带家人出行时需预留更长时间”的行为模式，并在未来类似场景中主动应用，从而实现持续优化的个性化服务。
Memory, From Fact Storage to Episodic Learning:

AgentCore Memory provides hierarchical memory management. In addition to short-term memory for maintaining conversational context and long-term memory for storing user preferences, the newly launched Episodic Memory feature represents a key breakthrough. It enables agents to perform pattern extraction and experiential learning from continuous historical interactions, rather than merely retrieving static facts. For instance, an agent can autonomously infer a behavioral pattern such as “the user requires more lead time when traveling with family” and proactively apply it in future similar scenarios, thereby achieving continuously optimized personalized service.

身份与集成，保障安全可控的系统接入：

让智能体作为可信实体安全访问内外系统是落地关键。AgentCore Identity 为每个智能体提供独立且可审计的数字身份，并管理其精细的访问权限。与之协同的 AgentCore Gateway 作为统一集成层，将企业内部API（如数据库、CRM系统）和第三方服务（如Slack、Jira）统一封装、纳管，并安全处理复杂的认证流程。这套机制共同保障了智能体在企业环境中的合规与安全运作。
Identity & Gateway, Ensuring Secure and Controlled System Access:

Enabling agents to securely access internal and external systems as trusted entities is key to deployment. AgentCore Identity provides each agent with an independent and auditable digital identity, managing its fine-grained access permissions. The closely integrated AgentCore Gateway serves as a unified integration layer, responsible for encapsulating and managing internal enterprise APIs (e.g., databases, CRM systems) and third-party services (e.g., Slack, Jira), while securely handling complex authentication flows. This combined mechanism ensures the compliant and secure operation of agents within the enterprise environment.

运行时与沙箱，提供隔离可靠的执行环境：

AgentCore Runtime 是一个为长周期、有状态智能体任务优化的无服务器环境，支持长达数小时的连续执行，并兼容主流开发框架。AgentCore Code Interpreter 则为智能体执行代码或计算逻辑提供了一个安全的沙箱环境，确保任何运算任务都在隔离空间中完成，从而保障底层系统的稳定性与安全性。
Runtime & Code Interpreter, Providing an Isolated and Reliable Execution Environment:

AgentCore Runtime is a serverless environment optimized for long-running, stateful agent tasks, supporting continuous execution for several hours and compatible with mainstream development frameworks. AgentCore Code Interpreter provides a secure sandbox environment for agents to execute code or computational logic, ensuring that any computational task is completed within an isolated space, thereby guaranteeing the stability and security of the underlying system.

可观测性，确保全流程透明与管理：

AgentCore Observability 提供了全面的监控与洞察能力。它使运维人员能够清晰地追踪智能体的内部决策过程、工具调用链路及性能指标，这对于在复杂业务流中进行问题诊断、效能评估与必要的干预至关重要，是实现智能体负责任、可管理运行的技术基石。
Observability, Ensuring Full-Process Transparency and Management:

AgentCore Observability provides comprehensive monitoring and insight capabilities. It enables operations personnel to clearly trace an agent’s internal decision-making processes, tool invocation chains, and performance metrics. This is crucial for problem diagnosis, performance evaluation, and necessary intervention within complex business flows, serving as the technical foundation for the responsible and manageable operation of agents.

第三阶段：特定领域的“数字员工”——垂直场景的落地案例

Phase 3: Domain-Specific "Digital Employees" — Vertical Scenario Use Cases

智能体正在从“通用聊天机器人”进化为具备深厚领域专业知识的“数字员工”。它们不再只是回答问题，而是能够理解特定业务逻辑、执行复杂任务流、并作为团队成员与人类并肩工作。

AI Agents are evolving from "general-purpose chatbots" into "digital employees" with deep domain expertise. They no longer just answer questions; they understand specific business logic, execute complex workflows, and work alongside humans as integral team members.

以下是三个核心垂直场景的落地案例分析：

The following is an analysis of case studies across three core vertical scenarios:

1. 软件开发领域：从“辅助编码”到“自主交付”（Kiro）

Software Development: From "Assisted Coding" to "Autonomous Delivery" (Kiro)

在软件开发中，智能体已从单纯的“代码补全”跃升为规范驱动（Spec-driven）的自主开发伙伴。

In software development, agents have leaped from simple "code completion" to spec-driven autonomous development partners.

核心逻辑： 开发者通过自然语言描述目标，由 Kiro 生成详细的技术规范（Specs），再由智能体根据规范自主编写代码并执行测试。

惊人的效率提升： 亚马逊内部曾评估一个大型重构项目需要 30 名开发人员工作 18 个月；通过全面采用 Kiro 智能体进行自主开发，最终仅由 6 人在 76 天内即完成了交付。

自主解决积压任务： Kiro Autonomous Agent 作为“前沿智能体”（Frontier Agents），能够连续数天自主工作，处理缺陷分类、跨 15 个微服务库的库升级、以及提升代码覆盖率等琐碎且耗时的任务，让工程师保持在高效的“流状态”。

Essential Logic: Developers describe goals in natural language, Kiro generates detailed technical specifications (Specs), and then agents autonomously write code and execute tests based on those specs.

Stunning Efficiency Gains: Amazon internally assessed that a large-scale refactoring project would require 30 developers working for 18 months; by fully adopting Kiro agents for autonomous development, the project was ultimately delivered by just 6 people in only 76 days.

Autonomous Backlog Resolution: As "Frontier Agents," Kiro Autonomous Agents can work independently for days, handling tasks such as bug triaging, library upgrades across 15 microservice repositories, and improving code coverage—tedious and time-consuming tasks that allow engineers to remain in a high-efficiency "flow state."

2. 安全与运维领域：自动化的“守门人”与“消防员”（Security & DevOps Agent）

Security & Operations: Automated "Gatekeepers" and "Firefighters" (Security & DevOps Agent)

安全与运维智能体通过与开发流程的深度联动，实现了“编码-安全-运维”的自动化闭环。

Security and operations agents achieve an automated closed loop of "coding-security-operations" through deep linkage with the development process.

AWS Security Agent（安全顾问）： 改变了以往每年仅执行几次渗透测试的低频模式。它能在上游阶段介入，主动审查设计文档，并在 GitHub 提交代码时扫描漏洞。它能将昂贵、耗时的渗透测试转变为按需执行的自动化流程，即时提供修复建议。

AWS DevOps Agent（在岗运维团队）： 充当经验丰富的运维工程师。当系统告警响起，它能在人工介入前瞬时响应，通过关联 CloudWatch 或 Dynatrace 等观测数据追踪根因（例如发现是简单的 IAM 策略变更导致的错误），并直接提交修复建议供人工审批。

AWS Security Agent (Security Consultant): This has changed the low-frequency model of conducting only a few penetration tests per year. It can intervene at the upstream stage, proactively reviewing design documents and scanning for vulnerabilities when code is committed to GitHub. It transforms expensive, time-consuming penetration testing into an on-demand automated process, providing instant remediation suggestions.

AWS DevOps Agent (On-call Ops Team): Acting as an experienced operations engineer, it responds instantaneously to system alerts before human intervention. By correlating observability data from sources like CloudWatch or Dynatrace, it traces root causes (e.g., discovering an error caused by a simple IAM policy change) and directly submits fix recommendations for human approval.

3. 客户服务领域：个性化的“金牌客服”（Amazon Connect）

Customer Service: Personalized "Star Agents" (Amazon Connect)

Amazon Connect 正在通过智能体技术将联络中心重塑为企业的运营骨干。

Amazon Connect is reshaping contact centers into the operational backbone of enterprises through agent technology.

自主执行与语音进化： 集成 Nova Sonic 语音模型后，智能体能以极具自然感、带有音调变化的语音与客户沟通，自主处理复杂的退款或查询任务。

人类客服的实时导师： 智能体能实时分析对话背景和客户情绪，向人类员工推荐下一步操作，并自动完成文档准备工作。

Autonomous Execution and Voice Evolution: After integrating the Nova Sonic voice model, agents can communicate with customers using a highly natural voice with tonal variations, autonomously handling complex refund or inquiry tasks.

Real-time Mentor for Human Agents: Agents can analyze conversation context and customer sentiment in real-time, recommending the next best action to human employees and automatically completing document preparation.

落地成效：

Lyft：为超过 100 万名司机建立了“意图智能体”，通过后台数据直接处理收入异议等问题，使平均问题解决时间缩短了 87%。

欺诈调查案例： 智能体能在几分钟内自动分析数千笔跨地区的交易模式（原本需耗时数天），识别信用卡盗刷风险，并主动为客户提供更安全的旅行账户建议。

Implementation Results:

Lyft: Established "Intent Agents" for over 1 million drivers to directly handle issues such as earnings disputes using backend data, reducing average resolution time by 87%.

Credit Card Fraud Investigation: Agents can automatically analyze thousands of cross-regional transaction patterns in minutes (which would originally take days), identifying credit card fraud risks and proactively providing customers with safer travel account suggestions.

这些案例表明，通过将 AI 智能体部署到具体的专业领域，企业正从“测试阶段”进入“产出真实投资回报（ROI）”的工业化应用阶段

These cases demonstrate that by deploying AI agents into specific professional domains, enterprises are moving from the "testing phase" into an industrial application phase that yields "real return on investment (ROI)".

第四阶段：治理与评估——让 Agent 可信、可控

Phase 4: Governance & Evaluation — Making Agents Trustworthy and Controllable

随着智能体（Agent）从简单的信息检索转向拥有更高权限的自主执行，企业的关注点正从传统的“内容安全”（防止有害言论）转向核心的“行为治理”。当智能体能够代表公司进行交易、访问敏感数据时，企业需要一种全新的管理哲学，将 AI 视为具备自主能力的员工进行管理。

As agents shift from simple information retrieval to autonomous execution with higher privileges, corporate focus is moving from traditional "content safety" (preventing harmful speech) to "behavioral governance". When agents can conduct transactions or access sensitive data on behalf of a company, enterprises need a new management philosophy that treats AI as employees with autonomous capabilities.

如果说早期的 AI 治理像是一个“内容过滤器”，只负责挡住脏话；那么现在的 AgentCore 治理就像是为一位高级经理建立的“审计制度”和“财务红线”。你不需要教他怎么写每一封邮件，但你必须确保他没有权限在未经审计的情况下签发百万美元的支票。

If early AI governance was like a "content filter" responsible only for blocking profanity, current AgentCore governance is like establishing an "auditing system" and "financial red lines" for a senior manager. You don't need to teach them how to write every email, but you must ensure they don't have the authority to sign a million-dollar check without an audit.

以下是实现智能体可信、可控的核心治理框架：

The following is the core governance framework for achieving trustworthy and controllable agents:

1. 董事会式治理 Board of Directors Governance

AWS 提出了一个类似于“养育青少年”的管理哲学。

授权与边界： 管理者不应通过硬编码（Hard coding）来微观管理智能体的每一个步骤，因为这会扼杀其创造力和适应性。

管理模式： 就像 CEO 领导团队或父母对待开始驾车的孩子，人类应设定战略目标、决策边界和风险阈值，并建立检查机制（如“环视相机”或应用轨迹分析）来确保一切在轨道上运行。

Authorization and Boundaries: Managers should not micromanage every step of an agent through hard coding, as this would stifle its creativity and adaptability.

Management Mode: Much like a CEO leading a team or a parent dealing with a child who has started driving, humans should set strategic goals, decision boundaries, and risk thresholds, and establish inspection mechanisms (such as "surround cameras" or application trace analysis) to ensure everything stays on track.

2. 行为策略：设定确定性红线 Behavioral Policy: Setting Deterministic Red Lines

为了解决智能体生成代码和执行动作时的非确定性问题，AWS 推出了 Bedrock AgentCore Policy。

To address the non-deterministic issues when agents generate code or execute actions, AWS launched Bedrock AgentCore Policy.

自然语言定义规则： 管理者可以使用自然语言设定具体的“红线”。例如：“如果退款金额超过 1000 美元，必须拦截并要求人工审批”。

神经符号 AI 的结合： 系统将自然语言自动转化为 Cedar这种确定性的授权策略语言。

实时拦截： 这种策略强制执行位于智能体应用代码之外，充当智能体与企业工具/数据之间的拦截层，在毫秒内验证每一个动作的合规性，确保其永不“越权”。

Defining Rules with Natural Language: Managers can use natural language to set specific "red lines." For example: "If a refund amount exceeds $1,000, it must be intercepted and require human approval."

Neuro-symbolic AI Integration: The system automatically translates natural language into Cedar, a deterministic authorization policy language.

Real-time Interception: This policy enforcement resides outside the agent's application code, acting as an interception layer between the agent and corporate tools/data, verifying the compliance of every action within milliseconds to ensure it never "exceeds its authority."

3. 持续评估：自动化的“绩效考核” Automated "Performance Reviews"

传统的预发布测试无法完全覆盖智能体在真实环境中的非确定性行为。AgentCore Evaluations 提供了一套自动化的持续质量巡检机制。

Traditional pre-release testing cannot fully cover an agent's non-deterministic behavior in real-world environments. AgentCore Evaluations provides an automated mechanism for continuous quality inspection.

全方位考核维度： 该服务提供 13 种预置评估器，涵盖正确性（Correctness）、帮助程度（Helpfulness）、有害性（Harmfulness）以及是否符合品牌调性（On brand）等关键指标。

闭环监控： 评估结果直接与 CloudWatch 集成，允许开发者实时监控生产环境中的智能体表现。

快速反馈： 当智能体升级模型或修改流程时，可以通过数千个模拟场景进行自动测试，确保质量不会下降，像进行“年度绩效考核”一样，确保数字员工始终称职。

Comprehensive Assessment Dimensions: The service provides 13 pre-built evaluators covering key metrics such as Correctness, Helpfulness, Harmfulness, and Brand Alignment.

Closed-loop Monitoring: Evaluation results are directly integrated with CloudWatch, allowing developers to monitor agent performance in production environments in real-time.

Rapid Feedback: When an agent's model is upgraded or its process modified, automatic testing can be conducted through thousands of simulated scenarios to ensure no degradation in quality—much like an "annual performance review" to ensure digital employees remain competent.

企业对 AI 管理权的重塑——从“监控对话”进化为“治理行为”。通过 Policy 设定即时红线，通过 Evaluations 进行长期质量审计，企业终于可以打破“概念验证困境（POC Jail）”，放心地将智能体部署到核心业务中。

The reshaping of corporate AI management — evolving from "monitoring conversations" to "governing behavior." By setting immediate red lines via Policies and conducting long-term quality audits via Evaluations, enterprises can finally break free from "POC Jail" and confidently deploy agents into their core business.