Amazon has introduced Nova Act, a
general-purpose AI agent capable of autonomously performing tasks within web
browsers. Developed by Amazon's AGI Lab in San Francisco, Nova Act is designed
to navigate web pages, fill out forms, and interact with various web elements,
aiming to enhance user productivity through automation.
To support developers in creating applications powered
by Nova Act, Amazon has released the Nova Act SDK. This toolkit enables
the development of AI agents that can execute step-by-step tasks in a browser
environment, such as submitting time-off requests or placing online orders. The
SDK is accessible via nova.amazon.com,
which also serves as a hub for Amazon's Nova foundation models.
Nova Act is set to power key features in Amazon's
upcoming Alexa+ upgrade, an enhanced version of the voice assistant
incorporating generative AI capabilities. This integration aims to provide
users with more dynamic and efficient interactions, such as booking services or
making reservations through voice commands.
In internal evaluations, Nova Act has demonstrated
superior performance compared to similar agents from OpenAI and Anthropic. For
instance, on the ScreenSpot Web Text benchmark, which assesses AI interaction
with on-screen text, Nova Act scored 94%, outperforming OpenAI's Computer Use
Agent (88%) and Anthropic's Claude 3.7 Sonnet (90%).
David Luan, Vice President of AGI Autonomy at Amazon
and former OpenAI engineer, emphasized the company's focus on building reliable
AI agents capable of independent actions. He highlighted that Nova Act
represents a foundational step toward achieving general intelligence in AI
systems.
Developers and tech enthusiasts are encouraged to
explore the capabilities of Nova Act and provide feedback to aid in its ongoing
development. The research preview of the Nova Act SDK is available for
experimentation and prototyping, marking Amazon's commitment to advancing AI
agent technology.
Comments:
Leave a Reply