Bolt42


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Nvidia launched Blueprint for AI Agents capable of analyzing video today, during its CES 2025 opening keynote by CEO Jensen Huang.

The new Nvidia AI Blueprint, powered by Metropolis, enables organizations and individuals to enhance productivity and safety, and may even assist Nvidia’s CEO in perfecting his fastball pitch.

The next significant advancement in AI is within reach — quite literally.

Currently, over 1.5 billion enterprise-level cameras are deployed globally, generating approximately 7 trillion hours of video each year. However, only a small percentage of this footage is actually analyzed.

It is estimated that less than 1% of video from industrial cameras is monitored live by humans, meaning crucial operational incidents can often go unnoticed.

This oversight comes at a significant cost. For instance, manufacturers lose trillions of dollars annually due to poor product quality or defects that could have been detected or even predicted earlier using AI agents that can perceive, analyze, and aid humans in taking timely actions.

Interactive AI agents equipped with visual perception capabilities can function as always-on video analysts. They assist factories in operating more efficiently, enhancing worker safety, ensuring smooth operations, and even improving an athlete’s performance.

To expedite the development of such agents, Nvidia today announced early access to a new version of the Nvidia AI Blueprint for video search and summarization. This blueprint, built on the Nvidia Metropolis platform and enhanced by Nvidia Cosmos Nemotron vision language models (VLMs), Nvidia Llama Nemotron large language models (LLMs), and Nvidia NeMo Retriever, provides developers with the necessary tools to build and deploy AI agents that can analyze vast amounts of video and image content.

The blueprint integrates the Nvidia AI Enterprise software platform — which includes Nvidia NIM microservices for VLMs, LLMs, and advanced AI frameworks for retrieval-augmented generation — to facilitate batch video processing that is 30 times faster than real-time viewing.

The blueprint encompasses various agentic AI features — such as chain-of-thought reasoning, task planning, and tool calling — that assist developers in streamlining the creation of powerful and diverse visual agents to address a wide array of challenges.

AI agents with video analysis capabilities can be combined with other agents that possess different skill sets, allowing for even more sophisticated agentic AI services.

Enterprises are given the flexibility to construct and deploy their AI agents from the edge to the cloud.

How Video Analyst AI Agents Can Help Industrial Businesses

AI agents equipped with visual perception and analysis skills can be tailored to assist businesses with industrial operations by:

  • Increasing productivity and reducing waste: Agents can ensure standard operating procedures are adhered to during intricate industrial processes like product assembly, and can analyze subtle actions and their sequence.
  • Boosting asset management efficiency through better space utilization: Agents can optimize warehouse inventory storage via 3D volume estimation and centralized understanding across various camera streams.
  • Improving safety through automated incident report generation: Agents can process large volumes of video and summarize it into contextual reports about accidents, as well as ensure compliance with personal protective equipment in factories to enhance workplace safety.
  • Preventing accidents and production issues: AI agents can identify atypical activities promptly to mitigate operational and safety risks, be it in a warehouse, factory, airport, or municipal intersection.
  • Learning from the past: Agents can search through operational video archives and relevant previous information to troubleshoot issues or develop new processes.

Video Analysts for Sports, Entertainment and More

Another sector where video analysis AI agents are set to make a significant impact is sports — a worldwide market valued at $500 billion, projected to grow substantially in the coming years.

Coaches, teams, and leagues — professional and amateur alike — rely on video analytics to assess and improve player performance, prioritize safety, and enhance fan engagement through platforms for player analytics and data visualization. With visually perceptive AI agents, athletes gain unprecedented access to insightful data and improvement opportunities.

During his CES keynote, Nvidia’s Huang showcased an AI video analytics agent that evaluated the fastball pitching techniques of an amateur compared to a professional. Utilizing video captured from the ceremonial first pitch Huang threw for the San Francisco Giants, the agent could offer suggestions for enhancement.

The $3 trillion media and entertainment industry is also positioned to gain from video analysis AI agents. Through the Nvidia Media2 initiative, these agents will facilitate the creation of smarter, more personalized, and more impactful content that can adjust according to individual viewer preferences.

Worldwide Adoption and Availability

Partners globally are integrating the blueprint for developing AI agents for video analysis into their workflows, including Accenture, Infosys, Linker Vision, Pegatron, TATA Consultancy Services (TCS), Telit Cinterion, and VAST.

Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We provide you with insider information on how companies are utilizing generative AI, from regulatory changes to practical applications, allowing you to share insights for greater ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occurred.


    um + três =

    Bolt42