This new artificial intelligence technology enhances video analysis by detecting human actions in real-time.
A system has proven to be capable of analyzing chaotic video recordings with high accuracy.
Researchers at the School of Engineering and Applied Science at the University of Virginia have made strides in the visual capabilities of artificial intelligence with the development of a new video analyzer called the Semantic and Motion-Aware Spatiotemporal Transformer Network (SMAST). This system demonstrates remarkable accuracy in detecting human actions, paving the way for applications in areas such as public safety, movement tracking, and navigation for autonomous vehicles.
One of the standout features of SMAST is its ability to process complex video material, focusing on the most relevant parts of a scene. The system combines a multi-feature selective attention model with a motion-aware 2D positional encoding algorithm. These features work together to enable the AI to detect and interpret human actions with high precision.
The selective attention model allows SMAST to concentrate on critical elements, such as a person or a moving object, while ignoring irrelevant details. For example, it can differentiate between someone throwing a ball and another person simply raising their arm. Meanwhile, the motion-aware algorithm allows the AI to track movements over time, remembering how objects and people have moved within a scene. This enhances SMAST's ability to understand the relationships between different actions, making it more effective at recognizing complex behaviors.
In the realm of security and surveillance, this system can enhance public safety by detecting potential threats in real time. For instance, it can identify suspicious behaviors in crowded spaces or recognize if someone is in a dangerous situation. In healthcare, the technology has the potential to monitor patients' movements, facilitating better movement analysis for rehabilitation or monitoring during surgeries.
The researchers claim that SMAST stands out for its ability to handle chaotic and unedited recorded material. Thanks to its AI-driven approach, the system can learn from data, adapting to different environments and refining its action detection capabilities. SMAST has undergone several academic tests, including AVA, UCF101-24, and EPIC-Kitchens, achieving favorable results.
Professor Scott T. Acton, who serves as chair of the Department of Electrical and Computer Engineering, stated that this AI technology opens doors for real-time action detection in some of the most demanding environments. He emphasized that it represents an advancement that can help prevent accidents, improve diagnoses, and even save lives.