Stable vision from above: Enhanced environment perception for autonomous flight

July 10, 2025 | Fraunhofer IVI and Institut Polytechnique de Paris develop method to improve video stability in autonomous aerial vehicles

Privacy warning

With the click on the play button an external video from www.youtube.com is loaded and started. Your data is possible transferred and stored to third party. Do not start the video if you disagree. Find more about the youtube privacy statement under the following link: https://policies.google.com/privacy

Semantic Similarity Propagation compared to other methods for video segmentation: significantly higher temporal stability and quality of segmentation.

To enable autonomous flight, aerial systems must be able to perceive their surroundings and continuously classify objects in real time. This requires the use of various sensors – including RGB cameras – whose heterogeneous data must be reliably processed and made available for autonomous flight control. With the help of AI-based semantic segmentation, the system can divide its environment into meaningful categories, such as roads, buildings, vegetation, or obstacles, enabling informed decisions during tasks like obstacle avoidance, emergency landings, or the inspection of critical infrastructure like power lines. However, under changing lighting conditions, rapid movements, or limited training data, the quality of video analysis tends to fluctuate.

A new method called Semantic Similarity Propagation (SSP) now offers a solution: it ensures that the AI’s “understanding” of its surroundings remains consistent across multiple video frames, even when the camera is in motion. The method was developed by researchers at Fraunhofer IVI and the Institut Polytechnique de Paris, and presented at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), held June 11–15, 2025, in Nashville, USA.

Taehyoung Kim bei der CVPR 2025 Conference
Poster presentation at CVPR 2025 in Nashville, USA: Taehyoung Kim, research scientist at the Fraunhofer Application Center »Connected Mobility and Infrastructure«.

Predictive learning from neighboring frames

The SSP method not only analyzes the current camera frame but intelligently compares it to previous ones. By recognizing semantic similarities between frames, the AI detects what remains unchanged and what has shifted. The result: fewer prediction jumps, more stable object tracking, and more reliable segmentations.

In tests using specialized aerial datasets, SSP improved the temporal consistency of segmentation results by up to 12.5 percent without compromising processing speed. This combination of stability, accuracy, and efficiency makes the approach especially promising for real-world applications such as disaster response, precision agriculture, or traffic monitoring.

The results clearly show that methods like SSP make onboard video analysis in autonomous aerial systems significantly more robust and reliable – an important step toward safer and more efficient autonomous flight.