Welcome to Scene Understanding for Autonomous Drone Challenge

This challenge aims to foster the development of fully autonomous Unmanned Aircraft Systems (UAS).

This project’s two key computer vision components are semantic segmentation and depth perception.

This challenge includes the release of a new dataset of drone images that will benchmark semantic segmentation and mono-depth perception. The images in this dataset comprise realistic backyard scenarios of variable content and have been taken on various Above Ground Level (AGL) ranges.

This year’s challenge has two tasks: Semantic Segmentation & Mono Depth Perception.

Semantic segmentation is the labelling of the pixels of an image according to the category of the object to which they belong. The output for this task is an image in which each pixel has the value of the class it represents.

For this task, we focus on labels that ensure a safe landing, such as the location of humans and animals, round or flat surfaces, tall grass and water elements, vehicles and so on. The labels chosen for this challenge are humans, animals, roads, concrete, roof, tree, furniture, vehicles, wires, snow etc. The complete list of labels is: [WATER, ASPHALT, GRASS, HUMAN, ANIMAL, HIGH_VEGETATION, GROUND_VEHICLE, FAÇADE, WIRE, GARDEN_FURNITURE, CONCRETE, ROOF, GRAVEL, SOIL, PRIMEAIR_PATTERN, SNOW].


Depth estimation measures the distance between the camera and the objects in the scene. It is an important perception task for an autonomous aerial drone. Using two stereo cameras makes this task solvable with stereo vision methods. This challenge aims to create a model that can use the information of a single camera to predict the depth of every pixel.

The output of this task must be an image of equal size to the input image, in which every pixel contains a depth value.

:sparkles: The challenge is now live

:closed_book: Starter Kit

Check out these easy-2-follow starter kits to get familiar with documentation, submission follow and setup. This starter kit will help you in making your first submission.

  1. Semantic Segmentation
  2. Depth Perception

:muscle: Baselines

Don’t know where to start, check out these baselines and make your submission.

  1. Semantic Segmentation
  2. Depth Perception

:speech_balloon: Find Teammates
:memo: Share your feedback and queries

All The Best,
Team SUADD’23