The SynBASe dataset is a collection of 1,295 synthetic images designed to alleviate the scarcity of data in maritime scenarios for search and rescue tasks from UAVs. It was built using Unreal Engine 4 (https://www.unrealengine.com/), a piece of software designed for game development with plug-ins for high-quality synthetic data generation, in order to imitate the characteristics of a selected subset of images from the SeaDronesSee dataset. Each image may contain one or multiple targets in different scenarios and illumination conditions. We only consider a single category called “swimmer” to which the objects may belong, since the objective is the search and rescue of people at sea, so the rest of the objects are not labeled in the annotation files. The specific details can be seen in the following table, which shows a comparison between both distributions:
Specification | SeaDronesSee | SynBASe |
---|---|---|
Number of images (train/val/test) | 966 (677 / 96 / 193) | 1,295 (907 / 129 / 259) |
Image size (px) | 1,280×960 to 5,456×3,632 | 1,200×1,200 |
Total number of bodies | 2,317 | 3,415 |
Range of the number of bodies per image | [1, 12] | [1, 12] |
Avg. number of bodies per image | 3.42 ± 2.78 | 3.76 ± 2.82 |
Avg. relative area of bounding boxes (%) | 0.012 ± 0.020 | 0.015 ± 0.005 |
Furthermore, we introduced synthetic variations into the images in order to simulate adverse conditions that a Search and Rescue mission may confront, such as fog, rain, and low lighting and reflections caused by sunsets.The code required in order to generate both these new domains and the data is also available for download in the shared repository.
Some examples of each dataset are presented in the figure below, which shows a comparison between the two distributions:
The SeaDronesSee corpus is a collection of high-resolution real images that was specifically designed for at-sea Search and Rescue system development. Particularly relevant to this study is the "swimmers" class, which specifically refers to people in the water. These images were used to perform additional filtering with which to select only those samples that would be suitable for the task, discarding those with an inclination of less than 67 degrees with respect to the axis perpendicular to the sea, along with those taken at altitudes of less than 40 meters. The list with filenames used can be found in the download link provided.
In the file structure of the download link provided, you can find the data used. In the SynBASe folder, you can locate the images and annotations folder; The images are in RGB format, stored as PNG and and split in partitions and under the new adverse domains. The annotations are provided in 3 different formats, following the COCO, YOLO and PASCAL-VOC standards. For the SeaDronesSee dataset, you can find a folder with the list of images used for each partition.
In addition, you can find the code to generate the domains in adverse conditions, as well as a README file with instructions to execute it.
To download the SynBASe dataset, as well as the list of images from the SeaDronesSee dataset and the code to generate the domains in adverse weather conditions, use the following link:
Please, if you use the SynBASe dataset or part of it, cite the following publication:
@article{martinez2024synbase, title = {On the use of synthetic data for body detection in maritime search and rescue operations}, author = {Juan P. Martinez-Esteso, Francisco J. Castellanos, Adrian Rosello, J. Calvo-Zaragoza, and Antonio Javier Gallego}, journal = {Engineering Applications of Artificial Intelligence}, volume = {139}, pages = {109586}, issn = {0952-1976}, year = {2025} }
This work was supported by the Generalitat Valenciana (GV) through the TADMar project, INVEST/2022/450.