Top View Deep Learning Object Detection using Active Perception in Construction Environment
- To increase situational awareness of the crane operator, the aim of this thesis is to develop a vision-based deep learning object detection from crane load-view using an adaptive perception in the construction area. Conventional worker detection methods are based on simple shape or color features from the worker's appearances. Nonetheless, these methods can fail to recognize the workers who do not wear the protective gears. To find out an image representation of the object from the top view manually or handcrafted feature is crucial. We, therefore, employed deep learning methods to automatically learn those features. To yield optimal results, deep learning methods require mass amount of data. Due to the data deficit especially in the construction domain, we developed the photorealistic world to create the data in addition to our samples collected from the real construction area. The simulated platform does not benefit only from diverse data types, but also concurrent research development which speeds up the pipeline at a low cost. Our research findings indicate that the combination of synthetic and real training samples improved the state-of-the-art detector. In line with previous studies to bridge the gap between synthetic and real data, the results of preprocessed synthetic images are substantially better than using the raw data by approximately 10%. Finding the right deep learning model for load-view detection is challenging. By investigating our training data, it becomes evident that the majority of bounding box sizes are very small with a complex background. In addition, we gave the priority to speed over accuracy based on the construction safety criteria. Finally, RetinaNet is chosen out of the three primary object detection models. Nevertheless, the data-driven detection algorithm can fail to handle scale invariance, especially for detectors whose input size is changed in an extremely wide range. The adaptive zoom feature can enhance the quality of the worker detection. To avoid further data gathering and extensive retraining, the proposed automatic zoom method of the load-view crane camera supports the deep learning algorithm, specifically in the high scale variant problem. The finite state machine is employed for control strategies to adapt the zoom level to cope not only with inconsistent detection but also abrupt camera movement during lifting operation. Consequently, the detector is able to detect a small size object by smooth continuous zoom control without additional training. The adaptive zoom control not only enhances the performance of the top-view object detection but also reduces the interaction of the crane operator with camera system, reducing the risk of fatality during load lifting operation.
Author: | Tanittha SutjaritvorakulORCiD |
---|---|
URN: | urn:nbn:de:hbz:386-kluedo-80264 |
DOI: | https://doi.org/10.26204/KLUEDO/8026 |
Advisor: | Karsten BernsORCiD |
Document Type: | Doctoral Thesis |
Cumulative document: | No |
Language of publication: | English |
Date of Publication (online): | 2024/04/15 |
Year of first Publication: | 2024 |
Publishing Institution: | Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau |
Granting Institution: | Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau |
Acceptance Date of the Thesis: | 2024/03/06 |
Date of the Publication (Server): | 2024/04/17 |
Page Number: | VIII, 181 |
Faculties / Organisational entities: | Kaiserslautern - Fachbereich Informatik |
DDC-Cassification: | 0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik |
Licence (German): | Creative Commons 4.0 - Namensnennung, nicht kommerziell (CC BY-NC 4.0) |