Top View Deep Learning Object Detection using Active Perception in Construction Environment

  • To increase situational awareness of the crane operator, the aim of this thesis is to develop a vision-based deep learning object detection from crane load-view using an adaptive perception in the construction area. Conventional worker detection methods are based on simple shape or color features from the worker's appearances. Nonetheless, these methods can fail to recognize the workers who do not wear the protective gears. To find out an image representation of the object from the top view manually or handcrafted feature is crucial. We, therefore, employed deep learning methods to automatically learn those features. To yield optimal results, deep learning methods require mass amount of data. Due to the data deficit especially in the construction domain, we developed the photorealistic world to create the data in addition to our samples collected from the real construction area. The simulated platform does not benefit only from diverse data types, but also concurrent research development which speeds up the pipeline at a low cost. Our research findings indicate that the combination of synthetic and real training samples improved the state-of-the-art detector. In line with previous studies to bridge the gap between synthetic and real data, the results of preprocessed synthetic images are substantially better than using the raw data by approximately 10%. Finding the right deep learning model for load-view detection is challenging. By investigating our training data, it becomes evident that the majority of bounding box sizes are very small with a complex background. In addition, we gave the priority to speed over accuracy based on the construction safety criteria. Finally, RetinaNet is chosen out of the three primary object detection models. Nevertheless, the data-driven detection algorithm can fail to handle scale invariance, especially for detectors whose input size is changed in an extremely wide range. The adaptive zoom feature can enhance the quality of the worker detection. To avoid further data gathering and extensive retraining, the proposed automatic zoom method of the load-view crane camera supports the deep learning algorithm, specifically in the high scale variant problem. The finite state machine is employed for control strategies to adapt the zoom level to cope not only with inconsistent detection but also abrupt camera movement during lifting operation. Consequently, the detector is able to detect a small size object by smooth continuous zoom control without additional training. The adaptive zoom control not only enhances the performance of the top-view object detection but also reduces the interaction of the crane operator with camera system, reducing the risk of fatality during load lifting operation.

Volltext Dateien herunterladen

Metadaten exportieren

Metadaten
Verfasser*innenangaben:Tanittha SutjaritvorakulORCiD
URN:urn:nbn:de:hbz:386-kluedo-80264
DOI:https://doi.org/10.26204/KLUEDO/8026
Betreuer*in:Karsten BernsORCiD
Dokumentart:Dissertation
Kumulatives Dokument:Nein
Sprache der Veröffentlichung:Englisch
Datum der Veröffentlichung (online):15.04.2024
Jahr der Erstveröffentlichung:2024
Veröffentlichende Institution:Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau
Titel verleihende Institution:Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau
Datum der Annahme der Abschlussarbeit:06.03.2024
Datum der Publikation (Server):17.04.2024
Seitenzahl:VIII, 181
Fachbereiche / Organisatorische Einheiten:Kaiserslautern - Fachbereich Informatik
DDC-Sachgruppen:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Lizenz (Deutsch):Creative Commons 4.0 - Namensnennung, nicht kommerziell (CC BY-NC 4.0)