Deep Convolutional Neural Network Design Approach for 3D-Object Detection for Robotic Grasping
MetadataShow full metadata
Recognition technology has gained state-of-the-art performance with the dawn of Deep Convolutional Neural Network (DCNN), and with these achievements in computer vision, machine learning, and 3D-sensor, industries are near to start a new era of the automation. However, object detection for robotic grasping in varying environments, low illumination, occlusion, and partial images contributes to poor accuracy and slows the speed of detecting objects. In this thesis, an approach is presented to recognize an object in an industrial warehouse through a robotic vision to advance warehouse automation using a robot arm. A multimodal architecture is designed to be used as a base network/backbone network of Single Shot Detector (SSD) to address the warehouse’s challenge, such as partial images and low illumination. This architecture uses Red-Green-Blue (RGB) and Depth images as an input and provides a single output. Most of the researchers used Visual Geometry Group (VGG), Residual Network (ResNet), and MobileNet for detection purposes, but in this thesis, a new DCNN architecture is designed to perform a specific task of grasping. Here, four different Red-Green-Blue-Depth (RGB-D) deep neural network architectures are designed and compared in their training, testing, and other statistical metrics for evaluation. The Deep Convolutional Neural Network (DCNN) development motive is to recognize the input object image obtained from the depth sensor cameras in warehouses. Four objects are mainly focused on this research: ‘Bowl,’ ‘School glue,’ ‘Dove bar,’ ‘Soda can.’ Furthermore, details of the designed model, its performance, and the results are discussed in this thesis.