Sign language recognition has been an active research field for almost two decades. From early electric signal-based sign language recognition to modern-day recognition using deep learning techniques, researchers all over the world have tried to automate this task. While sign language recognition could be seen as a naive gesture recognition problem, sign language does not translate to spoken language word by word. Translation of sign languages simply aims to detect the individual words from the individual signs used while signing a sentence, recognition majorly refers to detecting the complete meaningful text sentence communicated with signs. In this thesis, this translation issue of Sign Languages is addressed and several solution approaches are demonstrated. We mainly aim to carry out key point detection based sign language recognition (SLR) to infer the meaning that the speaker wants to communicate by generating captions. We use MediaPipe to collect the hand key points from images and OpenPose to collect holistic pose keypoints from videos. We work with American Sign Language (ASL), specifically, ASL image data set and How2sign data set of ASL videos. We use a fully-connected neural network with ReLU activation function to detect alphabet gestures from images. We achieve an accuracy of 83% and a precision of 90% for recognition of single alphabets. Additionally, we test the recognition for images captured through a webcam. We also provide the architecture of a model using transformer cells for recognition of complete sentences from sign language videos.
«
Sign language recognition has been an active research field for almost two decades. From early electric signal-based sign language recognition to modern-day recognition using deep learning techniques, researchers all over the world have tried to automate this task. While sign language recognition could be seen as a naive gesture recognition problem, sign language does not translate to spoken language word by word. Translation of sign languages simply aims to detect the individual words from the...
»