Pi-Reader
- Yahvin Gali
- Feb 3, 2020
- 2 min read
Updated: Sep 22, 2021
Raspberry Pi based Optical Reader for the Blind and Visually Impaired using OCR and TTS.
Blindness and vision impairments can result from congenital conditions, eye disease, injury, brain trauma, diabetes, multiple sclerosis, etc. The Blind and Visually Impaired (VI) have access to printed material only if it is available in Braille (a tactile writing system), audio format as audiobooks, large/magnified print, or with the help of a personal assistant.
The scope of this project is to design an easy to operate, low-cost Raspberry Pi based Optical Reader using Open Source Optical Character Recognition (OCR) and Text-To-Speech (TTS) modules, and automate it with Python script and Mindstorms EV3 motors, to provide a technical solution to assist the Blind and Visually Impaired (VI) in gaining access to various text resources, without the learning of any new concepts, and with minimal help.

With the single press of a button, a Pi Camera acting as the main vision captures the image of the document or book placed in front of it. The image is then passed onto the Raspberry Pi with loaded Optical Character Recognition (OCR) engine which enables the recognition and automatic conversion of characters in the printed image into digital text. Next, the Text-to-Speech (TTS) module uses predefined libraries to synthesize audible waveforms that can be played through an audio output device. The Pi Reader uses Tesseract OCR and Pico2Wave TTS. A BrickPi-EV3 page-turning mechanism was added to further automate the Pi Reader. Approximately 200 lines of Python code were written to run the Pi Reader.

In Stage 1, different document types - Plain-Text, Text-on-Colored-Background, Different Fonts, Text-with-Images, Handwritings, and Text-on-Image - were tested for OCR and TTS accuracy while keeping the focal distance and camera attributes constant. Plain-Text showed the best results in terms of text conversions and readability, followed by Different Fonts, Text-on-Image, Text-with-Image, Text-on-Colored-Background, and Handwriting in that order.

In Stage 2, two books of 430 and 459 pages were tested and read successfully with minor hiccups with font in italics, symbols, and images of fonts.

In Stage 3, four different designs of page-turning mechanisms were built and tested for the precision of turning a single page. Design 4 performed the best followed by Design 2, 1, and 3.

The Pi Reader showed higher reading accuracy in good ambient lighting, and with Standard fonts in black and white, regular style, and sizes 12pt and above. Design 4 of the page-turning mechanism also lent the Pi Reader increased automation. Although Stage 3 of the Pi Reader is not perfect, the results show that the developed device can be used as a standalone reading solution for the Visually Disabled as it can function without internet or Wi-Fi, making it suitable for remote areas. I am currently experimenting with NLP (Natural Language Processing) for Indic languages.
The Pi Reader in action.
Comentarios