The power of Bangla
The image left shows how a scanned image has been converted to editable Bangla unicode text with the OCR application, right, a screen shot of the 'Katha' application that reads aloud any Bangla text when typed.
Imagine a blind person being able to use a computer almost as efficiently as someone with eyes! All he does is take the mouse cursor to a certain position in the screen and the computer reads him aloud whatever button there is. That too in Bangla!
This, and a few more marvels would now be available for free to all the people who want to use Bangla for many different reasons in their everyday lives thanks to two newly released products by CRBLP (Center for Research for Bangla Language Processing) of BRAC University.
On February 19, CRBLP announced the first official release of its Bangla language processing software packages 'Katha' (text-to-speech) and BanglaOCR (optical character recognition). In the event, the audience (which included experts in the field and several blind people) was shown how the computer could create Bangla unicode text from scanned images and then read out the text.
The TTS and OCR run on Linux, Windows and Mac OSX. There is also a web-enabled front-end for the TTS (and under development for the OCR), making these tools available at anytime and from anywhere. Currently, the group is working on better integration with screen readers in collaboration with the vision impaired community.
The Bangla language processing tools developed at CRBLP are free and open source software, released under GNU Public License v2, and supported in part by funding from Canadian IDRC and BRAC University.
"We have come a long way, but we have even a longer way to go" says Dr. Mumit Khan, professor of Computer Science and Engineering at the university and the head of CRBLP. "We are trying to develop capacity for Bangla Language Processing in Bangladesh. These days we are talking about ICT and Digital Bangladesh. If you talk about ICT for a country, you have to localise. And localisation means not only translation, you also need to incorporate local culture for widespread use."
"We start from very basic spelling checkers and Optical Character recognition. Speech synthesis would enable us not to type but dictate. That would make not only a recording, but also turn it into editable text, where you can check for spellings and grammatical errors. These things are present for the English Language through various softwares. For all this to happen in Bangla, you need a lot of linguistic research at first. And in Bangla we are a bit behind in all this. So for us, the first step was capacity building. It took us a year and a half to find out what we needed to know. So our target was to do something simple yet concrete - make applications that people can use," he added.
A Corpus (list of words in usage in Bangla, something not found in the dictionary), efficient spell checkers, OCR, machine translation, syntax checking all these are required to make full-scale Bangla computing possible.
So the purpose of the research is to:
1) Build linguistic resource using Corpus
2) Create new applications
3) Develop capacity
These applications will help in different forms of academic research because there is a huge literacy problem in this country.
The target of TTS includes three kinds of people:
1) The illiterate
2) The Visually impaired
3) People who can't read Bangla
OCR has infinite applications including fast digitalising of old and rare Bangla books, which would save a lot of time compared to manually typing all the words in those books.
When asked about the inspirations behind developing these applications, Dr Khan mentioned Sightsavers International and the JPUF (Jatiyo Protbondhhi Unnoyon Foundation). Through the usage of these applications, blind people too, can become citizens of the 'net world'. The sponsors for these projects are the IDRC, Canada (International Development Research Corporation) under its PAN Localisation Project and BRAC University itself.
"At this stage, the text to speech software sounds a bit wooden but we hope to improve on it within the next six months or so. For example, Amar Nam Sanjida sounds like Amar Nam Sa-no-ji-da right now. The good news is that this is just the beginning to even better applications. Future developments would include the female voice, which is an even more difficult thing to do, and intonation inclusive of mood variations, whether the person is angry or in a good mood. Right now we are working on the 'Broadcast dialect'. Incorporating dialect is an altogether new issue," he mentioned
At CRBLP, there are 6 researchers in the core group and many other people are affiliated with the research projects including students and teachers from the Dhaka University Linguistics Department (with whom the centre has a MoU) and researchers from 14 other countries.
When asked about the state of research in the field of computer Science in Bangladesh, Dr. Khan said that their sure is room for improvement but research in his field is not totally inexistent as many people would think. Other than CRBLP, he mentioned Prof Saidur Rahman's (Buet) research on graph theory which is being internationally acclimated.
For more information on the research centre and its activities may visit CRBLP's website http://www.bracu.ac.bd/research/crblp/.
Short Notes:
1) TTS: The TTS (Text to Speech) application generates speech from Bangla text. This can help tackle illiteracy problem, empower the visually impaired and increase the possibilities of improved machine-interaction. This project has developed a TTS system for Bangla using diphone and unit selection concatenation techniques based on the Festival speech synthesis technology. The developers in this project are Firoj Alam, S.M. Murtoza Habib and Kamrul Hayder.
2) OCR: Optical Character Recognition (OCR) is the process of converting printed text images to editable text. This project has developed a Bangla OCR that takes the scanned image of a printed page or document as input and converts it into editable Unicode text. The developers in this project are Md. Abul Hasnat,
and Souro Chowdhury.
Comments