Medical intelligence and language engineering lab
The Medical Intelligence and Language Engineering Laboratory, also known as the MILE lab, is a research laboratory at the Indian Institute of Science, Bangalore under the Department of Electrical Engineering. The lab is known for its work on Image processing, online handwriting recognition, Text-To-Speech and Optical character recognition[1] systems, all of which are focused mainly on documents and speech in Indian languages.[2] The lab is headed by A. G. Ramakrishnan.[3]
Research focus
[edit]One of the commitments of the MILE lab is the development of technology for people with visual impairment to harness knowledge from any available printed material in Indian languages.[4] The lab is working towards reaching this goal. Its work so far has included: document mosaicing of coloured, camera captured images ; text extraction from complex colour images, including camera captured images; document layout analysis; detection of broken and merged characters; OCR technology for Tamil and Kannada;[5] text to speech conversion in Tamil and Kannada;[6] pitch modification using discrete cosine transform in the source domain;[7] automated part of speech tagging; phrase prediction and prosody modeling.
Mozhi Vallan, the Tamil OCR[8] product developed by MILE Lab, is being used by Worth Trust and Karna Vidya Technology Centre, Chennai[9] for the conversion of printed school and college books to Braille format. Sri Ramakrishna Math, Chennai[10] is using it to convert their printed philosophical books in Tamil to computer readable text. Lipi Gnani, the Kannada OCR developed by MILE Lab is being used by Braille Transcription Centers of Mitrajyothi[11] and Canara Bank Relief & Welfare Society,[12] Bangalore for similar purposes. Also, Thirukkural,[13] the Tamil TTS system[14] developed by MILE Lab is being used by some school teachers in Singapore for assignments. Madhura, the Kannada TTS[15] developed by the lab, is being used by two blind students, integrated with a screen reader, to read aloud text OCR'ed with Lipi Gnani from Kannada books. Currently, the lab is researching on machine listening[16] and a novel temporal feature named as plosion index has been proposed, which has been shown to be extremely effective in detecting closure-burst transitions of stop consonants and affricates from continuous speech, even in noise.[17] Another feature proposed is DCTILPR,[18] which is a voice source based feature vector that improves the recognition performance of a speaker identification system.
In the early days, significant work was carried out in medical signal and image processing. A unique algorithm was proposed for ECG compression by treating each cardiac cycle as a vector, and applying linear prediction on the discrete wavelet transform of this vector, after normalizing its period using multirate processing based interpolation.[19] The maturity of the fetal lung was predicted using image texture features obtained from the liver and lung regions of the ultrasound images obtained from pregnant women[20] An effective technique was proposed for lossless compression of 3D magnetic resonance images of the brain. Each MRI slice was represented by uniform or adaptive mesh; affine transformation was applied between the corresponding mesh elements of adjacent slices and context-based entropy coding, on the residues.[21]
References
[edit]- ^ "MILE Lab at IISc: Developing technologies to enable the specially abled".
- ^ MILE Lab. "MILE Lab in news". Retrieved 28 April 2013.
- ^ MILE Lab. "People". Archived from the original on 3 September 2014. Retrieved 28 April 2013.
- ^ "Walking an extra MILE for the specially abled - Bangalore Mirror".
- ^ Pati, Peeta Basa; Ramakrishnan, A.G. (2008). "Word level multiscript identification". Pattern Recognition Letters. 29 (9): 1218–1229. doi:10.1016/j.patrec.2008.01.027.
- ^ "Shiva Kumar H R, Ashwini J K, Rajaram B S R and A G Ramakrishnan, "MILE TTS for Tamil and Kannada for blizzard challenge 2013," Proc. Blizzard Challenge Workshop, Barcelona, Spain, Sept 3, 2013" (PDF).
- ^ "Pitch synchronous pitch modification". Speech Communication. 42: 143–154. doi:10.1016/j.specom.2003.05.001.
- ^ Subramanian, Karthik (17 January 2014). "Article in The Hindu on MILE Lab Tamil OCR". The Hindu.
- ^ "Karna Vidya Technology Centre, Guindy, Chennai".
- ^ "Sri Ramakrishna Math, Chennai".
- ^ "Mitrajyothi Braille Transcription Centre, Bangalore". Archived from the original on 3 February 2011.
- ^ "Braille Transcription Centre, Canara Bank Relief & Welfare Society, Bangalore".
- ^ Jayavardhana Rama, G.L.; Ramakrishnan, A.G.; Muralishankar, R.; Prathibha, R. (2002). "A complete text-to-speech synthesis system in Tamil" (PDF). Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002. pp. 191–194. doi:10.1109/WSS.2002.1224406. ISBN 0-7803-7395-2. S2CID 13870581.
- ^ "Blog in Tamil Manam on Thirukkural Tamil TTS".
- ^ "Deccan Herald: IISc develops text-to-speech software for Kannada, Tamil". 26 June 2010.
- ^ "MILE Lab research focus".
- ^ Ananthapadmanabha, T. V.; Prathosh, A. P.; Ramakrishnan, A. G. (2014). "Plosion index, a temporal feature to detect bursts in stops and affricates". The Journal of the Acoustical Society of America. 135 (1): 460–71. doi:10.1121/1.4836055. PMID 24437786.
- ^ Ramakrishnan, A. G.; Abhiram, B.; Prasanna, S. R. (2015). "A G Ramakrishnan, B Abhiram and S R Mahadeva Prasanna, "Voice source characterization using pitch synchronous discrete cosine transform for speaker identification," Journal of the Acoustical Society of America Express Letters, Vol. 137(), pp., 2015". The Journal of the Acoustical Society of America. 137 (6): EL469-75. doi:10.1121/1.4921679. PMID 26093457.
- ^ Ramakrishnan, A. G.; Saha, S. (1997). "Cardiac cycle synchronized compression of ECG" (PDF). IEEE Transactions on Bio-Medical Engineering. 44 (12): 1253–61. doi:10.1109/10.649997. PMID 9401225. S2CID 8834327.
- ^ Prakash, K. N.; Ramakrishnan, A. G.; Suresh, S.; Chow, T. W. (2002). "Predicting maturity of fetal lung from ultrasound image features" (PDF). IEEE Transactions on Information Technology in Biomedicine. 6 (1): 38–45. doi:10.1109/4233.992160. PMID 11936595. S2CID 14662967.
- ^ Srikanth, R.; Ramakrishnan, A. G. (2005). "3D brain MRI compression using adaptive mesh and contextual encoding" (PDF). IEEE Transactions on Medical Imaging. 24 (9): 1199–206. doi:10.1109/TMI.2005.853638. PMID 16156357. S2CID 7523030.