Jump to content

User:BCarrilloM/sandbox/InfoScope

From Wikipedia, the free encyclopedia

An InfoScope is a translating handheld device composed of a digital camera and wireless internet access. The device is a "information augmentation system"[1] that can be used to translate foreign languages by photographing the text and then sending the image to a remote computer via the internet for translation. Using the same procedure, it can also give the user information about a certain location and building. The device is in the prototype stage and was conceived and developed by Ismail Haritaoglu at IBM's Almaden Research Center. It was named as one of Time Magazine's 2002 "Best Inventions." [2]

Development

[edit]

Prototype

[edit]

This invention was made as a prototype in 2002 by Ismail Haritaoglu who has received his M.S. and B.S. from Bilkent University, Turkey and has a Ph.D. in Computer Science from the University Of Maryland, College Park.[3] He developed this invention while working at IBM's Almaden Research Center in response to the "increase of real world and digital world connections made possible by handheld devices"[1]. The increase and availability of new applications and devices introduced new ways of connecting the user to the digital world; and Haritaoglu wanted the Infoscope to operate under the same premise of connectivity. The goal was to create a device that made it easier to receive information from the real world using digital world applications.[1]

Future Work

[edit]

When InfoScope was created, it dealt with many technical problems. Two major ones include, the "speed of the wireless communication" and the "processor power limitations of available PDA's". This means that the InfoScope prototypes take 15 seconds in total to augment back to the user the translation that was indicated. Future developments of InfoScope will deal with these issues by cutting down the augmentation process to less than 5 seconds. This will be done by an upgrade of modems, from the original GSM modem to a 3G modem.[4]

Mechanism Description

[edit]

InfoScope's main function is to translate the text of a language using its many components. This is done by a series of steps in which InfoScope has to take an image from the real world into the digital world, where it is then translated to text, and inputted back to the original scene location.[1]

Components

[edit]

The InfoScope consists of five main parts. The first is the colored camera, a Casio JK-710DC that takes the picture of the text. Second is the PDA, a Cassiopeia E 125, which is attached to the camera with the purpose of displaying the translation and the image. The third is the GSM modem, a Nokia 8190 GSM phone that allows the user to send the image and the information to the server. Fourth is the GPS, a Pharos Nav GPS that allows for the accurate location of the information. Lastly is the IBM eServer, which gives the whole device the processing power it needs.[4]

Applications

[edit]

InfoScope is composed of two basic applications which allow the prototype to work. The first is the "automatic sign/text application" which takes a foreign language and translates into the language that is needed. This application allows the viewer to see the actual translation in the live image exactly where the original text was located. The user first begins by selecting the piece of text that needs translation from the image that was taken. Then InfoScope takes this image and extracts the text and characters that need to be translated converting them to the ASCII text that is displayed on the image. This is done by a process called sign segmentation which filters out any noise in the image and focuses only on the text and characters that need to be translated.[4]

The second application is the "Information Augmentation in the City". This application displays information associated with a building or a place. It acts as the first application and is overlaid onto the real scene image which is displayed on the PDA. This application has two inner prototype applications as well. One of them works as a city guide, helping people get the information they need about historic places and buildings. The other is like a "realtor guide" that tells the user information of the building itself, such as, if it's up for rent, the age, if there is an open house, the price, etc.[1]

References

[edit]
  1. ^ a b c d e "Ubicomp 2001: Ubiquitous Computing | SpringerLink" (PDF). doi:10.1007/3-540-45427-6.pdf#page=261. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ "Best Inventions of 2002 - TIME". TIME.com. Retrieved 2017-10-24.
  3. ^ "Ismail Haritaoglu, Ph.D.- Anvato". Anvato. 2011-01-15. Retrieved 2017-10-24.
  4. ^ a b c Haritaoglu, I. (2001). "Scene text extraction and translation for handheld devices". Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. 2: II–408–II-413 vol.2. doi:10.1109/CVPR.2001.990990.
[edit]