26-08-11: Urban OCR: Exploitation of written information sources in Urban Environments

Type: Bachelor-/ Master Thesis  Studien-/ Diplomarbeit

Urban OCR: Exploitation of written information sources in Urban Environments

The man-made environment of a city features rich sources of semantic information. But apart from various objects or specific buildings the text on street signs, doors and shop windows offers information about the locality as well. Yet coarsely used the words on a sign can give a clue about the type of a shop or the name of a street. These semantic cues can be valuable for robotic navigation in urban environments. When describing a route to a certain location, people tend to include landmarks in their description such as distinct shops or street names. While distinguishing different buildings in a city just from their appearance would be a problematic task even for human beings signs can tell what kind of service is offered by the respective shop. Subsequently the description "turn left at the indian restaurant" can be useful if a robot is able to detect and recognize the word "restaurant" in his environment.

This work is aiming at the development of robust computer vision algorithms that enable a robotic platform to detect text in an urban environment and recognize crucial words that carry semantic information. As shown below, multiple problems have to be tackled here which can be assigned separately or bundled.

Tasks:

  • Examining State-of-the-Art
  • Saliency based detection of textfields in urban environments (provided)
  • Segmentation and recognition of characters on signs (provided)
  • Classification of clusters of characters into words (provided)
  • Improvement of the OCR using a Spell Checker or Language Model
  • Online Learning of features used in the Visual Saliency !(main topic)!
  • Generalization of the online learning approach to an urban environment
  • Alignment of the algorithm output and the word sought-after
  • Performance evaluation and testing

Prerequisites:

  • Good programming skills in C or C++
  • For Diploma-/Master-Thesis: experience in the field of computer vision

Supported Languages:

  • German
  • English

Helpful but not required:

  • Acquainted with Linux
  • Experience with OpenCV
  • Experience with ROS
  • Your own Laptop with a running Ubuntu version

Supervisor:

Literature:

  • Using Text-Spotting to Query the World
    I. Posner, P. Corke, and P. Newman
    Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS)
    2010

  • AdaBoost Learning for Detecting and Reading Text in City Scenes
    X. Chen and A.L. Yuille
    Oral presentation in Computer Vision and Pattern Recognition 2004

  • Real time image enhancement and segmentation for sign/text detection
    E.D. Haritaoglu, I. Haritaoglu
    International Conference on Image Processing, 2003. ICIP 2003. Proceedings. 2003

  • Quadrilateral Signboard Detection and Text Extraction
    Angela Tam, Hua Shen, Jianzhuang Liu, and Xiaoou Tang

  • Real time image enhancement and segmentation for sign/text detection
    E.D. Haritaoglu, I. Haritaoglu
    International Conference on Image Processing, 2003. ICIP 2003. Proceedings. 2003