Bienvenue chez nous!
Logo Ex Libris

Making the Visual World Audible

  • Couverture cartonnée
  • 300 Nombre de pages
(0) Donner la première évaluation
Afficher toutes les évaluations
Dr. Ing. Michael Banf completed his doctoral thesis under the supervision of Prof. Dr. rer. nat. Volker Blanz at the Media System... Lire la suite
CHF 65.00
Habituellement expédié sous 4 à 6 jours ouvrés.



Dr. Ing. Michael Banf completed his doctoral thesis under the supervision of Prof. Dr. rer. nat. Volker Blanz at the Media Systems Group, Institute for Vision and Graphics, Department of Electrical Engineering and Computer Science, University of Siegen.

Texte du rabat

Leonardo da Vinci once said, The eye encompasses the beauty of the whole world. Making this visual beauty of the world more accessible to visually impaired people has inspired researchers in Computer Vision for a long time. Perhaps the most ambitious software solution for the vision problem would be an algorithm that produces a semantic description of the image content which is then output of a speech synthesis device in natural language. This automated image analysis system would mimic a partner with normal vision who describes the image to the user. However, despite the fact that automated image understanding will remain a challenge to researchers for many years, it would continue to deprive the visually impaired of a direct perceptual experience, an active exploration, and an impression of where things are in the image and what visual appearance they have. The approaches, described in this work, therefore, are to augment the sensory capabilities of visually impaired persons by translating image content into sounds. The task of analyzing and understanding images is still up to the user, which is why we call our approach "auditory image understanding". Very much like a blind person who explores a Braille text or a bas-relief image haptically with the tip of her finger, our users touch the image via touch screen and experience the local properties of the image as auditory feedback. The approaches presented combine low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories "man-made" versus "natural" based on a novel type of discriminative graphical model. We argue that this multi-level approach gives users direct access to the identity and location of objects and structures in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning. The sound translation is inspired by the way humans perceive colors and structures visually. Due to the simplicity and directness of the sensory mapping from visual to auditory, we harness the human ability to learn, so we consider the brain of the user as a fundamental part of the system. Visually impaired persons can use the system to analyze images that they find on the internet, but also for personal photos that their friends or loved ones want to share with them. It is this application scenario that makes the direct perceptual access most valuable. The user feedback that we received for our system indicates that visually impaired persons appreciate the fact that they obtain more than an abstract verbal description and that images cease to be meaningless entities to them. Expressed in the words of one adult participant: What amazes me is that I start to develop some sort of a spatial imagination of the scene within my mind which really corresponds with what is shown in the image.

Informations sur le produit

Titre: Making the Visual World Audible
Sous-titre: Auditory Image Understanding for the Visually Impaired Based on a Modular Computer Vision Sonification Model
Code EAN: 9783844023992
ISBN: 978-3-8440-2399-2
Format: Couverture cartonnée
Editeur: Shaker Verlag
Genre: Autres
nombre de pages: 300
Poids: 536g
Taille: H238mm x B171mm x T20mm
Année: 2013


Vue d’ensemble

Mes évaluations

Évaluez cet article