Multimodal Interactive Pattern Recognition and Applications

Rabatt

Von:

Alejandro Héctor Toselli

Enrique Vidal

Francisco Casacuberta

Beschreibung

Here is a different approach to pattern recognition systems, in which users of a system are involved during the recognition process, and examines a range of advanced multimodal interactions between machine and users, including handwriting, speech and gesture. ...

Format auswählen

Kartonierter EinbandCHF 124.80
Fester EinbandCHF 119.35
E-Book (pdf)CHF 118.00

Kartonierter Einband

CHF 156.00

CHF124.80

-20%Sie sparen CHF 31.20

156.00

CHF124.80

Print on Demand - Exemplar wird für Sie gedruckt.

Kostenlose Lieferung

Kein Rückgaberecht

Beschreibung

This book presents a different approach to pattern recognition (PR) systems, in which users of a system are involved during the recognition process. This can help to avoid later errors and reduce the costs associated with post-processing. The book also examines a range of advanced multimodal interactions between the machine and the users, including handwriting, speech and gestures. Features: presents an introduction to the fundamental concepts and general PR approaches for multimodal interaction modeling and search (or inference); provides numerous examples and a helpful Glossary; discusses approaches for computer-assisted transcription of handwritten and spoken documents; examines systems for computer-assisted language translation, interactive text generation and parsing, relevance-based image retrieval, and interactive document layout analysis; reviews several full working prototypes of multimodal interactive PR applications, including live demonstrations that can be publicly accessed on the Internet.

Provides a new interactive pattern recognition (PR) paradigm in which traditional PR and multimodal human interaction are fused together Provides numerous examples and a helpful Glossary Reviews several full working prototypes of multimodal interactive PR applications, including live demonstrations that can be publicly accessed through the Internet Includes supplementary material: sn.pub/extras

Autorentext

Alejandro Héctor Toselli, is currently working as a PostDoc (María Zambrano grant) at the Universitat Politècnica de València. He obtained an Electrical Engineer degree from the University Nacional de Tucumán (Argentina, 1997) and a Phd in Computer Science from the Universitat Politècnica de València (UPV) (Spain, 2004). His research expertise focuses primarily on Document Analysis and Recognition, in which he has more than 20 years of experience, publishing on these topics and working on related projects funded by European and US institutions. He held a Post-Doctoral Fellow at Northeastern University (Boston, USA) in the the multi-institutional Open Islamicate Texts Initiative (OpenITI) and at the "Institut de Recherche en Informatique et Systèmes Aléatoires" (IRISA, Rennes France). Joan Puigcerver received his MSc and PhD in Computer Science from the Universitat Politècnica de València, in 2014 and 2018, respectively, focusing on probabilistic indexing and handwritten text recognition. In 2018, he joined Google Research as a software engineer. His research focuses on deep learning architectures, transfer learning, and computer vision. Joan is a member of the Spanish Society for Pattern Recognition and Image Analysis (AERFAI), an affiliate organization of the International Association for Pattern Recognition (IAPR). Enrique Vidal is an emeritus professor of the Universitat Politècnica de València (Spain) and former co-leader of the PRHLT research center there. He is co-author of hundreds of research papers in the fields of Pattern Recognition, Multimodal Interaction and applications to Language, Speech and Image Processing and has led many important projects in these fields. Enrique is a fellow of the International Association for Pattern Recognition (IAPR).

Klappentext

Many real-world applications of pattern recognition (PR) systems require human post-processing to correct the errors committed by machines. This can create bottlenecks in recognition systems, yielding high operational costs.

This important text/reference proposes a radically different approach to this problem, in which users of a system are involved during the recognition process. This can help to avoid later errors and reduce the costs associated with post-processing. The book also examines a range of advanced multimodal interactions between the machine and the users, including handwriting, speech and gestures.

Topics and features:

Presents a thorough introduction to the fundamental concepts and general PR approaches for multimodal interaction modelling and search (or inference)
Provides numerous examples and a helpful Glossary
Includes work carried out in the context of the Spanish research program Multimodal Interaction in PatternRecognition and Computer Vision (MIPRCV), which involves more than 100 highly-qualified researchers from ten research institutions
Discusses approaches for computer-assisted transcription of handwritten and spoken documents
Examines systems for computer-assisted language translation, interactive text generation and parsing, relevance-based image retrieval, and interactive document layout analysis
Reviews several full working prototypes of multimodal interactive PR applications, including live demonstrations that can be publicly accessed through the Internet Addressing the emerging field of interactive and multimodal systems in a fresh, unified and integrated way, this unique book is highly recommended reading for graduate students, academic and industrial researchers, lecturers, and practitioners working in the field of pattern recognition.

Dr. Alejandro Héctor Toselli is an Associate Professor at the Department of Computer Systems and Computation of the Polytechnic University of Valencia, Spain. Dr. Enrique Vidal and Dr. Francisco Casacuberta both hold the title of Full Professor at the same institution.

Inhalt

General Framework.- Computer Assisted Transcription: General Framework.- Computer Assisted Transcription of Text Images.- Computer Assisted Transcription of Speech Signals.- Active Learning and Interactive Handwritten Transcription.- Interactive Machine Translation.- Multi-modality for Interactive Machine Translation.- Incremental and Adaptive Learning for Interactive Machine Translation.- Interactive Parsing.- Interactive Text Generation.- Interactive Image Retrieval.- Prototypes and Demonstrators.