

Beschreibung
The volume of natural language text data has been rapidly increasing over the past two decades, due to factors such as the growth of the Web, the low cost associated with publishing, and the progress on the digitization of printed texts. This growth combined ...
The volume of natural language text data has been rapidly increasing over the past two decades, due to factors such as the growth of the Web, the low cost associated with publishing, and the progress on the digitization of printed texts. This growth combined with the proliferation of natural language systems for search and retrieving information provides tremendous opportunities for studying some of the areas where database systems and natural language processing systems overlap.
This book explores two interrelated and important areas of overlap: (1) managing natural language data and (2) developing natural language interfaces to databases. It presents relevant concepts and research questions, state-of-the-art methods, related systems, and research opportunities and challenges covering both areas. Relevant topics discussed on natural language data management include data models, data sources, queries, storage and indexing, and transforming natural language text. Under naturallanguage interfaces, it presents the anatomy of these interfaces to databases, the challenges related to query understanding and query translation, and relevant aspects of user interactions. Each of the challenges is covered in a systematic way: first starting with a quick overview of the topics, followed by a comprehensive view of recent techniques that have been proposed to address the challenge along with illustrative examples. It also reviews some notable systems in details in terms of how they address different challenges and their contributions. Finally, it discusses open challenges and opportunities for natural language management and interfaces.
The goal of this book is to provide an introduction to the methods, problems, and solutions that are used in managing natural language data and building natural language interfaces to databases. It serves as a starting point for readers who are interested in pursuing additional work on these exciting topics in both academic andindustrial environments.
Autorentext
Yunyao Li is the Head of Machine Learning at Apple Knowledge Platform. Until early 2022, she was a Distinguished Research Staff Member and Senior Research Manager at IBM Research - Almaden. She was also a Master Inventor and a member of IBM Academy of Technology. She is an ACM Distinguished Member and a member of the inaugural New Voices program of the American National Academies. Her expertise is at the intersection of natural language processing, databases, human-computer interaction, machine learning, and information retrieval. Her contributions in these areas have led to over 100 research publications with multiple awards, 36 patents granted, multiple graduate-level courses (including 2 Massive Open Online Courses), and billions in revenue generated from technology transfer. Yunyao pioneered some of the landmark work on NLIDB, including NaLIX, the first conversational NLIDB for XML. She is the co-author of Natural Language Data Management and Interfaces. Yunyao holds aPh.D. in Computer Science & Engineering from the University of Michigan - Ann Arbor. Dragomir Radev was the A. Bartlett Giamatti Professor of Computer Science at Yale University. He had been a Fellow of ACM, AAAI, AAAS, and ACL. Dragomir's interests were in semantic parsing, text summarization, natural language generation, logical reasoning, and information retrieval. Dragomir was the co-author of Graph-based Natural Language Processing and Information Retrieval with Rada Mihalcea (Cambridge University Press, 2011). He was also the editor of a two-volume collection of problems from the North American Computational Linguistics Open Contest (NACLO), Puzzles in Logic, Languages and Computation: The Red Book and Puzzles in Logic, Languages and Computation: The Green Book (both published by Springer in 2013). Dragomir held a Ph.D. in Computer Science from Columbia University. Sadly, Drago passed away shortly before publication of this book in 2023. Davood Rafiei holds the position of Professor of Computer Science and is an active member of the Database Systems Research Group at the University of Alberta. His areas of expertise span databases, information retrieval, and NLP with a focus on managing large complex data, data integration, and natural language interfaces to databases. He has co-authored the book Natural Language Data Management and Interfaces and, more recently, the article "DIN-SQL: Decomposed in-context learning of text-to-SQL with self-correction," which held the top position on two major text-to-SQL leaderboards. Davood regularly serves on the program committees for major database and data mining conferences (such as SIGMOD, VLDB, KDD, CIKM) and Web and IR conferences (such as WWW, SIGIR, WSDM). His academic journey includes undergraduate studies at Sharif University, a Master's degree from the University of Waterloo, and a Ph.D. from the University of Toronto. He has been a visiting scientist at Google (2007-2008 ), Kyoto University (2014), and the University of Paris Descartes (2015).
Inhalt
Preface.- Acknowledgments.- Introduction.- Background.- Natural Language Data Management.- Natural Language Interfaces to Databases.- Open Challenges and Opportunities.- Conclusions.- Bibliography.- Authors' Biographies.- Index.