This book lays out a path leading from the linguistic and cognitive basics, to classical rule-based and machine learning algorith...
Download steht sofort bereit
This book lays out a path leading from the linguistic and cognitive basics, to classical rule-based and machine learning algorithms, to today's state-of-the-art approaches, which use advanced empirically grounded techniques, automatic knowledge acquisition, and refined linguistic modeling to make a real difference in real-world applications. Anaphora and coreference resolution both refer to the process of linking textual phrases (and, consequently, the information attached to them) within as well as across sentence boundaries, and to the same discourse referent.The book offers an overview of recent research advances, focusing on practical, operational approaches and their applications. In part I (Background), it provides a general introduction, which succinctly summarizes the linguistic, cognitive, and computational foundations of anaphora processing and the key classical rule- and machine-learning-based anaphora resolution algorithms. Acknowledging the central importance of shared resources, part II (Resources) covers annotated corpora, formal evaluation, preprocessing technology, and off-the-shelf anaphora resolution systems. Part III (Algorithms) provides a thorough description of state-of-the-art anaphora resolution algorithms, covering enhanced machine learning methods as well as techniques for accomplishing important subtasks such as mention detection and acquisition of relevant knowledge. Part IV (Applications) deals with a selection of important anaphora and coreference resolution applications, discussing particular scenarios in diverse domains and distilling a best-practice model for systematically approaching new application cases. In the concluding part V (Outlook), based on a survey conducted among the contributing authors, the prospects of the research field of anaphora processing are discussed, and promising new areas of interdisciplinary cooperation and emerging application scenarios are identified.Given the book's design, it can be used both as an accompanying text for advanced lectures in computational linguistics, natural language engineering, and computer science, and as a reference work for research and independent study. It addresses an audience that includes academic researchers, university lecturers, postgraduate students, advanced undergraduate students, industrial researchers, and software engineers. Autorentext Massimo Poesio is a cognitive scientist with a primaryinterest in computational linguist but interests in psycholinguistics and neuroscience as well. His research includes the development of computational models of semantic and discourse interpretation (in particular, anaphora resolution); the creation of corpora of anaphorically annotated data (hepioneered the use of games-with-a-purpose for computational linguistics with the development of Phrase Detectives, http://www.phrasedetectives.org); the study of commonsense knowledge using a combination of methods from computational linguistics and from neuroscience; and the application of text analytics methods to real life problems, such as deception detection and the identification of reports of human rights violations in social media. Roland Stuckardt works as a consultant, research &development manager, and scientific researcher in the fields of computationallinguistics and natural language processing. He studied computer science andeconomics at Goethe University Frankfurt. During his work at the GermanNational Research Center for Information Technology (GMD) Darmstadt, hespecialized in text analysis, parsing, discourse semantics, and robust anaphorresolution. He received his PhD at Goethe University for his research oncomputer-based text content analysis in the social sciences. Among his research interests and main fields of work areanaphora processing, information extraction, media content monitoring,innovative natural language processing applications in general, and computerchess. Yannick Versley is a group leader in the Leibniz-ScienceCampus "EmpiricalLinguistics and Computational Language Modeling", a collaborationbetween the Institute for German Language (IDS) in Mannheim and the Institutefor Computational Linguistics at the University ofHeidelberg. He studiedComputer Science, Physics and Mathematics in Hamburg before doing a PhD inTübingen on the coreference resolution of definitenoun phrases in German newspaper text. During his subsequent work inRovereto/Trento, Tübingen, and Heidelberg, he has worked on anumber of topics including statistical parsing, coreference resolution, discourse relations, and distributional semantics, with particular attention to German. Inhalt Preface.-1.Introduction.- Part I Background.- 2.Linguistic and CognitiveEvidence About Anaphora.- 3. Early Approaches to Anaphora Resolution:Theoretically Inspired and Heuristic-Based.- Part II Resources.- 4.Annotated Corpora and Annotation Tools.- 5.EvaluationMetrics.- 6.Evaluation Campaigns.- 7.Preprocessing Technology.- 8.Off-the-shelfTools.- Part III Algorithms.- 9.TheMention-Pair Model.- 10.Advanced Machine Learning Models for CoreferenceResolution.- 11.Integer Linear Programming for Coreference Resolution.- 12.ExtractingAnaphoric Agreement Properties from Corpora.- 13.Detecting Non-reference andNon-anaphoricity.- 14.Using Lexical and Encyclopedic Knowledge.- Part IV Applications.- 15.CoreferenceApplications to Summarization.- 16.Towards a Procedure Model for DevelopingAnaphora Processing Applications.- PartV Outlook.- 17.Challenges and Directions of Further Research.- Index.