Computer Vision

Cognitive Models for Visual Commonsense

Fotogalerie

Yixin Zhu, Song-Chun Zhu

Computer Vision

Cognitive Models for Visual Commonsense

Gebundenes Buch

Jetzt bewerten Jetzt bewerten

Weitere Ausgabe:
eBook, PDF

Andere Kunden interessierten sich auch für

Computer Vision - ACCV 2024 Workshops

50,99 €
Computer Vision - ACCV 2024 Workshops

50,99 €
Advanced Topics in Computer Vision

76,99 €
Advanced Topics in Computer Vision

76,99 €
Xin Yang
3D Scene Modeling and Robotics Interaction

136,99 €
Computer Vision - ECCV 2018

77,99 €
Computer Vision -- ECCV 2014

39,99 €

Produktbeschreibung

This volume on visual commonsense reasoning, part of a comprehensive three-volume series, presents a computational framework for bridging the gap between modern computer vision capabilities and human-like visual understanding. While current AI systems excel at pattern recognition tasks, they often lack the sophisticated reasoning capabilities that humans demonstrate effortlessly in understanding and interacting with their environment. This work addresses this limitation by integrating physical, social, and abstract reasoning within a unified computational framework.

The volume is organized into three parts. The first part establishes the theoretical foundations of visual commonsense through a systematic examination of physical understanding, including affordances, intuitive physics, causality, and tool use. These components form the basis for understanding how objects and environments behave and interact. The second part delves into social reasoning aspects, exploring intent, theory of mind, and nonverbal communication - crucial capabilities for AI systems to interpret and predict human behavior. The third part investigates abstract visual reasoning, examining higher-level cognitive capabilities.

Drawing from cognitive science, computer vision, and artificial intelligence, this work:
Provides a systematic treatment of visual commonsense ranging from foundational theories to practical implementationsIntroduces computational frameworks integrating multiple forms of reasoningDemonstrates applications through extensive examples and case studiesHighlights current challenges and future directions in developing human-like visual AI
This carefully crafted volume serves as an invaluable resource for researchers, graduate students, and practitioners in computer vision, artificial intelligence, cognitive science, and related fields. It offers both theoretical insights and practical guidance for developing AI systems with more sophisticated visual understanding capabilities, moving closer to human-like visual intelligence.

Produktdetails

Produktdetails
Verlag: Springer / Springer Nature Switzerland / Springer, Berlin
Artikelnr. des Verlages: 978-3-031-98106-7
Seitenzahl: 570
Erscheinungstermin: 24. Januar 2026
Englisch
Abmessung: 241mm x 160mm x 36mm
Gewicht: 1157g
ISBN-13: 9783031981067
ISBN-10: 3031981065
Artikelnr.: 74341143

Herstellerkennzeichnung
Libri GmbH
Europaallee 1
36244 Bad Hersfeld
gpsr@libri.de

Produktdetails

Verlag: Springer / Springer Nature Switzerland / Springer, Berlin
Artikelnr. des Verlages: 978-3-031-98106-7
Seitenzahl: 570
Erscheinungstermin: 24. Januar 2026
Englisch
Abmessung: 241mm x 160mm x 36mm
Gewicht: 1157g
ISBN-13: 9783031981067
ISBN-10: 3031981065
Artikelnr.: 74341143

Herstellerkennzeichnung
Libri GmbH
Europaallee 1
36244 Bad Hersfeld
gpsr@libri.de

Autorenporträt

Yixin Zhu is a Boya Assistant Professor at the Institute for Artificial Intelligence, Peking University, where he serves as Assistant Dean. Dr. Zhu received his Ph.D. in Statistics from the University of California, Los Angeles (2018), advised by Professor Song-Chun Zhu. His research aims to construct interactive AI systems by fusing high-level common sense---including functionality, affordance, intuitive physics, causality, and intent---with raw sensory data such as pixels and haptic signals. This interdisciplinary approach seeks to endow machines with sophisticated representations and robust reasoning capabilities across objects, scenes, shapes, numbers, and intelligent agents. Song-Chun Zhu is a distinguished computer scientist specializing in computer vision, cognitive AI, and robotics. He received his B.S. from the University of Science and Technology of China (1991) and Ph.D. from Harvard University (1996). After positions at Stanford University and Ohio State University, he served as professor at UCLA (2002-2020), where he directed the Center for Vision, Cognition, Learning and Autonomy. Since 2020, he has been Chair Professor at Peking University and Tsinghua University, directing the Beijing Institute for General Artificial Intelligence (BIGAI). His pioneering work includes the FRAME model, stochastic grammar, and cognitive AI frameworks integrating visual commonsense reasoning. His contributions have earned him numerous accolades, including the David Marr Prize (2003), J.K. Aggarwal Prize (2008), and IEEE Fellow (2011). Through his research, institution building, and leadership in major conferences, Dr. Zhu continues to advance the development of interpretable and generalizable AI systems that bridge computational approaches with human-like reasoning.

Inhaltsangabe

Introduction.- Affordance and Functionality.- Physical Commonsense Reasoning.- Causality in Daily Activities.- Tool-use.- Mirroring and Immitation.- Utility.- Nonverbal Communication: Gaze, Pointing and Drawing.- Intention.- Animacy: Physical vs. Social Perception.- Theory of Mind Representations.- Explainable AI.- Communicative Learning.- Abstract Reasoning.- The Current State and Challenges.

Inhaltsangabe