Video Grounding and Its Generalization (eBook, PDF)

From I.D. and Task-specific Models to O.O.D. and Large Foundation Models

Fotogalerie

Xin Wang, Xiaohan Lan, Wenwu Zhu

Video Grounding and Its Generalization (eBook, PDF)

From I.D. and Task-specific Models to O.O.D. and Large Foundation Models

Format: PDF

Jetzt bewerten Jetzt bewerten

Geräte: PC
ohne Kopierschutz
eBook Hilfe
Größe: 27.29MB

Andere Kunden interessierten sich auch für

Computational Intelligence for Multimedia Understanding (eBook, PDF)

36,95 €
Advances in Multimedia Information Processing - PCM 2018 (eBook, PDF)

40,95 €
Digital TV and Wireless Multimedia Communication (eBook, PDF)

72,95 €
Advances in Multimedia Information Processing - PCM 2018 (eBook, PDF)

72,95 €
Advances in Multimedia Modeling (eBook, PDF)

40,95 €
Advances in Multimedia Information Processing - PCM 2018 (eBook, PDF)

72,95 €
Advances in Multimedia Modeling (eBook, PDF)

40,95 €

Produktbeschreibung

This book consists of two parts: Part I Methodologies for Video Grounding and Part II Generalized Video Grounding and Trending Directions. To make this book self-contained and cutting edge, Part I will cover basic and advanced methodologies for Video Grounding, discussing key comparisons with several representative Vision-Language learning tasks including multimodal understanding and generation. Part II will cover our insights for Generalized Video Grounding and the development of Video Grounding in the era of large foundation models, discussing future directions such as Out-of-Distribution settings which deserve further investigations.

Discussions on Video Grounding will cover both the task of Video Grounding and other Vision-Language Task, as well as their relations. The basics and advances will touch Video Grounding from model to benchmark, from supervised learning to unsupervised pre-training, from single video grounding to video corpus grounding, and from in-distribution setting to out-of-distribution setting. As for Generalized Video Grounding, we discuss cross-modal grounding, event grounding for multi-modal tasks, various distribution shifts in out-of-distribution setting, explainable Video Grounding, and large foundation model for Video Grounding.

We deeply hope this book can benefit interested readers from both academy and industry, covering needs from junior starters in research to senior practitioners in IT companies.

Dieser Download kann aus rechtlichen Gründen nur mit Rechnungsadresse in A, B, BG, CY, CZ, D, DK, EW, E, FIN, F, GR, HR, H, IRL, I, LT, L, LR, M, NL, PL, P, R, S, SLO, SK ausgeliefert werden.

Produktdetails

Produktdetails
Verlag: Springer Nature Switzerland
Seitenzahl: 209
Erscheinungstermin: 1. Januar 2026
Englisch
ISBN-13: 9783031948374
Artikelnr.: 76270714

Produktdetails

Verlag: Springer Nature Switzerland
Seitenzahl: 209
Erscheinungstermin: 1. Januar 2026
Englisch
ISBN-13: 9783031948374
Artikelnr.: 76270714

Herstellerkennzeichnung

Autorenporträt

Xin Wang is currently an Associate Professor at the Department of Computer Science and Technology, Tsinghua University. He got both of his Ph.D. and B.E degrees in Computer Science and Technology from Zhejiang University, China. He also holds a Ph.D. degree in Computing Science from Simon Fraser University, Canada. His research interests include multimedia intelligence, machine learning and its applications. He has published over 200 high-quality research papers in ICML, NeurIPS, IEEE TPAMI, IEEE TKDE, ACM KDD, WWW, ACM SIGIR, ACM Multimedia etc., winning three best paper awards including ACM Multimedia Asia. He is the recipient of ACM China Rising Star Award, IEEE TCMC Rising Star Award and DAMO Academy Young Fellow.

Xiaohan Lan obtained her M.S. degree from Shenzhen International Graduate School, Tsinghua University. She received her B.E. degree from the Department of Computer Science and Technology of Beijing Normal University in 2020. Her main research interests include multimedia computation, vision and language understanding and deep learning.

Wenwu Zhu is currently a Professor in the Department of Computer Science and Technology at Tsinghua University. He also serves as the Vice Dean of Beijing National Research Center for Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs, New Jersey as Member of Technical Staff during 1996-1999. He received his Ph.D. degree from New York University in 1996. His research interests include graph machine learning, curriculum learning, data-driven multimedia, big data. He has published over 400 referred papers, and is inventor of over 100 patents. He received ten Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019. He serves as the EiC for IEEE Transactions on Circuits and Systems for Video Technology, the EiC for IEEE Transactions on Multimedia (2017-2019) and the Chair of the steering committee for IEEE Transactions on Multimedia (2020-2022). He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019. He is an AAAS Fellow, IEEE Fellow, ACM Fellow, SPIE Fellow, and a member of Academia Europaea.

Inhaltsangabe

Preface.- Introduction.- Traditional Temporal Sentence Grounding in Videos.- Generalized Video Grounding.- Future Research Directions.- References.

Inhaltsangabe

Preface.- Introduction.- Traditional Temporal Sentence Grounding in Videos.- Generalized Video Grounding.- Future Research Directions.- References.

Video Grounding and Its Generalization (eBook, PDF)

From I.D. and Task-specific Models to O.O.D. and Large Foundation Models

Video Grounding and Its Generalization (eBook, PDF)

From I.D. and Task-specific Models to O.O.D. and Large Foundation Models

1. Login

2. tolino select Abo