31,99 €
inkl. MwSt.
Versandkostenfrei*
Versandfertig in 1-2 Wochen
payback
16 °P sammeln
  • Broschiertes Buch

Video captioning, the task of describing the content of a video in natural language, is a popular task both in computer vision and natural language processing. In the beginning, researchers try to generate sentence-level captions for short video clips (Venugopalan et al., 2015). Krishna et al. (2017) propose the task of dense video captioning. The system needs to detect event segments first and then generate captions. Park et al. (2019) propose the task of video paragraph captioning: they use ground-truth event segments and focus on generating coherent paragraphs. Lei et al. (2020) follow the…mehr

Produktbeschreibung
Video captioning, the task of describing the content of a video in natural language, is a popular task both in computer vision and natural language processing. In the beginning, researchers try to generate sentence-level captions for short video clips (Venugopalan et al., 2015). Krishna et al. (2017) propose the task of dense video captioning. The system needs to detect event segments first and then generate captions. Park et al. (2019) propose the task of video paragraph captioning: they use ground-truth event segments and focus on generating coherent paragraphs. Lei et al. (2020) follow the task setting and propose a recurrent transformer model that can generate more coherent and less repetitive paragraphs. Considering the groundtruth event segments are often unavailable in practice, our goal is to generate paragraph captions without ground-truth segments.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.