Data is collected from the TV programs that covers non-fictional content broadcasted over various channels in English and German.
Format: TAB seperated fields constituting Id, Program Title, Program Short Description, Speech Transcription, Channel Name, Date of Show.
Channels: 'cnbc', 'zdf, 'bloomberg-europe', 'mdr-sachsen', 'n-tv', 'hr', 'zdf-info', 'skynews-intl', 'bbc-world-service', 'swr-fernsehen-bw', 'sf-info', 'euronews-en', 'DE_arte', 'deutsche-welle', 'phoenix', 'cnn-international', 'wdr-koeln', 'n24', 'ndr-niedersachsen', 'einsextra', 'br-alpha'


You can download the content from the following Links


  • We thank ZATTOO for providing access to TV shows for generating the dataset.
  • We thank SAIL LABS for providing speech software for research.


Aditya Mogadala

For Any Queries, Please mail AT -- aditya DOT mogadala AT kit DOT edu

Copyright © 2016 Aditya Mogadala. All Rights Reserved.

Design byW3layouts