Information is of strategic importance for business and governmental agencies, but also for individual citizens. The use of automatic methods for selection and dissemination of information would enable media monitoring companies to cover a much larger variety of media sources by working more cost efficiently and providing 24 hours coverage and availability. This thesis investigates how professional media monitoring, which is currently a largely manual process, can be automatically supported. Three main modules are necessary for automatic media monitoring: speech recognition, topic segmentation, and topic classification. The research that was conducted on these three topics and the resulting innovations are presented. The performance of the individual modules, as well as the complete system, is thoroughly investigated. The focus of this theses are German news. Topic boundaries are determined using a novel approach to visual indexing. A speech recogniser transforms the audio signals into texts, which are afterwards classified for the presence of pre-defined topics. For topic classification, approaches with Hidden Markov Models, Neural Networks, and Support Vector Machines (SVMs) are investigated. One contribution of this thesis is the introduction of novel couplers for SVMs with advantages over known couplers. An additional topic covered in this thesis is Unsupervised Topic Discovery, a field nearly neglected in the literature. It makes it possible to find key-words in texts without a pre-defined topic list or training samples.
«
Information is of strategic importance for business and governmental agencies, but also for individual citizens. The use of automatic methods for selection and dissemination of information would enable media monitoring companies to cover a much larger variety of media sources by working more cost efficiently and providing 24 hours coverage and availability. This thesis investigates how professional media monitoring, which is currently a largely manual process, can be automatically supported. Thr...
»