Editing
Cross-Media Lecture Retrieval System for Lecture Videos
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
Title: Cross-Media Lecture Retrieval System for Lecture Videos Abstract: This research proposes a cross-media lecture-on-demand system that allows users to selectively view specific segments of lecture videos by submitting text queries. Users can easily formulate queries by using the textbook associated with a target lecture, even if they cannot come up with effective keywords. The system extracts the audio track from a target lecture video, generates a transcription using large vocabulary continuous speech recognition, and produces a text index. Experimental results show that adapting speech recognition to the topic of the lecture increases recognition accuracy and improves retrieval accuracy to a level comparable with human transcription. Research Question: How can a cross-media lecture-on-demand system be designed to allow users to retrieve relevant video/audio passages in response to text queries, improving the efficiency of information retrieval from lecture videos? Methodology: The proposed system consists of an online and offline process. In the offline process, the audio track from a target lecture video is extracted and segmented into a number of passages. A speech recognition system transcribes each passage, and the transcribed passages are indexed for efficient retrieval. To adapt speech recognition to a specific lecturer, unsupervised speaker adaptation is performed using an initial speech recognition result (i.e., a transcription). In the online process, users can submit text queries to retrieve relevant video/audio passages. Results: The experimental results demonstrate that adapting speech recognition to the topic of the lecture increases recognition accuracy and improves retrieval accuracy to a level comparable with human transcription. This indicates that the proposed system effectively retrieves relevant video/audio passages in response to text queries. Implications: The research highlights the potential of cross-media systems in improving the efficiency of information retrieval from multimedia contents. By adapting speech recognition to the topic of the lecture, the system achieves high recognition and retrieval accuracy, making it a promising approach for other multimedia retrieval applications. Additionally, the use of textbooks to formulate queries provides a user-friendly interface, allowing users to retrieve relevant information more easily and efficiently. Link to Article: https://arxiv.org/abs/0309021v1 Authors: arXiv ID: 0309021v1 [[Category:Computer Science]] [[Category:Lecture]] [[Category:Recognition]] [[Category:Retrieval]] [[Category:System]] [[Category:Speech]]
Summary:
Please note that all contributions to Simple Sci Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Simple Sci Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information