TubeTagger: Semantic Concepts To Fuel Next Gen of Video Search

May 20th, 2008 by John-Alistair George    
Tags: , , , , , , , , ,

Finding video on the web has been a pretty hit or miss affair for quite some time. This is due to the fact that we solely rely on the descriptions added by users to video content. And while some may argue that this is the best model (and most scalable) — there are inherent flaws to this as well.

However, we may be close to finding a more workable solution. Video recognition software has been around since 2001, but the focus by companies like YouTube is mainly to try and match copyright infringements, especially after a slew of lawsuits by content owners like Viacom. The software is fairly rudimentary in its function, working much like fingerprint matching. The most effective use to date is to train the system using a large library of sample video clips. This method still barely yields more than a 20% success rate above baseline guess.

All that is about to change thanks to the Computer Science Department of the University of Kaiserslautern and their system called “TubeTagger”. The main feature, here, is that they have eliminated the need for end users to train the system. The system can be trained using any video source available to it.

For an experiment they did a test of 22 tags, for which 100 videos were downloaded for each tag. 50 videos were used for learning the tags and 50 videos were used in the test set for a combined total of 195 hours of online videos. YouTube videos were used as the source. The creators averaged about 37% success rate for some of the tags and less for others. It’s still a far cry from full automation but a great start. Automatic detection of semantic concepts like locations, actions and objects in the video were used for detection points. The creators are now expanding this to include audio recognition and also text scrubbing.


Most companies out there gave up on this type of approach due to the complexity and vast variants in video material. But it’s great to see people move forward in solving one of the most dreaded categories out there – “automatic relevant meta tagging of content”. Cracking this nut will lead to a host of new tools and features for finding, sharing and monetizing online video.

The group does have skeptics, like Alexander Hauptmann of Carnegie Mellon University for one, who does not think the system can be applied successfully in a broader landscape.

Nonetheless, this is still one bright spot on the video search landscape. Here is more on how it works:

The system uses the visual content of the videos only and learns statistical models for the connections between tags and features. Thereby, it integrates multiple features in so-called feature pipelines:

  • color
  • texture
  • motion
  • a bags-of-visual words descriptor based on local patches

For a complete overview of the system and its reasoning as well as experimental results, please refer to our publication: “Download PDF

To view the demo click here.

For more info visit the project page:

http://demo.iupr.org/videotagging/tagging-description.html


Del.icio.us     Digg     Technorati     Share on Facebook     Stumble Upon     Google Bookmarks     Furl     reddit

Post a Comment

This is a captcha-picture. It is used to prevent mass-access by robots. (see: www.captcha.net)

You must read and type the 5 chars within 0..9 and A..F, and submit the form.

  

Oh no, I cannot read this. Please, generate a