Google To Extract Text From Images
By Reshma Kumar at January 08, 2008 0 CommentsThis is super cool and interesting. Imaging having your flattened gif text or video being read and indexed by the search engines - in particular, Google. Well, it might just happen. Google has apparently filed an application with the World Intellectual Property Organization to patent such a technology. “The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.” Already in Google Image searches, you get the image thumbnail view, title and url of the webpage where the image resides, and image dimensions, file size, and format. This proposed technology would take this a step further to improve and possibly increase the search results of universal search on Google - i.e. web, images, maps, news, shopping, blogs, books, etc. This way we can be assured that our multimedia assets are fully utilized and leveraged in improving their findability and discoverability. I wonder how this will work for highly stylized text like a checkmark used to denote the letter “v” or gif text that is highly pixelated by design. Would this technology be able to read and interpret such characters?!
Currently, we depend on the ALT attribute text for indexing of imagery but it has been suggested that this has been decreasing in relevance due to misuse. So, what’s next? Will search engines be able to interpret and imply meaning to standard iconic images without text like an envelope used to denote email, or interpret shapes like the map of the US or an animal or a graph or even recognize photos or caricatures of popular people. There are lots of interesting things that can be done but I guess we will have to wait to see.
RSS


