The State of Google Images and Visual Search
Search Engine Optimisation

The State of Google Images and Visual Search

8th August 2018


Hannibal Lecter talking to Clarice Starling

“We begin by coveting what we see every day. […] And don’t your eyes seek out the things you want?” (Hannibal Lecter to Clarice Sterling, The Silence of the Lambs, 1993)

The SEO world is already old enough to live upon two recurring trends:

  1. Discovering and dissecting new shiny objects, which practical impact on business revenues still has to achieve its real promised potential;
  2. Creating revivals of topics that – for a reason or another – the same SEO world archived before as not sexy-to-talk-about-them-anymore, but that actually never stopped from being relevant as a task.

In the first one, we can put the real frenzy existing around voice search, augmented and virtual reality or personal assistants.

To avoid misunderstandings, I am not saying that SEOs mustn’t investigate these new fields and landscapes that technology is offering to the Search Industry but, let’s be honest, we are all talking about them as if everybody in the world was doing complex voice searches, wearing a VR headset all day or relying on Personal Assistants for everything (and we know it is not so in the vast majority of the cases).

In the second recurring trend we can cite old classics like Log Analysis, which all of a sudden returned as a primary task a couple of years ago, so much to impulse even the creation of new log files analysis tools or the inclusion of this task in almost all the major crawlers, or we can include the renaissance of internal linking amongst these revivals.

But there is one field in Search that does not seem able to attract the same kind of interest of mainstream SEO publications, apart from some sporadic mention and post: this is image and visual search.

I find this quite surprising.

The same impact and evolution of Machine Learning that is driving, for instance, Voice Search is revolutionizing a classic like Image Search and it is bringing up an entire and almost virgin field filled with opportunities: Visual Search.

The surprising thing then, it is that for once (even if we should reconsider the glorified idea we SEOs have of SEO as the most advanced digital marketing channel, but this is a topic for another post, maybe) business companies are anticipating and creating the rules around which both Images and Visual Search are governed. I am not talking only about Google and Bing, but also about Pinterest, Amazon, Clarifai and even brands like Target, Macy’s or Home Depot.

In this post, I will concentrate on Google and I will try to offer you not only a truthful portrait of the state of Images and Visual Search but also actionable ideas you can implement in your SEO strategy right now and a glimpse of what the future of a fully integrated Voice + Visual Search will be.


At the last The Inbounder, Rand Fishkin presented fresher data of a previous research he conducted with the help of Jumpshot:

february 2018 Jumpshot search market USA

The data offered an unexpected surprise: Google Image Search counted for 22,6% of all search done on the Internet in the USA:

  • 5x times more than YouTube;
  • 10X times more than Amazon or Bing;
  • Almost 20X times more than Facebook.

As I wrote before, though, this is maybe a surprise only for us SEOs, because we are neglecting Image Search since we started considering that the quality of the traffic it generates is almost zero.

However, it doesn’t seem a surprise for the Search Engines (and not only them).

In fact, if Image Search wasn’t important for their core business, they would not have invested time, money and efforts to radically change the overall experience and also the mechanisms governing Images Search.

In fact, ImagesSearch – thanks to Machine Learning algorithms – is steadily transforming into Visual Search.

Nevertheless, we are still in a crisis, meaning with it a phase of passage where Images and Visual Search still partly are two separate entities, but with a grey zone where they are colliding.

The State of Google Image Search

Image Search can be considered the perfect example of Search right now.

In fact, practically all the old classic rules of Images SEO are still valid, albeit so often forgotten or deemed as not so influential anymore.

At the same time, though, other new rules have come and their weight is becoming predominant.

The old classic rules that still matters

Classic Images SEO is still valid.

For those who are new to SEO and for those who may need to refresh dusty memories, here they are:

  • Images files must be descriptive.

Letting web designers upload images with their raw id number name is still nonsense.

Try, for instance, to search for a random-number.jpg and look how many different images we can find on Google Images.

  • Alt tag (or Alt Text as commonly are called too) are still important for Images SEO.

Created for usability (blind or vision unpaired users) they are used by the blind search engines crawlers to understand the meaning of an image.

Plus, the alt tag is used as “anchor text” when the image is paired with a link.

  • Image Caption (the classic journalistic custom of writing a short text describing a photo under it) and Image Description, which is used as a readable text in images only URLs, still are important for giving hints to the search engines about the nature of an image.
  • Image Sitemaps, which have assumed a new greater importance since when JavaScript-based framework became popular… because “onclick” is not something Google uses for discovering Images’ links.
  • Contextual information around the image, or – in other words – the textual context surrounding an image, which Google often uses as “meta description” in Images Search.

Then we have “new” Image Search SEO rules, that are becoming more and more relevant for earning greater visibility not only in the Google Images vertical, but also in the Universal Search.

I am talking of Structured Data for Images.

Google requires to obligatorily tag Images for:

  1. Products;
  2. Recipes;
  3. Carrousels;
  4. Articles (in their AMP version);
  5. Logos (to appear in the Knowledge Panel);
  6. Restaurants, in the case of Local Businesses;
  7. Video (the thumbnail representing the video).

Google is very precise about how to markup structured data for Image Objects:

image object structured data guidelines
Image object structured data guidelines from the Search Gallery of Google.

Structured Data, then, is what Google is using since the end of December 2016 to give a new meaning to what was substantially a zombie without any monetization.

Product Rich Result in Google Images Search
Product Rich Result in Google Images Search

Apart from the introduction of PLA in Images Search, a topic that I will not talk about, the use of Product Structured Data – with the obliged markup of the Image Object representative of the product itself and its price – is what makes the Google images Search similar to Pinterest.

With this said, then, we may also consider the increased use of structured data for painting the SERPs also a symptom of a more general evolution of Google from a strings to things or, to use a definition that Cindy Krum made famous with her posts’ series in Mobile Moxie, to an entity first search engine.

This is especially true in Image Search.

Classic Images SEO factors (i.e.: the contextual information surrounding an image) combined with structured data improve the overall understanding of the meaning of the content published in our page.

The more Google is able to understand the true meaning of our content without ambiguities, the more Google will potentially use that same content as an answer to related queries users may do.

Obviously, this is not the same as saying that structured data and semantics are ranking factors, but their use surely helps a content to be more responsive to a larger query set, hence more visible in the SERPs.

Another big change, which has not been fully understood in its importance since it was introduced 2 years ago already, is the tagging system, with which Google offers us the opportunity to refine our Image Searches:

Images Search result for Lightsaber and Tagging Explained
Images Search result for Lightsaber and Tagging Grouping Explained

The tags are not keywords: they are Named Entities that Google associates to our query over the base of Knowledge Base and web documents related to our query.

Finally, Google groups them basing its decision on contextual closeness, as you can understand looking at the example I present in the annotated screenshot here above.

The fact that these suggestions are based on Entities closeness and grouping can be of wonderful help for us when we need to understand the overall ontology our main Entity is part of accordingly to Google, hence for improving the correctness of our Entity Search, hence the topical search.

This semantics and related-entities tagging that Google uses in Image Search can offer us even better topic modeling insights than Google Suggest or Related Searches offer us in Universal Search:

Images Search Tags as Topical Research Tool
Images Search Tags as Topical Research Tool

Strange (not strange) no tool has still discovered the potentialities of the Images Search tagging system…

However, in Google Images and the tagging system aren’t the only new characteristics that Google introduced in Image Search.

Google Related Items

Remember this example I used before?

The image was purposely cut to not show you immediately one of the others 1 new feature Google introduced in 2017 and started expanding this year: Related Items.

Related Items is available only for mobile search.

Its purpose is obvious seeing the screenshot. However is quite surprising how this potentially powerful feature is not widely used by ecommerce websites.

The second new Images Search feature opportunity Google introduced in 2017, then, was Style Ideas.

When we are looking, for instance, for a bag, Google may present us – apart from similar items – also style ideas for using that same bag with other products.

You can imagine the potential in terms of organic traffic these 2 features can offer.

But how do they work?

As we saw, structured data plays a big role in Similar Items, but it doesn’t seem so in Style Ideas.

This last Image Search feature is almost entirely based on Visual Search, and it a first glimpse of where Google really wants to move next.

The State of Google Visual Search

Visual Search, as very well defined it Purna Virji in a conference, is the ideal and only way of searching for something when we don’t even know how to name it.

Using a metaphor, any Visual Search engine is a sort of Shazam for images.

Visual Search, like Voice Search, wouldn’t be possible without Machine Learning, and it is based on image recognition algorithms.

It is not my intention to transform this post into a brainy and arid dissertation, albeit possibly fascinating, about image recognition algorithms; therefore I will try to be more didactic than purely scientific.

How do image recognition algorithms work (in plain English)

Night Sky for simple image recognition test

Can you see the star?

Well, actually it is not a star, but our own planet Earth as seen from the surface of Mars.

Earth as seen from Mars

This is the simplest example I can offer you of how Image Recognition works: “Individuating the target discarding the distractors”.

The grey pixels of the Martian sky are the distractors and the pale blue dot of the Earth was the target (as in the example on the left in the image here below).

This obviously becomes harder and harder as many different distractors are present in an image (the example on the right):

Image Recognition explained - Distractors Target and Pop-Up Effect
Image Recognition explained – Distractors Target and Pop-Up Effect

Image Recognition Algorithm, however, is still not perfect. In fact, MIT Researches successfully tricked the Google one with their experiments, and many of the “racism” accusations Google received in the recent past (the infamous case of a black woman tagged as a gorilla) are due to these imperfections, which usually are directly related to the low quality of the images analyzed by the same algorithm.

Nevertheless, the Google Image Recognition algorithm – Google Cloud Vision – is quite impressive and it’s the base of the Visual Search of Google.

Cloud Vision (and the Visual Search of Google), though, not only  relies on image recognition and images similarity (i.e.: thanks to common predominant colors between 2 images), but it also uses Entity extraction from:

  1. The page hosting the image (if exists);
  2. Websites using the same or very similar images;
  3. And “image SEO best practices”.

for offering better visual search results.

An important side note, then, is that Cloud Vision not only allows Google to offers Visual Search in an image set, but also in a video set (obvious because a video is a sequence of static images, but technically expensive in computing costs).

Let see a couple of examples.

The Ugly Duckling test

Ugly Duckling Cloud Vision test
Ugly Duckling Cloud Vision test

In this example, I uploaded in Cloud Vision (and you can do the same test) a frame I screenshotted from the original The Ugly Duckling short animated movie by Walt Disney.

I saved the image with the descriptive name “Brutto anatroccolo.png”, which is “ugly duckling” in Italian, and uploaded it to test Cloud Vision capabilities.

Thanks to Image Recognition, Cloud Vision was able to see what other websites have the same image or very similar images.

From the information retrieved from those websites (i.e.: the, then, Google was able to extract the web entities related to the image I uploaded, assigning to each web entity a grade of exactitude:

  1. Walt Disney;
  2. The Ugly Duckling (meant as the original fable title);
  3. The Ugly Duckling (meant as the title of the Silly Symphony short by Disney inspired to the homonymous fable);
  4. Merbabies (another Disney’ s Silly Symphonies animated short shot in the same years of The Ugly Duckling);
  5. The Old Mill (Silly Symphonies short);
  6. Wynken Blynken and Nod (Silly Symphonies short);
  7. Farmyard Symphony (other Silly Symphonies short).

The fashion photo test

An even better example is this other one, for which I used an image taken directly from a random search in Google Images:

Fashion Photo Cloud Vision test 1


Fashion Photo Cloud Vision test 2

Almost everything is correct:

In fact, this is:

  1. a photo of the fashion model Karlie Kloss
  2. wearing an haute couture
  3. gown
  4. designed by Alessandro Michele and
  5. presented during the Gucci Cruise Collection fashion show
  6. at the Palazzo Pitti
  7. in Florence.

Then, Cloud Vision – as I anticipated before – also presents the colors’ pattern dominant in the image, giving to each color a value based on the relevance of the color itself in the same image:

Fashion Photo Cloud Vision test 3

This Dominant Colors information not only is extremely useful for improving, for instance, reverse images search, but also for populating the style ideas

Style Ideas by Google

Finally, if we consider that Google seamlessly is able to connect  and make work together Cloud Vision and the Natural Language Cloud algorithms, then we can see how much Visual Search (in combination with Actions) is at the base of all the search potentialities of Google Lens:

Google Lens GIF

If Image Recognition has started playing a bigger role in the slow but steady transformation of Images Search into Visual Search, how can we optimize for it?

This is how we can do Visual Search SEO.

  • Images Search SEO

Yes! Image Search SEO still has an important value also for Visual Search, but not anymore – or not only – in terms of “keywords” but in terms of better non-ambiguous understanding of the meaning of an image by Google.

  • Links

Incredible, isn’t it? In the example I presented before, we saw how Google rely on the content of the pages using our images for better understanding the image itself.

However, links are to be meant as a pure medium for helping the image recognition algorithm to discover matching images, hence as Search Entities, and not in terms of pure Link Graph.

This means that we should facilitate other websites in using our images and to link back to the pages of our website they took them from.

  • High-Quality Images/Photos

We have also seen how quality directly influences image recognition.

In other words: the lower the quality, the poorer the image recognition.

However, we also know how much images weight in everything PageSpeed, therefore – while using high-quality images – we still need to pay attention to not over bloat our web pages with extra megabytes causing dangerous Web Performance Optimization issues, especially on mobile.

We can use image compression software like, TinyPNG, Optimyzilla, and others (plus studying this excellent post by Kristine Schachinger.

  • Respect the 16:9 or 4:3 image ratio for ideal cropping in Image Search

The respect of the 16:9 or 4:3 format is needed for not having our images deformed in the Images Search Results by Google itself.

Pay attention to this detail, because there’re studies showing how deformed images thumbnails have the worst CTR.

  • Pay attention to details

When producing the photos to use on our website, we must

  1. Create photos that see our product in a real environment. For instance, a bag used by a model and not only the classic bag product photo or, a decoration product actually used for decorating a room.
  2. Pay big attention in asking the photographers to shot photos that are not hiding additional products, for instance, in “Lifestyle” fashion photos. In fact, if we want our photo to be used by Google as a “Style Idea” also for secondary items and not only for the main object of the photo itself (somehow as when we work for secondary keywords in on page SEO).
  • Do not use stock photos

Stock photos are used by many websites, which nature can be very different one each other, and usually, they are used in any possible context, which we saw how much is important for Cloud Vision for extracting Entities related to the image.


Google Image Search is under a deep transformation from the useless vertical it was only 18 months ago to a complex and ever growing and used new search option for millions of users, whose behavior clearly says to us that are more familiar with images search and communication (think about Pinterest, Instagram, Snapchat and the same explosion of visual in Facebook) than words.

Moreover, Image Search and soon, Visual Search, is a search vertical that promises to be more interesting to marketers for product discovery and transactional searches, something which is not so for Voice Search, which seems more about informational searches and, when it comes to transactional ones, for already tested or discovered branded products buying.

Finally… very few SEOs are paying attention to Images and Visual Search… don’t be one of them. Start optimizing for Images and Visual Search and gain a competitive advantage.


Written By
Gianluca Fiorelli is an SEO and Web Marketing Strategist, who operates in the Italian, Spanish and English speaking countries market. He also works regularly as independent consultant with bigger international SEO agencies.
  • This field is for validation purposes and should be left unchanged.