This post is presented by CIBC

From today’s vantage point, it looks like nothing will slow the momentum of the Visual Web.

Selfies, pins, memes, and other picture formats are now dominating our once text-based Web. Sites like Pinterest, Tumblr, and Instagram don’t just rule mobile; they also boast the youngest, hippest audiences coveted by marketers and startups alike. It’s no wonder that in 2013, the Visual Web became a billion-dollar trend, sparking skyrocketing valuations for the leading players. 

But there’s only been one truth on the Web that’s sure to last, and that’s “innovate or die.” Like every other disruptive technology, the Visual Web will come head to head with challenges that could jeopardize its widespread adoption. 

If the Visual Web wants to last, image-centered networks and apps will need to keep improving. Here are the biggest issues facing the Visual Web today—and the ways it might overcome them.

True Visual Search 

Since the adolescence of the Web, search engines have identified textual cues and linked them to the objects that users are most likely looking for. Web users and Web publishers have grown accustomed to looking for and providing answers based on text. 

Needless to say, that’s not going to work on the Visual Web. Image-heavy algorithms that rely on adjacent textual cues are a recipe for disaster. Case in point: in the early days of Pinterest, the network’s algorithm gave more weight to image titles than anything else. That meant anybody could make their image “most” relevant by typing the title three times.

According to Apu Gupta, CEO of Curalate, the provider of a suite of Visual Web analytics tools, the biggest challenge the Visual Web faces right now is restoring context to its content—with no text required. 

“Every single social media analytics tool that has ever been created has assumed text would be a constant,” he said. “As consumers increasingly communicate using pictures, they decreasingly use words. In a world where the text based cues become smaller and smaller, determining what an image represents becomes harder and harder.”

Right now, players in the Visual Web are working on solving this issue in several different ways. The most obvious solution would be for companies and vendors to develop technology that literally reads the images themselves. Facebook has long had this with its less-than-reliable facial recognition software. GazeMetrix takes this a step further, by identifying brands and logos out of the pixels of an online image. Curalate itself uses a technique called “pixel recognition” to literally identify images on the Visual Web by their pixel makeup.  

Pinterest has another solution, an internal system called Rich Pins, which aims to surface relevant data about pinned images. These pins appear to users as especially text-heavy, but actually attach metadata to the image that makes it easier for the Pinterest algorithm to recognize and categorize. Unfortunately, they require the cooperation and labor of businesses and developers to work correctly so far.  

Analysis Through Open Data

The second challenge the Visual Web faces today is making itself work for marketers. The support of brands will make it far easier for Visual Web networks to become self sustaining.

You’d think the Visual Web would be a marketer’s dream, and it is—if you’re a big business. On Instagram, Sony worked with a popular photographer to indirectly advertise its cameras. On Tumblr, Sephora outright uses one of the platform’s blogs as its sole fashion magazine. For big brands, there are plenty of options. But what about small businesses?

If you’re an up-and-coming business using the Visual Web to gain publicity for the first time, your best bet is to create an image that goes viral. And according to Danny Maloney, cofounder and CEO of Pinterest analytics company Tailwind, that’s when you can run into a host of issues. 

“If you post an image that is going around the Web and has been shared a bunch of times and you want to edit it, that’s almost impossible to do,” he said. “The Visual Web is missing the ability to update content that has already been published. For that to work, you’d need the ability to identify everywhere the image has been shared.” 

Even worse, users sharing your image might cut out your watermark, making it impossible to trace the image back to you. Right now, you can’t replace those instances with the right picture, because you can’t tell yet everywhere your image is shared. Even if you can tell for only one network, you can’t tell for all of them. 

Fortunately, Maloney said, the technology to match images does exist. A former strategy associate at YouTube, he recalled the work that company went into building advanced technology to perform video matching and identify when somebody uploaded a pirated video. It’s a problem with solutions, but for it to work, Visual Web sites would have to work together.

“I think the bigger question is, ‘How do you get multiple data providers to open access to their visual content?’ The more open access to data is by web platforms, the more solvable this problem becomes. No one of the platforms can solve it itself; Tumblr will never have all the data. But if they collaborate with a third party, it can be solved.” 

It’s fortunate that the biggest challenges the Visual Web is facing are both technological ones, because if anything’s constant, it’s that human innovation is always improving. The question is whether big players will be able to solve these problems before the next big thing comes along.