Thursday, 15 October 2015

Have We Been Wrong About Panda All Along?

Posted by MarieHaynes

Thin content! Duplicate content! Everyone knows that these are huge Panda factors. But are they really? In this article, I will explore the possibility that Panda is about so much more than thin and duplicate content. I don’t have a list of ten steps to follow to cure your Panda problems. But, I do hope that this article provokes some good discussion on how to improve our websites in the eyes of Google’s Panda algorithm.

The duplicate content monster

Recently, Google employee John Mueller ran a webmaster help hangout that focused on duplicate content issues. It was one of the best hangouts I have seen in a while—full of excellent information. John commented that almost every website has some sort of duplicate content. Some duplicate content could be there because of a CMS that sets up multiple tag pages. Another example would be an eCommerce store that carries several sizes of a product and has a unique URL for each size.

He also said that when Google detects duplicate content, it generally does not do much harm, but rather, Google determines which page they think is the best and they display that page.

But wait! Isn’t duplicate content a Panda issue? This is well believed in the SEO world. In fact, the Moz Q&A has almost 1800 pages indexed that ask about duplicate content and Panda!

I asked John Mueller whether duplicate content issues could be Panda issues. I wondered if perhaps duplicate content reduced crawl efficiency and this, in turn, would be a signal of low quality in the eyes of the Panda algorithm. He responded saying that these were not related, but were in fact two separate issues:

The purpose of this post is not to instruct you on how to deal with duplicate content. Google has some good guidelines here. Cleaning up your duplicate content can, in many cases, improve your crawl efficiency—which in some cases can result in an improvement in rankings. But I think that, contrary to what many of us have believed, duplicate content is NOT a huge component to the Panda algorithm.

Where duplicate content can get you in trouble is if you are purposely duplicating content in a spammy way in order to manipulate Google. For example, if a huge portion of your site consisted of articles duplicated from other sources, or if you are purposely trying to duplicate content with the intent of manipulating Google, then this can get you a manual penalty and can cause your site to be removed from the Google index:

These cases are not common, though. Google isn't talking about penalizing sites that have duplicate product pages or a boatload of Wordpress tag pages. While it's always good to have as clean a site as possible, I'm going to make a bold statement here and say that this type of issue likely is not important when it comes to Panda.

What about thin content?

This is where things can get a little bit tricky. Recently, Google employee Gary Illyes caused a stir when he stated that Google doesn’t recommend removing thin content but rather, beefing up your site to make it “thick” and full of value.

Jen Slegg from The SEM Post had a great writeup covering this discussion; if you’re interested in reading more, I wrote a long post discussing why I believe that we should indeed remove thin content when trying to recover from a Panda hit, along with a case study showing a site that made a nice Panda recovery after removing thin content.

The current general consensus amongst SEOs who work with Panda-hit sites is that thin content should be improved upon wherever possible. But, if a site has a good deal of thin, unhelpful pages, it does make sense to remove those pages from Google’s index.

The reason for this is that Panda is all about quality. In the example which I wrote about where a site recovered from Panda after removing thin content, the site had hosted thousands of forum posts that contained unanswered questions. A user landing on one of these questions would not have found the page helpful and would likely have found another site to read in order to answer their query.

I believe that thin content can indeed be a Panda factor if that content consistently disappoints searchers who land on that page. If you have enough pages like this on your site, then yes, by all means, clean it up.

Panda is about so much MORE than duplicate and thin content

While some sites can recover from Panda after clearing out pages and pages of thin content, for most Panda-hit sites, the issues are much deeper and more complex. If you have a mediocre site that contains thousands of thin pages, removing those thin pages will not make the site excellent.

I believe Panda is entirely about excellence.

At Pubcon in Vegas, Rand Fishkin gave an excellent keynote speech in which he talked about living in a two-algo world. Rand spoke about the “regular algorithm,” which, in years past, we've worked hard to figure out and conquer by optimizing our title tags, improving our page speed, and gaining good links. But then he also spoke of a machine learning algorithm.

When Rand said “We’re talking about algorithms that build algorithms,” something clicked in my head and I realized that this very well could be what's happening with Panda. Google has consistently said that Panda is about showing users the highest-quality sites. Rand suggested that machine learning algos may classify a site as a high quality one if they're able to do some of the following things:

  • Consistently garner a higher click-through rate than their competitors.
  • Get users to engage more with your site than others in your space.
  • Answer more questions than other sites.
  • Earn more shares and clicks that result in loyal users.
  • Be the site that ultimately fulfills the searcher's task.

There are no quick ways to fulfill these criteria. Your site ultimately has to be the best in order for Google to consider it the best.

I believe that Google is getting better and better at determining which sites are the most helpful ones to show users. If your site has been negatively affected by Panda, it may not be because you have technical on-site issues, but because your competitors’ sites are of higher overall quality than yours.

Is this why we're not seeing many Panda recoveries?

In mid- to late 2014, Google was still refreshing Panda monthly. Then, after October of 2014, we had nine months of Panda silence. We all rejoiced when we heard that Google was refreshing Panda again in July of 2015. Google told us it would take a while for this algo to roll out. At the time of writing this, Panda has been supposedly rolling out for three months. I’ve seen some sporadic reports of mild recoveries, but I would say that probably 98% of the sites that have made on-site quality changes in hopes of a Panda recovery have seen no movement at all.

While it’s possible that the slow rollout still hasn’t affected the majority of sites, I think that there's another frightening possibility.

It's possible that sites that saw a Panda-related ranking demotion will only be able to recover if they can drastically improve the site to the point where users GREATLY prefer this site over their competitors’ sites.

It is always good to do an on-site quality audit. I still recommend a thorough site audit for any website that has suffered a loss in traffic that coincides with a Panda rerun date. In many cases, fixing quality issues—such as page speed problems, canonical issues, and confusing URL structures—can result in ranking improvement. But I think that we also need to put a HUGE emphasis on making your site the best of its kind.

And that’s not easy.

I've reviewed a lot of eCommerce sites that have been hit by Panda over the years. I have seen few of these recover. Many of them have had site audits done by several of the industry’s recognized experts. In some cases, the sites haven't recovered because they have not implemented the recommended changes. However, there are quite a few sites that have made significant changes, yet still seem to be stuck under some type of ranking demotion.

In many cases like this, I've spent some time reviewing competitors’ sites that are currently ranking well. What I’ll do is try to complete a task, such as searching for and reaching the point of purchase on a particular product on the Panda hit-site, as well as the competitors’ sites. In most cases, I’ll find that the competitors offer a vastly better search experience. They may have a number of things that the Panda-hit site doesn't, such as the following:

  • A better search interface.
  • Better browsing options (i.e. search by color, size, etc.)
  • Pictures that are much better and more descriptive than the standard stock product photos.
  • Great, helpful reviews.
  • Buying guides that help the searcher determine which product is best to buy.
  • Video tutorials on using their products.
  • More competitive pricing.
  • A shopping cart that's easier to use.

The question that I ask myself is, “If I were buying this product, would I want to search for it and buy it on my clients’ site, or on one of these competitors’ sites?” The answer is almost always the latter.

And this is why Panda recovery is difficult. It’s not easy for a site to simply improve their search interface, add legitimate reviews that are not just scraped from another source, or create guides and video tutorials for many of their products. Even if the site did add these features, this is only going to bring them to the level where they are perhaps just as good as their competitors. I believe that in order to recover from Panda, you need to show Google that by far, users prefer your website over any other one.

This doesn’t just apply to eCommerce sites. I have reviewed a number of informational sites that have been hit by Panda. In some cases, clearing up thin content can result in Panda recoveries. But often, when an informational site is hit by Panda, it’s because the overall quality of the content is sub-par.

If you run a news site and you’re pushing out fifty stories a day that contain the same information as everyone else in your space, it’s going to be hard to convince Google’s algorithms that they should be showing your site’s pages first. You’ve got to find a way to make your site the one that everyone wants to visit. You want to be the site that when people see you in the SERPS, even if you’re not sitting at position #1, they say, “Oh…I want to get my news from THAT site…I know them and I trust them…and they always provide good information.”

In the past, a mediocre site could be propelled to the top of the SERPS by tweaking things like keywords in title tags, improving internal linking, and building some links. But, as Google’s algorithms get better and better at determining quality, the only sites that are going to rank well are the ones that are really good at providing value. Sure, they’re not quite there yet, but they keep improving.

So should I just give up?

No! I still believe that Panda recovery is possible. In fact, I would say that we're in an age of the Internet where we have much potential for improvement. If you've been hit by Panda, then this is your opportunity to dig in deep, work hard, and make your site an incredible site that Google would be proud to recommend.

The following posts are good ones to read for people who are trying to improve their sites in the eyes of Panda:

How the Panda Algorithm Might Evaluate Your Site – A thorough post by Michael Martinez that looks at each of Amit Singhal’s 23 Questions for Panda-hit sites in great detail.

Leveraging Panda To Get Out Of Product Feed Jail – An excellent post on the Moz blog in which Michael Cottam gives some tips to help make your product pages stand out and be much more valuable than your competitors’ pages.

Google’s Advice on Making a High-Quality Site – This is short, but contains many nuggets.

Case Study – One Site’s Recovery from an Ugly SEO Mess – Alan Bleiweiss gives thorough detail on how implementing advice from a strong technical audit resulted in a huge Panda recovery.

Glenn Gabe’s Panda 4.0 Analysis – This post contains a fantastic list of things to clean up and improve upon for Panda-hit sites.

If you have been hit by Panda, you absolutely must do the following:

  • Start with a thorough on-site quality audit.
  • Find and remove any large chunks of thin content.
  • Deal with anything that annoys users, such as huge popups or navigation that doesn’t work.

But then we have to do more. In the first few years of Panda’s existence, making significant changes in on-site quality could result in beautiful Panda recoveries. I am speculating though that now, as Google gets better at determining which sites provide the most value, this may not be enough for many sites.

If you have been hit by Panda, it is unlikely that there is a quick fix. It is unlikely that you can tweak a few things or remove a chunk of content and see a dramatic recovery. Most likely, you will need to DRAMATICALLY improve the overall usefulness of the site to the point where it's obvious to everyone that your pages are the best choices for Google to present to searchers.

What do you think?

I am seriously hoping that I'm wrong in predicting that the only sites we'll see make significant Panda recoveries are ones that have dramatically overhauled all of their content. Who knows…perhaps one day soon we'll start seeing awesome recoveries as this agonizingly slow iteration of Panda rolls out. But if we don’t, then we all need to get working on making our sites far better than anyone else’s site!

Do you think that technical changes alone can result in Panda recoveries? Or is vastly improving upon all of your content necessary as well?


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!



No comments:

Post a Comment