In true cheap blogging fashion many people have shared Google’s new guidelines for web content quality. Ironically, it is this type of sharing without adding value or providing a reason for sharing that Google despises. Why share something other than a link to the source if you have nothing to say about it other than “these are Google’s new web content guidelines”?
These are Google’s New Web Content Guidelines
So, rather than just paste the guidelines I will now share my understanding of these guidelines, based on messing about with the content on my websites over the last 6 months. I was going to write up a document on this Panda recovery stuff but I have now done away with plans to publish an eBook. The long awaited How to recover from Google Panda will never be published as I now feel that it is a waste of time and effort on my part to do so. Sorry to those that really wanted a copy, but really, you do not need it. So, now to the Google quality guidelines, which are just a bunch of questions you need to ask yourself about your website. Remember, have more pages with a higher quality score and there is less chance of a Panda Slap.
1. Would you trust the information presented in this article?
OK, so, how do people decide whether or not to trust what they read? What are the signals that say to you, “hey, this must be the real deal, this must be true, this must be reliable information?”. Can you think of any? Well, I can think of many, and many pages fail to include these signals. It is logical (remember, it is a search engine that now uses human reasoning, but still works on logic) to assume that increasing the positive signals will improve your page quality score. When a teacher used to ask you to write an essay, what did they expect you to do to prove your argument?
2. Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
Simple to find out. Many websites are full of articles on a range of topics that are written purely for search engine traffic. These articles are not written by experts. You do not need to be qualified in your chosen field to be considered an expert, but you should be very experienced and knowledgeable about your area, and be able and willing to explain why.
Many news reporters specialise in specific areas, such as finance, politics, celebrity or sports, some focus on sub-niches such as London markets, middle-eastern politics, Hollywood or NFL. Be focused and be authoritative. A good track record goes a long way. Signals? Hmmm, authorship maybe?
3. Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
There are many reasons why this may occur. Some are innocent, some are not. It is not uncommon for an SEO to find that some keyword phrases bring (or used to) in a healthy level of search traffic that would generally convert well or cause high paying Adsense adverts to show.
As a result, some people made many similar pages with slight alterations in the keywords shown and the order in which there appeared. In the past this appeared to create an illusion of a depth in the topic (an illusion that often fooled Google, but not readers).
Now Google, with the clever Panda stuff that analyses relationships between pages etc. can spot this and yell “spammer” and hammer your site down. The solution? Just make sure each page is sufficiently different from every other page. If you had 2 similar pages and they have both tanked out in the SERPs, change one page enough so that they are definitely different from each other. Maybe they will be pop back again. Or merge similar content together.
4. Would you be comfortable giving your credit card information to this site?
A bit vague, probably alluding to the idea that if you are selling stuff you should have a secure server set up with secure checkout and the badges to prove that your site is safe and …. secure. SSL Certificate? If you don’t have one you probably need one. I guess that using a trustworthy 3rd party system like Paypal or Google Checkout works well enough. I do not do eCommerce stuff so cannot comment really, no experience there.
5. Does this article have spelling, stylistic, or factual errors?
Spelling is easy to fix (I recommend using the After the Deadline extension for Chrome). There are some services that offer webpage spell checking. Really you need to proof read (something I fail to do properly, no need to tell me!). Factual errors – a trickier one to resolve, goes back to the first point, about trusting information – how can a computer trust that your facts are correct?
6. Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
In short – do not keyword stuff your titles, headers, content with the high paying / costing Adwords keywords. Write naturally, write in a way that reads well and makes sense. From this we can assume that Google have improved its ability to determine what is “good content” and what is spammy, spun, hastily written drivel.
7. Does the article provide original content or information, original reporting, original research, or original analysis?
Obvious – are you copying what you just read in your news feed and hoping that with a few keywords and some quick social sharing you out rank the source? Are you even linking to the source or pretending that you are both the political corresponded based in Kabul and the technology reporter based in Silicon Valley, on the same afternoon?
There is nothing wrong with telling the same story so long as you add value and give your opinion. But, if nobody cares about your opinion (because you do not fulfil the first point), or you are obviously not an expert (the second point) or it is so quickly spun that it is full of spelling and factual errors (the first point) then forget about it.
8. Does the page provide substantial value when compared to other pages in search results?
Search for your most used keyword for a particular page (or the one you did rank well for pre-Panda) and look at the first 10 pages / sites in Google search. What sets them apart from your site or those ranking on page 2 and beyond? Anything? Are they all the same? Do they all tell a different story. Consider all the above points and then compare to your own pages. Remember, each story has more than one side to it.
9. How much quality control is done on content?
Do you have an editor? Are pages reviewed or updated? Does a page get published and then is never changed? Does the content fulfil all the above points?
10. Does the article describe both sides of a story?
Not possible for all pages but does give a good indication of authority and trustworthiness when it does. An information site should strive to provide a balanced argument. A page that looks at the pros and cons of hitch hiking cannot be accused of writing with the specific aim of selling a hitchhiking service. If you only write about the benefits of a service, a product or solution then that could be seen as being biased and untrustworthy.
This is suggesting that more specific niches will do better, so long as they fulfil all the above points. I do see this is in the SERPs. This is part the reason why those pesky keyword domains seem to be doing so well – they focus on one specific area. However, I suspect that some of them will come under the radar sooner or later as further improvements in the algorithm reveal many of them to break several of the above quality guidelines, although at the moment many are being very cunning to avoid penalty.
12. Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
Basically – how big is it? A bit vague. Wikipedia fails here – many people all over the place adding and editing content. But, it fulfils all other points. It is essentially a dynamic, peer-reviewed encyclopaedia with just a little bit of spam that ebbs and flows with the gentle tide of the ever developing Internet.
This is really the key “farmer” part of the update that everyone was talking about. Some sites have outsource content writing to teams of people in countries where the writers are not highly educated or experienced in the topics that they write and it may not even be their first language.
The very last bit on attention and care goes back to the night point on quality control. Are the articles reviewed? In Wikipedia the information is being constantly reviewed. In other sites the information is static and this does not provide a very good signal to Google that it will be useful to readers. Who wants to read a page on the web about medical advancements in breast cancer that is 5 years old? A lot happens, content needs to change or be archived.
13. Was the article edited well, or does it appear sloppy or hastily produced?
Another quality guideline, sounds like points 7 to 11 really. Maybe that is the point. Goes back to writing good quality content, writing for a purpose, providing all arguments or ideas, showing why your article should be trusted and updating the information when required.
Goes all the way back to point 2 really – why trust the content you are reading? Is the person qualified to write on the subject? A reporter does not need to have higher qualifications to write on certain topics if they can show that they write at length on this topic.
Someone who has been reporting on issues surrounding fitness and nutrition, for example, may have acquired a lot of information on the subject over many years. However, if there is not a signal to tell Google this, it will likely err on the side of caution.
Think – how would you determine is the information is trustworthy? If you read something in an online tabloid newspaper what would make you trust the information? Should you trust it at all (tabloids have an excellent track record for taking one line from a research paper, throwing the rest out, and making up their own sensationalist conclusions).
This could be talking about branding and also hinting at keyword domains. If you did search for “pros and cons of hitch hiking”, which domain would you naturally trust more:
16. Does this article provide a complete or comprehensive description of the topic?
Again, is it authoritative? Does it tell both side of the story? Is it short and missing vital points or long and providing an in-depth analysis of a topic? My last tutor used to say something like “if you essay is less than a 1000 words then you are most probably missing some vital information”. Lots of short articles are likely to be another signal to Google of a poorer website in terms of quality. Some short articles are OK, and to be expected, but if all of them are short (just written to fill the gap between adverts) then expect trouble.
17. Does this article contain insightful analysis or interesting information that is beyond obvious?
Simple, has the writer analysed the information and drawn their own conclusions, and then added to that to provide a new idea or concept on the subject. Does the article just report of does it provide new idea and concepts that build on previous work? If you have ever written essays then you will know that you are expected to review the work of others and then draw on that work to develop your own conclusions. And of course, reference the sources.
Well, if it is the sort of page you would want to bookmark and share, it is logical that in time this is a signal to Google. The only way Google can know that a site or page is bookmarked or shared is by monitoring public stats, like Google+ shares, Facebook likes, Twittering tweets and maybe some older bookmarking things like Delicious, Digg and StumbleUpon.
Of course, seeing that so many people use Google Analytics Google could conceivably use this data to see how people find a site, although they have stated that they will never use that information in the search ranking algorithm.
19. Does this article have an excessive amount of ads that distract from or interfere with the main content?
Many people have talked about this and so far nobody has provided a definite answer to how many is too many. There is criticism from some webmasters about Google Adsense pushing for webmasters to add more adverts (3 content blocks, 2 link units, 2 search boxes etc) and placing ads above the fold, below the navigation and generally in all the advert hotspots. This seems to go against the Search teams quality guidelines. However, some sites that have added more adverts since getting slapped by Panda have also seen signs of recovery.
20. Would you expect to see this article in a printed magazine, encyclopedia or book?
In short, is it really good enough for general release. Would it pass the strict editorial review of a hardened newspaper editor with 30 years of experience under his belt? Or would it get thrown out with the other 500 articles that would-be journalists and freelance reporters send in each day? Even shorter: is it really any good?
21. Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?
Didn’t we do this already? Isn’t this just repeating the bits from the first 10 points all over again? Maybe there is a reason – is it a more important ranking factor / quality factor?
Short articles waste a reader’s time as they do not provide the answer they were looking for. They probably just raise more questions and if there is not way from your webpage for the reader to find those answers, that is one unhappy bunny. Unhappy bunnies are BAD SEO! Keep the bunnies happy.
Write in enough detail to please the reader. Remember school and exams? What did you teacher keep reminding you to do? Answer the question! Consider your page to be an attempt to answer a question. As my OU tutors generally say “if your answer is fewer than 1000 words it is probably missing vital information, if it is more than 2000 words it is probably repeating and waffling. Remember that your conclusions are also marked“. That last line is important too – that takes us back to points 16 and 17 – does it add value?
22. Are the pages produced with great care and attention to detail vs. less attention to detail?
Are they sloppy? This seems to cover many of the quality points above, but we could also deduce that it is suggesting something beyond the simple written word. Does it make good use of headers, paragraphs, blockquotes, font sizes? Does it have nice images, photos, videos – something stylish, something different?
If you look at a newspaper or magazine article the content is laid out in a stylish way. There are breaks in the text, images with captions and important quotes are often repeated in larger fonts to attract your attention. The pages look interesting. Pretty pictures maybe? It is relatively easy to knock up some new words, but getting a unique image is harder. And yes, Google knows when an image is unique.
23. Would users complain when they see pages from this site?
If people are calling trading standards, calling the police, reporting your site to Google, then there is a good chance that Google will find out and take action. In the past people boasted about getting search traffic from bad reviews. Google did not like that very much. The Panda Unleashed is dealing with the problem. Also note the second Google.com reference below.
I get a strong impression that Panda has resulted in link text becoming less important, although links in general still important, but Google now makes a better analysis of page copy and combined with author and website authority it ranks pages (and sites) for the keywords on the page that are meaningful to the search term, even if not identical to the search term itself.
It is as if search is going full circle. In 1998 META keywords provided Google with an indication of what was on the page. These have long been forgotten (due to mis-use). Now Google is able to carry out a full page review for each page on a site, compare it to all other pages on same site, cross reference with similar sites in that niche and determine its quality. Really a remarkable accomplishment.
I really believe that these “Google Panda” updates are just the beginning of a new way of indexing and ranking websites. SEOs have spent years gaming the search engines with well placed links containing keyword phrases that match searches and on-page content (titles, headers etc). Now the emphasis is moving towards an analysis of the content on the page and using PageRank as a guide and a means to rank pages that are equal in terms of quality, but the “Panda Factor” takes care of determining if the site is fit for purpose.
Well, that is it. That is my take on the whole Google Panda thing.
Matt Cutts talks a little about Panda
The last 30 seconds are pretty insightful – think like Google!
The very last thing he says is …. “uuummmm”. He probably could have edited that out
References and Resources
Want to learn more about Google Panda or search engine optimisation in general? Check out these pages. Some of the last links are a little random but I found them interesting to give some possible clues and insights into how the whole search thing may be working, or at least, what the ultimate goal is. Relationship of content – page to page, page to sites, sites to sites, sites to people, people to memes, ideas, dreams etc. A somewhat unscientific and haphazard holistic approach to what should be nothing but hardcore logic. But this is my head and my blog.
- More guidance on building high-quality sites – Google.com.
- High-quality sites algorithm goes global, incorporates user feedback – Google.com.
- Google: Remove Low Quality Content If You Were Impacted By Farmer/Panda – Seroundtable.com
- Can You Dig Out Of Your Google Panda Hole By Offloading To Subdomains? – Searchengineland.com
- TED 2011: The ‘Panda’ That Hates Farms: A Q&A With Google’s Top Search Engineers – Wired.com
- How a Search Engine May Measure the Quality of Its Search Results – Seobythesea.com
- Searching Google for Big Panda and Finding Decision Trees – Seobythesea.com
- How to Reverse Engineer the Google Panda Update – Seo-theory.com
- Method and apparatus for establishing relationship between documents – by Qing Bo Wang et al, a Google Patent.
- System and method of textual information analytics – another Google Patent that talks about relationships between content (note, not keyword anchor text or PageRank stuff)
- Formulating context-dependent similarity functions by Gang Wu, Edward Y. Chang, Navneet Panda. ACM International Conference on Multimedia. Abstract.
- “Fast Algorithms for Finding Extremal Sets”, Roberto J. Bayardo, Biswanath Panda, Proc. of the 2011 SIAM Int’l Conf. on Data Mining – Abstract: http://research.google.com/pubs/pub36974.html. Full paper: http://bayardo.org/ps/sdm2011.pdf
- “PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce”, Biswanath Panda, Joshua S. Herbach, Sugato Basu,Roberto J. Bayardo, Proceedings of the 35th International Conference on Very Large Data Bases (VLDB-2009). Abstract: http://research.google.com/pubs/pub36296.html Full paper: http://www.bayardo.org/ps/vldb2009.pdf
- Many thanks to J. Patrick Fischer for sharing the photo of the panda.