
Several SEOs are closely watching as Google’s policy change from last month went into effect, specifically its decision to not pass the search query in their referrer for logged in users.
They assured us this change would only affect about 2% of our data (the effect being that keyword is listed as “(not provided)” ) But when people started to look at the numbers, they seem to be losing more data (something closer to 20%) than they were expecting.
When I first heard about what people were experiencing, I wasn’t too woried … I thought the reason Google stopped passing the data wasn’t that it had intentionally made it go away, it was just an effect of them changing to the https protocol. Google has been providing a https url for their service for awhile, but very few people used it so it was negligible.
If this had been the real reason it was lost, it wouldn’t be Google’s fault because the W3C’s (World Wide Web Consortium) rules that a page going from a https url to a http url shouldn’t pass referrer variables because its information could be disclosed (the url MAY have some sensitive data in it). The good thing is that if both pages are https then the referrer would be passed and you could get your keyword data.
All you would have to do is install an SSL certificate and serve your entire site securely (https), but when I fired up a netword sniffer to analyze the data google was sending to the referring page I was a little surprised. The http referrer was set but the query parameter has been blanked out.
This deliberate removal of the query makes me wonder why they actually did it. Some say this was to try and help Google keep its monopoly on online advertising because other ad networks were using that data to build better mousetraps so Google did it to cut off the data source. So what does the referrer look like? Here you can see a sample of the headers my server saw after a user was referred by Google
HTTP_CONNECTION:keep-alive HTTP_ACCEPT:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 HTTP_ACCEPT_CHARSET:ISO-8859-1,utf-8;q=0.7,*;q=0.3 HTTP_ACCEPT_ENCODING:gzip,deflate,sdch HTTP_ACCEPT_LANGUAGE:en-US,en;q=0.8 HTTP_COOKIE:__qca=P0-841215490-1313768566581; ASPSESSIONIDQASTBCRC=BMOHIJBDNNMJFII DNMFCONGC; __utma=254081434.784855759.1310058665.1322775856.1323197355.8; __utmc=254081434; __utmz=254081434.1322775856.7.2.utmcsr=blog.cartercole.com| utmccn=(referral)|utmcmd=referral|utmcct=/2010/01/analysis-of-forum-spam-attack -my-spam.html; ASPSESSIONIDCASSTQDR=NFGLFJBAHAFACMGCOHDEDJGD HTTP_HOST:cartercole.com HTTP_REFERER:http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CC cQFjAA&url=http%3A%2F%2Fcartercole.com%2Fdonate.asp&ei=gWviTt-PGeiqsQKSufmfBg&usg=AFQ jCNHSCcQ-053LI9fWYk6Iv5Wx-K-MJA&sig2=CdBaeGrDJC7OkJ5JoTaQKw HTTP_USER_AGENT:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2
See the “HTTP_REFERER”? Because of the Panda release of its search software, Google needs a way to track what url you click on in the SERPs so they redirect every URL in the results now. They first go to the http version of google (so they could pass the data if they wanted to) but they leave q= blank (q is the query string parameter for the query you googled). So whats the real reason they did this? We have no clue, but it’s not very popular. The guys over at SeoMoz did a whiteboard video about it, as well as some other users who have written open letters to Google complaining about the change
Keep track of the percent of search traffic that isnt showing keywords and start to look for other sources like Bing or Yahoo still allow you to view the queries from incoming search engine traffic.
Loading...
Comments
Google Panda project aims to clean out the web from shallow content. From one side it may low down the e-commerce profits, from the other side the users will get relevant web content. Moreover the competitors with low content web sites will be kicked down while you have the possibility to raise the ranks adding to your web site relevant content.
Based on the new "Social Search" that google just released it seems to be the real cause of them hiding this keyword data and panda had nothing to do with it