<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Indexing Search Results: Why Google Is Going Against Its Own Advice</title>
	<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/</link>
	<description>Ciarán Norris blogs about SEO, social media, Web 2.0, music, stuff &#38; nonsense</description>
	<pubDate>Sat, 06 Sep 2008 16:51:41 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
		<item>
		<title>By: David Lindop</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-595</link>
		<dc:creator>David Lindop</dc:creator>
		<pubDate>Tue, 06 May 2008 08:00:40 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-595</guid>
		<description>I don't quite get the logic there Chris, but I think I'm going to run some specifity tests of my own with different robots.txt, meta instructions, in-site SERPs and inline nofollows.

Man, do I know how to party!</description>
		<content:encoded><![CDATA[<p>I don&#8217;t quite get the logic there Chris, but I think I&#8217;m going to run some specifity tests of my own with different robots.txt, meta instructions, in-site SERPs and inline nofollows.</p>
<p>Man, do I know how to party!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris Dimmock</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-594</link>
		<dc:creator>Chris Dimmock</dc:creator>
		<pubDate>Tue, 06 May 2008 06:05:42 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-594</guid>
		<description>Ciaran, 
You can't use a robots.txt AND a meta noindex together.

If you block a page with robots.txt, Googlebot will never crawl the page and will never read any meta tags on the page (so it never reads the meta noindex).

If you allow a page via robots.txt but block it from being indexed using a meta tag, Googlebot will access the page, read the meta tag, and subsequently not index it.

If you have both - the robots.txt trumps the meta noindex - as the bot won't parse/ visit the page, to be able to read the meta noindex.... so you get a 'thin' result.....

i.e. exactly what you are seeing here - 105,000 thin results

Robots.txt = thin result
Meta noindex = no thin result - nothing
Use both = thin result</description>
		<content:encoded><![CDATA[<p>Ciaran,<br />
You can&#8217;t use a robots.txt AND a meta noindex together.</p>
<p>If you block a page with robots.txt, Googlebot will never crawl the page and will never read any meta tags on the page (so it never reads the meta noindex).</p>
<p>If you allow a page via robots.txt but block it from being indexed using a meta tag, Googlebot will access the page, read the meta tag, and subsequently not index it.</p>
<p>If you have both - the robots.txt trumps the meta noindex - as the bot won&#8217;t parse/ visit the page, to be able to read the meta noindex&#8230;. so you get a &#8216;thin&#8217; result&#8230;..</p>
<p>i.e. exactly what you are seeing here - 105,000 thin results</p>
<p>Robots.txt = thin result<br />
Meta noindex = no thin result - nothing<br />
Use both = thin result</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ciaran</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-584</link>
		<dc:creator>Ciaran</dc:creator>
		<pubDate>Mon, 05 May 2008 22:03:27 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-584</guid>
		<description>You're right (obviously!) but why should they have to?
Google says "don't let us find search results" and then finds a way to do just that.

Err......</description>
		<content:encoded><![CDATA[<p>You&#8217;re right (obviously!) but why should they have to?<br />
Google says &#8220;don&#8217;t let us find search results&#8221; and then finds a way to do just that.</p>
<p>Err&#8230;&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joost de Valk</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-582</link>
		<dc:creator>Joost de Valk</dc:creator>
		<pubDate>Mon, 05 May 2008 21:50:58 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-582</guid>
		<description>btw the fix is pretty simple, change the search query from a get to a post, and it's all done :)</description>
		<content:encoded><![CDATA[<p>btw the fix is pretty simple, change the search query from a get to a post, and it&#8217;s all done <img src='http://ciarannorris.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ciaran</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-580</link>
		<dc:creator>Ciaran</dc:creator>
		<pubDate>Mon, 05 May 2008 21:40:30 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-580</guid>
		<description>Joost - it does seem like we (although I don't work there any more) were rather naive to believe that following Google's guidelines would mean that Google did what we asked.

I have to disagree with your second statement though; they're not being linked to - they don't exist unless you use the search box. And this is something that Google has only just started doing, and goes completely against their own guidelines.</description>
		<content:encoded><![CDATA[<p>Joost - it does seem like we (although I don&#8217;t work there any more) were rather naive to believe that following Google&#8217;s guidelines would mean that Google did what we asked.</p>
<p>I have to disagree with your second statement though; they&#8217;re not being linked to - they don&#8217;t exist unless you use the search box. And this is something that Google has only just started doing, and goes completely against their own guidelines.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joost de Valk</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-577</link>
		<dc:creator>Joost de Valk</dc:creator>
		<pubDate>Mon, 05 May 2008 21:10:16 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-577</guid>
		<description>The solution is pretty darn easy: get rid of that robots.txt line... Doing both is not double protection, it just shows that you don't know how it works :) Google has ALWAYS indexed links that it wasn't allowed to spider, for various reasons, and though I don't agree with them doing that either, it's how things work. 

And even though you're throwing the blame at Google now, I would seriously wonder how come 105,000 INTERNAL search result pages on that site are being linked to, apparently without a nofollow condom, thus loosing loads of link juice... What benefit is it to ANYONE to link to 105,000 internal SERPs?</description>
		<content:encoded><![CDATA[<p>The solution is pretty darn easy: get rid of that robots.txt line&#8230; Doing both is not double protection, it just shows that you don&#8217;t know how it works <img src='http://ciarannorris.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> Google has ALWAYS indexed links that it wasn&#8217;t allowed to spider, for various reasons, and though I don&#8217;t agree with them doing that either, it&#8217;s how things work. </p>
<p>And even though you&#8217;re throwing the blame at Google now, I would seriously wonder how come 105,000 INTERNAL search result pages on that site are being linked to, apparently without a nofollow condom, thus loosing loads of link juice&#8230; What benefit is it to ANYONE to link to 105,000 internal SERPs?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Mihm</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-574</link>
		<dc:creator>David Mihm</dc:creator>
		<pubDate>Mon, 05 May 2008 20:15:20 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-574</guid>
		<description>Brent, get your mind out of the gutter! :D

Ciaran, nice find &#38; well-written piece.</description>
		<content:encoded><![CDATA[<p>Brent, get your mind out of the gutter! <img src='http://ciarannorris.co.uk/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /><br />
Ciaran, nice find &amp; well-written piece.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brent D. Payne</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-570</link>
		<dc:creator>Brent D. Payne</dc:creator>
		<pubDate>Mon, 05 May 2008 16:59:17 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-570</guid>
		<description>What a huge black hat opportunity.  More or less, what Google's Webmaster Forum is stating (linked to above from Ann Smarty) is that if you disallow a page it can still be indexed based off of the inbound anchor text and relevance of the pages pointing to that page.  Googlebot is not allowed to view the page at all (and in this example can't determine it has a noindex tag).  If Google can't view the page at all then you could theoretically rank really well on something that has absolutely nothing to do with the page.  Imagine the possibilities.  Linkbait widgets that have anchor text of xyz that point to an abc page that is disallowed and thus Googlebot has no clue that the page is about abc versus xyz.

I miss the good ole days . . . too bad my hat has to be white now.  This could've been a lot of fun.  ;-)

Brent</description>
		<content:encoded><![CDATA[<p>What a huge black hat opportunity.  More or less, what Google&#8217;s Webmaster Forum is stating (linked to above from Ann Smarty) is that if you disallow a page it can still be indexed based off of the inbound anchor text and relevance of the pages pointing to that page.  Googlebot is not allowed to view the page at all (and in this example can&#8217;t determine it has a noindex tag).  If Google can&#8217;t view the page at all then you could theoretically rank really well on something that has absolutely nothing to do with the page.  Imagine the possibilities.  Linkbait widgets that have anchor text of xyz that point to an abc page that is disallowed and thus Googlebot has no clue that the page is about abc versus xyz.</p>
<p>I miss the good ole days . . . too bad my hat has to be white now.  This could&#8217;ve been a lot of fun.  <img src='http://ciarannorris.co.uk/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Brent</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ciaran</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-568</link>
		<dc:creator>Ciaran</dc:creator>
		<pubDate>Mon, 05 May 2008 16:56:45 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-568</guid>
		<description>Ok Ann, I've been thinking some more about this.

The search results pages in question are not being indexed because they have been linked to from another site (at least according to a quick look at Yahoo Site Explorer); they are being indexed because Google has decided to index the deep web. Therefore any semantics about crawling, indexing or anything like that is just so much rubbish - they're indexed because Google wants them to be, and nothing the site owner says is likely to make the slightest bit of difference.</description>
		<content:encoded><![CDATA[<p>Ok Ann, I&#8217;ve been thinking some more about this.</p>
<p>The search results pages in question are not being indexed because they have been linked to from another site (at least according to a quick look at Yahoo Site Explorer); they are being indexed because Google has decided to index the deep web. Therefore any semantics about crawling, indexing or anything like that is just so much rubbish - they&#8217;re indexed because Google wants them to be, and nothing the site owner says is likely to make the slightest bit of difference.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Lindop</title>
		<link>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-564</link>
		<dc:creator>David Lindop</dc:creator>
		<pubDate>Mon, 05 May 2008 16:34:42 +0000</pubDate>
		<guid>http://ciarannorris.co.uk/2008/05/05/indexing-search-results-why-google-is-going-against-its-own-advice/#comment-564</guid>
		<description>Cheers for the link Ann, that's a good way of putting it.</description>
		<content:encoded><![CDATA[<p>Cheers for the link Ann, that&#8217;s a good way of putting it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
