Webmaster Help Feed

Google Help > Webmaster > Discussions > Crawling, indexing & ranking > SERPs discrepancies

Question: SERPs discrepancies

Note: all boxed text is not a part of the original thread. It was added later as comments.

The Book Of Google

Level 1
7/4/09

I have several large sites, each with 30,000 to 70,000 articles.
For some unknown reason, during the last two months,
the Google index was steadily declining until it felt from more
than 20,000 articles/site indexed to around 500.

Sites have been up for more than a year and all sitemaps were submitted
and accepted by Google without problems several months ago.

At high point, sites were indexed to about 20,000 - 50,000 pages per site.
Considering there are older versions of the pages on the sites, the correct
index count should be over 50,000 pages until Google removes old versions.

Recently, it has been discovered that Google search does not produce the
correct number of indexed pages on SERPs.

For example, for site http://mfcgoldmine.uuuq.com we see following results:

Search on site:mfcgoldmine.uuuq.com:

1,250 from mfcgoldmine.uuuq.com

But, once in a while you do get the correct number of pages indexed:

61,900 from mfcgoldmine.uuuq.com

So, if you are doing a search, the chances of you getting almost 40 times
lower count of indexed pages than Google actually indexed, is over 98%.

For example:

Site: http://mfcgoldmine.by.ru

When you do a search on site:mfcgoldmine.by.ru here is what you get:

1,530 from mfcgoldmine.by.ru

Here is the correct result is:

37,900 from mfcgoldmine.by.ru

Here is a screenshot of the correct index:

http://img505.imageshack.us/img505/5166/sitemfcgoldminebyrugoog.png

When you do a search on http://mfcgoldmine.uuuq.com domain,
without specifying site:... this is what you get:

Search on mfcgoldmine.uuuq.com

This is what search shows most of the times:

3,790 for mfcgoldmine.uuuq.com

while the correct number is:

1,030,000 for mfcgoldmine.uuuq.com

which is whooping 250 times higher in reality!

Screenshot for the correct index:

http://img10.yfrog.com/img10/8379/mfcgoldmineuuuqcomgoogl.png

Does anybody know what this is?

Has anybody else seen something of this sort?

And here is the stats for all sites in question:
Here are the stats for all the sites:

http://mfcgoldmine.uuuq.com

Search on site:mfcgoldmine.uuuq.com

61,800 from mfcgoldmine.uuuq.com - correct result
1,160 from mfcgoldmine.uuuq.com - incorrect result
(off by whooping 40 times)

Search on mfcgoldmine.uuuq.com domain, without site:...

Results 21 - 30 of about 1,030,000 for mfcgoldmine.uuuq.com - correct result
Results 1 - 10 of about 3,360 for mfcgoldmine.uuuq.com - incorrect result
(off by whooping 250 times)

Site http://mfcgoldmine.by.ru

Search on site:mfcgoldmine.by.ru

37,900 from mfcgoldmine.by.ru - correct result
1,490 from mfcgoldmine.by.ru - incorrect result
(off by 20 times)

Search on mfcgoldmine.by.ru domain, without site:...

Results 1 - 10 of about 826,000 for mfcgoldmine.by.ru - correct result
Results 1 - 10 of about 4,460 for mfcgoldmine.by.ru - incorrect result
(off by 200 times)

Site http://cppgoldmine.uuuq.com

Search on site:cppgoldmine.uuuq.com

32,800 from cppgoldmine.uuuq.com - almost correct result, but lower than it should be
2,880 from cppgoldmine.uuuq.com - incorrect result
(off by 10 times)

Search on cppgoldmine.uuuq.com domain, without specifying site:...

Results 1 - 10 of about 543,000 for cppgoldmine.uuuq.com - correct result
Results 1 - 10 of about 4,320 for cppgoldmine.uuuq.com - incorrect result
(off by > 100 times)

Site: http://cppgoldmine.by.ru

Search on site:cppgoldmine.by.ru

25,400 from cppgoldmine.by.ru - almost correct result, but lower than it should be
315 from cppgoldmine.by.ru - incorrect result
(off by almost 100 times)

Search on cppgoldmine.by.ru domain, without specifying site:...

Results 1 - 10 of about 513,000 for cppgoldmine.by.ru - correct result
3,060 for cppgoldmine.by.ru - incorrect result
(off by 150 times)

Site http://javagoldmine.uuuq.com

Search on site:javagoldmine.uuuq.com

10,400 from javagoldmine.uuuq.com - incorrect, but closer to reality (which should be > 40,000)
3,190 from javagoldmine.uuuq.com - incorrect result

Search on javagoldmine.uuuq.com domain, without specifying site:...

Results 1 - 10 of about 459,000 for javagoldmine.uuuq.com - correct result
Results 1 - 10 of about 6,200 for javagoldmine.uuuq.com - incorrect result
(off by > 70 times)

Site http://tarkus01.by.ru

Search one site:tarkus01.by.ru

71,200 from tarkus01.by.ru - correct result
15,400 from tarkus01.by.ru - incorrect result, but better than other incorrect results

Search on tarkus01.by.ru domain, without specifying site:...

Results 1 - 10 of about 471,000 for javagoldmine.by.ru - correct result
Results 1 - 10 of about 13,600 for tarkus01.by.ru - incorrect result

All answers

danielroofer

Level 3
7/4/09

I think it would be best If you consulted a professional to help you resolve issues you are having with indexing this site.

Do you think this answers the question? Report abuse

Kevin-UK

Level 4
7/4/09

google does not index all pages of a site and it can drop pages over time. there can be many reasons for pages being dropped and without looking at your site and going through te articles.

Are your articles unique content. Most article sites if they take article submissions from the public will have major problems with duplicate issues. this is because people submit their articles to many different sites. You need to go through your sites using something like copyscape to find out if people are submitting original content.

Kevin

Do you think this answers the question? Report abuse

Autocrat

Top Contributor
Webmaster Help Bionic Poster
7/4/09

1 person says this answers the question:

1) Google is not obliged/required to crawl/index anyting

2) Google will typically only crawl/index a set % of a site anyway - the % may vary based on things like Trust, Authority, Popularity, Internal Link structure, server responses and time length for responses etc.

3) Crawling does Not = Indexing

4) Indexed does Not = shown in SERPs

5) Google /may/can/does Filter results in the SERPs ...
It may decide Not to show some URLs if it sees them as Duplciates (full/partial - Internally/Externally and/or due to Canonical issues).
    It may decide Not to show some URLs if it perceives them as being "weak" (little/no content, liitle/no original content, no intenral links, poor response times, poor response history etc.).

6) Results given may vary based on DataCenters - G's info is on multipel networks - the DC you speak to may change based on the response speed of the DCs, your ISP, your Browser, the time of day etc.

7) The figures shown by G tend to be "estimates" or "guesses" - as you click through the Pager Links at the bottom, the figures tend to change.
   (It may say "of about 5000" on page one, go to page 25 and it may say "of about 1200" and if you go to page 79 it may say "of about 800")

8) You may find that Google is actually "consolidating" it's figures. The figures you saw before could have been wild guesses - but now G has had time to properly crawl the site, it has realised that it actually on has X pages, and never had Y pages in the first place!

9) The only way o have a better idea (but still NOT likely to be 100% accurate!) is to click through ALL the pager links!
Due to the site of the site, using a site:operator is likely to be ineffective - isntead you should use the domain plus a Directory, or possibly even a SubDirectory...
   site:yoursite.com/directory1/
   site:yoursite.com/directory1/subdirectory1a/
   etc.

Do you think this answers the question? Report abuse

Autocrat

Top Contributor
Webmaster Help Bionic Poster
7/4/09

You know - I'm getting tired of explaining this one .... I need to write up and Auto-Resposne for it!

Do you think this answers the question? Report abuse

Kevin-UK

Level 4
7/4/09

ok I've had a quick look at the site. you have problems with the way you are running the site - you have www and non www versions for the page and you also offer a frames version and a non frames version of the site.

You have deifinate duplicate content issues as your site is providing a lot of sample code that is found elsewhere on the web and in a much more user friendly (and search engine friendly) manner.

I would suggest you actually look at the why you are delivering the site to see how you can make it much smoother and user firendly - and then you need to deliver some unique content.

Kevin

Do you think this answers the question? Report abuse

Webmaster Help Feed

CALL('redrawTopicSubscriber', 'v76b93b88519d6a9f_csuid1'); Question: SERPs discrepancies

The Book Of Google

All answers

Post reply

Subscribe

Question: SERPs discrepancies