Author |
Message |
mrix
Client

Joined: Dec 04, 2004
Posts: 757
|
Posted:
Sun Oct 02, 2005 3:02 am |
|
Hello all, google stop caching my pages about a month ago, I thought at first it may have been the latest Sentinal I installed but now I dont think so now. my site is phpnuke 7.6 with the latest patch 3.1 etc and I also run the latest sentinal. I have only one thing to go on really and that is about 2 months ago I added my site to google sitemap and everything was ok, but some error accured and now my sitemap.xml map doesnt get seen by google? I am sure this is a big part of my problem? I get an error when I submit my sitemap page which is this
The server returned an error when we tried to access the URL provided. Please verify that the Sitemap URL is correct and resubmit your Sitemap.
but as you can see my sitemap is here and appears reachable?
http://www.sea-fishing.org/sitemap.xml
Thanks for any help as um running out of ideas now
Cheers
mrix |
|
|
|
 |
hitwalker
Sells PC To Pay For Divorce

Joined:
Posts: 5661
|
Posted:
Sun Oct 02, 2005 7:05 am |
|
well its a sitemap of 800 kb.
and the browser just freezes.
as far as i can tell you have included a lot of junk in your sitemap that shouldnt be in there...
thats how you get a 800 kb sitemap.. |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 8:12 am |
|
Hi thanks for your reply but from what I have read sitemaps can be up to 10mb ? but what I`ll do anyway is creat a very small sitemap and see if that does help.
Thanks
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 8:21 am |
|
well dont believe everything you read...
thing is,you have urls in your sitemap that should not be in there..
like topsite in and outs..
stats....
and whole bunch of ftopict of your forums...
Thats not the idea of a sitemapo.
sitemap is a reflection of a websites total index...
like i created miy sitemap.
so infact,the links that shouldnt be in there are the topsites and the forums. |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 11:42 am |
|
Hi again, I have chopped out all the forum and top sites url`s so now the sitemap is very small in size but still it comes upo with the same error??
what could it be? driving me up the wall to be honest now
Thanks for you help
mrix |
|
|
|
 |
Steptoe
Involved


Joined: Oct 09, 2004
Posts: 293
|
Posted:
Sun Oct 02, 2005 12:11 pm |
|
Are u submitting the sitemap.xml or sitemap.xml.gz I think the latter is the better.
There is no need to have
Member, submit, size limit...u have taken out ALL your forums? just need to take out the quotes, reply, search. faq, leaving topics and posts.
I believe site map is meant to be updated every week or so and resubmitted with updated urls.
We have a site very similar to yours, birds not fish, our forums a little larger now sitemap.xml is 95kb and sitemap.xml.gz is 4 kb,
One thing I do notice on our site...Robots.txt is a "hey dont go here" file
Sitemap is meant to be, JUST go to these.
Then check the google site map errors and it says " cant access ***** because of robots.txt..... ***** is not in the site map and it is in robots.txt. |
_________________ My Spelling is NOT incorrect, it's Creative |
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 12:33 pm |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 2:11 pm |
|
Hi hitwalker, I have copied your sitemap style and have re-uploaded it and tried again and google is still saying I have the same error. unfortuantely my mystery goes on......
Cheers
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 2:17 pm |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 2:28 pm |
|
Hi this is the error
The server returned an error when we tried to access the URL provided. Please verify that the Sitemap URL is correct and resubmit your Sitemap.
Cheers
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 2:31 pm |
|
Are you sure url is ok?
Now weird typo's ?
No security preventing any ? |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 2:35 pm |
|
Yea url is fine http://www.sea-fishing.org/sitemap.xml I am not sure if something could block it though I havnt got anything banned in my sentinal not what I am aware of? maybe thats it something is totally blocking it and also google along with it?
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 2:44 pm |
|
Well i dont know much about the rules to create the xml but i do see the following difference..
you have it like this :
Code:
<loc>http://www.sea-fishing.org/sea-garfish.html</loc>
|
But mine is like this :
Code:
<loc>
http://www.hitwalker.nl/phpx/html/modules.php?name=CPanel_User_Guide&page=addingAuthorizedUser.htm
</loc>
|
|
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 2:53 pm |
|
I have another website on a subdomain of that webspace and decided to try uploading the same sitemap to the sub domain site and then I submited it to google sitemap page and guess what happened yes it excepted it , now my problem is to find out whats actually blocking the sitemap on my fishing site? my robots.txt seems ok and sentinal appears to be ok but cant think of anything else that would block it?
cheers
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 3:07 pm |
|
well robot file doesnt hold back anything,it cannot do that.
only thing left is sentinel..somehow... |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 3:13 pm |
|
I have totally just got rid of all sentinal tables and re-installed them but still no joy?? what is blocking it, this is so frustrating now
Cheers
mrix |
|
|
|
 |
hitwalker

|
Posted:
Sun Oct 02, 2005 3:26 pm |
|
well taking out sentinel tables wasnt realy needed...
but whats holding it back i dont have a clue..  |
|
|
|
 |
Guardian2003
Site Admin

Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam
|
Posted:
Sun Oct 02, 2005 7:34 pm |
|
I see from your sitemap you have
the '-' doesnt need to be there.
I can see no other valid reason why Google is not crawling that url.
Remove those '- reupload your sitemap file and I'll add the url to my own account at google just to see what happens. |
|
|
|
 |
mrix

|
Posted:
Sun Oct 02, 2005 11:24 pm |
|
The actual code view of my site map doesnt have those dashes but the <url> tag they only appear when you actually load it through a browser this sitemap is indenticle to hitwalkers and his works fine. if you upload this sitemap you will find it does work as inmy other post I have explained that I uploaded it to my other site and it does work fine....... something is definately blocking it on my fishing site though but what????
Cheers
mrix |
|
|
|
 |
Guardian2003

|
Posted:
Mon Oct 03, 2005 12:10 am |
|
I found it interesting that google accepted 'the same sitemap' when you placed it in a sub-domain. Google should have given you an error as it only acpets sitemaps with links to the domain the sitemap is linked to.
i.e if your domain is www.sub.mydomain.com it would reject the map if the root url is www.mydomain.com
The fact that google crawled a sitemap on a sub-domain and not another site within the same domain would seem to confirm that whichever googlebot is doing the crawling has been blocked from accessing the site.
As a server wide block would prevent the sitemap (wherever it is located on the server) from being crawled, we can logaclly deduce that the block has either been done by Sentinel or some other code such being manually added to the IP Ban module.
Do you have any blocked IP's at all in the Sentinel blocked table? |
|
|
|
 |
mrix

|
Posted:
Mon Oct 03, 2005 12:21 am |
|
Hi there, when I check sentinal for any ip bans there is nothing there, yesterday I even deleted sentinal tables and re-installed it again, just to make sure there was not some kind of glitch somewhere? as for ip bans I`ll check that but I havnt actually added any myself?
Cheers
mrix |
|
|
|
 |
Guardian2003

|
Posted:
Mon Oct 03, 2005 4:14 am |
|
The only other thing I can think of would be the possibility of the googlebot IP being listed in your .htaccess file as being denied access. |
|
|
|
 |
mrix

|
Posted:
Mon Oct 03, 2005 4:26 am |
|
I havnt anything along ip`s in my htaccess file at all.
I do really appriciate these idea`s though as you can imagine have a site somehow being blocked gets you down I mean the site is still ranked google 5.
Cheers
mrix |
|
|
|
 |
hitwalker

|
Posted:
Mon Oct 03, 2005 5:48 am |
|
well i used a website checkup and that was allowed so i wasnt blocked by anything..
weird..
But another question...
if there was a htaccess file in your root...can you see it ? |
|
|
|
 |
mrix

|
Posted:
Mon Oct 03, 2005 8:23 am |
|
Yes I do have a htaccess file which mainly runs the google tap but I dont use it for sentinal
mrix |
|
|
|
 |
|