robots.txt - How to disallow bots from a single page or file -

- February 15, 2010

how disallow bots single page , allow allow other content crawled.

its important not wrong asking here, cant find definitive answer elsewhere.

is correct?

    user-agent:*     disallow: /dir/mypage.html     allow: /

the disallow line that's needed. block access starts "/dir/mypage.html".

the allow line superfluous. default robots.txt allow: /. in general, allow not required. it's there can override access disallowed. example, want disallow access "/images" directory, except images in "public" subdirectory. write:

allow: /images/public disallow: /images

note order important here. crawlers supposed use "first match" algorithm. if wrote 'disallow` first, crawler assume access "/images/public" blocked.

Search This Blog

C A N B

robots.txt - How to disallow bots from a single page or file -

Comments

Post a Comment

Popular posts from this blog

php - How can I edit my code to echo the data of child's element where my search term was found in, in XMLReader? -

java - Why is BlockingQueue.take() not releasing the thread? -

jQuery Ajax Render Fragments OR Whole Page -