robots.txt - How to disallow bots from a single page or file -


how disallow bots single page , allow allow other content crawled.

its important not wrong asking here, cant find definitive answer elsewhere.

is correct?

    user-agent:*     disallow: /dir/mypage.html     allow: / 

the disallow line that's needed. block access starts "/dir/mypage.html".

the allow line superfluous. default robots.txt allow: /. in general, allow not required. it's there can override access disallowed. example, want disallow access "/images" directory, except images in "public" subdirectory. write:

allow: /images/public disallow: /images 

note order important here. crawlers supposed use "first match" algorithm. if wrote 'disallow` first, crawler assume access "/images/public" blocked.


Comments

Popular posts from this blog

javascript - Iterate over array and calculate average values of array-parts -

iphone - Using nested NSDictionary with Picker -

objective c - Newbie question -multiple parameters -