Jun 28, 2008

Robots Meta tag values


Robots Meta tag values
Do you know you can tell the search engine robots to whether your pages are indexed or not, enable the web site cache, isuues on Open Directory Project, and tell other things to search engines by just changing your meta tag.

This is how to do it.

NEW :
In JUNE 2008 the revised Robots Exclusion Protocol added NOARCHIVE, NOODP and NOSNIPPET to the list of supported values supported by Google and MSN Live Search, and Yahoo added NOYDIR.
Why do you need to add meta tags to support as search engines ?
Because of it is better to direct (tell) the search engines, for what you need.
The default values are now assumed to be INDEX, FOLLOW, ARCHIVE, ODP, SNIPPET and YDIR.
There is no actual need to include these,
unless someone on the web team needs reminding.
So let's begin.
I think you know what is the meta tag.
It is like as follows.
Ex:
<HEAD>   
<title>Should Not Be Indexed (Meta Robots noindex nofollow)
</title>   
<META name="robots" content="NOINDEX,NOFOLLOW" /> 
</HEAD>
above meta tag tells the search engines to "do not index our pages on your search engine"
Below is the list of options you can follow.
TaskEntryNotes
Do not index, but follow links

<META name="ROBOTS" content="NOINDEX, FOLLOW">

Use this for pages with many links on them, but no useful data, such as a site map. Because "follow" is the default, you don't have to include it.
Index, but do not follow links

<META name="ROBOTS" content="NOFOLLOW, INDEX ">

Use this for pages which have useful content but outdated or problematic links.
Do not index or follow links

<META name="ROBOTS" content="NOINDEX,NOFOLLOW">

This is for sections of a site that shouldn't be indexed and shouldn't have links followed. Putting access control, such as a password, is much better for security.
Index and follow links: default behavior

<META name="ROBOTS" content="INDEX,FOLLOW">

This is the default behavior: you don't have to include these.
Search results pages should not show "cache" link<META name="ROBOTS" content="NOARCHIVE">Useful if the content changes frequently: headlines, auctions, etc. The search engine probably still archives the information, but won't show it in the results.
Do not display the Open Directory Project (ODP) title and description for the page in search results.

<META name="ROBOTS" content="NOODP">

Danny Sullivan provides good examples of how

outdated descriptions and even titles show up when

the ODP content is used for search results.

Encourages search engines to use the page title tag, and match term in context, or META Description tag content instead of the ODP content, which may be misleading or outdated.
Do not display the Yahoo! Directory title and description for the page

<META name="ROBOTS" content="NOYDIR">

(Yahoo Slurp robot only)

Same as above, only for the Yahoo directory, and the other search indexers will ignore it.
Do not display any description or text context for this URL in search results.<META name="ROBOTS" content="NOSNIPPET">Encourages the search engines to use the title only, and to suppress the "cache" link. Might be useful if the site has special plus box listings in search results, but otherwise, not so much.

Robots Information
Introduction to web crawling robots for search indexing and other purposes
Robot Exclusion Protocol (REP)
Information on the original protocol and the June 2008 search engines extensions
Elements of Robots.txt
Robots.txt Details
Practical notes on implementing robots.txt
META Robots Tag Page
Describes the META Robots tag contents and implications for search indexing robots.
Indexing Robot Checklist
A list of important items for those creating robots for search indexing.
List of Robot Source Code
Links to free and commercial source code for robot indexing spiders
List of Robot Development Consultants

1 comments:

Unknown said...

This is very useful to me.
I learned how to done it now.
thanks

Post a Comment

colombo pro