ROBOT TEXT OPTIMIZATION
Fundamentally, we like the content of our
websites to be indexed immediately so that traffic will be driven and search
engine ranking will be improved. But in some situations, a file or online tool
is used to hide the pages and personal files we have in our website.
The most useful and popular way of hiding
the files from search engines is using the Robots.txt file. Actually robots
meta tag is good too but some engines cannot read meta tags. If Robots.txt is
used, all search engines will be notified.
Robots.txt is also referred as Robot
Exclusion Standard or Robot Exclusion Standard. It was developed in 1994 and it
was only popular on Lycos and AltaVista. It is a text file as what the
extensions says and not an HTML tool. Robots.txt can help you from preventing
the search machines from crawling search engines, but like any other tools,
Robots.txt can be unreliable at times so don't rely your top secret information
to a text file.
Put the Robots.txt file in the main
directory. It is important to put the file on the proper directory for engines
look first in the main directory like
[http://www.your-domain-name.com/robots.txt]. Once the user agent or the search
engine was not able to find the file, they will think that all the web pages
should be indexed.
How does the Robots.txt look like?
Robots.txt has a really plain structure
which is only composed of user agent and disallowed directories and files. The
term used for search engine is "user agent" and "disallow"
for the lists of files and directories you wanted to exclude from indexing. If
you want to insert comments, put # sign at the start of the line.
Robots.txt is one of the many tools for
your website. Having a personal website spells fun! You can benefit from the
advancements in the technology world. If you got one, don't forget to deliver
your audience quality content to drive site traffic. Contact an organic SEO
company for clean and proper SEO techniques if you want to optimize your
website for the target niche.
User-agent: *
# prevents indexing sensitive files
Disallow: /cgi-bin
Disallow: /wp-login.php
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /category/*/*
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /*?
# prevents indexing sensitive files
User-agent: Googlebot
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.swf$
Disallow: /*.wmv$
Disallow: /*.cgi$
Disallow: /*.xhtml$
# Autoriser Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*
# Autoriser Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*
# It tells the spider the link to the sitemap
Sitemap: http://www.geekpress.fr/sitemap.xml
Robots txt for blogger
To exclude certain content from being searched, go to Settings | Search Preferences and click Edit next to "Custom robots.txt." Enter the content which you would like web robots to ignore. For example:
User-agent: *
Disallow: /about
Disallow: /about
Robots txt for wordpress
WordPress contains sensitive files, such as wp-admin and wp-include, which should not be indexed for safety.
With a robots.txt file optimized for WordPress, indicate the different search engines folders and files not to index.
The robots.txt file contains a list of commands for the different indexing search engine spiders. It specifies the pages or folders that should or should not be indexed by robots.
This unique file should necessarily be at the root of your website and must be accessible via this address: www.domainename.com/robots.txt.
All search engines begin exploration of a site by looking for the robots.txt file to this address. If the file does not exist, the robot starts its indexing from the homepage address.
Here robots.txt optimized for a website or blog on WordPress realized.