How to remove webpage of website from search engines using meta tags?

This approach is suitable when user does not have root access of server and user is not able to create “robots.txt” file.

To prevent all robots from indexing a page on your site, place the following meta tag into the <head> section of your page:

<meta name=”robots” content=”noindex”>

To allow other robots to index the page on your site, preventing only Google’s robots from indexing the page:

<meta name=”googlebot” content=”noindex”>

When google see the noindex meta tag on a page, Google will completely drop the page from search results, even if other pages link to it. Other search engines, however, may interpret this directive differently. As a result, a link to the page can still appear in their search results.

If the content is currently in google index, Google will remove it after the next time crawl the site. To expedite removal, use the URL removal request tool in Google Webmaster Tools.

What is a Robot Meta Tag?

You can use a special HTML <META> tag to tell robots not to index the content of a page, and/or not scan it for links to follow.

For example:

<html>
<head>
<title>Test Page</title>
<META NAME=”ROBOTS” CONTENT=”NOINDEX, NOFOLLOW”>
</head>

There are two important considerations when using the robots <META> tag:

- robots can ignore your <META> tag. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.

- the NOFOLLOW directive only applies to links on this page. It’s entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page.

How to write a Robots Meta Tag?

Where to put it:
Like any <META> tag it should be placed in the HEAD section of an HTML page, as in the example above. You should put it in every page on your site, because a robot can encounter a deep link to any page on your site.

What to put into:
robots meta tag have two attributes “NAME” and “CONTENT” attribute.
The “NAME” attribute must be “ROBOTS”.
Valid values for the “CONTENT” attribute are: “INDEX”, “NOINDEX”, “FOLLOW”, “NOFOLLOW”. Multiple comma-separated values are allowed, but obviously only some combinations make sense. If there is no robots <META> tag, the default is “INDEX, FOLLOW”, so there’s no need to spell that out. That leaves:

<META NAME=”ROBOTS” CONTENT=”NOINDEX, FOLLOW”>
<META NAME=”ROBOTS” CONTENT=”INDEX, NOFOLLOW”>
<META NAME=”ROBOTS” CONTENT=”NOINDEX, NOFOLLOW”>

How to remove cached copies of web pages using robots meta tag?

Google automatically takes a “snapshot” of each page it crawls and archives it. This “cached” version allows a webpage to be retrieved for your end users if the original page is ever unavailable. The cached page appears to users exactly as it looked when Google last crawled it, and google display a message at the top of the page to indicate that it’s a cached version. Users can access the cached version by choosing the “Cached” link on the search results page.

Before you begin, you must do one of the following:

To update the cached version of a page:
change the content of the page. The next time Google crawls the page, It will update the cached version.

To removed cached versions of a page from Google’s index and prevent Google from caching the page in the future:
you must add a noarchive meta tag to that page. The next time we crawl that site, we’ll see the tag and remove the page.

To prevent all search engines from showing a “Cached” link for your site, place this tag in the <HEAD> section of your page:

<meta name=”robots” content=”noarchive”>

To prevent only Google from displaying one, use the following tag:

 <meta name=”googlebot” content=”noarchive”>

Once this is complete, you can use the URL removal tool in Webmaster Tools to request expedited removal of the cached content for a minimum of six months.

How to remove snippets that appear below web pages  in Google search results and describe the content of your page?

A snippet is a text excerpt that appears below a page’s title in our search results and describes the content of the page.

To prevent Google from displaying snippets for your page, place this tag in the <HEAD> section of your page:

<meta name=”googlebot” content=”nosnippet”>

Note: Removing snippets also removes cached pages.

How to remove outdated pages from google index by returning proper server response?

Google updates its entire index regularly. When google crawl the web, it automatically find new pages, remove outdated links, and reflect updates to existing pages, keeping the Google index fresh and as up-to-date as possible.

If outdated pages from your site appear in the search results, ensure that the pages return a status of either 404 (not found) or 410 (gone) in the header. These status codes tell Googlebot that the requested URL isn’t valid.

How to remove images from Google Image Search using a robots.txt file?

To remove an image from Google’s image index, add a robots.txt file to the root of the server that blocks the image.

For example, if you want Google to exclude the logo.jpg image that appears on your site at www.yoursite.com/images/logo.jpg, add the following to your robots.txt file:

User-agent: Googlebot-Image
Disallow: /images/logo.jpg

To remove all the images on your site from google index, place the following robots.txt file in your server root:

User-agent: Googlebot-Image
Disallow: /

Additionally, Google has introduced increased flexibility to the robots.txt file standard through the use asterisks. Disallow patterns may include “*” to match any sequence of characters, and patterns may end in “$” to indicate the end of a name. To remove all files of a specific file type (for example, to include .jpg but not .gif images), you’d use the following robots.txt entry:

User-agent: Googlebot-Image
Disallow: /*.gif$

Some time publisher creates new website and want to remove old website from search engines, Publisher can do this by the help of “robots.txt” file.

“robots.txt” file is the text file in website server root, “robots.txt” file is used to request search engines for remove your site and prevent robots from crawling it in the future.

To prevent all robots from crawling your site,

Create file name “robots.txt” in your server root and paste following content in the “robots.txt” file:

User-agent: *

Disallow: /

To remove your site from Google only and prevent just Googlebot from crawling your site in the future, paste following content in the file:

User-agent: Googlebot

Disallow: /

Each port must have its own robots.txt file. In particular, if you serve content via both http and https, you’ll need a separate robots.txt file for each of these protocols. For example, to allow Googlebot to index all http pages but no https pages, you’d use the robots.txt files below.

For your http protocol (http://yourserver.com/robots.txt):

User-agent: *

Allow: /

For the https protocol (https://yourserver.com/robots.txt):

User-agent: *

Disallow: /

Note: A robot can discovers your site by other means - for example, by following a link to your URL from another site - your content may still appear in our index and our search results. To entirely prevent a page from being added to the Google index even if other sites link to it, use a noindex meta tag.

Some More Examples:

Examp1:

The following example “/robots.txt” file specifies that no robots should visit any URL starting with “/India/delhi/” or “/test/”, or /prince.html:

# robots.txt for http://www.princejain.com/

User-agent: *
Disallow: /India/delhi/ # This is an infinite virtual URL space
Disallow: /test/ # these will soon disappear
Disallow: /prince.html

Examp2:
This example “/robots.txt” file specifies that no robots should visit any URL starting with “/India/delhi /”, except the robot called “Googlebot”:

# robots.txt for http://www.princejain.com/
User-agent: *
Disallow: /India/delhi / # This is an infinite virtual URL space
# Googlebot knows where to go.
User-agent: Googlebot
Disallow:

Examp3:
This example indicates that no robots should visit this site further:

# go away
User-agent: *
Disallow: /

SEO is seems like very simple and one word answer (Search Engine Optimizations) but it covers many points and changes according to technologies trades and also search algorithms.

I am trying to cover some responsibilities and work,
Please don’t mind if I forgot to mentions something and add your inputs in the same.

Analysis Phase or Key Words Selection:
In this phase SEO was involved in the analysis of key word selection for web site development. This is major part of the web site designing because key world selection play major role to success a web site over the internet. So this is challenging and innovative phase of the web site designing

Web site design and Development:
In this phase SEO was involved in the layout or design of the web site with database connectivity and Coding. This phase also involve Generation of scripts and Writing SQL scripts. In this phase web site design should be Browser comparable, and search engine friendly with good resolution.

Search engine Submission and Optimization (SEO):
Presently in this phase SEO was involved in the SEO services. This phase also includes search engine submission and directory submission to get high rank in search engine on desired key words by taking one-way/reciprocal or two-way links.

Some Other works which is also part of SEO:
• Content analysis and writing
• Good knowledge of Search Engine Optimization, Email marketing, Market Research, Web analytics, Add Management, PPC, Web tools (ad words, google analytics, yahoo, Overture).
• Blogging and Article writing
• Finding new ideas in current systems, Finding new vertical of business and successfully implemented business models and revenue models
• Project management, SEO, client interaction, Team lead, website designing and development.

Search Engine Optimization (SEO)

Posted by: Prince in SEO 1 Comment »

Web Content Mass + Keyword Optimization + Links = SEO

How does web content really affect SEO? It’s often said that the answer is simply that content does not affect SEO very much – it’s all about more technical issues. Yet a website’s content still plays an enormous and fairly direct role in search engine ranking. Of course, the whole goal of the search engines’ ranking schemes is precisely to deliver good, relevant content to users. The mechanism for how search engines select and reward good, relevant content is essentially just a technical issue, though admittedly an extremely important technical issue.

But even in purely technical, mechanistic, terms, web content affects search engine rankings three ways: 1. inbound links

2. website mass 3. keyword optimization

1. Web Content and Inbound Links Inbound links are the number-one factor in getting search engine rankings. They also yield plenty of traffïc on their own. The importance of links is what has led many people to say that content is no longer important. But those people forget that content really does play a big role in getting links in the first place:

  • At the very least, good content will make potential link partners more comfortable with linking to your site. No one wants to link to a link farm, splog, junk site, or even just an unprofessional-looking site.

  • Lots of good content gives other webmasters (and particularly bloggers) a reason to link to your site spontaneously without being asked.

  • You can allow other websites to post your content in exchange for a link back to your site.

2. Web Content Mass More web pages of content = more search engine traffïc

Here’s why: 1. Adding pages to your site is like putting out extra nets to catch surfers.

2. Search engines see bïgger websites as more prestigious and reliable. 3. The more content you have, the more reasons you give other webmasters, particularly bloggers, to link to your site spontaneously, without being asked.

3. Web Content Keyword Optimization Keyword optimization used to be the most important step in SEO. Nöw it matters little in ranking for highly competitive keywords.

Still, keyword optimization can really help you get traffïc from searches not on competitive keywords. While you may nevër rank number 1 for “finance,” you may still show up tops for a search on “household finance rent federal tax deductions” if you have that phrase somewhere in your content. Such non-competitive searches make up a very large proportion of total web searches. Web Content Keyword Optimization Checklist:

There are four legs to keyword optimization: • Research/selection
• Density
• Prominence
• Stemming/Variation

Keyword Research and Selection You need to identify keywords searched on by your target audience. Use tools such as those offered by WordTracker and Yahoo Search Marketing (formerly Overture).

There are two big pitfalls to avoid:

  • “Negative keywords” that look relevant but are not really searched on by your target market. For instance, “website copy” is a synonym for “website content,” but most people searching on “website copy” are looking for software that copies an entire website to the hard drive for offline browsing.

  • Impossibly competitive keywords that you have no realistic chance of ranking high for them. How do you know if a keyword is impossibly competitive? One rough measure is to look at the PageRank of the webpages currently ranking in the top three for that keyword. If the PageRank of those pages is much higher than the PageRank your site will likely have in the future, you will probably nevër outrank those pages.

A pay-per-click campaign with Google Adwords or Yahoo! Search Marketing will help you to find which keywords really are searched on by your target audience. Keyword Density

Keywords appear in the content the right number of times for search engines to recognize the page as relevant, but not so often that it looks like keyword stuffing. The longer the content, the more times the keyword should appear. Keyword Prominence

Keywords appear in just the right positions within your web pages for search engines to recognize them as relevant. The page title, headings, and first lines of the page are often considered the most prominent positions. Keyword Stemming/Keyword Variation

  • Using variations of the keyword will help ensure web pages appear relevant to the next generation of more sophisticated search engine algorithms.

  • In the meantime, variations of popular keywords help your site appear for the “non-standard” searches on variations of the keyword.

There are three main types of keyword variations:

  • Word-stem variations. A stem of a word is its base. For instance, “optimize” is the stem of “optimized.” Other stem variations of “optimize” include “optimizing,” “optimizer,” and “optimization.” You can also shuffle the component words of multiple-word keywords. Variations of “website content” would be “web site content”, “web content”, “content for websites”, and “site content”.

  • Synonyms (such as “web page content”, “internet content”, or “writing for the web” for “website content”).

  • Related terms (such as “internet”, “SEO” or “web page”).

For many people, the SEO side of content feels like a moot point. You need to create content for your visitors even if no search engine spider ever notices. But there is a case to be made that an extra page of content is good not just for visitors but search engine spiders, too. Every website budget, both of monëy and time, is finite. If you’re ever choosing whether to invest in another link to please search engines or another page of content to please your visitors, don’t forget: search engines still like content, too.