Fast page indexing in Yandex. Checking the number of pages in the Yandex index using operators. Non-reference methods

What is site indexing in search engines is known to many webmasters. They are eagerly awaiting the update of the search base to rejoice in the indexing results or to find and fix optimization errors that interfere with high-quality indexing and further website promotion.

Thanks to high-quality indexing of sites on the Internet, you can find anything you want.

How does the indexing system work in major search engines?
Search engines have robotic programs (search bots) that constantly “walk” the links in search of new pages. If they find a new page that meets the requirements of the search engine's algorithm, then it is included in the search results and indexed.


pic: Indexing helps find sites

The most valuable and at the same time complex is the algorithms of search engines by which they select pages for their search base. Different search engines have their own: some are better, some are a little simpler. This should also be taken into account when indexing a site. They say that you can find anything on the Internet. And thanks to what can you find? Right! Thanks to the high-quality indexing of sites.

How do I add a site to the search engine index?

How to quickly and easily add your site to the search engine index? It would seem that there is nothing complicated about this: you just need to place the site on the network, and the search engines themselves will rush to it. If everything were that simple, then numerous SEO optimizers would be out of work.

Let's see what indexing is. Indexing is the process of adding pages on your site to the search engine's database. In simple terms, the search engine collects your pages so that they can then be shown to users for specific queries. In what order to display and for what queries - this is the topic of more than one article.

Indexing a site is quite simple: you need to "tell" the search engine that you have a site that may be of interest to it. Each search engine has a form for adding sites to the index. Here are links to forms for adding sites to the index of some search engines:

To speed up indexing, many recommend registering your site with social bookmarking systems. This is really justified because search robots (programs that conduct indexing) visit such sites very often. If they see a link to your resource there, then its indexing will not take long.

Registration of a site in search engines and social bookmarks can be carried out both independently and entrust this business to companies that deal with the issues of site promotion.

Why do I need indexing?

Do you need a website that increases sales of your company and promotes your products? Or maybe you need a website that is profitable by itself? Maybe you want to keep a personal diary and get paid for it? If you answered in the affirmative to any of these questions, then you should at least in general terms understand what the indexing of a site in search engines is.

Follow the main condition - create a site “for people”, convenient and with unique content.

Indeed, if your site is not in the search results of the largest search engines (Yandex, Google, Rambler ...), then you may not even hope to make a profit and promote your goods or services. The site will be an extra burden, eating away the firm's budget for its maintenance.

A completely different situation will arise if the site is indexed. Moreover, the more pages have been indexed, the better. The main thing that is necessary for successful indexing is optimization and uniqueness of the site content.

Search engines are developing rapidly, indexing algorithms are constantly being improved. Now search engines have no difficulty in identifying plagiarism or unreadable text. Therefore, follow the main condition that is necessary for successful indexing - create a site “for people”, convenient and with unique content.

Site indexing not only gives a large number of targeted visitors (which ultimately affects the sales of your company's products), it also contributes to the development of the project itself and can direct the site owner along a more promising way to expand his Internet project.

How often is indexing on the Internet?

On many large forums devoted to the promotion and promotion of sites, you can find topics with approximately the same names: search engine APIs. What is it, and how often are search engine databases "up-dated"? How does all this affect indexing? Let's try to figure it out.

A person who is a little versed in the terminology of the Internet probably knows what "up" is. But what is the search base up, or indexing update, only those who are engaged in the promotion and promotion of sites know. We understand that search engine data cannot be constantly updated. This is fraught not only with banal server overloads, but also with equipment failure. Of course, small databases can constantly change their state, and if we are talking about search engine databases that are responsible for indexing sites, then this is a completely different matter.

Imagine how many queries the indexing database receives every second. And what will become of it if the information about indexing is still changing in parallel? Naturally, it may not withstand, which was observed at the dawn of the development of search engines.

Today this problem is solved in a rather universal way: data on indexing from search robots are stored in temporary databases, and the update of the "main" database occurs with a delay of several days. Therefore, the indexing of sites in major search engines is quite fast and without glitches.

Preparing the site for indexing.

Many novice webmasters on specialized forums ask the same question: how to properly prepare a site for indexing. Perhaps these recommendations will help you:

  1. Successful indexing requires high-quality unique content. This is perhaps the first and most important condition. If your site uses "stolen" content, then the likelihood that the indexing will be successful is low.

  2. Do not use "gray" and "black" methods of page optimization: once and for all, give up a list of keywords in the background color of the page, as well as various iframe structures. If the search engine robot suspects you of such violations, the domain name will be generally prohibited from indexing.

  3. After you have uploaded the site to the server, do not rush to add it wherever possible. Check again the content, code for validity, internal linking of pages. If everything is done correctly, notify search bots and invite them for indexing.

  4. Check for meta tags, keywords and descriptions in them, page titles and image alts. If all this is available, then you can safely carry out indexing.

  5. Add your site to search engines through special panels.

As you can see, the tips are pretty simple. But for some reason, many novice optimizers do not pay enough attention to them, and then complain that the indexing of their sites is delayed for several months.

Other materials

Site indexing is the most important, necessary and primary detail in the implementation of its optimization. After all, precisely due to the presence of the index, search engines can respond extremely quickly and accurately to all user requests.

What is site indexing?

Site indexing is the process of adding information about the content (content) of a site to the search engine database. It is the index that is the search engine database. In order for a site to be indexed and appear in search results, a special search bot must visit it. The entire resource, page by page, is examined by the bot according to a certain algorithm. As a result, finding and indexing links, images, articles, etc. At the same time, in the search results higher in the list, there will be those sites whose authority is higher than the rest.

There are 2 options for indexing the PS website:

  • Self-determination by a search robot of fresh pages or a created resource - this method is good if there are active links from other sites already indexed to yours. Otherwise, you can wait indefinitely for a search robot;
  • Manual entry of the URL to the site in the form of the search engine intended for this - this option allows the new site to "queue up" for indexing, which will take quite a long time. The method is simple, free and requires entering the address of only the main page of the resource. This procedure can be performed through the webmaster panel of Yandex and Google.

How to prepare a site for indexing?

It should be noted right away that it is highly undesirable to publish the site at the development stage. Search engines can index unfinished pages with incorrect information, spelling errors, etc. As a result, this will negatively affect the site's ranking and the issuance of information from this resource in the search.

Now let's list the points that should not be forgotten at the stage of preparing a resource for indexing:

  • indexing restrictions apply to flash files, so it is better to create a website using HTML;
  • such data type as Java Script is also not indexed by search robots, therefore, site navigation should be duplicated with text links, and all important information that should be indexed should not be written in Java Script;
  • you need to remove all broken internal links so that each link leads to the real page of your resource;
  • the structure of the site should allow you to easily move from the bottom pages to the main page and back;
  • it is better to move unnecessary and secondary information and blocks to the bottom of the page, as well as hide them from bots with special tags.

How often does indexing happen?

Indexing a site, depending on a number of reasons, can take from several hours to several weeks, up to a whole month. Indexing updates, or search engine ups, occur at different intervals. According to statistics, on average, Yandex indexes new pages and sites for a period of 1 to 4 weeks, and Google manages for a period of up to 7 days.

But with proper preliminary preparation of the created resource, these terms can be reduced to a minimum. Indeed, in fact, all algorithms for indexing PS and the logic of their work boils down to giving the most accurate and relevant answer to the user's request. Accordingly, the more regular quality content appears on your resource, the faster it will be indexed.

Methods for speeding up indexing

First, you should "notify" the search engines that you have created a new resource, as already mentioned in the paragraph above. Also, many people recommend adding a new site to social bookmarking systems, but I don't. This really made it possible to speed up indexing several years ago, since search robots often “visit” such resources, but, in my opinion, now it is better to put a link from popular social networks. Soon they will notice the link to your resource and index it. A similar effect can be achieved using direct links to a new site from already indexed resources.

After several pages have already entered the index and the site has begun to develop to speed up indexing, you can try to "feed" the search bot. To do this, you need to periodically publish new content at approximately regular intervals (for example, every day, 1-2 articles). Of course, the content should be unique, high-quality, literate, and not overloaded with key phrases. I also recommend creating an XML sitemap, which will be discussed below, and adding it to the webmaster panels of both search engines.

Robots.txt and sitemaps

The robots txt file includes directions for search engine bots. At the same time, it makes it possible to prohibit the indexing of selected site pages for a given search engine. If you do it manually, it is important that the name of this file is spelled out only in capital letters and is located in the root directory of the site, most CMS generate it independently or using plugins.

Sitemap or sitemap is a page containing a complete model of the site structure to help "lost users". In this case, you can move from page to page without using the site navigation. It is advisable to create such a map in XML format for search engines and add it to the robots.txt file to improve indexing.

You can get more detailed information about these files in the corresponding sections by clicking on the links.

How to prevent a site from being indexed?

You can manage, including blocking a site or a separate page from indexing, using the robots.txt file already mentioned above. To do this, create a text document with this name on your PC, place it in the root folder of the site and write in the file from which search engine you want to hide the site. In addition, you can hide site content from Google or Yandex bots using the * sign. This instruction in robots.txt will prohibit indexing by all search engines.

User-agent: * Disallow: /

For WordPress sites, you can prohibit site indexing through the control panel. To do this, in the site visibility settings, you need to check the box "Recommend search engines not to index the site." At the same time, Yandex, most likely, will listen to your wishes, but with Google it is not necessary, but some problems may arise.

The pages need to be indexed. Site indexing - what is it in simple words? Each search engine has its own search robot. He can visit the site at any time and “walk” through it, transferring all scanned documents (all html-code, text, images, links and everything else) to the base of his search engine. This process is usually called "scanning".

Well, now let's look for answers to the questions "how to speed up indexing?" and "how to improve indexing?"

How to check the indexing of a site in Google and Yandex

There are several ways to get an answer to the questions "is this page indexed?", "How many pages are indexed?" etc. Let's take a look at some of the most effective ones. But to begin with, a little clarification - the processing of a search query and the formation of search results is based on indexed "copies" of pages in the database of the search engine, and not on the basis of the pages available on the site. Now about three ways to check indexing:

    Cabinets of search engines Yandex, Google, Mail.ru, etc. Already there, with 100% probability, you can get all the necessary data.

    Runet is full of suitable online services. Almost all of them work the same way: the site address is indicated and we get the data and the entire history of the site's indexing at a glance.

    Manual verification using the site: operator. Those. we drive site: your-site.guru into the search line and immediately get a list of indexed pages (in the form of search results) and their number.

Check site indexing in Yandex.Webmaster

Here you can immediately drop in a column a list of sites that need to be checked for indexing. Moreover, the list can contain only domain names (for example, your-site.go), and links to some specific documents (for example, your-site.go / content / domashka /) - it does not matter, because to. the service "isolates" domain names automatically.

This service is able to check indexing in Google and Yandex. Up to 250 checks per day are allowed.

In short, the service is completely similar to the previous one, with two exceptions:

  • the service checks the indexing "by the piece", i.e. only one site can be checked at a time;
  • you can also check the indexing in Bing.

How to speed up website indexing

It is unlikely that some of you will have a question "why speed up indexing?" the answer to it is already obvious: in order to take all the necessary measures as soon as possible to conquer all the top positions on the promoted requests.

The very first thing to do is add the site to your webmaster accounts from Google and Yandex. If this is not done, then the site will be indexed very slowly and very rarely, leaving the webmaster alone with his pestles about conquering at least the TOP-10.

Next, you need to correctly configure robots.txt, because a search robot can scan only a certain number of pages per visit. And it will be very disappointing if the robot starts scanning pages of a "technical" nature (for example, the registration page on the site, or the login-password entry page), which have nothing to do in the index. To prevent this from happening, data about which pages should not be crawled are entered into the robots.txt file. We will not talk about how robots.txt is compiled now, because we already have a detailed article on this topic.

The next step is to configure the xml sitemap (sitemap.xml). It contains a list of all pages that need to be indexed by search engines. You can also specify the priority of indexing, and even intervals. The more often the sitemap is updated, the higher the likelihood that the search robot will decide to visit the site more often (which is exactly what we need, right?).

The speed of the site can also affect the indexing. the faster the search robot receives an answer to its request, the better, because if you tell the robot "we have lunch, come back in an hour" - the robot will simply leave, and for a long time.

Errors in the code can also affect indexing, only negatively. Therefore, it is extremely important that there are no errors in the code at all. instead of a 200 code (namely, this code is given if it works correctly), the search robot can receive a code, for example, 404, or another that indicates an error.

In most cases, the measures described above are enough for the site to be indexed quickly enough.

Fast site indexing in Yandex

There are a couple more non-obvious ways to speed up the indexing of a new site in Yandex. We have already mentioned many times about registering in webmaster accounts from Yandex, Google, Bing and Mail.

In these cabinets, you can get statistics on the reindexing of pages, as well as receive notifications about certain errors related to the operation of the site.

And now a small "life hack". In Yandex.Webmaster, go to "Indexing" - "Re-crawling pages" and specify the address of the newly created page. Yes, we do not argue, this does not always work, because After notification, the Yandex search robot acts at its own discretion and does not always decide to visit this page, but at least he will find out about it. Therefore, we recommend using this tool to speed up the indexing of a new site or its pages.

Indexing site images

Basically, getting this information will be useful if you are promoting, say, photo hosting.

It is important to note that Google, Bing and Yandex index text and images by different robots. Images are indexed by Google the fastest. Yandex indexes images much slower compared to Google, but much faster than Bing, and the latter only indexes Runet sites slowly. Don't trust?

However, there is one universal advice for optimizing images - to write alt \u003d ”” and title \u003d ”” attributes in the code for each image, because they are both very important.

If you hover over the picture, an explanatory text hint may "pop up" to make it easier for the user to understand "where to look and what to see." The text of this very tip is written in the title \u003d ”” attribute

There are also situations when the image for some reason does not load (it may be deleted, the image loading may be disabled in the browser - it doesn't matter). In this case, instead of the photo, text should appear describing what exactly is shown in the image that has not loaded. It is this text that is written in the alt \u003d ”” attribute.

How to find out how many pictures are indexed by a search engine?

The answer is extremely simple: we go to the search engine we are interested in, in the search line we drive in the operator site :, after a colon without a space, write the domain name, confirm the request with the Enter key, and then go to the “Images” or “Pictures” tab (depending on the search engine ). You will see the pictures themselves, as well as find out their exact number.

For Yandex it will look like this:

And for Google - like this

Questions

How to prevent site indexing

Sometimes it becomes necessary to prohibit the indexing of the site. There may be several reasons for this - technical work on the site, or a new design is being tested, etc.

There are several ways to tell search engines “don't index anything here”.

The very first, the most popular and the simplest is through the robots.txt file. It's enough just to write this code:

Now let's look at what this code means.

User-agent is a directive that specifies the name of a search robot for which a list of rules has been generated. If an asterisk is specified as a name, then the list is relevant for all search robots (except for those for whom individual lists of rules have been compiled). If you specify the name of a search robot instead of an asterisk, then the list of rules will apply specifically to it:

User-agent: yandex

Disallow directive: designed to tell search robots what files / folders to scan. In our case, a single slash indicates that it is forbidden to index the entire site as a whole.

There are also special cases when it is required to close indexing for all search engines, except for a specific one. In this case, we will have 2 lists (one of which is shown above in the screenshot), and the second - an individual list for a particular robot. In this case, the robot of the Yandex search engine.

User-agent: Yandex

We've dealt with User-agent and Disallow, now let's deal with the Allow directive. It is a permissive directive. In simple terms, the above code of the robots.txt file prohibits indexing of the site by all search engines, except Yandex.

Yes, we do not argue, despite the prohibitions, search engines can still index the site. However, this is so rare that even a statistical error can be called a stretch.

The second way is the robots. To do this, add the following line to the site code:

META NAME \u003d ”ROBOTS” CONTENT \u003d ”NOINDEX, NOFOLLOW”

IMPORTANT!!! You can add it in such a way that it is loaded into the code of each page of the site without exception, or you can add it to some separate pages. But in both cases, this meta tag must be placed in the head html area.

Compared to the first, the second method is more complicated for mass indexing prohibitions, and easier for pinpoint ones.

The third way is to close the site through .htaccess

The method is also quite simple - add the following code to the .htaccess file:

Now the search robot will not be able to access the site without a password.

This is the surest way to close the site from indexing, but another problem appears - there are difficulties with scanning the site for errors, because not every parser can log in.

Method 4 is customizing the HTTP response header

This method is also quite effective for a point ban on indexing. It works as follows - along with the server response code, the X-Robots-Tag header with the noindex mark is also sent. Having received such an "accompanying" answer, the search engine will not index this page.

If necessary, you can send several X-Robots-Tag headers. In this case, there are two of them noarchive and unavailable_after.

If necessary, you can also specify the search robot to which the directive is addressed before the directive, and the directives themselves can be written separated by commas. For example, like this:

And, of course, you yourself have already guessed that if there is no user agent name, the command will be executed by all robots. Those. directives can be combined in different ways for different crawlers.

How to open a site for indexing

The answer has already sounded - with the help of the Allow directive (remember the example when using robots.txt we blocked the site from indexing for all search engines except Yandex).

Allow: / allows indexing of the site, and Disallow: / prohibits.

When was the site indexed last

It is quite easy to view the history of crawling pages by Google search robots. To begin with, the site: operator is entered into the search line, after the colon without spaces, the name of your domain is indicated, then we click in "Tools" and select the appropriate period in one of the columns. If you select "in the last 24 hours", then you can either get a list of the pages that were indexed in the last 24 hours, or get "Nothing found". This means that your site has not been re-indexed in the last 24 hours.

In Yandex, everything is made even easier - all the necessary information and history is available in Yandex.Webmaster:

How to add a site for indexing

This information will be useful to everyone who has just created their first website, i.e. the faster the site is indexed, the faster you get your first search traffic.

Adding a site to Google

First, follow the link http://www.google.com/addurl/?continue\u003d/addurl , log in (if necessary), enter the URL, go through the captcha (in this case, check the box "I am not a robot" and go through 1-2 tasks with pictures), and click "Send request".

Then you will be given the following message. It means "OK, the site has been taken into account, we'll look there soon."

If everything is in order with the site, the indexing will happen very quickly.

Adding a site to Yandex

First of all, go to https://webmaster.yandex.ru/, register, then go to "Indexing" - "Re-crawling pages" and add the domain name of your site there. All.

Adding a site to Bing

It's even easier here - go here http://www.bing.com/toolbox/submit-site-url, enter the domain name, captcha, and you're done! Registration is not required.

IMPORTANT!!! At the moment, it makes no sense to promote Russian-language sites on Bing, unlike English-language ones. This is due to the fact that very few people in Russia know about the Bing search engine.

Adding a site to Mail.ru Search

This procedure is also quite simple - go to the mail.ru webmaster's office at the link http://go.mail.ru/addurl, register / log in, then indicate the site's domain name, enter the captcha and click “Add”.

Then the following message will be displayed on the screen:

P.S. Indexing a site on Wordpress, Wix, Joomla, Ucoz or any other CMS or constructor is no different. It all depends on the set of rules that are written in the robots.txt file or in the page code itself. For more information on how to properly configure the robots.txt file, as well as how to open or close pages from crawling in Yandex and Google, read THIS article! Also, in the article you will find ready-made robots files for the correct indexing of WordPress, Joomla, Wix sites in Yandex and Google.

Site indexing in search engines - how it happens and how to speed it up - 5.0 out of 5 based on 1 vote

After creating their own website, many webmasters relax and think that the hardest part is over. In fact, this is not the case. First of all, the site is created for visitors.

After all, it is visitors who will read the pages with articles, buy goods and services posted on the site. The more visitors, the more profit. And traffic from search engines is the basis of everything, that's why it is so important that the indexing of the site is fast and the pages are kept in the index stably.

If there is no traffic, then few people will even know about the site, especially this provision is relevant for young Internet resources. Good indexing helps the page get to the top of search engines as soon as possible and, as a result, attracts a large number of targeted visitors.

What is indexing and how it works

First you need to understand what it is. Site indexing is the process of collecting information from site pages and then entering it into the search engine database. After that, the received data is processed. Then, after a while, the page will appear in the search engine results and people will be able to find it using this search engine.

Programs that collect and analyze information are called search robots or bots. Each search engine has its own robots. Each of them has its own name and purpose.

As an example, there are 4 main types of Yandex search robots:

1. A robot that indexes site pages. Its task is to detect and enter the found content pages into the database.

2. A robot that indexes pictures. Its task is to detect and enter into the search engine database all graphic files from the site pages. Then these pictures can be found by users in a Google image search or in the Yandex.Pictures service.

3. A robot that indexes site mirrors. Sometimes sites have multiple mirrors. The task of this robot is to identify these mirrors using information from robots.txt, and then give users only the main mirror in the search.

4. A robot that checks the availability of the site. Its task is to periodically check the site added via Yandex.Webmaster for its availability.

In addition to the above, there are other types of robots. For example, robots that index video files and favicons on site pages, robots that index "fast" content, and robots that check the performance of an Internet resource hosted in Yandex.Catalog.

Indexing of site pages by search engines has its own characteristics. If the robot discovers a new page on the site, then it is entered into its database. If the robot commits changes to old pages, then their versions previously entered into the database are deleted and replaced with new ones. And all this happens over a period of time, usually 1-2 weeks. Such long periods are explained by the fact that search robots have to work with a large amount of information (a large number of new sites appear every day, and old ones are also updated).

Now about the files that search engine bots can index.

In addition to web pages, search engines also index some files of closed formats, but with certain restrictions. So in PDF robots only read text content. Flash files quite often are not indexed at all (or only text placed in special blocks is indexed there). Also, robots do not index files larger than 10 megabytes. Search engines are best at indexing text. When indexing it, the minimum number of errors is allowed, the content is entered into the database in full.

To summarize, many search engines at the moment can index formats such as TXT, PDF, DOC and DOCX, Flash, XLS and XLSX, PPT and PPTX, ODP, ODT, RTF.

How to speed up the process of site indexing in search engines

Many webmasters are thinking about how to speed up indexing. First, you need to understand what the indexing time frame is. This is the time between visits to the site by the search robot. And this time can vary from a few minutes (on large information portals) to several weeks or even months (on forgotten and abandoned small or new sites).

There are frequent cases of content theft. Someone can simply copy your article and place it on their website. If a search engine indexes this article before it happens on your site, then the search engines will consider this site as the author, not yours. And although today some tools have appeared that allow you to indicate the authorship of content, the speed of indexing of site pages does not lose its relevance.

Therefore, below we will give tips on how to avoid all this and speed up the indexing of your resource.

1. Use the "Add URL" function - these are the so-called addurilki, which are forms in which you can enter and add the address of any page on the site. In this case, the page will be added to the indexing queue.

It is available in many major search engines. So that you do not have to search for all the addresses of the forms for adding site pages, we have collected them in a separate article: "". This method cannot be called 100% protection against plagiarism, but it is a good way to inform the search engine about new pages.

2. Register the site in Google Webmaster Tools and Yandex.Webmaster service. There you can see how many pages of the site have already been indexed, and how many have not been indexed. You can add pages to the indexing queue and do much more with the tools available there.

3. Make a sitemap in two formats - HTML and XML. The first is needed for placement on the site and for ease of navigation. The second map is needed for search engines. It contains text links to all pages on your site. Therefore, when indexing, the robot will not miss any of them. Sitemap can be done using plugins for CMS or using numerous online services.

The following are excellent solutions for creating it:

  • For CMS Joomla, the Xmap component;
  • For WordPress plugin Google XML Sitemaps;
  • For CMS Drupal, the SitemapXML module;
  • The www.mysitemapgenerator.com service can serve as a universal tool for creating a sitemap.

4. Announcement of articles on social networks - Google +1, Twitter, Facebook, Vkontakte. Immediately after adding a new article to the site, announce it on your Google + page, Twitter feed, and pages on Facebook and Vkontake. It is best to put social media buttons on the site and add announcements there simply by clicking on the buttons. You can set up automatic announcements on Twitter and Facebook.

5. Cross-post to various blog platforms. You can create blogs for yourself on services such as: Li.ru, Livejournal.com, wordpress.ru, blogspot.com and publish there short announcements of your articles with links to their full versions on your site.

6. Make an RSS feed of the site and register it in various RSS directories. You can find their addresses in the article: "".

7. Frequency of site updates. The more often new materials appear on your site, the more often search robots will visit it. For a new site, this is best done every day, well, at least every other day.

9. Only post unique content on your site. This is a universal rule of thumb to improve more than just the indexing process. The more unique the material, the better search engines will relate to your site. The more often search robots will visit you.

These methods for speeding up indexing will be quite enough for a young or middle-aged site. They won't take you long and have a good effect.

Prevent indexing pages

In some cases, the webmaster needs to close the site from indexing or close its individual pages and sections. What is it for? For example, some of the pages on your site do not contain useful information; these can be all sorts of technical pages. Or you need to close unnecessary external links, banners, and so on from indexing.

1. Robots.txt.

You can close individual pages and sections of the resource from indexing using the robots.txt file. It is placed in the root directory. There are written the rules for search robots in terms of indexing individual pages, sections, and even for individual search engines.

With the help of special directives of this file, you can very flexibly control the indexing.

Here are some examples:

You can prohibit indexing of the entire site by all search engines using the following directive:

User-agent: * Disallow: /

Disable indexing of a specific directory:

User-Agent: * Disallow: / files /

Disable indexing of pages whose url contains "?":

User-agent: * Disallow: / *?

And so on. The robots.txt file has many directives and capabilities, and this is a topic for a separate article.

2. There is also noindex and nofollow tag and meta tag.

To prohibit indexing of certain content on the page, just place it between the tags , but these tags only work for the Yandex search engine.

If you need to close a separate page or site pages from indexing, you can use meta tags. To do this, on the page of your site between the tags you need to add the following:

If you add:

then the document will not be indexed either.

If you add:

then the search engine robot will not follow the links on this page, but will index the page itself.

In this case, what will be indicated in the meta tags will take precedence over the directives of the robots.txt file. Therefore, if you prohibit indexing of a certain directory of your site in the robots.txt file, and the following meta tag will be indicated on the pages of the site that refer to this directory:

Then the page data will still be indexed.

If the site is built on some CMS, then some of them have the ability to close the page for indexing using special options. In other cases, these meta tags will have to be inserted manually into the site pages.

In the next articles, we will take a closer look at the procedure for prohibiting indexing and everything connected with it (using the robots.txt file, as well as the noindex and nofollow tags).

Indexing and page dropping issues

There are many reasons why an Internet resource may not be indexed. Below we list the most common ones.

1. The Robots.txt file is incorrectly configured or specified incorrectly.

2. The domain of your site has already been used for a certain site and has a bad history, most likely some kind of filter was previously applied to it. Most often, problems of this kind relate to indexation by Yandex. The pages of the site can get into the index during the first indexing, then they completely crash and are no longer indexed. When you contact Yandex support, you will most likely be told to develop the site and everything will be fine.

But as practice shows, even after 6 months of publication of high-quality unique content on the site, there may not be any positive movements. If you have a similar situation and the site has not been indexed for 1 - 2 months, then it is better. As a rule, after that everything falls into place and the pages of the site begin to be indexed.

3. Non-unique content. Add only unique material to the site. If your site contains a large amount of copy-paste, then do not be surprised that over time these pages may drop out of the index.

4. The presence of spam in the form of links. On some sites, pages are literally inundated with external links. The webmaster usually hosts all of this in order to make more money. However, the end result can be very sad - certain pages of the site and the entire site can be excluded from the index, or some other sanctions can be imposed.

5. The size of the article. If you look at the source code of any page on your site, you will see that the text of the article itself does not take up much space compared to the code of other elements (header, footer, sidebar, menu, etc.). If the article is too small, then it can even get lost in the code. Therefore, there may also be problems with the uniqueness of such a page. Therefore, try to publish notes with a text volume of at least 2,000 characters; such content is unlikely to cause problems.

How to check site indexing

Now let's talk about how to check the indexing of your Internet resource and find out exactly how many pages are indexed.

1. First of all, try to drive into a simple search of the same Google or Yandex. The results should contain this page. If there is no page, then it is not indexed.

2. To check the indexing of all pages of a site in Yandex, it is enough to insert host: your-site.ru | host: www.your-site.ru and search. For Google, it is enough to insert into the search form site: your-site.ru

3. You can also check your site using a service such as pr-cy.ru. Everything is simple and understandable here. You just need to drive the address of your resource into the field located in the center, and then click the "Analyze" button. After the analysis, you will receive the results of the check and find out how many pages are indexed in a particular search engine (you can do this in the corresponding section called "Key site indicators").

4. If your site has been added to the Yandex Webmaster service, then there you can also track the indexing of the website pages by this search engine.

Quite often, a new site cannot be found in Yandex. Even if you type its name in the search box. The reasons for this can be different. Sometimes search engines simply do not yet know that a new resource has appeared. To figure out what's the matter and solve the problem, you need to register the site in Yandex Webmaster.

What is site indexing in Yandex

First, let's figure out how search engines generally find out about new sites or changes to them. Yandex has a special program called a search robot. This robot is surfing the Internet and looking for new pages. Sometimes he goes to old ones - he checks if something new has appeared on them.

When the robot finds a useful page, it adds it to its database. This base is called the search index. When we search for something in the search, we see sites from this base. Indexing is when a robot adds new documents to it.

A robot cannot go around the entire Internet every day. He doesn't have enough power for that. Therefore, he needs help - to report new pages or changes on old ones.

What is Yandex Webmaster and why is it needed

Yandex.Webmaster is an official service from Yandex. You need to add a site to it so that the robot knows about its existence. With its help, resource owners (webmasters) can prove that this is their site.

You can also see in the Webmaster:

  • when and where the robot entered;
  • which pages he indexed and which he did not;
  • what keywords people come from search for;
  • are there any technical errors.

Through this service, you can customize the site: set the region, prices of goods, protect your texts from theft. You can ask the robot to re-enter the pages on which you made changes. Yandex Webmaster makes it easy to move to https or another domain.

How to add a new site to Yandex Webmaster

Go to the Webmaster's panel. Click Sign In. You can enter the login and password that you use to log into Yandex mail. If you don't have an account yet, you'll need to register.

After logging in, you will be taken to a page with a list of added resources. If you have not used the service before, the list will be empty. To add a new resource, click the "+" button.

On the next page, enter your site address and confirm adding it.

At the last stage, you need to verify your rights - to prove to Yandex that you are the owner. There are several ways to do this.

How to verify the rights to the site in Yandex Webmaster

The easiest way to verify rights in Yandex Webmaster is to add a file to the site. To do this, click on the "HTML file" tab.

A small file is downloaded. You will need this file now, so save it somewhere where you can see it. For example, on the Desktop. Don't rename the file! You don't need to change anything in it.

Now upload this file to your website. Usually, file managers are used for this, but for InSales users, none of this is necessary. Just go to the back office, click "Files". Then at the top of the page - "Add file". Select the file you downloaded earlier.

Then return to the Yandex.Webmaster panel and click the "Check" button. After successful confirmation of access rights, your site will appear in the list of added ones. Thus, you have informed Yandex Webmaster about the new site.

Meta tag Yandex Webmaster

Sometimes the method described above does not work, and the owners cannot verify the rights to the site in the Webmaster. In this case, you can try another way: add a line of code to the template.

In the Webmaster, go to the Meta Tag tab. You will see a line that needs to be added to the HTML code.

InSales users can contact technical support and ask to insert this code. This will be done as part of a free revision.

When they have done this, in the Webmaster, click the "Check" button. Congratulations, you have registered your site in the search engine!

Preconfiguring Yandex Webmaster

The site has been added to the search, now the robot will definitely come to you and index it. It usually takes up to 7 days.

Add link to sitemap

In order for the robot to index the resource faster, add the sitemap.xml file to the Webmaster. This file contains the addresses of all pages of the resource.

Online stores on InSales have this file already configured and should be added to the Webmaster automatically. If not, add a link to your sitemap.xml in the Indexing - Sitemap Files section.

Check robots.txt

The robots.txt file specifies the pages that the robot does not need to visit. These are shopping cart, checkout, back office and other technical documents.

InSales creates robots.txt by default, which does not need to be changed. Just in case, we recommend that you check if there are any errors in the robots. To do this, go to "Tools" - "Robots.txt Analysis".

Set site region

On the page "Site Information" - "Regionality" you can specify the region of the site. For online stores, these are the cities, regions and countries where purchased goods are delivered. If you don't have a store, but a directory or blog, then the whole world will be a region.

Set the sales region as shown in the screenshot:

What else is Webmaster useful for?

On the "Search queries" page, you can see the phrases that come to you from the search.

The "Indexing" section displays information about when the robot was on the site and how many pages it found. The subsection “Moving a site” will help you if you decide to install and switch to https. The subsection "Re-traversing pages" is also extremely useful. In it, you can indicate to the robot the pages on which the information has changed. Then, on the next visit, the robot will index them first.

On the "Products and Prices" page of the "Site Information" section, you can provide information about your online store. To do this, the resource must be configured to upload data about goods and prices in YML format. If configured correctly, the search results of product pages will display prices and shipping information.

If you want to improve the visibility of your company in Yandex services, you should use the "Useful services" section. In Yandex.Directory, you can specify the phone number and address of your store, opening hours. This information will be displayed directly in Yandex search results. It will also add you to Yandex.Maps.

Yandex.Metrica is another important tool for the owner of an Internet resource that displays traffic data. Statistics and dynamics of website traffic are displayed in tables, charts and graphs convenient for analysis.

After connecting to the Yandex.Webmaster and Yandex.Metrica services, you will receive a sufficient amount of information to manage the site's positions and traffic. These are indispensable tools for site owners who want to promote their resources in the most popular search engine in Russia.

The next step in website promotion is through the similar Search Console service.