---
title: "Robots.txt"
slug: "robotstxt"
description: "A robots.txt file is used to communicate with web crawlers and other automated agents about which pages of your knowledge base should not be indexed. "
tags: ["Robot.txt"]
updated: 2026-02-03T15:22:34Z
published: 2026-02-03T15:22:34Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://docs.document360.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Robots.txt

## What is a Robots.txt file?

A Robots.txt file is a text file used to communicate with web crawlers and other automated agents about which pages of your knowledge base should not be indexed. It contains rules specifying which pages may be accessed by which crawlers.

> [!NOTE]
> ** NOTE
> 
> For more information, read this [help article](https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt) from Google.

---

## Accessing Robots.txt in Document360

To access Robots.txt in Document360:

1. Navigate to **Settings**(**) in the left navigation bar
2. Go to **Knowledge base site** > **Article settings & SEO** > **SEO**tab.
3. Locate **Robots.txt** and click **Edit**.

The **Robots.txt settings** panel will appear.
4. Type in your desired rules.
5. Click **Update**.

![Settings page showing SEO options and robots.txt configuration for a knowledge base.](https://cdn.document360.io/860f9f88-412e-4570-8222-d5bf2f4b7dd1/Images/Documentation/robots.txt.png)

---

### Use cases of Robots.txt

A Robots.txt file can block a folder, file (such as a PDF), or specific file extensions from being crawled.

You can also delay the crawl speed of bots by adding crawl-delay in your Robots.txt file. This is useful when your site is experiencing high traffic.

```plaintext
User-agent: *
Crawl-delay: 10
```

---

### Restricting the crawler through admin data

```plaintext
User-agent: *
Disallow: /admin/
Sitemap: https://example.com/sitemap.xml
```

`User-agent: *` - Specifies that any bot can crawl through the site. `Disallow: /admin/:` - Restricts the crawler from accessing admin data. `Sitemap: https://example.com/sitemap.xml` - Provides access to bots to crawl the sitemap. This makes the crawl easier as the sitemap contains all the URLs of the site.

---

### Restricting a specific search engine from crawling

```plaintext
User-agent: Bingbot 
Disallow: /
```

> The above Robots.txt file is defined to disallow the Bingbot.

`User-agent: Bingbot` - Specifies the crawler from the Bing search engine. `Disallow: /` - Restricts Bingbot from crawling the site.

---

#### Best Practices

- **Include links**to the most important pages.
- **Block links**to pages that do not provide any value.
- Add the sitemap location in the **Robots.txt** file.
- A Robots.txt file cannot be added twice. Please check the basic guidelines from [Google Search Central](https://developers.google.com/search/docs/advanced/robots/create-robots-txt#format_location) documentation for more information.

> [!NOTE]
> ** NOTE
> 
> A web crawler, also known as a Spider or Spiderbot, is a program or script that automatically navigates the web and collects information about various websites. Search engines like Google, Bing, and Yandex use crawlers to replicate a site's information on their servers.
> 
> Crawlers open new tabs and scroll through website content, just like a user viewing a webpage. Additionally, crawlers collect data or metadata from the website and other entities (such as links on a page, broken links, sitemaps, and HTML code) and send it to the servers of their respective search engine. Search engines use this recorded information to index search results effectively.

---

### FAQs

#### How do I remove my Document360 project from the Google search index?

To exclude the entire project from the Google search index:

1. Navigate to **Settings**(**) in the left navigation bar in the **Knowledge base portal**.
2. In the left navigation pane, navigate to **Knowledge base site** > **Article settings & SEO** > **SEO**tab.
3. Go to the **SEO**tab and click **Edit** in the `Robots.txt`.
4. Paste the following code:

```plaintext
User-Agent: Googlebot 
Disallow:
```

1. Click **Update.**

#### How do I prevent tag pages from being indexed by search engines?

To exclude the tag pages from the search engines:

1. Navigate to **Settings**(**) in the left navigation bar.
2. Go to **Knowledge base site** > **Article settings & SEO** > **SEO**tab.
3. Click **Edit** in the `Robots.txt`.
4. Paste the following code:

```plaintext
User-agent: *
Disallow: /docs/en/tags/
```

1. Click **Update**.
