Best SEO for Your CDN

By Jonas Krummenacher

Updated on January 28, 2020

Search engine crawlers (also known as bots or spiders) scan your website whether you like it or not. They scan pretty much everything that's available, which is normally a good thing. Why is SEO with CDNs so important? As you start using a CDN, your content can appear from different domains. Nothing wrong about that as long as the search engines have clarity about your content. If the content is not clearly declared, they will penalize you for duplicate content.

We offer two options to reach the best SEO. Both options fulfill the goal of not having duplicate content and be aligned with search engines. Let's take a closer look on both solutions.

1. Canonical URLs

An extra HTTP header added to your Zone lets the crawler know, that the content from the CDN is only a copy. Once we add rel="canonical" to the HTTP header, we're on the safe side. Crawlers are aware that this is only a copy.

The rel="canonical" header will be applied to the entire Zone. If you already send a canonical header from your origin server, there is no need to enable it in the dashboard.

In KeyCDN when a new Zone is added the Canonical Header feature is automatically set to enabled. This setting can be set to disabled instead if you don't want this response header added.

2. Robots.txt file

Search engines check for a robots.txt file at the root of a site. If the file is present, they will follow the instructions but if no file is present, they will scan everything. In KeyCDN when a new Zone is added the Robots.txt feature is automatically set to disabled. This means the robot.txt file on the origin server is applied.

If this setting is set to enabled instead the following robots.txt file will be applied:

User-agent: *
Disallow: /

The first line defines the crawler the rule applies to. In the example above, the robots.txt applies to every crawler. User-agent: Googlebot would only apply for Googlebot.
The next line defines what path can be indexed. Disallow: / tells the search engine to not index anything.

To customize the default value shown above you can add your own rules with the Custom Robots.txt feature.