SEO: seems like Google is not indexing properly the 14.04 Server Guide and Desktop guide pages

Bug #1378539 reported by Csipak Attila on 2014-10-07
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Documentation
High
Gunnar Hjalmarsson
Ubuntu Server Guide
High
Unassigned

Bug Description

This looks a SEO problem and not a content problem. Please forward it accordingly.

Take a look:

https://www.google.com/search?q=site:help.ubuntu.com/12.04/serverguide/

12.04 Server Guide: 144 results

https://www.google.com/search?q=site:help.ubuntu.com/14.04/serverguide/

14.04 Server Guide: only 3 results

Same searches on Bing:

http://www.bing.com/search?q=site%3Ahelp.ubuntu.com%2F12.04%2Fserverguide%2F

12:04 - 2480 results

http://www.bing.com/search?q=site%3Ahelp.ubuntu.com%2F14.04%2Fserverguide%2F

14.04 - 573 results

I don't know about you, but me, I'd rather have the official documentation turn up in the search results than some forum / QA site...

description: updated
Doug Smythies (dsmythies) wrote :

I am not aware of anything we (people that contribute to the Ubuntu Serverguide) can do about this issue raised with this bug report.
I also do not know who we would forward this to or what project it might better be set to.

@Csipak (or anybody else): Do you have any suggestions? If not, then I'll set the status to "Opinion".

Changed in serverguide:
status: New → Incomplete
status: Incomplete → New
Gunnar Hjalmarsson (gunnarhj) wrote :

I don't know right now either, but we should certainly make an effort to find out IMO. Noticed that it's just as bad for the desktop guide. Maybe an item for next team meeting?

Changed in serverguide:
importance: Undecided → High
status: New → Confirmed
Changed in ubuntu-docs:
importance: Undecided → High
status: New → Confirmed
summary: SEO: seems like Google is not indexing properly the 14.04 Server Guide
- pages
+ and Desktop guide pages
Gunnar Hjalmarsson (gunnarhj) wrote :

This search:

https://www.google.com/search?q=site:help.ubuntu.com/14.04/ubuntu-help

results in two items, of which one is the "What's new" page. The contents of that page is significantly different compared to the 12.04 version.

So one theory is that the search engines notice that we publish several versions of pages with the same or very similar contents, and refuse to index more than one instance of each page. If that's the case, this is a huge problem.

As regards the desktop you may also wonder why they seem to prefer the Chinese page names and descriptions...

Gunnar Hjalmarsson (gunnarhj) wrote :

Maybe a robots.txt file may help. I prepared the linked merge proposal, so we have something to talk about.

I noticed that there is a robots.txt file already:

https://help.ubuntu.com/robots.txt

It's not in the branch, and as far as I can tell it's useless.

What do you think? Worth a try?

Doug Smythies (dsmythies) wrote :

Gunnar,

I thought of what you are wanting to do with the robots.txt file, however there is already one there.
I do not know if one that we try to supply would get overwritten or if we can merge them somehow or whatever.

Also, my thinking was that we would disallow everything except the lts and stable directories and thus the bots would always be pulling form the correct place. My thinking seems to be different that what you are proposing.

Gunnar Hjalmarsson (gunnarhj) wrote :

Hi Doug,

Yes, as I wrote in my previous comment, I know there is already one there. It seems to be intended for the help wiki, and currently it's probably useless. In effect my proposal tries to 'merge' them through this line:

Disallow: /community/?action=

which, unlike the current line, would work.

Most likely the current file would be overwritten. If we think this is the route to take, I can't see why we shouldn't make an attempt.

On 2014-10-09 05:38, Doug Smythies wrote:> Also, my thinking was that we would disallow everything except the lts
> and stable directories and thus the bots would always be pulling form
> the correct place. My thinking seems to be different that what you are
> proposing.

Right, that thought struck me just as I had made the MP. But in that case we should just allow 'stable', shouldn't we, or else we would keep duplicating in a way that the search engines seem to dislike.

Gunnar Hjalmarsson (gunnarhj) wrote :

I made a few adjustments to the MP which I think combine our thoughts.

Changed in ubuntu-docs:
status: Confirmed → Fix Released
Changed in serverguide:
status: Confirmed → Fix Released
Doug Smythies (dsmythies) wrote :

It appears it did not work. In the past, the published help.ubuntu.com has always updated by this time. We'll give it a few more hours.

Gunnar Hjalmarsson (gunnarhj) wrote :
Changed in ubuntu-docs:
status: Fix Released → In Progress
Changed in serverguide:
status: Fix Released → In Progress
Changed in ubuntu-docs:
assignee: nobody → Gunnar Hjalmarsson (gunnarhj)
Gunnar Hjalmarsson (gunnarhj) wrote :

Fixed by Canonical.

Changed in ubuntu-docs:
status: In Progress → Fix Released
Changed in serverguide:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.