SEO: seems like Google is not indexing properly the 14.04 Server Guide and Desktop guide pages

Bug #1378539 reported by Csipak Attila
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Documentation
Fix Released
High
Gunnar Hjalmarsson
Ubuntu Server Guide
Fix Released
High
Unassigned

Bug Description

This looks a SEO problem and not a content problem. Please forward it accordingly.

Take a look:

https://www.google.com/search?q=site:help.ubuntu.com/12.04/serverguide/

12.04 Server Guide: 144 results

https://www.google.com/search?q=site:help.ubuntu.com/14.04/serverguide/

14.04 Server Guide: only 3 results

Same searches on Bing:

http://www.bing.com/search?q=site%3Ahelp.ubuntu.com%2F12.04%2Fserverguide%2F

12:04 - 2480 results

http://www.bing.com/search?q=site%3Ahelp.ubuntu.com%2F14.04%2Fserverguide%2F

14.04 - 573 results

I don't know about you, but me, I'd rather have the official documentation turn up in the search results than some forum / QA site...

description: updated
Revision history for this message
Doug Smythies (dsmythies) wrote :

I am not aware of anything we (people that contribute to the Ubuntu Serverguide) can do about this issue raised with this bug report.
I also do not know who we would forward this to or what project it might better be set to.

@Csipak (or anybody else): Do you have any suggestions? If not, then I'll set the status to "Opinion".

Changed in serverguide:
status: New → Incomplete
status: Incomplete → New
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

I don't know right now either, but we should certainly make an effort to find out IMO. Noticed that it's just as bad for the desktop guide. Maybe an item for next team meeting?

Changed in serverguide:
importance: Undecided → High
status: New → Confirmed
Changed in ubuntu-docs:
importance: Undecided → High
status: New → Confirmed
summary: SEO: seems like Google is not indexing properly the 14.04 Server Guide
- pages
+ and Desktop guide pages
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

This search:

https://www.google.com/search?q=site:help.ubuntu.com/14.04/ubuntu-help

results in two items, of which one is the "What's new" page. The contents of that page is significantly different compared to the 12.04 version.

So one theory is that the search engines notice that we publish several versions of pages with the same or very similar contents, and refuse to index more than one instance of each page. If that's the case, this is a huge problem.

As regards the desktop you may also wonder why they seem to prefer the Chinese page names and descriptions...

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Maybe a robots.txt file may help. I prepared the linked merge proposal, so we have something to talk about.

I noticed that there is a robots.txt file already:

https://help.ubuntu.com/robots.txt

It's not in the branch, and as far as I can tell it's useless.

What do you think? Worth a try?

Revision history for this message
Doug Smythies (dsmythies) wrote :

Gunnar,

I thought of what you are wanting to do with the robots.txt file, however there is already one there.
I do not know if one that we try to supply would get overwritten or if we can merge them somehow or whatever.

Also, my thinking was that we would disallow everything except the lts and stable directories and thus the bots would always be pulling form the correct place. My thinking seems to be different that what you are proposing.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Hi Doug,

Yes, as I wrote in my previous comment, I know there is already one there. It seems to be intended for the help wiki, and currently it's probably useless. In effect my proposal tries to 'merge' them through this line:

Disallow: /community/?action=

which, unlike the current line, would work.

Most likely the current file would be overwritten. If we think this is the route to take, I can't see why we shouldn't make an attempt.

On 2014-10-09 05:38, Doug Smythies wrote:> Also, my thinking was that we would disallow everything except the lts
> and stable directories and thus the bots would always be pulling form
> the correct place. My thinking seems to be different that what you are
> proposing.

Right, that thought struck me just as I had made the MP. But in that case we should just allow 'stable', shouldn't we, or else we would keep duplicating in a way that the search engines seem to dislike.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

I made a few adjustments to the MP which I think combine our thoughts.

Changed in ubuntu-docs:
status: Confirmed → Fix Released
Changed in serverguide:
status: Confirmed → Fix Released
Revision history for this message
Doug Smythies (dsmythies) wrote :

It appears it did not work. In the past, the published help.ubuntu.com has always updated by this time. We'll give it a few more hours.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :
Changed in ubuntu-docs:
status: Fix Released → In Progress
Changed in serverguide:
status: Fix Released → In Progress
Changed in ubuntu-docs:
assignee: nobody → Gunnar Hjalmarsson (gunnarhj)
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Fixed by Canonical.

Changed in ubuntu-docs:
status: In Progress → Fix Released
Changed in serverguide:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.