Ignore duplicates in CSV upload

Bug #731647 reported by Kristina Hoeppner
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mahara
Invalid
Low
Unassigned

Bug Description

When an institution wants to upload their users via a csv file, but some of their users already have a Mahara account, i.e. the email address already exists in Mahara, an error message is returned and no new user accounts are created.

Instead, only these duplicates should be ignored and all other accounts should be created. Thus, when a csv file is uploaded and existing accounts / email addresses are encountered, convert the error message into a warning that is printed to the screen and keep going with the account creation.

Tags: csvupload
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Preferably, there's also a check box for the admin to decide whether existing user details (name, password, login) are updated from the csv file or not. I.e.:

Scenario 1: Ignore duplicates in CSV file
- email <email address hidden> already exists and belongs to Student Test
- existing record is kept without any change
- new accounts are created
- warning is issued that <email address hidden> already existed and no change was made

Scenario 2: Update duplicates with CSV file information
- admin clicks the setting to update existing accounts with new information from the CSV file
- before admin continues he sees a pop-up message asking whether he is sure that he wants to proceed because if duplicates are detected, the information from the CSV file will overwrite them (username, name, ...)
- email <email address hidden> already exists and belongs to Student Test
- existing record shall be updated to reflect the correct name of the user as it appears in an external directory, e.g. Student Test-QA
- the CSV file <email address hidden> for Student Test-QA
- because the admin approved the setting to update duplicates, the <email address hidden> account is updated with the new information
- new accounts are created
- after the upload is complete, admin receives warning messages on the screen alerting him to which accounts have been updated
- admin can save these warning messages as txt file for future reference, e.g. if a user has a question why his login details changed.

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Francois: This bug may not be bite-sized anymore ;-)

tags: removed: bite-sized
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Some more information from Piers if we want to make it work with SAML authentication as well (a number of these things are important for http://wikieducator.org/LMS-MyPortfolio_Interoperability_Project and will be looked at when architecting that).

For the auto creation of accounts to work effectively with SAML, several
things need to happen:
* when logged in via SAML and no account exists, users need to be guided
through the signup process, with pre-populated values from the SAML
assertions - on completion of registration, the SAML auth, and Mahara
accounts are linked.
* for existing users - a login-link process needs to built where when a
user logs in with via SAML, and there is no linked account, they are
prompted to login manually, and then asked if they wish to make the
association between the two accounts.
* inorder to make the above work, there are likely to be some core
changes to the Mahara login process as well as the SAML auth plugin.

Changed in mahara:
milestone: 1.4.0 → none
tags: added: csvupload
removed: csvimport
Revision history for this message
Aaron Wells (u-aaronw) wrote :

Also related to this, is that when you try to upload a file that contains duplicate users, it generates a warning message:

[WAR] e8 (lib/mahara.php:1123) Object of class stdClass could not be converted to string
Call stack (most recent first):

    log_message("Object of class stdClass could not be converted to...", 8, true, true, "/home/aaronw/www/mahara/htdocs/lib/mahara.php", 1123) at /home/aaronw/www/mahara/htdocs/lib/errors.php:446
    error(4096, "Object of class stdClass could not be converted to...", "/home/aaronw/www/mahara/htdocs/lib/mahara.php", 1123, array(size 3)) at Unknown:0
    sprintf("Line %s of the file specifies a remote username "%...", 4, "Polly", object(stdClass)) at Unknown:0
    call_user_func_array("sprintf", array(size 4)) at /home/aaronw/www/mahara/htdocs/lib/mahara.php:1123
    format_langstring("Line %s of the file specifies a remote username "%...", array(size 3), "en.utf8") at /home/aaronw/www/mahara/htdocs/lib/mahara.php:496
    get_string_location("uploadcsverrorremoteusertaken", "admin", array(size 3)) at /home/aaronw/www/mahara/htdocs/lib/mahara.php:293
    get_string("uploadcsverrorremoteusertaken", "admin", 4, "Polly", object(stdClass)) at /home/aaronw/www/mahara/htdocs/admin/users/uploadcsv.php:338
    uploadcsv_validate(object(Pieform), array(size 16)) at Unknown:0
    call_user_func_array("uploadcsv_validate", array(size 2)) at /home/aaronw/www/mahara/htdocs/lib/pieforms/pieform.php:1324
    Pieform->validate(array(size 16)) at /home/aaronw/www/mahara/htdocs/lib/pieforms/pieform.php:492
    Pieform->__construct(array(size 4)) at /home/aaronw/www/mahara/htdocs/lib/pieforms/pieform.php:161
    Pieform::process(array(size 4)) at /home/aaronw/www/mahara/htdocs/lib/pieforms/pieform.php:71
    pieform(array(size 4)) at /home/aaronw/www/mahara/htdocs/admin/users/uploadcsv.php:646

We should clean that up.

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Additionally, the error message that gets displayed to the user contains some "br" tags that get printed directly. That needs to be cleaned up as well:

Line 2 of the file specifies the email address "<email address hidden>" that is already taken by another user.<br> Line 2 of the file specifies a remote username "fake" that is already taken by the user "".<br> Line 2 of the file specifies the username "paula" that already exists.<br> Line 3 of the file specifies the email address "<email address hidden>" that is already taken by another user.<br> Line 3 of the file specifies a remote username "petra" that is already taken by the user "".<br> Line 3 of the file specifies the username "petra" that already exists.<br> Line 4 of the file specifies the email address "<email address hidden>" that is already taken by another user.<br> Line 4 of the file specifies a remote username "Polly" that is already taken by the user "".<br> Line 4 of the file specifies the username "polly" that already exists.

Revision history for this message
Sarah Capps (sarah-capps) wrote :

Hello,

We recently upgraded to Mahara 1.7 and I noticed that the above error, where the "br" tags are printing directly, is occurring. Is this something that is happening across the board? Will this be (or has this been) addressed in any subsequent releases?

Thank you,

-Sarah

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Hello Sarah,

We have not yet addressed the <br>. The message that is generated doesn't handle HTML code well.

It might be similar to bug #946880.

Revision history for this message
Robert Lyon (robertl-9) wrote :

Have added a patch to stop the escaping of <br> tags

https://reviews.mahara.org/#/c/2721

the other error I couldn't duplicate - may have already been fixed.

Changed in mahara:
status: Confirmed → In Progress
assignee: nobody → Robert Lyon (robertl-9)
Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/2721
Committed: http://gitorious.org/mahara/mahara/commit/ed76087663c5972003483d44d53f97b94eca58bc
Submitter: Robert Lyon (<email address hidden>)
Branch: master

commit ed76087663c5972003483d44d53f97b94eca58bc
Author: Robert Lyon <email address hidden>
Date: Wed Nov 20 10:06:42 2013 +1300

Unescaping the <br> tag in warning message (bug #731647)

- also the bug mentions in comment #4 an Object warning
but I couldn't duplicate in testing with v1.8

Change-Id: I59960f9f717154d710de844546cef9cda29f17f3
Signed-off-by: Robert Lyon <email address hidden>

Robert Lyon (robertl-9)
Changed in mahara:
status: In Progress → Fix Committed
milestone: none → 1.9.0
Robert Lyon (robertl-9)
Changed in mahara:
status: Fix Committed → Fix Released
Revision history for this message
Juan Carrera (carreraj) wrote :

The fix released is only for the scaped <br> issue. The process still doesn'a ignore existing users.

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Good point Juan. Changing this bug back to "In Progress".

Changed in mahara:
status: Fix Released → In Progress
status: In Progress → Confirmed
milestone: 1.9.0 → none
Revision history for this message
Stacey Walker (stacey) wrote :

We have experienced a similar issue.

When uploading users if duplicate emails exist but the usernames don't match the system reports a non-recoverable error in the server logs but only presents a simple "no users uploaded" type error to the user within the UI with no further explanation of what happened - what was incorrect in the file - in order for the administrator to correct the CSV file.

This means that the administrative user has to contact server admins to identify the error and correct their file accordingly.

Especially as the nonrecoverable error prevents any other genuine accounts without errors from being created.

It'd be better if Mahara provided clearer error reporting for the administrative user in the UI rather than failing and only providing this in the backend.

Revision history for this message
Robert Lyon (robertl-9) wrote :

I don't believe this is still a problem.

To upload a CSV file of users that contain some new users and some old users works as expected

- when not enabling the 'Update users' option we get error messages about what went wrong
- when enabling the 'Update users' option we get information about what users were added and what users were updated

I didn't get any site error when uploading a CSV where there was a line containing existing username for one user and existing email for another user - I got the expected

Line 4 of the file specifies the email address "<email address hidden>" that is already taken by another user.

So this looks to be all working now

Changed in mahara:
status: Confirmed → Invalid
assignee: Robert Lyon (robertl-9) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.