Consider a different approach to libxml_disable_entity_loader(true) in init.php

Bug #1340151 reported by Dan Poltawski
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
High
Aaron Wells
1.10
Fix Released
High
Unassigned
1.9
Won't Fix
High
Unassigned
15.04
Fix Released
High
Unassigned
15.10
Fix Released
High
Unassigned
16.04
Fix Released
High
Unassigned

Bug Description

Unfortunately it seems like using libxml_disable_entity_loader(true) in init.php is unkind to other applications living on the same system.

PHP Bug https://bugs.php.net/bug.php?id=64938 is the heart of the problem - the use of this setting leaks between different threads and by setting it for the duration of every single Mahara request this bug comes into play much more easily.

The other problem is https://bugs.php.net/bug.php?id=62577 which means that simplexml_load_file() will not even load *local* files off disk. For example, this would break in Mahara even though no entities come into play:

$xml = simplexml_load_file(get_config('libroot').'/db/install.xml');

In Moodle we've been warned on one of our issues that users have seen this problem in the wild with Mahara, I didn't find an issue reported to you about it, so i'll copy and paste the report from our tracker here:

"We had experience with this problem on an upgrade to Mahara 1.7, when https://bugs.launchpad.net/mahara/+bug/1047111 was fixed. The approach was the same as is suggested in this ticket. This caused seemingly random failures in Moodle (which is run on the same webserver) whenever there was usage of Mahara. The root cause was not obvious, nor well advertised with the Mahara fix, so it took awhile to finally implement a fix. In the meantime, significant functionality is broken. Unfortunately, there's really no way around this unless you switch from using mod_php."

Yesterday whilst debugging some code with Yuliya we realised she also was encountering this horrible combination of bugs affecting her Moodle install so it seems this is not a theoretical problem.

In Moodle we have decided to use the same approach of many other projects and enable/disable the loader around vulnerable code to reduce the chance of this combination of bugs affecting other applications or uses of the the simplexml_load_file:
http://cgit.drupalcode.org/drupal/commit/?id=b912710
https://github.com/symfony/symfony/blob/8ef8a1d289a6ce454b7c79baeddbfb45e4af6191/src/Symfony/Component/Config/Util/XmlUtils.php#L41
http://www.mediawiki.org/wiki/XML_External_Entity_Processing

summary: - Consider a different approach for libxml_disable_entity_loader(true)
+ Consider a different approach to libxml_disable_entity_loader(true) in
+ init.php
Robert Lyon (robertl-9)
Changed in mahara:
milestone: none → 1.10.0
importance: Undecided → High
status: New → Confirmed
Changed in mahara:
assignee: nobody → Aaron Wells (u-aaronw)
Aaron Wells (u-aaronw)
Changed in mahara:
milestone: 1.10.0 → 1.10.1
Revision history for this message
Aaron Wells (u-aaronw) wrote :

We've had multiple reports of Mahara interfering with Joomla sites running on the same server also. I bet this PHP bug is the cause of that as well.

https://mahara.org/interaction/forum/topic.php?id=6525&offset=0&limit=10#post27651

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Hi Dan,

Do you happen to have a test case? We have trouble replicating the issue as our servers are not set up "the wrong way". ;-)

Thank you
Kristina

Revision history for this message
Yuliya Bozhko (yuliya.bozhko) wrote :

Hi Kristina,

It breaks for me when I install Mahara and then try to install Moodle on the same server.

Anything that tries to load XML on the same server as Mahara will fail.

Maybe Dan can provide a better test case.

Yuliya

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Hi Yuliya,

Any specific operation in Moodle that causes problems?

I seem to recall that maybe it was creating (or restoring) course backups that could foul up? Because that's actually using XML, which is what gets affected by this bug.

Cheers,
Aaron

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Discussed this with Yuliya on IRC. She says you can replicate like this:

1. Install Mahara & Moodle on the same web server

2. Access the Mahara home page (or any Mahara page, because the offending code is in init.php)

3. Run the Moodle code checker.

This will cause Moodle to crash with this error: Warning: simplexml_load_file(): I/O warning : failed to load external entity "/home/yuliya/repos/moodle/lib/thirdpartylibs.xml" in /home/yuliya/repos/moodle/local/codechecker/locallib.php on line 194

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Dan helpfully posted a bunch of links in the bug description showing how other products have addressed this problem. I find the Mediawiki one to be the most concise and explanatory: http://www.mediawiki.org/wiki/XML_External_Entity_Processing

Based on that, we need this plan of action:

1. Remove the part from init.php where we do libxml_disable_entity_loader(true);

2. Provide wrapper functions for DOMDocument -> loadXML() and simplexml_load_string(), which will run libxml_disable_entity_loader() before the vulnerable code, and then set it back to the old value immediately after.

3. Replace all existing calls to DOMDocument -> loadXML() and simplexml_load_string() with calls to our wrapper functions.

XMLReader->XML() is also vulnerable, but we don't use that in the Mahara codebase anywhere.

We'll need to test this to make sure that it no longer crashes Moodle, *and* that it doesn't leave us vulnerable to the initual security issue we were trying to fix.

Revision history for this message
Robert Lyon (robertl-9) wrote :

Looking at this patch to PHP

http://git.php.net/?p=php-src.git;a=commitdiff;h=de31324c221c1791b26350ba106cc26bad23ace9

It looks like the problem has been fixed in ext/libxml/libxml.c

Not sure when that fix will make it out to stable releases though

Revision history for this message
Aaron Wells (u-aaronw) wrote :

I can't wait! ;)

I see it in the release notes for:
- 5.6.7
- 5.5.23
- 5.4.39

Although, since we claim to still support PHP 5.3, we still need to release a fix for this bug anyway.

Cheers,
Aaron

tags: added: no-behat-needed
Aaron Wells (u-aaronw)
no longer affects: mahara/1.8
Revision history for this message
Mahara Bot (dev-mahara) wrote : A patch has been submitted for review

Patch for "15.10_STABLE" branch: https://reviews.mahara.org/5737

Revision history for this message
Mahara Bot (dev-mahara) wrote :

Patch for "master" branch: https://reviews.mahara.org/5738

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Okay, I wrote up a little test script to check for this. Ubuntu 14.04 is still on PHP 5.5.9, so it is still affected by this bug (not patched in 5.5 until 5.5.23).

    <pre><?php
    $before = libxml_disable_entity_loader(true);
    libxml_disable_entity_loader($before);
    var_dump($before);
    exit();

This will basically find out the state of the XML entity loader (true means you can load them; false means you can't), and print that out.

When I run this on a CLI script, or in a newly restarted Apache site, it always returns false.

When I install Mahara and load up the Mahara front page, and then access this page via the web browser, it intermittently returns true. The intermittency is because this setting leaks across processes (and maybe also threads, but my Apache is unthreaded). So in my case, if I hit a process that had previously served Mahara, it came back true, otherwise, it came back false.

Tweaking my Apache config file /etc/apache2/mods-enabled/mpm_prefork.conf to force Apache to use only one workerprocess, I found that it returned true every time after I hit Mahara.

<IfModule mpm_prefork_module>
# StartServers 5
# MinSpareServers 5
# MaxSpareServers 10
# MaxRequestWorkers 150
# MaxConnectionsPerChild 0
        StartServers 1
        MinSpareServers 1
        MaxSpareServers 1
        MaxConnectionsPerChild 0
        MaxRequestWorkers 1
        ServerLimit 1
</IfModule>

Note that if you use this configuration to test, the Mahara front page takes a long time to load! :-D Because the request for every image, CSS, and JS file must be made one at a time. I worked around that by calling the front page via CURL, which doesn't automatically attempt to load any assets.

After I loaded the code in patch 5738 and rebooted Apache, I found that after hitting the front page of Mahara, the state of libxml_disable_entity_loader() did not change! So I think we can call that a success. Although to be more thorough I should also hit some of the code that temporarily enables it.

Cheers,
Aaron

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Actually now the patch still causes this bug, but only after you've done something that uses XML (like an export). I'll patch it to fix that...

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Okay, fixed a couple problems in the code, so that now we always set the state of libxml_disable_entity_loader() back to what it was initially, unless the script crashes unexpectedly before we can do that.

Now it passes my tests. The state of the entity loader remains the same before and after running a Mahara script. If anyone wants to try to replicate my tests using my script from comment 11, here's the stops:

1. Have an affected PHP version installed (you can do php -v at the commandline, or run phpinfo() in a web script, to check this). See comment 8 for affected versions.

2. Copy my test script into a file called "test.php", and put it into your web root.

3. Restart Apache.

4. Load up my test.php script in your web browser. Reload it a few times. Note what it says (it will probably be "false" every time).

5. Install Mahara

6. Export a Leap2a archive from Mahara

7. Load up test.php again, and see what it says now. Refresh it a dozen or so times in case your Apache is using multiple workers or threads.

Expected result: The value you get from test.php in step 7 should be the same as the value you got in step 4.

Actual result (before patch): The value you got from test.php in step 7 was always "true".

As a further test you can alter test.php so that it leaves the the entity loader set to "true". Then revert it back to its original state that merely checks the state of the entity loader. You should now find that reloading test.php always shows "true". Now try running a Leap2a export from Mahara again, and then load up test.php after. It should again continue to show "true". Mahara should not have changed the current setting of libxml_disable_entity_loader().

Cheers,
Aaron

Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/5738
Committed: https://git.mahara.org/mahara/mahara/commit/76a2c6c9893539070db47747c8614a7f63676512
Submitter: Robert Lyon (<email address hidden>)
Branch: master

commit 76a2c6c9893539070db47747c8614a7f63676512
Author: Robert Lyon <email address hidden>
Date: Wed Nov 18 21:14:26 2015 +1300

Bug 1340151: dealing with libxml_disable_entity_loader

Set the state of the libxml entity loader & use internal
errors settings, back to what they were before we called
them, as a workaround to https://bugs.php.net/bug.php?id=64938,
which causes them to be shared by all threads in a single
process.

behatnotneeded

Change-Id: I0720146b1e91c24095a18636e09981830ef4ce8f
Signed-off-by: Robert Lyon <email address hidden>

Revision history for this message
Mahara Bot (dev-mahara) wrote : A patch has been submitted for review

Patch for "15.04_STABLE" branch: https://reviews.mahara.org/5769

Revision history for this message
Mahara Bot (dev-mahara) wrote :

Patch for "1.10_STABLE" branch: https://reviews.mahara.org/5771

Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/5737
Committed: https://git.mahara.org/mahara/mahara/commit/1580c0a7b73218e9d65cd097e423635f8d43f232
Submitter: Robert Lyon (<email address hidden>)
Branch: 15.10_STABLE

commit 1580c0a7b73218e9d65cd097e423635f8d43f232
Author: Robert Lyon <email address hidden>
Date: Wed Nov 18 21:14:26 2015 +1300

Bug 1340151: dealing with libxml_disable_entity_loader

Set the state of the libxml entity loader & use internal
errors settings, back to what they were before we called
them, as a workaround to https://bugs.php.net/bug.php?id=64938,
which causes them to be shared by all threads in a single
process.

behatnotneeded

Change-Id: I0720146b1e91c24095a18636e09981830ef4ce8f
Signed-off-by: Robert Lyon <email address hidden>

Revision history for this message
Mahara Bot (dev-mahara) wrote :

Reviewed: https://reviews.mahara.org/5769
Committed: https://git.mahara.org/mahara/mahara/commit/ca2306b7818a39fb20b3537f5f0452cde709fa33
Submitter: Robert Lyon (<email address hidden>)
Branch: 15.04_STABLE

commit ca2306b7818a39fb20b3537f5f0452cde709fa33
Author: Robert Lyon <email address hidden>
Date: Wed Nov 18 21:14:26 2015 +1300

Bug 1340151: dealing with libxml_disable_entity_loader

Set the state of the libxml entity loader & use internal
errors settings, back to what they were before we called
them, as a workaround to https://bugs.php.net/bug.php?id=64938,
which causes them to be shared by all threads in a single
process.

behatnotneeded

Change-Id: I0720146b1e91c24095a18636e09981830ef4ce8f
Signed-off-by: Robert Lyon <email address hidden>

Revision history for this message
Mahara Bot (dev-mahara) wrote :

Reviewed: https://reviews.mahara.org/5771
Committed: https://git.mahara.org/mahara/mahara/commit/a2149215b87059d1307de8ba27836811450724d7
Submitter: Robert Lyon (<email address hidden>)
Branch: 1.10_STABLE

commit a2149215b87059d1307de8ba27836811450724d7
Author: Robert Lyon <email address hidden>
Date: Wed Nov 18 21:14:26 2015 +1300

Bug 1340151: dealing with libxml_disable_entity_loader

Set the state of the libxml entity loader & use internal
errors settings, back to what they were before we called
them, as a workaround to https://bugs.php.net/bug.php?id=64938,
which causes them to be shared by all threads in a single
process.

behatnotneeded

Change-Id: I0720146b1e91c24095a18636e09981830ef4ce8f
Signed-off-by: Robert Lyon <email address hidden>

Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.