Launchpad itself

odd rabbit configuration sequence in yuixhr tests

Bug #883980 reported by Robert Collins on 2011-10-30

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Triaged	High	Unassigned

Bug Description

With the following patch applied to LP I was able to determine that the rabbit configuration in the yuixhr appserver gets toggled off after starting enabled.

As we want to probe actions taking place in the subordinate appserver, we want to share some things:
- test db
- rabbit instance
- librarian
etc

From the attached trace, this is what happens:
- test runner starts (pid 1447) w/no rabbit
- test slave starts (pid 1687) w/ rabbit configured and running port 54056
- librarian is started (pid 1701) w/out rabbit
- librarian is reconfigured during startup (same pid no rabbit)
- test runner connects to oops queue to catch oops (from pid 1447) on rabbit 54056
- test slave attempts to raise an oops but rabbit is no longer configured

-> something during the slave startup is overriding the rabbit config to say its not configured.

triaging as high as AFAICT this is a preexisting defect in the yuixhr environment (and will affect any attempt to use rabbit features with it)

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#1

patch to trace oops config/rabbit connections Edit (1.7 KiB, text/plain)

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#2

trace of oops config/rabbit connections Edit (50.8 KiB, text/plain)

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#3

setting INTERACTIVE_TESTS=1 when running the test suite 'fixes' this by creating a rabbit within the testapp environment. its trace shows:
slave appserver rabbit - initially 59260
reconfigured to port 53162

and oopses raised in the appserver won't be seen by the parent test runner, so not a good solution, but makes me think I'm in the right area.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#4

One thing that might make this easier is if the process controller layer used fixtures (with their details API) to expose log files etc from the subprocesses.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#5

baseLayer.setUp() causes the valid rabbit config to be nuked.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#6

(when called from start_testapp in runlaunchpad.py)

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#7

and hah. that resets the appserver config name
which means we're running a discrete config, wont' be talking to the same librarian etc. It also means that slave appserver startups are going to be significantly more expensive than desired (we're starting a new DB clone etc).

It *also* means we're at non-trivial risk of skew, with other utilities as well because the config is being yanked around underfoot.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-30:

#8

All that said, I think the simplest thing to unbreak my branch that triggered this is to remove the INTERACTIVE_TESTS variable - its not documented in-tree nor is it anything other than an optimisation to avoid having rabbitmq all the time (and my branch means we need it all the time).

That won't close this bug: but this bug can become 'oopses generated by slave appservers are not reported to the test driver'.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-10-31:

#9

The way that other tests using slaves work is they run 'bin/run' not run-testapp. This passes in the config to use and things Just Work. Possibly the easiest fix is to change run_name to 'run' and delete the whole run-testapp infrastructure as redundant.

Revision history for this message

Gary Poster (gary) wrote on 2011-11-03:

#10

run-testapp is there specifically to support the needs of yuixhr. bin/run didn't work as is: run-testapp emerged because of this. I strongly suspect that this is still the case.

Considerations for changing this include the following:

- Unlike all other LP tests, and inherent to their use cases, the browser determines when a yuixhr test begins and ends. It communicates test end to the server with xhr, so that the server can tear down any fixtures, and in particular reset the database. The server is the LP subprocess, against which the browser is running. That subprocess needs access to the DB test fixture that allows resetting the database. (It may also need access to other standard test fixtures for introspection or resetting.)

Other broad approaches that could address this problem other than having a run-testapp that starts its own fixtures include the following.

   * We abstract the code to reset the database out of the current db fixture. The subprocess uses these abstractions to control it. The parent process needs to be able to handle when changes happen out from underneath it. Any other fixture bits that the subprocess needs in the future get a similar treatment. This seems like the cleanest and best alternative.
   * The subprocess has a channel to the main process that lets it control the fixtures in the parent.
   * The parent process also has a webserver and the browser talks to it for server-side test setup and teardown.

- The yuixhr tests should be able to be run interactively to help with test creation and maintenance. Ideally they will be run as similarly as possible to how they are run in the test suite. Right now, particularly with the removal of the change in behavior that INTERACTIVE_TESTS caused, they are very close to identical. The first approach above seems to be the easiest to match to this goal, but it will still be difficult.

As Robert knows, I prefer the current approach of reusing layer setup and teardown in run-testapp to solve these issues. As I know, Robert does not like it, and raises the concern that more services will require more and more code gluing fixtures and layers together. We haven't resolved this disagreement yet.

run-testapp is there specifically to support the needs of yuixhr.  bin/run didn't work as is: run-testapp emerged because of this.  I strongly suspect that this is still the case.

Considerations for changing this include the following:

- Unlike all other LP tests, and inherent to their use cases, the browser determines when a yuixhr test begins and ends.  It communicates test end to the server with xhr, so that the server can tear down any fixtures, and in particular reset the database.  The server is the LP subprocess, against which the browser is running.  That subprocess needs access to the DB test fixture that allows resetting the database.  (It may also need access to other standard test fixtures for introspection or resetting.)

Other broad approaches that could address this problem other than having a run-testapp that starts its own fixtures include the following.

* We abstract the code to reset the database out of the current db fixture.  The subprocess uses these abstractions to control it.    The parent process needs to be able to handle when changes happen out from underneath it.  Any other fixture bits that the subprocess needs in the future get a similar treatment.  This seems like the cleanest and best alternative.
   * The subprocess has a channel to the main process that lets it control the fixtures in the parent.
   * The parent process also has a webserver and the browser talks to it for server-side test setup and teardown.

- The yuixhr tests should be able to be run interactively to help with test creation and maintenance.  Ideally they will be run as similarly as possible to how they are run in the test suite.  Right now, particularly with the removal of the change in behavior that INTERACTIVE_TESTS caused, they are very close to identical.  The first approach above seems to be the easiest to match to this goal, but it will still be difficult.

As Robert knows, I prefer the current approach of reusing layer setup and teardown in run-testapp to solve these issues.  As I know, Robert does not like it, and raises the concern that more services will require more and more code gluing fixtures and layers together.  We haven't resolved this disagreement yet.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.