Also, looking at http://ci.jujucharms.com/job/charm-storage/ I see only a
single success at a906cf9, all earlier revisions are marked as failure, so
I'm not sure if that is relevant or not.
There does seem to be something fishy going on with Mongo on Trusty. I see
the Ping Batcher operating:
2017-10-02 09:11:30 TRACE juju.state.presence pingbatcher.go:327
0xc42008b950 [cef56d] recorded 1 pings for 1 ping slot(s) and 1 fields in
0.010s
However, in the actual database there is no 'db.presence.pings' table:
juju:PRIMARY> show collections
presence.beings
presence.seqs
system.indexes
While in the shell I did some testing, and it seems that in Mongo 2.4
update(..., {upsert: true}) will *not* create a document if it doesn't
exist already if one of the operations is a $bit operation. Specifically,
doing:
I'll play with it a bit more, but I would have thought we had a trusty
based run of all the test suite, and that should have also failed, vs just
a CI test failing.
On Mon, Oct 2, 2017 at 1:07 PM, John Meinel <email address hidden> wrote:
> I don't see any successes for multi-series-charm so I'm not sure that
> there is actually a cause from this patch.
>
> The failure around 'agent-lost' or 'agent-is-not-communicating-with-the-server'
> could, indeed be caused by the change from this patch, as it is
> specifically dealing with changes around detecting whether agents are alive
> and communicating or not.
>
> I haven't managed to actually run the test yet, but just doing "bootstrap
> && deploy" I don't see the result. I've been trying with Xenial and LXD, I
> may try with Trusty to see if it makes a difference. (Its possible that
> Mongo 2.4 on Trusty doesn't support the $bit operation as we think it does,
> though the test suite was passing.)
>
> On Mon, Oct 2, 2017 at 12:07 PM, John Meinel <email address hidden>
> wrote:
>
>> Given the need to run against $ENV, I'm guessing this expects to have a
>> controller already bootstrapped? There shouldn't be anything in the patch
>> that changes agents (vs possibly changing how controllers operate).
>>
>> I'm missing something in how to run the tests manually. If I try:
>>
>> ./assess_storage lxd ~/dev/go/bin/juju ~/dev/tmp/juju_test
>>
>> Then it warns me that:
>> /home/jameinel/dev/go/src/github.com/juju/juju/acceptancetes
>> ts/utility.py:88: UserWarning: Not a directory
>> /home/jameinel/dev/tmp/jujutest
>> warnings.warn('Not a directory {}'.format(maybe_dir))
>> and fails with:
>> File "/home/jameinel/dev/go/src/github.com/juju/juju/acceptancete
>> sts/jujupy/configuration.py", line 37, in get_selected_environment
>> 'Environment "{}" does not exist.'.format(selected))
>> jujupy.configuration.NoSuchEnvironment: Environment "lxd" does not exist.
>>
>> However, if I first do "juju bootstrap lxd" I then get:
>> /home/jameinel/dev/go/src/github.com/juju/juju/acceptancetes
>> ts/utility.py:94: UserWarning: Directory '/home/jameinel/dev/tmp/jujutest'
>> has existing contents.
>> 'Directory {!r} has existing contents.'.format(maybe_dir))
>>
>> and a final failure of:
>> jujupy.configuration.NoSuchEnvironment: Environment "lxd" does not exist.
>>
>> Now, trying to look at the code, I see it trying to do:
>>
>> def get_juju_home():
>> home = os.environ.get('JUJU_HOME')
>> if home is None:
>> home = os.path.join(os.environ.get('HOME'), '.juju')
>> return home
>> ...
>> def get_environments_path(juju_home):
>> return os.path.join(juju_home, 'environments.yaml')
>> ...
>> def get_environments():
>> """Return the environments for juju."""
>> home = get_juju_home()
>> with open(get_environments_path(home)) as env:
>> return yaml.safe_load(env)['environments']
>>
>> So it seems that this code assumes you have a leftover Juju-1.x setup
>> laying around that has a matching environments.yaml file?
>>
>>
>> On Mon, Oct 2, 2017 at 11:37 AM, John Meinel <email address hidden>
>> wrote:
>>
>>> Note that you have to be logged in to see the 'charm-storage' job.
>>> Otherwise those links come back as 404.
>>>
>>> I'm investigating this now.
>>>
>>> John
>>>
>>> On Thu, Sep 28, 2017 at 7:50 PM, Burton Swan <<email address hidden>
>>> > wrote:
>>>
>>>> ** Description changed:
>>>>
>>>> - Juju version: 2.3-alpha1
>>>> + Juju version: 2.3-alpha1 and 2.2 series.
>>>>
>>>> PR7868 is highly likely the cause of this issue:
>>>> https://github.com/juju/juju/pull/7868
>>>>
>>>> Using charm-storage test as an example:
>>>> http://ci.jujucharms.com/job/charm-storage/
>>>>
>>>> Re-run the last successful job, revision build 5743, commit a906cf9,
>>>> works well:
>>>> http://ci.jujucharms.com/job/charm-storage/2027/
>>>>
>>>> Then re-run the first failed job, revision build 5745, commit
>>>> 774f2f0, failed again:
>>>> http://ci.jujucharms.com/job/charm-storage/2028/
>>>> 2017-09-28 14:11:49 ERROR ('0', 'agent is not communicating with the
>>>> server', 1149.777357)
>>>>
>>>> commit 774f2f0 also triggered other failures, e.g.:
>>>> http://ci.jujucharms.com/job/multi-series-charm/
>>>>
>>>> This issue can also be manually reproduced on the latest Juju
>>>> 2.3-alpha1 by run:
>>>> $ ./assess_storage.py $ENV $JUJU_BIN $JUJU_DATA
>>>>
>>>> ** Description changed:
>>>>
>>>> Juju version: 2.3-alpha1 and 2.2 series.
>>>>
>>>> PR7868 is highly likely the cause of this issue:
>>>> https://github.com/juju/juju/pull/7868
>>>> +
>>>> + On Juju 2.2 series it is PR7863:
>>>> + https://github.com/juju/juju/pull/7863
>>>>
>>>> Using charm-storage test as an example:
>>>> http://ci.jujucharms.com/job/charm-storage/
>>>>
>>>> Re-run the last successful job, revision build 5743, commit a906cf9,
>>>> works well:
>>>> http://ci.jujucharms.com/job/charm-storage/2027/
>>>>
>>>> Then re-run the first failed job, revision build 5745, commit
>>>> 774f2f0, failed again:
>>>> http://ci.jujucharms.com/job/charm-storage/2028/
>>>> 2017-09-28 14:11:49 ERROR ('0', 'agent is not communicating with the
>>>> server', 1149.777357)
>>>>
>>>> commit 774f2f0 also triggered other failures, e.g.:
>>>> http://ci.jujucharms.com/job/multi-series-charm/
>>>>
>>>> This issue can also be manually reproduced on the latest Juju
>>>> 2.3-alpha1 by run:
>>>> $ ./assess_storage.py $ENV $JUJU_BIN $JUJU_DATA
>>>>
>>>> --
>>>> You received this bug notification because you are subscribed to the bug
>>>> report.
>>>> https://bugs.launchpad.net/bugs/1720155
>>>>
>>>> Title:
>>>> Juju agent is not communicating with the server.
>>>>
>>>> To manage notifications about this bug go to:
>>>> https://bugs.launchpad.net/juju/+bug/1720155/+subscriptions
>>>>
>>>
>>>
>>
>
Also, looking at http:// ci.jujucharms. com/job/ charm-storage/ I see only a
single success at a906cf9, all earlier revisions are marked as failure, so
I'm not sure if that is relevant or not.
There does seem to be something fishy going on with Mongo on Trusty. I see
the Ping Batcher operating:
2017-10-02 09:11:30 TRACE juju.state.presence pingbatcher.go:327
0xc42008b950 [cef56d] recorded 1 pings for 1 ping slot(s) and 1 fields in
0.010s
However, in the actual database there is no 'db.presence.pings' table:
juju:PRIMARY> show collections
presence.beings
presence.seqs
system.indexes
While in the shell I did some testing, and it seems that in Mongo 2.4
update(..., {upsert: true}) will *not* create a document if it doesn't
exist already if one of the operations is a $bit operation. Specifically,
doing:
juju:PRIMARY> db.presence. test.find( ) test.update( {_id: "blah:1245"}, {$set: {"slot": test.find( ) test.update( {_id: "blah:1245"}, {$set: {"slot": test.find( ) test.update( {_id: "blah:1245"}, {$set: {"slot": test.find( )
juju:PRIMARY> db.presence.
1245}, $bit: {"alive.0": {"or": 1}}}, {"upsert": true})
juju:PRIMARY> db.presence.
# nothing returned, nothing created
juju:PRIMARY> db.presence.
1245}, $inc: {"alive.0": NumberInt(1)}}, {"upsert": true})
juju:PRIMARY> db.presence.
{ "_id" : "blah:1245", "alive" : { "0" : 1 }, "slot" : 1245 }
# $inc *did* create the document
juju:PRIMARY> db.presence.
1245}, $bit: {"alive.0": {"or": 2}}}, {"upsert": true})
juju:PRIMARY> db.presence.
{ "_id" : "blah:1245", "alive" : { "0" : 3 }, "slot" : 1245 }
# And now the $bit operation is able to increment it.
I'll play with it a bit more, but I would have thought we had a trusty
based run of all the test suite, and that should have also failed, vs just
a CI test failing.
On Mon, Oct 2, 2017 at 1:07 PM, John Meinel <email address hidden> wrote:
> I don't see any successes for multi-series-charm so I'm not sure that is-not- communicating- with-the- server' dev/go/ src/github. com/juju/ juju/acceptance tes dev/tmp/ jujutest maybe_dir) ) jameinel/ dev/go/ src/github. com/juju/ juju/acceptance te configuration. py", line 37, in get_selected_ environment .format( selected) ) configuration. NoSuchEnvironme nt: Environment "lxd" does not exist. dev/go/ src/github. com/juju/ juju/acceptance tes jameinel/ dev/tmp/ jujutest' '.format( maybe_dir) ) configuration. NoSuchEnvironme nt: Environment "lxd" does not exist. get('JUJU_ HOME') join(os. environ. get('HOME' ), '.juju') s_path( juju_home) : join(juju_ home, 'environments. yaml') environments_ path(home) ) as env: load(env) ['environments' ] /github. com/juju/ juju/pull/ 7868 ci.jujucharms. com/job/ charm-storage/ ci.jujucharms. com/job/ charm-storage/ 2027/ ci.jujucharms. com/job/ charm-storage/ 2028/ ci.jujucharms. com/job/ multi-series- charm/ /github. com/juju/ juju/pull/ 7868 /github. com/juju/ juju/pull/ 7863 ci.jujucharms. com/job/ charm-storage/ ci.jujucharms. com/job/ charm-storage/ 2027/ ci.jujucharms. com/job/ charm-storage/ 2028/ ci.jujucharms. com/job/ multi-series- charm/ /bugs.launchpad .net/bugs/ 1720155 /bugs.launchpad .net/juju/ +bug/1720155/ +subscriptions
> there is actually a cause from this patch.
>
> The failure around 'agent-lost' or 'agent-
> could, indeed be caused by the change from this patch, as it is
> specifically dealing with changes around detecting whether agents are alive
> and communicating or not.
>
> I haven't managed to actually run the test yet, but just doing "bootstrap
> && deploy" I don't see the result. I've been trying with Xenial and LXD, I
> may try with Trusty to see if it makes a difference. (Its possible that
> Mongo 2.4 on Trusty doesn't support the $bit operation as we think it does,
> though the test suite was passing.)
>
> On Mon, Oct 2, 2017 at 12:07 PM, John Meinel <email address hidden>
> wrote:
>
>> Given the need to run against $ENV, I'm guessing this expects to have a
>> controller already bootstrapped? There shouldn't be anything in the patch
>> that changes agents (vs possibly changing how controllers operate).
>>
>> I'm missing something in how to run the tests manually. If I try:
>>
>> ./assess_storage lxd ~/dev/go/bin/juju ~/dev/tmp/juju_test
>>
>> Then it warns me that:
>> /home/jameinel/
>> ts/utility.py:88: UserWarning: Not a directory
>> /home/jameinel/
>> warnings.warn('Not a directory {}'.format(
>> and fails with:
>> File "/home/
>> sts/jujupy/
>> 'Environment "{}" does not exist.'
>> jujupy.
>>
>> However, if I first do "juju bootstrap lxd" I then get:
>> /home/jameinel/
>> ts/utility.py:94: UserWarning: Directory '/home/
>> has existing contents.
>> 'Directory {!r} has existing contents.
>>
>> and a final failure of:
>> jujupy.
>>
>> Now, trying to look at the code, I see it trying to do:
>>
>> def get_juju_home():
>> home = os.environ.
>> if home is None:
>> home = os.path.
>> return home
>> ...
>> def get_environment
>> return os.path.
>> ...
>> def get_environments():
>> """Return the environments for juju."""
>> home = get_juju_home()
>> with open(get_
>> return yaml.safe_
>>
>> So it seems that this code assumes you have a leftover Juju-1.x setup
>> laying around that has a matching environments.yaml file?
>>
>>
>> On Mon, Oct 2, 2017 at 11:37 AM, John Meinel <email address hidden>
>> wrote:
>>
>>> Note that you have to be logged in to see the 'charm-storage' job.
>>> Otherwise those links come back as 404.
>>>
>>> I'm investigating this now.
>>>
>>> John
>>>
>>> On Thu, Sep 28, 2017 at 7:50 PM, Burton Swan <<email address hidden>
>>> > wrote:
>>>
>>>> ** Description changed:
>>>>
>>>> - Juju version: 2.3-alpha1
>>>> + Juju version: 2.3-alpha1 and 2.2 series.
>>>>
>>>> PR7868 is highly likely the cause of this issue:
>>>> https:/
>>>>
>>>> Using charm-storage test as an example:
>>>> http://
>>>>
>>>> Re-run the last successful job, revision build 5743, commit a906cf9,
>>>> works well:
>>>> http://
>>>>
>>>> Then re-run the first failed job, revision build 5745, commit
>>>> 774f2f0, failed again:
>>>> http://
>>>> 2017-09-28 14:11:49 ERROR ('0', 'agent is not communicating with the
>>>> server', 1149.777357)
>>>>
>>>> commit 774f2f0 also triggered other failures, e.g.:
>>>> http://
>>>>
>>>> This issue can also be manually reproduced on the latest Juju
>>>> 2.3-alpha1 by run:
>>>> $ ./assess_storage.py $ENV $JUJU_BIN $JUJU_DATA
>>>>
>>>> ** Description changed:
>>>>
>>>> Juju version: 2.3-alpha1 and 2.2 series.
>>>>
>>>> PR7868 is highly likely the cause of this issue:
>>>> https:/
>>>> +
>>>> + On Juju 2.2 series it is PR7863:
>>>> + https:/
>>>>
>>>> Using charm-storage test as an example:
>>>> http://
>>>>
>>>> Re-run the last successful job, revision build 5743, commit a906cf9,
>>>> works well:
>>>> http://
>>>>
>>>> Then re-run the first failed job, revision build 5745, commit
>>>> 774f2f0, failed again:
>>>> http://
>>>> 2017-09-28 14:11:49 ERROR ('0', 'agent is not communicating with the
>>>> server', 1149.777357)
>>>>
>>>> commit 774f2f0 also triggered other failures, e.g.:
>>>> http://
>>>>
>>>> This issue can also be manually reproduced on the latest Juju
>>>> 2.3-alpha1 by run:
>>>> $ ./assess_storage.py $ENV $JUJU_BIN $JUJU_DATA
>>>>
>>>> --
>>>> You received this bug notification because you are subscribed to the bug
>>>> report.
>>>> https:/
>>>>
>>>> Title:
>>>> Juju agent is not communicating with the server.
>>>>
>>>> To manage notifications about this bug go to:
>>>> https:/
>>>>
>>>
>>>
>>
>