Comment 27 for bug 1272083

Revision history for this message
Xiaoming Wang (xwang2713) wrote : Re: [Bug 1272083] Re: HPCC Charm initial check-in for review

Hi Matt,

Here are two typical cases which roxie start failed:
1) Not enough system memory. We recommend 4GB. With 2GB it may be OK to
start it and do some basic functions
2) network resource with juju-local
    For some environment juju-local doesn't start linux instance with
correct network resource.
    There are lots things missing under /proc/sys/net/core/
    /proc/sys/net/core/rmem_max is missing which result roxie process fails
to start:
   "/proc/sys/net/core/rmem_max value 0 is less than 131071"
  "EXCEPTION: (1455): System socket max read buffer is less than 131071

I will check your case when having roxie.log and file list under
/proc/sys/net/core.

Thanks

On Wed, Mar 5, 2014 at 1:29 PM, Xiaoming Wang <email address hidden> wrote:

> Hi Matt,
>
> I think about "sed -e". It will cause shell script terminate if any
> statement return non-zero value.
> I understand it has more clean code to do that way. But given the time
> limit (I heard Match 7 is the deadline for this charm to complete) I 'd
> rather defer adding "see -e". We do have some code parsing the return code
> (grep, etc). It will require more testing if I make the change.
>
> We prefer a minimum code change at this stage of game unless it is
> must-fix issue.
> For example hpcc start fails (roxie fails to start in your log).
>
>
> Let me know. I include my manager, Ort Stuart, in our discussion.
>
> Thanks
>
>
>
>
> On Wed, Mar 5, 2014 at 1:07 PM, Xiaoming Wang <email address hidden> wrote:
>
>> Hi Matt,
>>
>> Thanks for the log. Could you give the output of /proc/sys/net/core/ on
>> unit 1 or 3 which roxie process failed.
>> Also the logs under /var/log/HPCCSystems/myroxie.
>>
>> I get similar error in two of my LXC environment which is due a problem
>> with juju-local or some kind of system configuration with juju-local
>> packages. All my other team members don't have this problem.
>> I am still investigating the reason. I doubt it may related to my
>> virtual box settings.
>>
>>
>> On Wed, Mar 5, 2014 at 12:45 PM, Matt Bruzek <
>> <email address hidden>> wrote:
>>
>>> I have attached the juju status log where I could see cluster1/0,
>>> cluster1/1 and cluster 1/3 with a hook in error state. Please note
>>> cluster1/2 did not appear to have a failed hook, but I included the core
>>> files for comparison.
>>>
>>>
>>> ** Attachment added: "Log files from the hpcc units in error."
>>>
>>> https://bugs.launchpad.net/charms/+bug/1272083/+attachment/4008636/+files/hpcc_logs.tar.gz
>>>
>>> --
>>> You received this bug notification because you are subscribed to the bug
>>> report.
>>> https://bugs.launchpad.net/bugs/1272083
>>>
>>> Title:
>>> HPCC Charm initial check-in for review
>>>
>>> Status in Juju Charms:
>>> Incomplete
>>>
>>> Bug description:
>>> A new Juju Charm - hpcc is submitted for review.
>>> The README.md has the information for how to use the charm.
>>>
>>> For charm code:
>>> config.yaml: there are parameters to control which HPCCSystems use
>>> can install. There are other parameters for how to configure
>>> the cluster.
>>> bin/ : There are some help scripts we put here mainly for
>>> re-configure HPCC cluster. Default configuration may not meet user need.
>>> We understand it probably not recommend way for juju
>>> charm. But we want to give user convenient way to access the tools.
>>> We are open for discuss these during the review process.
>>> icon.svg: We haven't gotten which icon we should use. It is in our
>>> internal process.
>>>
>>> Currently HPCC charm doesn't have related to any other charms.
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/charms/+bug/1272083/+subscriptions
>>>
>>
>>
>