s390x: erlang.cookie creation fails when changing file permissions

Bug #1808984 reported by James Page
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
bugproxy
elixir-lang (Ubuntu)
Invalid
Undecided
Unassigned
erlang (Ubuntu)
Fix Released
Undecided
Unassigned
rabbitmq-server (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Ubuntu: disco development

On a new deployment, erlang will create a cookie with a random secret value for use; however it must have 0400 permissions otherwise erlang will refuse to use it.

On s390x the automatic cookie creation code fails with:

   args: []
    format: "Failed to change mode: einval"
    label: {error_logger,error_msg}
2018-12-18 13:52:08.629474 crash_report #{label=>{proc_lib,crash},report=>[[{initial_call,{auth,init,['Argument__1']}},{pid,<0.58.0>},{registered_name,[]},{error_info,{error,"Failed to change mode: einval",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}},{ancestors,[net_sup,kernel_sup,<0.46.0>]},{message_queue_len,0},{messages,[]},{links,[<0.56.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,27},{reductions,1164}],[]]}
2018-12-18 13:52:08.629867 supervisor_report #{label=>{supervisor,start_error},report=>[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{"Failed to change mode: einval",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}},{offender,[{pid,undefined},{id,auth},{mfargs,{auth,start_link,[]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
2018-12-18 13:52:08.634421 supervisor_report #{label=>{supervisor,start_error},report=>[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,{shutdown,{failed_to_start_child,auth,{"Failed to change mode: einval",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}},{offender,[{pid,undefined},{id,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
2018-12-18 13:52:08.636693 crash_report #{label=>{proc_lib,crash},report=>[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.45.0>},{registered_name,[]},{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Failed to change mode: einval",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,138}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}},{ancestors,[<0.44.0>]},{message_queue_len,1},{messages,[{'EXIT',<0.46.0>,normal}]},{links,[<0.44.0>,<0.43.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,27},{reductions,225}],[]]}
2018-12-18 13:52:08.640322 std_info #{label=>{application_controller,exit},report=>[{application,kernel},{exited,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Failed to change mode: einval",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}}},{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{\"Failed to change mode: einval\",[{auth,init_cookie,0,[{file,\"auth.erl\"},{line,286}]},{auth,init,1,[{file,\"auth.erl\"},{line,140}]},{gen_server,init_it,2,[{file,\"gen_server.erl\"},{line,374}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,249}]}]}}}}},{kernel,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Failed to change mode: einval",[{aut

Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done

Works fine on all other architectures so suspect something endian related.

Revision history for this message
James Page (james-page) wrote :

Might be related:

Dec 18 14:14:35 rmq-more-tesitng rabbitmq[24111]: warning: Elixir is running in a system with a different endianness than the one its source code was compiled in. Please make sure Elixir and all source files were compiled in a machine with the same endianness as the current one: big

Revision history for this message
James Page (james-page) wrote :

Attempting to resolve by making elixir Arch: any

https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3568/+packages

However tests fail on s390x which probably tells us something interesting.

Revision history for this message
James Page (james-page) wrote :

All of the errors are related to file permissions; however I do think this is something in erlang rather than elixir.

James Page (james-page)
description: updated
Frank Heimes (fheimes)
tags: added: s390x
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → Medium
assignee: nobody → bugproxy (bugproxy)
Revision history for this message
James Page (james-page) wrote :

FWIW this issue is blocking migration of any python related updates in disco development:

http://people.canonical.com/~ubuntu-archive/proposed-migration/update_excuses_by_team.html#ubuntu-openstack

We have a number of openstack projects with autopkgtests which use rabbitmq-server which are currently failing on s390x.

Revision history for this message
James Page (james-page) wrote :

I think the Elixir warning is unrelated, but problem does need address; the unit test suite errors are symptomatic of the same underlying issue effecting RMQ.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Medium → High
bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-174200 severity-high targetmilestone-inin1804
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

create_cookie(Name) ->
    Seed = abs(erlang:monotonic_time()
               bxor erlang:unique_integer()),
    Cookie = random_cookie(20, Seed, []),
    case file:open(Name, [write, raw]) of
        {ok, File} ->
            R1 = file:write(File, Cookie),
            ok = file:close(File),
            R2 = file:raw_write_file_info(Name, make_info(Name)),
            case {R1, R2} of
                {ok, ok} ->
                    ok;
                {{error,Reason}, _} ->
                    {error,
                     lists:flatten(
                       io_lib:format("Failed to write to cookie file '~ts': ~p", [Name, Reason]))};
                {ok, {error, Reason}} ->
                    {error, "Failed to change mode: " ++ atom_to_list(Reason)}
            end;
        {error,Reason} ->
            {error,
             lists:flatten(
               io_lib:format("Failed to create cookie file '~ts': ~p", [Name, Reason]))}
    end.

So, R2 = file:raw_write_file_info(Name, make_info(Name)), is failing.
raw_write_file_info is obsolete, and is a wrapper around write_file_info now.
Digging more.... maybe i could invoke an erlang expert.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

http://paste.ubuntu.com/p/GmXsSrZD7C/

To me it looks like int_to_int32bytes() wrapper got dropped around Mode, Gid, Uid. Inside the write_file_info implementation.

So maybe make_info(Name) now needs to do that..... Checking.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Hmm... i think i'm out of ideas, cause it looks like make_info(Name) does use the right mode=8#400.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

targetmilestone-inin1804 -> this affects 1904 only at the moment, the in-development Disco Dingo series.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

refactored the code into a small hello.erl, and indeed file mode setting fails on s390x. without doing anything else.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

stracing that, yields:

[pid 62574] chown(".erlang.cookie", 442314011, 1047536456) = -1 EPERM (Operation not permitted)

which is odd, given that owner is not changed....

whereas on amd64 it has

[pid 14595] chown(".erlang.cookie", -1, -1 <unfinished ...>
[pid 14595] <... chown resumed> ) = 0

From code, -1 is converted from undefined. So it feels as if "undefined" is broken on big endian.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :
Revision history for this message
Dimitri John Ledkov (xnox) wrote :
Changed in rabbitmq-server (Ubuntu):
status: New → Invalid
Changed in elixir-lang (Ubuntu):
status: New → Invalid
Changed in erlang (Ubuntu):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package erlang - 1:21.2+dfsg-1ubuntu2

---------------
erlang (1:21.2+dfsg-1ubuntu2) disco; urgency=medium

  * Apply upstream suggested patch to fix signiness of uid/gid/permission
    file_info values. LP: #1808984

 -- Dimitri John Ledkov <email address hidden> Thu, 20 Dec 2018 23:56:39 +1100

Changed in erlang (Ubuntu):
status: Fix Committed → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-12-20 11:26 EDT-------
IBM Bugzilla status-> closed, Fix Released, after fixing by Canonical

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

@james-page

Things started to look better I think. I have retried all the things I have spotted on the report page you pointed at, such that hopefully the new rabbitmq server will migrate... But I'm about to go EOY, so will not be able to monitor this further.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.