port 6000 which is used by swift object-server isn't listening after deploy openstack with multi-node

Bug #1673657 reported by MarginHu
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Undecided
MarginHu
Ocata
Fix Released
Undecided
Unassigned

Bug Description

Hi Guys,

After I deploy a environment with multi-node, I found It failed to upload file to container in swift.In my environment, I deployed a storage node which located in "kode1", and use 3 disks as swift storage devices.

[root@kode0 ~]# source admin-openrc.sh
[root@kode0 ~]# swift stat
               Account: AUTH_d77042e9abb4410581a6d8d438598c37
            Containers: 0
               Objects: 0
                 Bytes: 0
       X-Put-Timestamp: 1489704184.74999
           X-Timestamp: 1489704184.74999
            X-Trans-Id: tx86b6ecf45bcf4fc2a5315-0058cb14f8
          Content-Type: text/plain; charset=utf-8
X-Openstack-Request-Id: tx86b6ecf45bcf4fc2a5315-0058cb14f8
[root@kode0 ~]#

[root@kode0 ~]# swift upload mycontainer dnsmasq.log
Object HEAD failed: http://192.168.101.254:8080/v1/AUTH_d77042e9abb4410581a6d8d438598c37/mycontainer/dnsmasq.log 503 Service Unavailable

the swift log has the following output:

2017-03-17T02:36:00Z syslog.local0.err {"Payload":"swift-proxy-server: ERROR with Object server 192.168.104.21:6000/d2 re: Trying to HEAD /v1/AUTH_d77042e9abb4410581a6d8d438598c37/mycontainer/dnsmasq.log: Connection refused (txn: tx5dac48ba5b9b4b23bdf4c-0058cb4b90) (client_ip: 192.168.101.20)\u0000"}

192.168.101.0/24 is my external network.
192.168.104.0/24 is my storage network.

[root@kode1 ~]# docker ps | grep swift-object
09821d264ed9 192.168.103.16:5000/bgi/centos-binary-swift-object-expirer:4.0.0.1 "kolla_start" 4 hours ago Up 4 hours swift_object_expirer
7dc5dcd2e9b6 192.168.103.16:5000/bgi/centos-binary-swift-object:4.0.0.1 "kolla_start" 4 hours ago Up 4 hours swift_object_updater
d1eebab1fc5c 192.168.103.16:5000/bgi/centos-binary-swift-object:4.0.0.1 "kolla_start" 4 hours ago Up 4 hours swift_object_replicator
d3e08921a239 192.168.103.16:5000/bgi/centos-binary-swift-object:4.0.0.1 "kolla_start" 4 hours ago Up 4 hours swift_object_auditor
b0fd76ea8601 192.168.103.16:5000/bgi/centos-binary-swift-object:4.0.0.1 "kolla_start" 4 hours ago Up 4 hours swift_object_server
[root@kode1 ~]# docker logs b0fd76ea8601
[root@kode1 ~]#

[root@kode1 ~]# docker inspect b0fd76ea8601 | grep -i pid
            "Pid": 24196,
            "PidMode": "",
            "PidsLimit": 0,
[root@kode1 ~]# netstat -anlp | grep -i 24196
[root@kode1 ~]#

my openstack is ocata version from rdo.

Revision history for this message
MarginHu (margin2017) wrote :

I enter into the container, and found it is "sleep 1"

()[root@kode1 /]# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
swift 1 0.0 0.0 192 4 ? Ss+ 15:07 0:00 /usr/local/bin/dumb-init /bin/bash /usr/local/
swift 7 0.0 0.0 11632 1404 ? Ss 15:07 0:00 /bin/bash /usr/local/bin/kolla_start
root 1012 0.1 0.0 15200 2016 ? S 15:24 0:00 -bash
swift 1048 0.0 0.0 4308 344 ? S 15:24 0:00 sleep 1
root 1049 0.0 0.0 50872 1828 ? R+ 15:24 0:00 ps aux
()[root@kode1 /]# ls
anaconda-post.log dev home lib64 media opt root sbin sys usr
bin etc lib lost+found mnt proc run srv tmp var
()[root@kode1 /]#

()[root@kode1 /]# cat /usr/local/bin/kolla_start
#!/usr/local/bin/dumb-init /bin/bash
set -o errexit

# Wait for the log socket
if [[ ! "${!SKIP_LOG_SETUP[@]}" && -e /var/lib/kolla/heka ]]; then
    while [[ ! -S /var/lib/kolla/heka/log ]]; do
        sleep 1
    done
fi
....

but I found other swift process works well, why? it seems that root cause is located in heka.

Revision history for this message
MarginHu (margin2017) wrote :

I commented the line about heka in kolla-ansible/ansible/roles like the following:

nova/defaults/main.yml:35: #- "heka_socket:/var/lib/kolla/heka/"
swift/tasks/start.yml:163: #- "heka_socket:/var/lib/kolla/heka/"
swift/tasks/start.yml:177: #- "heka_socket:/var/lib/kolla/heka/"

and found the issue has been resolved.

Changed in kolla:
milestone: none → pike-1
status: New → Confirmed
MarginHu (margin2017)
Changed in kolla:
assignee: nobody → MarginHu (margin2017)
Revision history for this message
MarginHu (margin2017) wrote :

I find the code related heka has been removed on master branch,so you can set the bug status to "Fix Released".

Changed in kolla:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.