CoreOS doesn't support multi-part mimes and as such doesn't work

Bug #1499909 reported by Kris Lindgren
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Magnum
Confirmed
Medium
Unassigned

Bug Description

Trying to use CoreOS as the OS image for a Magnum bay. However, the issue is the minion never finishes (starts) provisioning.

The issue is that CoreOS no longer supports (as of ~ a year ago (whenever they refactored Cloud-init) multi-part-mimes.

This is an issue becuase both the kubecluster-coreos.yaml and kubeminon-coreos.yaml both use the heat multipart-mime to create user data for the created VM's.

From a minion trying to run cloudinit as ran on bootup:
k8-fc6ydlyoox6p ~ # /usr/bin/coreos-cloudinit --oem=ec2-compat
Checking availability of "cloud-drive"
Checking availability of "ec2-metadata-service"
Fetching user-data from datasource of type "cloud-drive"
Attempting to read from "/media/configdrive/openstack/latest/user_data"
line 1: error: must be "#cloud-config" or begin with "#!"
Fetching meta-data from datasource of type "cloud-drive"
Attempting to read from "/media/configdrive/openstack/latest/meta_data.json"
Attempting to read from "/media/configdrive/openstack/content/0000"
Failed to parse user-data: Unrecognized user-data format
Continuing...
Merging cloud-config from meta-data and user-data
2015/09/25 19:12:08 Set hostname to k8-fc6ydlyoox6p.cloud.dev1.gdg
2015/09/25 19:12:08 Authorized SSH keys for core user
2015/09/25 19:12:08 Ensuring runtime unit file "etcd.service" is unmasked
2015/09/25 19:12:08 Ensuring runtime unit file "etcd2.service" is unmasked
2015/09/25 19:12:08 Ensuring runtime unit file "fleet.service" is unmasked
2015/09/25 19:12:08 Ensuring runtime unit file "locksmithd.service" is unmasked

user_data is as follows:
Content-Type: multipart/mixed; boundary="===============6798670219374071106=="
MIME-Version: 1.0

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-boothook
#!/bin/sh

setenforce 0

sed -i '
  /^SELINUX=/ s/=.*/=permissive/
' /etc/selinux/config

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-config
merge_how: dict(recurse_array)+list(append)
write_files:
  - path: /etc/sysconfig/heat-params
    owner: "root:root"
    permissions: "0644"
    content: |
      KUBE_ALLOW_PRIV="true"
      KUBE_MASTER_IP="10.224.51.148"
      WAIT_HANDLE="https://openstack-dev.int.godaddy.com:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3Af48e57277a7a484290ba9afdc49a21a9%3Astacks%2Fk8sbay-gs76finufomu-kube_minions-gr7mn7ymlgoi-0-aasmqcbt2typ%2F050834f1-a511-409e-9ef2-ac1b44cbeac7%2Fresources%2Fnode_wait_handle?Timestamp=2015-09-25T18%3A42%
3A26Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=574213b8940e48519b8400bbc68e487f&SignatureVersion=2&Signature=oLopPNQlun%2FxQbUkyhzfE45ejmMUHEIgyzWIYUe6NZc%3D"
      DOCKER_VOLUME="$DOCKER_VOLUME"

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-config
system_info:
  default_user:
    name: minion
    lock_passwd: true
    gecos: Kubernetes Interactive User
    groups: [wheel, adm, systemd-journal]
    sudo: ["ALL=(ALL) NOPASSWD:ALL"]
    shell: /bin/bash

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-config
merge_how: dict(recurse_array)+list(append)
write_files:
  - path: /etc/kubernetes/examples/web.pod
    owner: "root:root"
    permissions: "0644"
    content: |
      kind: Pod
      apiVersion: v1beta1
      labels:
        name: web
      desiredState:
        manifest:
          version: v1beta1
          id: web
          containers:
            - name: web
              image: larsks/thttpd
              ports:
                - containerPort: 80
  - path: /etc/kubernetes/examples/web.service
    owner: "root:root"
    permissions: "0644"
    content: |
      kind: Service
      apiVersion: v1beta1
      id: web
      port: 8000
      selector:
        name: web
      containerPort: 80

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#!/bin/sh

# Under atomic, we need to make sure the 'dockerroot' group exists in
# /etc/group (because /lib/group cannot be modified by usermod).
echo "making 'dockerroot' group editable"
if ! grep -q dockerroot /etc/group; then
 grep dockerroot /lib/group >> /etc/group
fi

# make 'minion' user a member of the dockerroot group
# (so you can run docker commands as the 'minion' user)
echo "adding 'minion' user to 'dockerroot' group"
usermod -G dockerroot minion

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#!/bin/sh

. /etc/sysconfig/heat-params

echo "configuring kubernetes (minion)"

myip=$(ip addr show eth0 |
awk '$1 == "inet" {print $2}' | cut -f1 -d/)
myip_last_octet=${myip##*.}

sed -i '
/^KUBE_ALLOW_PRIV=/ s/=.*/="--allow_privileged='"$KUBE_ALLOW_PRIV"'"/
/^KUBE_ETCD_SERVERS=/ s|=.*|="--etcd_servers=http://'"$KUBE_MASTER_IP"':4001"|
' /etc/kubernetes/config

sed -i '
/^KUBELET_ADDRESS=/ s/=.*/="--address=0.0.0.0"/
/^KUBELET_HOSTNAME=/ s/=.*/="--hostname_override='"$myip"'"/
/^KUBELET_API_SERVER=/ s/=.*/="--api_servers='http:\/\/"$KUBE_MASTER_IP"':8080"/
' /etc/kubernetes/kubelet

sed -i '
/^KUBE_MASTER=/ s/=.*/="--master='"$KUBE_MASTER_IP"':8080"/
' /etc/kubernetes/apiserver

sed -i '
/^FLANNEL_ETCD=/ s|=.*|="http://'"$KUBE_MASTER_IP"':4001"|
' /etc/sysconfig/flanneld

cat >> /etc/environment <<EOF
KUBERNETES_MASTER=http://$KUBE_MASTER_IP:8080
EOF

cpu=$(expr $(nproc) \* 1000)
memory_kb=$(cat /proc/meminfo | awk '/MemTotal: /{print $2}')
memory=$(expr $memory_kb \* 1024)
curl -sf -X POST -H 'Content-Type: application/json' \
    --data-binary "{\"kind\":\"Minion\",\"id\":\"$myip\",\"apiVersion\":\"v1beta1\",
        \"resources\":{\"capacity\":{\"cpu\":$cpu,\"memory\":$memory}}}" \
    http://$KUBE_MASTER_IP:8080/api/v1beta1/minions

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-config
merge_how: dict(recurse_array)+list(append)
bootcmd:
  - mkdir -p /etc/systemd/system/docker.service.d
  - mkdir -p /etc/systemd/system/flanneld.service.d
write_files:
  - path: /usr/local/bin/flannel-docker-bridge
    owner: "root:root"
    permissions: "0755"
    content: |
      #!/bin/sh

      if ! [ "$FLANNEL_SUBNET" ] && [ "$FLANNEL_MTU" ] ; then
        echo "ERROR: missing required environment variables." >&2
        exit 1
      fi

      mkdir -p /run/flannel/
      cat > /run/flannel/docker <<EOF
      DOCKER_NETWORK_OPTIONS="--bip=$FLANNEL_SUBNET --mtu=$FLANNEL_MTU"
      EOF
  - path: /etc/systemd/system/flannel-docker-bridge.service
    owner: "root:root"
    permissions: "0644"
    content: |
      [Unit]
      After=flanneld.service
      Before=docker.service
      Requires=flanneld.service

      [Service]
      Type=oneshot
      EnvironmentFile=/run/flannel/subnet.env
      ExecStart=/usr/local/bin/flannel-docker-bridge

      [Install]
      WantedBy=docker.service
  - path: /etc/systemd/system/docker.service.d/flannel.conf
    owner: "root:root"
    permissions: "0644"
    content: |
      [Unit]
      Requires=flannel-docker-bridge.service
      After=flannel-docker-bridge.service

      [Service]
      EnvironmentFile=/run/flannel/docker
  - path: /etc/systemd/system/flanneld.service.d/flannel-docker-bridge.conf
    owner: "root:root"
    permissions: "0644"
    content: |
      [Unit]
      Requires=flannel-docker-bridge.service
      Before=flannel-docker-bridge.service

      [Install]
      Also=flannel-docker-bridge.service

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#!/bin/sh

# docker is already enabled and possibly running on centos atomic host
# so we need to stop it first and delete the docker0 bridge (which will
# be re-created using the flannel-provided subnet).
echo "stopping docker"
systemctl stop docker
ip link del docker0

# make sure we pick up any modified unit files
systemctl daemon-reload

for service in flanneld docker.socket docker kubelet kube-proxy; do
 echo "activating service $service"
 systemctl enable $service
 systemctl --no-block start $service
done

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#!/bin/sh

. /etc/sysconfig/heat-params

echo "notifying heat"
curl -sf -X PUT -H 'Content-Type: application/json' \
 --data-binary '{"Status": "SUCCESS",
 "Reason": "Setup complete",
 "Data": "OK", "UniqueId": "00000"}' \
 "$WAIT_HANDLE"

--===============6798670219374071106==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

#cloud-config

coreos:
  etcd:
    # generate a new 9b158e8c835442f593b15e6de05e2dc3 for each cluster from https://discovery.etcd.io/new
    discovery: https://discovery.etcd.io/$9b158e8c835442f593b15e6de05e2dc3
    # multi-region and multi-cloud deployments need to use $public_ipv4
    addr: $private_ipv4:4001
    peer-addr: $private_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
  ssh-rsa XXXXXXX
  - $ssh-rsa XXXXX

--===============6798670219374071106==--

Revision history for this message
Ton Ngo (ton-i) wrote :
Revision history for this message
Kris Lindgren (klindgren) wrote :

Ton,

That is correct. I used: http://stable.release.core-os.net/amd64-usr/current/coreos_production_openstack_image.img.bz2 as of a few days ago.

This issue is specifically called out in: https://github.com/coreos/bugs/issues/792

Adrian Otto (aotto)
Changed in magnum:
milestone: none → mitaka-1
hongbin (hongbin034)
Changed in magnum:
status: New → Confirmed
Revision history for this message
Adrian Otto (aotto) wrote :

See also: https://bugs.launchpad.net/magnum/+bug/1543308 We must not disable selinux

Changed in magnum:
importance: Undecided → Medium
Revision history for this message
yatin (yatinkarel) wrote :

We are using #cloud-config as pointed in https://github.com/coreos/bugs/issues/792. Its seems the issue doesn't exist as coreos cluster can be successfully created.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.