This appears to be a repeat of bug 1667697.
Scenario:
Downstream release based on stable/pike.
Error occurs when deploying a small overcloud (1 control, 1 compute)
- Overcloud nodes are real hardware
- Non-trivial network configuration
- Cinder services running in containers
The error:
u'message': u"Failed to run action [action_ex_id=0bd20d27-42f0-46f2-9d6f-16c9ff6757df, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'skip_deploy_identifier': False, u'container': u'overcloud', u'timeout': 240}']\n ERROR: Request limit exceeded: JSON body size (2099504 bytes) exceeds maximum allowed size (2097152 bytes).",
u'status': u'FAILED'}
Workaround:
I was borrowing access to the systems, and didn't have time to do much more than patch a workaround. I increased max_json_body_size in the undercloud's /etc/heat/heat.conf, restarted the undercloud, and was then able to deploy the overcloud (problem went away).
Details:
% openstack overcloud deploy \
--templates ~/pilot/templates/overcloud \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e ~/pilot/templates/overcloud/environments/network-isolation.yaml \
-e ~/pilot/templates/network-environment.yaml \
-e ~/pilot/templates/node-placement.yaml \
-e ~/docker_registry_containerized_cinder.yaml \
-e ~/containerized-cinder.yaml \
--control-flavor baremetal \
--compute-flavor baremetal \
--control-scale 1 \
--compute-scale 1 \
--ntp-server 192.168.120.201
Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: 0935a6c7-fe79-4bee-b3be-745d0de83502
Waiting for messages on queue '443bee40-4ea3-4414-9b3e-066355a917ce' with no timeout.
WARNINGS
[u"7 nodes with profile None won't be used for deployment now", u"7 nodes with profile None won't be used for deployment now"]
Configuration has 2 warnings, fix them before proceeding.
Removing the current plan files
Uploading new plan files
Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: d28e093c-93d5-4f8a-9725-ce5deece2548
Plan updated.
Processing templates in the directory /tmp/tripleoclient-yZj8Ob/tripleo-heat-templates
Started Mistral Workflow tripleo.plan_management.v1.get_deprecated_parameters. Execution ID: fd697e56-3a57-44ef-b05c-0a05cd6a066b
Deploying templates in the directory /tmp/tripleoclient-yZj8Ob/tripleo-heat-templates
Started Mistral Workflow tripleo.deployment.v1.deploy_plan. Execution ID: acb28c92-5c08-4bb1-8304-3da672b3bca6
{u'execution': {u'created_at': u'2018-01-04 17:37:28',
u'id': u'acb28c92-5c08-4bb1-8304-3da672b3bca6',
u'input': {u'container': u'overcloud',
u'queue_name': u'dc65d443-2f0b-4123-b482-da5a8fa91e88',
u'run_validations': False,
u'skip_deploy_identifier': False,
u'timeout': 240},
u'name': u'tripleo.deployment.v1.deploy_plan',
u'params': {u'namespace': u''},
u'spec': {u'description': u'Deploy the overcloud for a plan.\n',
u'input': [u'container',
{u'run_validations': False},
{u'timeout': 240},
{u'skip_deploy_identifier': False},
{u'queue_name': u'tripleo'}],
u'name': u'deploy_plan',
u'tags': [u'tripleo-common-managed'],
u'tasks': {u'add_validation_ssh_key': {u'input': {u'container': u'<% $.container %>',
u'queue_name': u'<% $.queue_name %>'},
u'name': u'add_validation_ssh_key',
u'on-complete': [{u'run_validations': u'<% $.run_validations %>'},
{u'create_swift_rings_backup_plan': u'<% not $.run_validations %>'}],
u'type': u'direct',
u'version': u'2.0',
u'workflow': u'tripleo.validations.v1.add_validation_ssh_key_parameter'},
u'create_swift_rings_backup_plan': {u'input': {u'container': u'<% $.container %>',
u'queue_name': u'<% $.queue_name %>',
u'use_default_templates': True},
u'name': u'create_swift_rings_backup_plan',
u'on-error': u'create_swift_rings_backup_plan_set_status_failed',
u'on-success': u'get_heat_stack',
u'type': u'direct',
u'version': u'2.0',
u'workflow': u'tripleo.swift_rings_backup.v1.create_swift_rings_backup_container_
plan'},
u'create_swift_rings_backup_plan_set_status_failed': {u'name': u'create_swift_rings_backup_plan_set_status_failed',
u'on-success': u'send_message',
u'publish': {u'message': u'<% task(create_swift_rings_backup_pl
an).result %>',
u'status': u'FAILED'},
u'type': u'direct',
u'version': u'2.0'},
u'deploy': {u'action': u'tripleo.deployment.deploy',
u'input': {u'container': u'<% $.container %>',
u'skip_deploy_identifier': u'<% $.skip_deploy_identifier %>',
u'timeout': u'<% $.timeout %>'},
u'name': u'deploy',
u'on-error': u'set_deployment_failed',
u'on-success': u'send_message',
u'type': u'direct',
u'version': u'2.0'},
u'get_heat_stack': {u'action': u'heat.stacks_get stack_id=<% $.container %>',
u'name': u'get_heat_stack',
u'on-error': u'deploy',
u'on-success': [{u'set_stack_in_progress': u'<% "_IN_PROGRESS" in task(get_heat_stack).result.sta
ck_status %>'},
{u'deploy': u'<% not "_IN_PROGRESS" in task(get_heat_stack).result.stack_status %
>'}],
u'type': u'direct',
u'version': u'2.0'},
u'run_validations': {u'input': {u'group_names': [u'pre-deployment'],
u'plan': u'<% $.container %>',
u'queue_name': u'<% $.queue_name %>'},
u'name': u'run_validations',
u'on-error': u'set_validations_failed',
u'on-success': u'create_swift_rings_backup_plan',
u'type': u'direct',
u'version': u'2.0',
u'workflow': u'tripleo.validations.v1.run_groups'},
u'send_message': {u'action': u'zaqar.queue_post',
u'input': {u'messages': {u'body': {u'payload': {u'execution': u'<% execution() %>',
u'message': u"<% $.get('message', '') %>",
u'status': u"<% $.get('status', 'SUCCESS') %>"},
u'type': u'tripleo.deployment.v1.deploy_plan'}},
u'queue_name': u'<% $.queue_name %>'},
u'name': u'send_message',
u'on-success': [{u'fail': u'<% $.get(\'status\') = "FAILED" %>'}],
u'retry': u'count=5 delay=1',
u'type': u'direct',
u'version': u'2.0'},
u'set_deployment_failed': {u'name': u'set_deployment_failed',
u'on-success': u'send_message',
u'publish': {u'message': u'<% task(deploy).result %>',
u'status': u'FAILED'},
u'type': u'direct',
u'version': u'2.0'},
u'set_stack_in_progress': {u'name': u'set_stack_in_progress',
u'on-success': u'send_message',
u'publish': {u'message': u'The Heat stack is busy.',
u'status': u'FAILED'},
u'type': u'direct',
u'version': u'2.0'},
u'set_validations_failed': {u'name': u'set_validations_failed',
u'on-success': u'send_message',
u'publish': {u'message': u'<% task(run_validations).result %>',
u'status': u'FAILED'},
u'type': u'direct',
u'version': u'2.0'}},
u'version': u'2.0'}},
u'message': u"Failed to run action [action_ex_id=0bd20d27-42f0-46f2-9d6f-16c9ff6757df, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'skip_deploy_identifier': False, u'container': u'overcloud', u'timeout': 240}']\n ERROR: Request limit exceeded: JSON body size (2099504 bytes) exceeds maximum allowed size (2097152 bytes).",
u'status': u'FAILED'}
At shardy's suggestion, I'm attaching a tarball copy of all the custom env files. They are pretty innocuous, and don't do any get_file on large external files.
One thing to note is the overcloud deploy command specifies "--templates ~/pilot/ templates/ overcloud, " but that directory is an exact copy of the stock templates in /usr/share/ openstack- tripleo- heat-templates (this is quirk of the customer's installation tooling).
I suspect the custom templates directory increases file names enough to bloat the json body. The problem does not occur if I eliminate the custom templates directory from the deploy. In fact, removing the containerized- cinder. yaml env file is just enough to slip below the current json body limit.