Action Execution Response Timeout

Bug #1581649 reported by Ryan Brady
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mistral
Fix Released
High
Ryan Brady

Bug Description

Problem:
When attempting to run an action or workflow from the command line or from the API directly it fails often.

"ERROR (app) MessagingTimeout: Timed out waiting for a reply to message ID a64bd59c9c8449a0a4bcd9ec1acb8ef3"

Running with json string arg:
------------------------------
mistral run-action --debug zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 2"}]}'
DEBUG (v2) Making authentication request to http://192.0.2.1:5000/v2.0/tokens
DEBUG (extension) found extension EntryPoint.parse('yaml = clifftablib.formatters:YamlFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = clifftablib.formatters:JsonFormatter')
DEBUG (extension) found extension EntryPoint.parse('html = clifftablib.formatters:HtmlFormatter')
DEBUG (extension) found extension EntryPoint.parse('table = cliff.formatters.table:TableFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = cliff.formatters.json_format:JSONFormatter')
DEBUG (extension) found extension EntryPoint.parse('shell = cliff.formatters.shell:ShellFormatter')
DEBUG (extension) found extension EntryPoint.parse('value = cliff.formatters.value:ValueFormatter')
DEBUG (extension) found extension EntryPoint.parse('yaml = cliff.formatters.yaml_format:YAMLFormatter')
DEBUG (httpclient) HTTP POST http://192.0.2.1:8989/v2/action_executions 201
{"result": {"resources": ["/v2/queues/test_queue/messages/5736151c30eabc6a50f1917f"]}}

[stack@instack ~]$ mistral run-action zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 3"}]}'
ERROR (app) MessagingTimeout: Timed out waiting for a reply to message ID a64bd59c9c8449a0a4bcd9ec1acb8ef3

[stack@instack ~]$ mistral run-action zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 4"}]}'
{"result": {"resources": ["/v2/queues/test_queue/messages/5736172b30eabc6a50f19181"]}}

[stack@instack ~]$ mistral run-action zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 5"}]}'
ERROR (app) MessagingTimeout: Timed out waiting for a reply to message ID 07fe288dccb94f0a983a038c44ecc72f

[stack@instack ~]$ mistral run-action zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 6"}]}'
{"result": {"resources": ["/v2/queues/test_queue/messages/5736192c30eabc6a50f19183"]}}

mistral --debug run-action zaqar.queue_post '{"queue_name": "test_queue", "messages": [{"body": "I am a message 9"}]}'
DEBUG (v2) Making authentication request to http://192.0.2.1:5000/v2.0/tokens
DEBUG (extension) found extension EntryPoint.parse('yaml = clifftablib.formatters:YamlFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = clifftablib.formatters:JsonFormatter')
DEBUG (extension) found extension EntryPoint.parse('html = clifftablib.formatters:HtmlFormatter')
DEBUG (extension) found extension EntryPoint.parse('table = cliff.formatters.table:TableFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = cliff.formatters.json_format:JSONFormatter')
DEBUG (extension) found extension EntryPoint.parse('shell = cliff.formatters.shell:ShellFormatter')
DEBUG (extension) found extension EntryPoint.parse('value = cliff.formatters.value:ValueFormatter')
DEBUG (extension) found extension EntryPoint.parse('yaml = cliff.formatters.yaml_format:YAMLFormatter')
DEBUG (httpclient) HTTP POST http://192.0.2.1:8989/v2/action_executions 500
ERROR (app) MessagingTimeout: Timed out waiting for a reply to message ID fb81be0d27644a548316921a503da7db
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 346, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/display.py", line 79, in run
    column_names, data = self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/mistralclient/commands/v2/action_executions.py", line 125, in take_action
    **params
  File "/usr/lib/python2.7/site-packages/mistralclient/api/v2/action_executions.py", line 44, in create
    self._raise_api_exception(resp)
  File "/usr/lib/python2.7/site-packages/mistralclient/api/base.py", line 143, in _raise_api_exception
    error_message=error_data)
APIException: MessagingTimeout: Timed out waiting for a reply to message ID fb81be0d27644a548316921a503da7db
Traceback (most recent call last):
  File "/bin/mistral", line 10, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/mistralclient/shell.py", line 422, in main
    return MistralShell().run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 226, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 346, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/display.py", line 79, in run
    column_names, data = self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/mistralclient/commands/v2/action_executions.py", line 125, in take_action
    **params
  File "/usr/lib/python2.7/site-packages/mistralclient/api/v2/action_executions.py", line 44, in create
    self._raise_api_exception(resp)
  File "/usr/lib/python2.7/site-packages/mistralclient/api/base.py", line 143, in _raise_api_exception
    error_message=error_data)
mistralclient.api.base.APIException: MessagingTimeout: Timed out waiting for a reply to message ID fb81be0d27644a548316921a503da7db

Running with json file arg:
--------------------------------
[stack@instack ~]$ mistral --debug run-action zaqar.queue_post msg10.json
DEBUG (v2) Making authentication request to http://192.0.2.1:5000/v2.0/tokens
DEBUG (extension) found extension EntryPoint.parse('yaml = clifftablib.formatters:YamlFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = clifftablib.formatters:JsonFormatter')
DEBUG (extension) found extension EntryPoint.parse('html = clifftablib.formatters:HtmlFormatter')
DEBUG (extension) found extension EntryPoint.parse('table = cliff.formatters.table:TableFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = cliff.formatters.json_format:JSONFormatter')
DEBUG (extension) found extension EntryPoint.parse('shell = cliff.formatters.shell:ShellFormatter')
DEBUG (extension) found extension EntryPoint.parse('value = cliff.formatters.value:ValueFormatter')
DEBUG (extension) found extension EntryPoint.parse('yaml = cliff.formatters.yaml_format:YAMLFormatter')
DEBUG (httpclient) HTTP POST http://192.0.2.1:8989/v2/action_executions 201
{"result": {"resources": ["/v2/queues/test_queue/messages/573621ab30eabc6a50f19187"]}}
[stack@instack ~]$ mistral --debug run-action zaqar.queue_post msg11.json
DEBUG (v2) Making authentication request to http://192.0.2.1:5000/v2.0/tokens
DEBUG (extension) found extension EntryPoint.parse('yaml = clifftablib.formatters:YamlFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = clifftablib.formatters:JsonFormatter')
DEBUG (extension) found extension EntryPoint.parse('html = clifftablib.formatters:HtmlFormatter')
DEBUG (extension) found extension EntryPoint.parse('table = cliff.formatters.table:TableFormatter')
DEBUG (extension) found extension EntryPoint.parse('json = cliff.formatters.json_format:JSONFormatter')
DEBUG (extension) found extension EntryPoint.parse('shell = cliff.formatters.shell:ShellFormatter')
DEBUG (extension) found extension EntryPoint.parse('value = cliff.formatters.value:ValueFormatter')
DEBUG (extension) found extension EntryPoint.parse('yaml = cliff.formatters.yaml_format:YAMLFormatter')
DEBUG (httpclient) HTTP POST http://192.0.2.1:8989/v2/action_executions 500
ERROR (app) MessagingTimeout: Timed out waiting for a reply to message ID ff29e7a7bd5f4990a708402698a4b922
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 346, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/display.py", line 79, in run
    column_names, data = self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/mistralclient/commands/v2/action_executions.py", line 125, in take_action
    **params
  File "/usr/lib/python2.7/site-packages/mistralclient/api/v2/action_executions.py", line 44, in create
    self._raise_api_exception(resp)
  File "/usr/lib/python2.7/site-packages/mistralclient/api/base.py", line 143, in _raise_api_exception
    error_message=error_data)
APIException: MessagingTimeout: Timed out waiting for a reply to message ID ff29e7a7bd5f4990a708402698a4b922
Traceback (most recent call last):
  File "/bin/mistral", line 10, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/mistralclient/shell.py", line 422, in main
    return MistralShell().run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 226, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 346, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/display.py", line 79, in run
    column_names, data = self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/mistralclient/commands/v2/action_executions.py", line 125, in take_action
    **params
  File "/usr/lib/python2.7/site-packages/mistralclient/api/v2/action_executions.py", line 44, in create
    self._raise_api_exception(resp)
  File "/usr/lib/python2.7/site-packages/mistralclient/api/base.py", line 143, in _raise_api_exception
    error_message=error_data)
mistralclient.api.base.APIException: MessagingTimeout: Timed out waiting for a reply to message ID ff29e7a7bd5f4990a708402698a4b922

/etc/mistral/mistral.conf:
[DEFAULT]

#
# From oslo.log
#

# If set to true, the logging level will be set to DEBUG instead of
# the default INFO level. (boolean value)
#debug = false

# If set to false, the logging level will be set to WARNING instead of
# the default INFO level. (boolean value)
# This option is deprecated for removal.
# Its value may be silently ignored in the future.
#verbose = true

# The name of a logging configuration file. This file is appended to
# any existing logging configuration files. For details about logging
# configuration files, see the Python logging module documentation.
# Note that when logging configuration files are used then all logging
# configuration is set in the configuration file and other logging
# configuration options are ignored (for example,
# logging_context_format_string). (string value)
# Deprecated group/name - [DEFAULT]/log_config
#log_config_append = <None>

# Defines the format string for %%(asctime)s in log records. Default:
# %(default)s . This option is ignored if log_config_append is set.
# (string value)
#log_date_format = %Y-%m-%d %H:%M:%S

# (Optional) Name of log file to send logging output to. If no default
# is set, logging will go to stderr as defined by use_stderr. This
# option is ignored if log_config_append is set. (string value)
# Deprecated group/name - [DEFAULT]/logfile
#log_file = <None>

# (Optional) The base directory used for relative log_file paths.
# This option is ignored if log_config_append is set. (string value)
# Deprecated group/name - [DEFAULT]/logdir
#log_dir = <None>
log_dir = /var/log/mistral

# Uses logging handler designed to watch file system. When log file is
# moved or removed this handler will open a new log file with
# specified path instantaneously. It makes sense only if log_file
# option is specified and Linux platform is used. This option is
# ignored if log_config_append is set. (boolean value)
#watch_log_file = false

# Use syslog for logging. Existing syslog format is DEPRECATED and
# will be changed later to honor RFC5424. This option is ignored if
# log_config_append is set. (boolean value)
#use_syslog = false

# Syslog facility to receive log lines. This option is ignored if
# log_config_append is set. (string value)
#syslog_log_facility = LOG_USER

# Log output to standard error. This option is ignored if
# log_config_append is set. (boolean value)
#use_stderr = true

# Format string to use for log messages with context. (string value)
#logging_context_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [%(request_id)s %(user_identity)s] %(instance)s%(message)s

# Format string to use for log messages when context is undefined.
# (string value)
#logging_default_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [-] %(instance)s%(message)s

# Additional data to append to log message when logging level for the
# message is DEBUG. (string value)
#logging_debug_format_suffix = %(funcName)s %(pathname)s:%(lineno)d

# Prefix each line of exception output with this format. (string
# value)
#logging_exception_prefix = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s

# Defines the format string for %(user_identity)s that is used in
# logging_context_format_string. (string value)
#logging_user_identity_format = %(user)s %(tenant)s %(domain)s %(user_domain)s %(project_domain)s

# List of package logging levels in logger=LEVEL pairs. This option is
# ignored if log_config_append is set. (list value)
#default_log_levels = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=INFO,oslo.messaging=INFO,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=INFO,dogpile.core.dogpile=INFO

# Enables or disables publication of error events. (boolean value)
#publish_errors = false

# The format for an instance that is passed with the log message.
# (string value)
#instance_format = "[instance: %(uuid)s] "

# The format for an instance UUID that is passed with the log message.
# (string value)
#instance_uuid_format = "[instance: %(uuid)s] "

# Enables or disables fatal status of deprecations. (boolean value)
#fatal_deprecations = false

#
# From oslo.messaging
#

# Size of RPC connection pool. (integer value)
# Deprecated group/name - [DEFAULT]/rpc_conn_pool_size
#rpc_conn_pool_size = 30

# ZeroMQ bind address. Should be a wildcard (*), an ethernet
# interface, or IP. The "host" option should point or resolve to this
# address. (string value)
#rpc_zmq_bind_address = *

# MatchMaker driver. (string value)
# Allowed values: redis, dummy
#rpc_zmq_matchmaker = redis

# Type of concurrency used. Either "native" or "eventlet" (string
# value)
# Allowed values: eventlet, native
#rpc_zmq_concurrency = eventlet

# Number of ZeroMQ contexts, defaults to 1. (integer value)
#rpc_zmq_contexts = 1

# Maximum number of ingress messages to locally buffer per topic.
# Default is unlimited. (integer value)
#rpc_zmq_topic_backlog = <None>

# Directory for holding IPC sockets. (string value)
#rpc_zmq_ipc_dir = /var/run/openstack

# Name of this node. Must be a valid hostname, FQDN, or IP address.
# Must match "host" option, if running Nova. (string value)
#rpc_zmq_host = localhost

# Seconds to wait before a cast expires (TTL). The default value of -1
# specifies an infinite linger period. The value of 0 specifies no
# linger period. Pending messages shall be discarded immediately when
# the socket is closed. Only supported by impl_zmq. (integer value)
#rpc_cast_timeout = -1

# The default number of seconds that poll should wait. Poll raises
# timeout exception when timeout expired. (integer value)
#rpc_poll_timeout = 1

# Expiration timeout in seconds of a name service record about
# existing target ( < 0 means no timeout). (integer value)
#zmq_target_expire = 120

# Use PUB/SUB pattern for fanout methods. PUB/SUB always uses proxy.
# (boolean value)
#use_pub_sub = true

# Use ROUTER remote proxy for direct methods. (boolean value)
#use_router_proxy = false

# Minimal port number for random ports range. (port value)
# Minimum value: 0
# Maximum value: 65535
#rpc_zmq_min_port = 49153

# Maximal port number for random ports range. (integer value)
# Minimum value: 1
# Maximum value: 65536
#rpc_zmq_max_port = 65536

# Number of retries to find free port number before fail with
# ZMQBindError. (integer value)
#rpc_zmq_bind_port_retries = 100

# Size of executor thread pool. (integer value)
# Deprecated group/name - [DEFAULT]/rpc_thread_pool_size
#executor_thread_pool_size = 64

# Seconds to wait for a response from a call. (integer value)
#rpc_response_timeout = 60

# A URL representing the messaging driver to use and its full
# configuration. If not set, we fall back to the rpc_backend option
# and driver specific configuration. (string value)
#transport_url = <None>

# The messaging driver to use, defaults to rabbit. Other drivers
# include amqp and zmq. (string value)
#rpc_backend = rabbit
rpc_backend = rabbit

# The default exchange under which topics are scoped. May be
# overridden by an exchange name specified in the transport_url
# option. (string value)
#control_exchange = openstack
control_exchange = openstack

[cors]

#
# From oslo.middleware.cors
#

# Indicate whether this resource may be shared with the domain
# received in the requests "origin" header. Format:
# "<protocol>://<host>[:<port>]", no trailing slash. Example:
# https://horizon.example.com (list value)
#allowed_origin = <None>

# Indicate that the actual request can include user credentials
# (boolean value)
#allow_credentials = true

# Indicate which headers are safe to expose to the API. Defaults to
# HTTP Simple Headers. (list value)
#expose_headers = Content-Type,Cache-Control,Content-Language,Expires,Last-Modified,Pragma

# Maximum cache age of CORS preflight requests. (integer value)
#max_age = 3600

# Indicate which methods can be used during the actual request. (list
# value)
#allow_methods = GET,POST,PUT,DELETE,OPTIONS

# Indicate which header field names may be used during the actual
# request. (list value)
#allow_headers = Content-Type,Cache-Control,Content-Language,Expires,Last-Modified,Pragma

[cors.subdomain]

#
# From oslo.middleware.cors
#

# Indicate whether this resource may be shared with the domain
# received in the requests "origin" header. Format:
# "<protocol>://<host>[:<port>]", no trailing slash. Example:
# https://horizon.example.com (list value)
#allowed_origin = <None>

# Indicate that the actual request can include user credentials
# (boolean value)
#allow_credentials = true

# Indicate which headers are safe to expose to the API. Defaults to
# HTTP Simple Headers. (list value)
#expose_headers = Content-Type,Cache-Control,Content-Language,Expires,Last-Modified,Pragma

# Maximum cache age of CORS preflight requests. (integer value)
#max_age = 3600

# Indicate which methods can be used during the actual request. (list
# value)
#allow_methods = GET,POST,PUT,DELETE,OPTIONS

# Indicate which header field names may be used during the actual
# request. (list value)
#allow_headers = Content-Type,Cache-Control,Content-Language,Expires,Last-Modified,Pragma

[database]

#
# From oslo.db
#

# The file name to use with SQLite. (string value)
# Deprecated group/name - [DEFAULT]/sqlite_db
#sqlite_db = oslo.sqlite

# If True, SQLite uses synchronous mode. (boolean value)
# Deprecated group/name - [DEFAULT]/sqlite_synchronous
#sqlite_synchronous = true

# The back end to use for the database. (string value)
# Deprecated group/name - [DEFAULT]/db_backend
#backend = sqlalchemy

# The SQLAlchemy connection string to use to connect to the database.
# (string value)
# Deprecated group/name - [DEFAULT]/sql_connection
# Deprecated group/name - [DATABASE]/sql_connection
# Deprecated group/name - [sql]/connection
#connection = <None>
connection = mysql+pymysql://mistral:***********************************@192.0.2.1/mistral

# The SQLAlchemy connection string to use to connect to the slave
# database. (string value)
#slave_connection = <None>

# The SQL mode to be used for MySQL sessions. This option, including
# the default, overrides any server-set SQL mode. To use whatever SQL
# mode is set by the server configuration, set this to no value.
# Example: mysql_sql_mode= (string value)
#mysql_sql_mode = TRADITIONAL

# Timeout before idle SQL connections are reaped. (integer value)
# Deprecated group/name - [DEFAULT]/sql_idle_timeout
# Deprecated group/name - [DATABASE]/sql_idle_timeout
# Deprecated group/name - [sql]/idle_timeout
#idle_timeout = 3600

# Minimum number of SQL connections to keep open in a pool. (integer
# value)
# Deprecated group/name - [DEFAULT]/sql_min_pool_size
# Deprecated group/name - [DATABASE]/sql_min_pool_size
#min_pool_size = 1

# Maximum number of SQL connections to keep open in a pool. (integer
# value)
# Deprecated group/name - [DEFAULT]/sql_max_pool_size
# Deprecated group/name - [DATABASE]/sql_max_pool_size
#max_pool_size = <None>

# Maximum number of database connection retries during startup. Set to
# -1 to specify an infinite retry count. (integer value)
# Deprecated group/name - [DEFAULT]/sql_max_retries
# Deprecated group/name - [DATABASE]/sql_max_retries
#max_retries = 10

# Interval between retries of opening a SQL connection. (integer
# value)
# Deprecated group/name - [DEFAULT]/sql_retry_interval
# Deprecated group/name - [DATABASE]/reconnect_interval
#retry_interval = 10

# If set, use this value for max_overflow with SQLAlchemy. (integer
# value)
# Deprecated group/name - [DEFAULT]/sql_max_overflow
# Deprecated group/name - [DATABASE]/sqlalchemy_max_overflow
#max_overflow = 50

# Verbosity of SQL debugging information: 0=None, 100=Everything.
# (integer value)
# Deprecated group/name - [DEFAULT]/sql_connection_debug
#connection_debug = 0

# Add Python stack traces to SQL as comment strings. (boolean value)
# Deprecated group/name - [DEFAULT]/sql_connection_trace
#connection_trace = false

# If set, use this value for pool_timeout with SQLAlchemy. (integer
# value)
# Deprecated group/name - [DATABASE]/sqlalchemy_pool_timeout
#pool_timeout = <None>

# Enable the experimental use of database reconnect on connection
# lost. (boolean value)
#use_db_reconnect = false

# Seconds between retries of a database transaction. (integer value)
#db_retry_interval = 1

# If True, increases the interval between retries of a database
# operation up to db_max_retry_interval. (boolean value)
#db_inc_retry_interval = true

# If db_inc_retry_interval is set, the maximum seconds between retries
# of a database operation. (integer value)
#db_max_retry_interval = 10

# Maximum retries in case of connection error or deadlock error before
# error is raised. Set to -1 to specify an infinite retry count.
# (integer value)
#db_max_retries = 20

[matchmaker_redis]

#
# From oslo.messaging
#

# Host to locate redis. (string value)
#host = 127.0.0.1

# Use this port to connect to redis host. (port value)
# Minimum value: 0
# Maximum value: 65535
#port = 6379

# Password for Redis server (optional). (string value)
#password =

# List of Redis Sentinel hosts (fault tolerance mode) e.g.
# [host:port, host1:port ... ] (list value)
#sentinel_hosts =

# Redis replica set name. (string value)
#sentinel_group_name = oslo-messaging-zeromq

# Time in ms to wait between connection attempts. (integer value)
#wait_timeout = 500

# Time in ms to wait before the transaction is killed. (integer value)
#check_timeout = 20000

# Timeout in ms on blocking socket operations (integer value)
#socket_timeout = 1000

[oslo_messaging_amqp]

#
# From oslo.messaging
#

# address prefix used when sending to a specific server (string value)
# Deprecated group/name - [amqp1]/server_request_prefix
#server_request_prefix = exclusive

# address prefix used when broadcasting to all servers (string value)
# Deprecated group/name - [amqp1]/broadcast_prefix
#broadcast_prefix = broadcast

# address prefix when sending to any server in group (string value)
# Deprecated group/name - [amqp1]/group_request_prefix
#group_request_prefix = unicast

# Name for the AMQP container (string value)
# Deprecated group/name - [amqp1]/container_name
#container_name = <None>

# Timeout for inactive connections (in seconds) (integer value)
# Deprecated group/name - [amqp1]/idle_timeout
#idle_timeout = 0

# Debug: dump AMQP frames to stdout (boolean value)
# Deprecated group/name - [amqp1]/trace
#trace = false

# CA certificate PEM file to verify server certificate (string value)
# Deprecated group/name - [amqp1]/ssl_ca_file
#ssl_ca_file =

# Identifying certificate PEM file to present to clients (string
# value)
# Deprecated group/name - [amqp1]/ssl_cert_file
#ssl_cert_file =

# Private key PEM file used to sign cert_file certificate (string
# value)
# Deprecated group/name - [amqp1]/ssl_key_file
#ssl_key_file =

# Password for decrypting ssl_key_file (if encrypted) (string value)
# Deprecated group/name - [amqp1]/ssl_key_password
#ssl_key_password = <None>

# Accept clients using either SSL or plain TCP (boolean value)
# Deprecated group/name - [amqp1]/allow_insecure_clients
#allow_insecure_clients = false

# Space separated list of acceptable SASL mechanisms (string value)
# Deprecated group/name - [amqp1]/sasl_mechanisms
#sasl_mechanisms =

# Path to directory that contains the SASL configuration (string
# value)
# Deprecated group/name - [amqp1]/sasl_config_dir
#sasl_config_dir =

# Name of configuration file (without .conf suffix) (string value)
# Deprecated group/name - [amqp1]/sasl_config_name
#sasl_config_name =

# User name for message broker authentication (string value)
# Deprecated group/name - [amqp1]/username
#username =

# Password for message broker authentication (string value)
# Deprecated group/name - [amqp1]/password
#password =

[oslo_messaging_notifications]

#
# From oslo.messaging
#

# The Drivers(s) to handle sending notifications. Possible values are
# messaging, messagingv2, routing, log, test, noop (multi valued)
# Deprecated group/name - [DEFAULT]/notification_driver
#driver =

# A URL representing the messaging driver to use for notifications. If
# not set, we fall back to the same configuration used for RPC.
# (string value)
# Deprecated group/name - [DEFAULT]/notification_transport_url
#transport_url = <None>

# AMQP topic used for OpenStack notifications. (list value)
# Deprecated group/name - [rpc_notifier2]/topics
# Deprecated group/name - [DEFAULT]/notification_topics
#topics = notifications

[oslo_messaging_rabbit]

#
# From oslo.messaging
#

# Use durable queues in AMQP. (boolean value)
# Deprecated group/name - [DEFAULT]/amqp_durable_queues
# Deprecated group/name - [DEFAULT]/rabbit_durable_queues
#amqp_durable_queues = false

# Auto-delete queues in AMQP. (boolean value)
# Deprecated group/name - [DEFAULT]/amqp_auto_delete
#amqp_auto_delete = false

# SSL version to use (valid only if SSL enabled). Valid values are
# TLSv1 and SSLv23. SSLv2, SSLv3, TLSv1_1, and TLSv1_2 may be
# available on some distributions. (string value)
# Deprecated group/name - [DEFAULT]/kombu_ssl_version
#kombu_ssl_version =

# SSL key file (valid only if SSL enabled). (string value)
# Deprecated group/name - [DEFAULT]/kombu_ssl_keyfile
#kombu_ssl_keyfile =

# SSL cert file (valid only if SSL enabled). (string value)
# Deprecated group/name - [DEFAULT]/kombu_ssl_certfile
#kombu_ssl_certfile =

# SSL certification authority file (valid only if SSL enabled).
# (string value)
# Deprecated group/name - [DEFAULT]/kombu_ssl_ca_certs
#kombu_ssl_ca_certs =

# How long to wait before reconnecting in response to an AMQP consumer
# cancel notification. (floating point value)
# Deprecated group/name - [DEFAULT]/kombu_reconnect_delay
#kombu_reconnect_delay = 1.0

# EXPERIMENTAL: Possible values are: gzip, bz2. If not set compression
# will not be used. This option may notbe available in future
# versions. (string value)
#kombu_compression = <None>

# How long to wait a missing client beforce abandoning to send it its
# replies. This value should not be longer than rpc_response_timeout.
# (integer value)
# Deprecated group/name - [DEFAULT]/kombu_reconnect_timeout
#kombu_missing_consumer_retry_timeout = 60

# Determines how the next RabbitMQ node is chosen in case the one we
# are currently connected to becomes unavailable. Takes effect only if
# more than one RabbitMQ node is provided in config. (string value)
# Allowed values: round-robin, shuffle
#kombu_failover_strategy = round-robin

# The RabbitMQ broker address where a single node is used. (string
# value)
# Deprecated group/name - [DEFAULT]/rabbit_host
#rabbit_host = localhost
rabbit_host = 192.0.2.1

# The RabbitMQ broker port where a single node is used. (port value)
# Minimum value: 0
# Maximum value: 65535
# Deprecated group/name - [DEFAULT]/rabbit_port
#rabbit_port = 5672

# RabbitMQ HA cluster host:port pairs. (list value)
# Deprecated group/name - [DEFAULT]/rabbit_hosts
#rabbit_hosts = $rabbit_host:$rabbit_port

# Connect over SSL for RabbitMQ. (boolean value)
# Deprecated group/name - [DEFAULT]/rabbit_use_ssl
#rabbit_use_ssl = false

# The RabbitMQ userid. (string value)
# Deprecated group/name - [DEFAULT]/rabbit_userid
#rabbit_userid = guest
rabbit_userid = 3ca2a51eb955e2cc6d94a74ac68823441394f49a

# The RabbitMQ password. (string value)
# Deprecated group/name - [DEFAULT]/rabbit_password
#rabbit_password = guest
rabbit_password = ************************

# The RabbitMQ login method. (string value)
# Deprecated group/name - [DEFAULT]/rabbit_login_method
#rabbit_login_method = AMQPLAIN

# The RabbitMQ virtual host. (string value)
# Deprecated group/name - [DEFAULT]/rabbit_virtual_host
#rabbit_virtual_host = /

# How frequently to retry connecting with RabbitMQ. (integer value)
#rabbit_retry_interval = 1

# How long to backoff for between retries when connecting to RabbitMQ.
# (integer value)
# Deprecated group/name - [DEFAULT]/rabbit_retry_backoff
#rabbit_retry_backoff = 2

# Maximum interval of RabbitMQ connection retries. Default is 30
# seconds. (integer value)
#rabbit_interval_max = 30

# Maximum number of RabbitMQ connection retries. Default is 0
# (infinite retry count). (integer value)
# Deprecated group/name - [DEFAULT]/rabbit_max_retries
#rabbit_max_retries = 0

# Try to use HA queues in RabbitMQ (x-ha-policy: all). If you change
# this option, you must wipe the RabbitMQ database. In RabbitMQ 3.0,
# queue mirroring is no longer controlled by the x-ha-policy argument
# when declaring a queue. If you just want to make sure that all
# queues (except those with auto-generated names) are mirrored across
# all nodes, run: "rabbitmqctl set_policy HA '^(?!amq\.).*' '{"ha-
# mode": "all"}' " (boolean value)
# Deprecated group/name - [DEFAULT]/rabbit_ha_queues
#rabbit_ha_queues = false

# Positive integer representing duration in seconds for queue TTL
# (x-expires). Queues which are unused for the duration of the TTL are
# automatically deleted. The parameter affects only reply and fanout
# queues. (integer value)
# Minimum value: 1
#rabbit_transient_queues_ttl = 1800

# Specifies the number of messages to prefetch. Setting to zero allows
# unlimited messages. (integer value)
#rabbit_qos_prefetch_count = 0

# Number of seconds after which the Rabbit broker is considered down
# if heartbeat's keep-alive fails (0 disable the heartbeat).
# EXPERIMENTAL (integer value)
#heartbeat_timeout_threshold = 60

# How often times during the heartbeat_timeout_threshold we check the
# heartbeat. (integer value)
#heartbeat_rate = 2

# Deprecated, use rpc_backend=kombu+memory or rpc_backend=fake
# (boolean value)
# Deprecated group/name - [DEFAULT]/fake_rabbit
#fake_rabbit = false

# Maximum number of channels to allow (integer value)
#channel_max = <None>

# The maximum byte size for an AMQP frame (integer value)
#frame_max = <None>

# How often to send heartbeats for consumer's connections (integer
# value)
#heartbeat_interval = 3

# Enable SSL (boolean value)
#ssl = <None>

# Arguments passed to ssl.wrap_socket (dict value)
#ssl_options = <None>

# Set socket timeout in seconds for connection's socket (floating
# point value)
#socket_timeout = 0.25

# Set TCP_USER_TIMEOUT in seconds for connection's socket (floating
# point value)
#tcp_user_timeout = 0.25

# Set delay for reconnection to some host which has connection error
# (floating point value)
#host_connection_reconnect_delay = 0.25

# Maximum number of connections to keep queued. (integer value)
#pool_max_size = 30

# Maximum number of connections to create above `pool_max_size`.
# (integer value)
#pool_max_overflow = 0

# Default number of seconds to wait for a connections to available
# (integer value)
#pool_timeout = 30

# Lifetime of a connection (since creation) in seconds or None for no
# recycling. Expired connections are closed on acquire. (integer
# value)
#pool_recycle = 600

# Threshold at which inactive (since release) connections are
# considered stale in seconds or None for no staleness. Stale
# connections are closed on acquire. (integer value)
#pool_stale = 60

# Persist notification messages. (boolean value)
#notification_persistence = false

# Exchange name for for sending notifications (string value)
#default_notification_exchange = ${control_exchange}_notification

# Max number of not acknowledged message which RabbitMQ can send to
# notification listener. (integer value)
#notification_listener_prefetch_count = 100

# Reconnecting retry count in case of connectivity problem during
# sending notification, -1 means infinite retry. (integer value)
#default_notification_retry_attempts = -1

# Reconnecting retry delay in case of connectivity problem during
# sending notification message (floating point value)
#notification_retry_delay = 0.25

# Time to live for rpc queues without consumers in seconds. (integer
# value)
#rpc_queue_expiration = 60

# Exchange name for sending RPC messages (string value)
#default_rpc_exchange = ${control_exchange}_rpc

# Exchange name for receiving RPC replies (string value)
#rpc_reply_exchange = ${control_exchange}_rpc_reply

# Max number of not acknowledged message which RabbitMQ can send to
# rpc listener. (integer value)
#rpc_listener_prefetch_count = 100

# Max number of not acknowledged message which RabbitMQ can send to
# rpc reply listener. (integer value)
#rpc_reply_listener_prefetch_count = 100

# Reconnecting retry count in case of connectivity problem during
# sending reply. -1 means infinite retry during rpc_timeout (integer
# value)
#rpc_reply_retry_attempts = -1

# Reconnecting retry delay in case of connectivity problem during
# sending reply. (floating point value)
#rpc_reply_retry_delay = 0.25

# Reconnecting retry count in case of connectivity problem during
# sending RPC message, -1 means infinite retry. If actual retry
# attempts in not 0 the rpc request could be processed more then one
# time (integer value)
#default_rpc_retry_attempts = -1

# Reconnecting retry delay in case of connectivity problem during
# sending RPC message (floating point value)
#rpc_retry_delay = 0.25

[keystone_authtoken]
auth_uri=http://192.0.2.1:5000/v3
identity_uri=http://192.0.2.1:35357
admin_user=mistral
admin_password=**************************
admin_tenant_name=service

Changed in mistral:
importance: Undecided → High
milestone: none → newton-1
Revision history for this message
Marios Andreou (marios-b) wrote :

+1 I have hit this a lot... assumed it was not enough ram on the virt host causing issues for mistral executions.

Revision history for this message
Ryan Brady (rbrady) wrote :

I don't think it's a ram issue. I ran "watch -n 5 'free -m'" in another terminal while running two executions (a completion and a failure). For the execution that passes, it lowers from 185m to 183m, during the failure it dropped to 154m and goes back up to 184m after the timeout error message occurs.

Revision history for this message
Ryan Brady (rbrady) wrote :

I worked with Renat to attempt debug this issue today. I made the following changes to the environment:

1. Added new section to /etc/mistral/mistral.conf to ensure the mistral engine wasn't having its messages stolen by another service using the same topic.

[engine]
topic = mistral_engine_tripleo

2. Added code to https://github.com/openstack/mistral/blob/master/mistral/engine/rpc.py in each of the Client classes in start_action and run_action methods to log the return value.

e.g.

    @wrap_messaging_exception
    def start_action(self, action_name, action_input,
                     description=None, **params):
        """Starts action sending a request to engine over RPC.
        :return: Action execution.
        """
        ret_val = self._client.call(
            auth_ctx.ctx(),
            'start_action',
            action_name=action_name,
            action_input=action_input or {},
            description=description,
            params=params
        )
        LOG.info("[engine]After start action %s" % ret_val)
        return ret_val

This resulted in seeing messages from the ExecutorClient displayed, but the log messages from engine were missing from /var/log/mistral/mistral-server.log.

Renat's guidance at the end of our session today was:

'Try to run API w/o Apache, if it works then we need to look at how Mistral API is configured for launching with Apache init script or something. My guess that that RPC executor is not properly configured e.g. look at: https://github.com/openstack/mistral/blob/master/mistral/cmd/launch.py#L125 in the init script we may want to change "eventlet" to "blocking" because I think eventlet won't work with Apache and hence RPC messages won't be handled'

-r

Revision history for this message
Ryan Brady (rbrady) wrote :

I was able to test running the API without Apache/WSGI and it ran with 100% success rate. I created a quick bash script to run mistral run-action from the cli in a loop of 100 iterations. We still need to figure out how to run Mistral API with WSGI.

Steps taken:

Disabled API WSGI:

1. removed /etc/httpd/conf.d/10-mistral_wsgi.conf
2. commented out "Listen 8989" in /etc/httpd/conf/ports.conf and restarted httpd

Started Mistral with cmd/launch.py:

mistral-server --config-file=/etc/mistral/mistral.conf --server=api

Revision history for this message
Ryan Brady (rbrady) wrote :

Here is the code for the WSGI app that drives the mistral api. I'm looking at the launch.py file now to see if there are changes I can make.

from mistral.api import app
from mistral import config

# By default, oslo.config parses the CLI args if no args is provided.
# As a result, invoking this wsgi script from gunicorn leads to the error
# with argparse complaining that the CLI options have already been parsed.
config.parse_args(args=[])

application = app.setup_app()

Ryan Brady (rbrady)
Changed in mistral:
assignee: nobody → Ryan Brady (rbrady)
Changed in mistral:
milestone: newton-1 → newton-2
Changed in mistral:
milestone: newton-2 → newton-3
Revision history for this message
Renat Akhmerov (rakhmerov) wrote :

I closed the bug for now as I believe it is fixed as a side-effect of the work that we did in the last months. If it pops up again we can re-open it.

Changed in mistral:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.