When SSL endpoint configured via Vault, ceilometer-agent-notification fails to connect to gnocchi

Bug #1867924 reported by Yoshi Kadokawa
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Ceilometer Charm
Fix Released
High
Liam Young

Bug Description

When deploying OpenStack Train with Ceilometer and Gnocchi,
and also configuring all endpoints as TLS terminated with Vault,
ceilometer-agent-notification service will fail to connect to gnocchi.
Because of this, it will fail to collect metrics.

This is reproducible with the following bundle and network overlay.

Bundle:(the bundle is based on openstack-base bundle, adding ceilometer, gnocchi, and vault)
https://pastebin.ubuntu.com/p/2d87W7QvqS/

Overlays:
https://pastebin.ubuntu.com/p/vBQPqWkQNr/

After everything is deployed(Vault is unsealed, and ceilometer-upgrade is done),
and create any resources, for example, upload a glance image,
you will see the following error in /var/log/ceilometer/ceilometer-agent-notification.log in ceilometer/0

2020-03-18 13:05:53.565 28968 ERROR ceilometer.pipeline.sample [-] Pipeline meter_sink: Continue after error from publisher <ceilometer.publisher.gnocchi.GnocchiPublisher object at 0x7fc2de6ab4a8>: gnocchiclient.exceptions.BadRequest: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
Reason: You're speaking plain HTTP to an SSL-enabled server port.<br />
 Instead use the HTTPS scheme to access this URL, please.<br />
</p>
<hr>
<address>Apache/2.4.29 (Ubuntu) Server at 172.16.10.204 Port 443</address>
</body></html>
 (HTTP 400)

So it looks like, ceilometer-agent-notification still tries to connect to gnocchi with http, not https.
To mitigate this issue, restarting the ceilometer-agent-notification service will work for now.

Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

I have attached the juju crashdump

Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

This issue is still reproducible with the latest charm (tested at 2020-07-22).
This is the bundle that I have tested with.
https://pastebin.ubuntu.com/p/kWQnbz5tjc/

1. deploy the bundle
2. initialize vault
3. upload a glance image
4. check the log in /var/log/ceilometer/ceilometer-agent-notification.log in ceilometer/0
5. the following command also gets an empy output
$ openstack metric list

I will attach the juju crashdump later.

Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

Also subscribing this to field-high.

Revision history for this message
James Page (james-page) wrote :

I think you are missing a relation between ceilometer and keystone:

  juju add-relation ceilometer:identity-notifications keystone:identity-notifications

when endpoints get updated in keystone, this will trigger the restart of the evaluator and notifier agents which have cached endpoints on previous startups.

Changed in charm-ceilometer:
status: New → Incomplete
importance: Undecided → High
Revision history for this message
James Page (james-page) wrote :

Checked our reference bundle and it has:

- - ceilometer:identity-notifications
  - keystone:identity-notifications

Revision history for this message
James Page (james-page) wrote :

Raised bug 1888515 against FCE for this missing relation

Changed in charm-ceilometer:
status: Incomplete → Invalid
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

Thank you James.
I have tried with the revised bundle(just added the relation you pointed).
https://pastebin.ubuntu.com/p/dG9RXW3mnM/

However, it looks like I'm still seeing the same issue.

Changed in charm-ceilometer:
status: Invalid → New
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

Here is the new juju crashdump retrieved from an environment that has the relation
"ceilometer:identity-notifications", "keystone:identity-notifications"

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

In the latest crashdump, was the ceilometer-agent-notifiaction service restarted manually? In the logs, I do see the referenced error, immediately followed by a graceful service restart and no further errors, so it looks like the failure was observed before the keystone notifications had finished propagating?

Changed in charm-ceilometer:
status: New → Incomplete
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

Hello Chris,

Yes, I have restarted the service manually after the symptom was observed.
So the service did not get restarted by Juju.
Just to clarify, the error can only be observed when you create any resources on OpenStack(ex. create a Glance image).

Liam Young (gnuoy)
Changed in charm-ceilometer:
assignee: nobody → Liam Young (gnuoy)
Changed in charm-ceilometer:
milestone: none → 20.10
Revision history for this message
Liam Young (gnuoy) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceilometer (master)

Reviewed: https://review.opendev.org/747128
Committed: https://git.openstack.org/cgit/openstack/charm-ceilometer/commit/?id=e3982a2d98a98c164890734d1dada1b1aa44ba9e
Submitter: Zuul
Branch: master

commit e3982a2d98a98c164890734d1dada1b1aa44ba9e
Author: Liam Young <email address hidden>
Date: Thu Aug 20 09:10:18 2020 +0000

    Fix restart when endpoint notification is received

    Restarts were configured only when the ceilometer agent endpoint
    had changed and only alarm services were triggered to be restarted.
    This change adds a check for the gnocchi service having changed
    too and restarts all services to be on the safe side.

    Closes-Bug: #1867924
    Change-Id: I48e2f079e2db640d485bc74bfc2cedfd7e82ac84

Changed in charm-ceilometer:
status: In Progress → Fix Committed
Revision history for this message
Yoshi Kadokawa (yoshikadokawa) wrote :

I could confirm that the issue is resolved with the -next charm for ceilometer[0].

[0] cs:~openstack-charmers-next/ceilometer-375

Changed in charm-ceilometer:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.