td-agent user does not have permission to create /var/run/fluentd

Bug #1844574 reported by Ken Vondersaar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Expired
Medium
Unassigned

Bug Description

When elasticsearch is configured as an output fluentd will continuously crash with the following stack trace:

2019-09-18 13:38:24 -0500 [error]: #0 unexpected error error_class=Errno::EACCES error="Permission denied @ dir_s_mkdir - /var/run/fluentd"
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:230:in `mkdir'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:230:in `fu_mkdir'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:208:in `block (2 levels) in mkdir_p'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:206:in `reverse_each'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:206:in `block in mkdir_p'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:191:in `each'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/fileutils.rb:191:in `mkdir_p'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/plugin/in_tail.rb:192:in `start'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:203:in `block in start'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:192:in `block (2 levels) in lifecycle'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:191:in `each'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:191:in `block in lifecycle'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:178:in `each'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:178:in `lifecycle'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/root_agent.rb:202:in `start'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/engine.rb:274:in `start'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/engine.rb:219:in `run'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:805:in `run_engine'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:549:in `block in run_worker'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:730:in `main_process'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/supervisor.rb:544:in `run_worker'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/lib/fluent/command/fluentd.rb:316:in `<top (required)>'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.4.2/bin/fluentd:8:in `<top (required)>'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2019-09-18 13:38:24 -0500 [error]: #0 /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
  2019-09-18 13:38:24 -0500 [error]: #0 /usr/sbin/td-agent:7:in `load'
  2019-09-18 13:38:24 -0500 [error]: #0 /usr/sbin/td-agent:7:in `<main>'
2019-09-18 13:38:24 -0500 [error]: #0 unexpected error error_class=Errno::EACCES error="Permission denied @ dir_s_mkdir - /var/run/fluentd"
  2019-09-18 13:38:24 -0500 [error]: #0 suppressed same stacktrace
2019-09-18 13:38:24 -0500 [info]: Worker 0 finished unexpectedly with status 1

creating the directory inside the container and assigning ownership to the td-agent user resolves the issue.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Interesting, I think I saw a fix for this in a patch.

Changed in kolla:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Mark Goddard (mgoddard) wrote :
Revision history for this message
Mark Goddard (mgoddard) wrote :

I think this happens because td-agent uses /var/run/td-agent, rather than /var/run/fluentd. Perhaps the plugin is using the wrong directory for some reason?

Revision history for this message
Mark Goddard (mgoddard) wrote :

Related patch now extracted to this review: https://review.opendev.org/#/c/683676.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

If this is debian, then yeah. We already fixed that for Ubuntu. Should do the same for Debian (not different).

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

On Debian there is fluentd atm so it cannot be debian then.

@Ken - please give us more details: used distro (in images), host architecture, build type (source/binary), any specific extra config you have.

Changed in kolla:
status: Triaged → Incomplete
Revision history for this message
Ken Vondersaar (kvondersaar) wrote :

Sorry about the long delay.

The host is Ubuntu on x86_64:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

Container is kolla/ubuntu-source-fluentd:stein

Extra config: I've enabled external elasticsearch.

I believe I had the same issue with binary containers as well. I'll attempt to verify and report back.

Let me know if there's anything else needed.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Enabling elasticsearch is an interesting data point. The related patch (https://review.opendev.org/#/c/683676) was actually abandoned. We've had a number of changes to fluentd in master, I think this scenario needs testing again with these changes in place.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for kolla because there has been no activity for 60 days.]

Changed in kolla:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.