auditd fails after moving /var it a new filesystem and turning /var/run into a symlink to /run

Bug #1873627 reported by Trey Schisser
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
audit (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Auditd was working on my system (Ubuntu 18.04LTS, kernel 4.15.0-1065-aws) until recently. But after splitting off /var into a new filesystem it fails to launch.

running '/sbin/auditd -f' as root indicates a problem writing the pid file (no file exists even when it says one does) Post config load command output:
Started dispatcher: /sbin/audispd pid: 16927
type=DAEMON_START msg=audit(1587280022.692:2019): op=start ver=2.8.2 format=raw kernel=4.15.0-1065-aws auid=878601141 pid=16925 uid=0 ses=24 subj=unconfined res=success
config_manager init complete
Error setting audit daemon pid (File exists)
type=DAEMON_ABORT msg=audit(1587280022.692:2020): op=set-pid auid=878601141 pid=16925 uid=0 ses=24 subj=unconfined res=failed
Unable to set audit pid, exiting
The audit daemon is exiting.
Error setting audit daemon pid (Permission denied)

/var/run is a symlink to /run
/var/run permissions are 777 root:root
/run permissions are 755f root:root
no /run/auditd.pid and subsiquently no /var/run/auditd.pid exists (even though the error incorrectly reports otherwise.

/var/log/audit/audit.log output
type=DAEMON_START msg=audit(1587278222.942:5617): op=start ver=2.8.2 format=raw kernel=4.15.0-1065-aws auid=4294967295 pid=7529 uid=0 ses=4294967295 subj=unconf
ined res=success
type=DAEMON_ABORT msg=audit(1587278222.943:5618): op=set-pid auid=4294967295 pid=7529 uid=0 ses=4294967295 subj=unconfined res=failed

I have been pulling my hair out over this one. So I ran 'strace /sbin/auditd -f' and found the following line in the output.
"openat(AT_FDCWD, "/var/run/auditd.pid", O_WRONLY|O_CREAT|O_TRUNC|O_NOFOLLOW, 0644) = 4"
I am grasping at straws, but suspect that the O_NOFOLLOW option is causing a failure in creating the pid file since /var/run is a symlink. I could be wrong but I can't find anything else to suspect.

Since it is best practice to split/var into a separate file system to prevent filling the root filesystem in case of an unexpected increase in log collection I suspect this is a bug. So either the system needs to be able to follow symlinks or an option such as pid_file=[filepath] needs to be available in /etc/audit/auditd.conf.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Running under strace may change the execution environment enough that it's not reflective of the actual error, but it's still worth a shot -- can you pastebin the whole auditd strace logs? That openat() line is actually a success -- the error we're looking for will come from the audit_set_pid(3) function, which uses netlink, which is an incredibly complicated protocol. The error may not look like an error in strace output.

Is there any chance the kernel has logged whatever the failure was in dmesg output?

Thanks

Revision history for this message
Trey Schisser (treys) wrote : Re: [Bug 1873627] Re: auditd fails after moving /var it a new filesystem and turning /var/run into a symlink to /run
Download full text (4.9 KiB)

Unfortunately I can't, because I fixed the problem with a workaround and
can't recreate the problem on _this_ server. My workaround was to mount the
new filesystem as /var/log (since the goal was to keep logs from filling up
the root file system), leaving the /var/run symlink on the same filesystem
as /run and now everything works.

If you give me a couple of days I can throw up a new server and see if I
can reproduce the behavior.

Trey Schisser
Waveland Technologies - https://wavelandrcm.com
Director of Security Operations and IT Infrastructure
<email address hidden>
*mobile* 512-496-6660

Never send passwords, regulated data, or confidential information via
unencrypted email.
Please use Signal (https://signal.org) to send short secure messages
(512-496-6660)
Or PGP to send encrypted messages via email (See attached public key)

PGP Public key for <email address hidden>: F056 40E6 AEE2 EB92

-----BEGIN PGP PUBLIC KEY BLOCK-----Version: FlowCrypt 7.6.0 Gmail Encryption
Comment: Seamlessly send and receive encrypted
emailxjMEXmbFHhYJKwYBBAHaRw8BAQdAaaneOd78NEVLKtQROusVcda2zUSTeF2o
NCWvClyafb3NKVRyZXkgU2NoaXNzZXIgPHRzY2hpc3NlckB3YXZlbGFuZHJj
bS5jb20+wncEEBYKAB8FAl5mxR4GCwkHCAMCBBUICgIDFgIBAhkBAhsDAh4B
AAoJEPBWQOau4uuSIcYBAImzI7Dc+63PVDI3OTyL6VoDZOegP1dnC1Jug3wu
1uYBAP4/bZz5Pa1qNgsAuB+HfptfdJvq+EQkbFQv4t7oDwfVAs44BF5mxR4S
CisGAQQBl1UBBQEBB0A5TvjO/keeyllJllXC1fvwKNvADE/T+gNXZJ8EzYVC
GAMBCAfCYQQYFggACQUCXmbFHgIbDAAKCRDwVkDmruLrkkO7AP9CUat2JSbw
nk5fIpKG23eLrZOZ1JGhHQMDYMus/kOXowD/bNVOZ/yoWZ4cWq7gV9/3k3dD
4pxkHA1GaPmQwKr2/gI=
=fqR7
-----END PGP PUBLIC KEY BLOCK-----

On Mon, Apr 20, 2020 at 4:40 PM Seth Arnold <email address hidden>
wrote:

> Running under strace may change the execution environment enough that
> it's not reflective of the actual error, but it's still worth a shot --
> can you pastebin the whole auditd strace logs? That openat() line is
> actually a success -- the error we're looking for will come from the
> audit_set_pid(3) function, which uses netlink, which is an incredibly
> complicated protocol. The error may not look like an error in strace
> output.
>
> Is there any chance the kernel has logged whatever the failure was in
> dmesg output?
>
> Thanks
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1873627
>
> Title:
> auditd fails after moving /var it a new filesystem and turning
> /var/run into a symlink to /run
>
> Status in audit package in Ubuntu:
> New
>
> Bug description:
> Auditd was working on my system (Ubuntu 18.04LTS, kernel
> 4.15.0-1065-aws) until recently. But after splitting off /var into a
> new filesystem it fails to launch.
>
> running '/sbin/auditd -f' as root indicates a problem writing the pid
> file (no file exists even when it says one does) Post config load command
> output:
> Started dispatcher: /sbin/audispd pid: 16927
> type=DAEMON_START msg=audit(1587280022.692:2019): op=start ver=2.8.2
> format=raw kernel=4.15.0-1065-aws auid=878601141 pid=16925 uid=0 ses=24
> subj=unconfined res=success
> config_manager init complete
> Error setting audit daemon pid (File exists)
> type=DAEMON...

Read more...

Revision history for this message
Trey Schisser (treys) wrote :
Download full text (4.6 KiB)

Alright, close the bug, I have built two different machines and tried to
reproduce the problem several different ways, but have been unable to do
so. Sorry to bother you with this.

Trey Schisser
Waveland Technologies - https://wavelandrcm.com
Director of Security Operations and IT Infrastructure
<email address hidden>
*mobile* 512-496-6660

Never send passwords, regulated data, or confidential information via
unencrypted email.
Please use Signal (https://signal.org) to send short secure messages
(512-496-6660)
Or PGP to send encrypted messages via email (See attached public key)

PGP Public key for <email address hidden>: F056 40E6 AEE2 EB92

-----BEGIN PGP PUBLIC KEY BLOCK-----Version: FlowCrypt 7.6.0 Gmail Encryption
Comment: Seamlessly send and receive encrypted
emailxjMEXmbFHhYJKwYBBAHaRw8BAQdAaaneOd78NEVLKtQROusVcda2zUSTeF2o
NCWvClyafb3NKVRyZXkgU2NoaXNzZXIgPHRzY2hpc3NlckB3YXZlbGFuZHJj
bS5jb20+wncEEBYKAB8FAl5mxR4GCwkHCAMCBBUICgIDFgIBAhkBAhsDAh4B
AAoJEPBWQOau4uuSIcYBAImzI7Dc+63PVDI3OTyL6VoDZOegP1dnC1Jug3wu
1uYBAP4/bZz5Pa1qNgsAuB+HfptfdJvq+EQkbFQv4t7oDwfVAs44BF5mxR4S
CisGAQQBl1UBBQEBB0A5TvjO/keeyllJllXC1fvwKNvADE/T+gNXZJ8EzYVC
GAMBCAfCYQQYFggACQUCXmbFHgIbDAAKCRDwVkDmruLrkkO7AP9CUat2JSbw
nk5fIpKG23eLrZOZ1JGhHQMDYMus/kOXowD/bNVOZ/yoWZ4cWq7gV9/3k3dD
4pxkHA1GaPmQwKr2/gI=
=fqR7
-----END PGP PUBLIC KEY BLOCK-----

On Mon, Apr 20, 2020 at 4:40 PM Seth Arnold <email address hidden>
wrote:

> Running under strace may change the execution environment enough that
> it's not reflective of the actual error, but it's still worth a shot --
> can you pastebin the whole auditd strace logs? That openat() line is
> actually a success -- the error we're looking for will come from the
> audit_set_pid(3) function, which uses netlink, which is an incredibly
> complicated protocol. The error may not look like an error in strace
> output.
>
> Is there any chance the kernel has logged whatever the failure was in
> dmesg output?
>
> Thanks
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1873627
>
> Title:
> auditd fails after moving /var it a new filesystem and turning
> /var/run into a symlink to /run
>
> Status in audit package in Ubuntu:
> New
>
> Bug description:
> Auditd was working on my system (Ubuntu 18.04LTS, kernel
> 4.15.0-1065-aws) until recently. But after splitting off /var into a
> new filesystem it fails to launch.
>
> running '/sbin/auditd -f' as root indicates a problem writing the pid
> file (no file exists even when it says one does) Post config load command
> output:
> Started dispatcher: /sbin/audispd pid: 16927
> type=DAEMON_START msg=audit(1587280022.692:2019): op=start ver=2.8.2
> format=raw kernel=4.15.0-1065-aws auid=878601141 pid=16925 uid=0 ses=24
> subj=unconfined res=success
> config_manager init complete
> Error setting audit daemon pid (File exists)
> type=DAEMON_ABORT msg=audit(1587280022.692:2020): op=set-pid
> auid=878601141 pid=16925 uid=0 ses=24 subj=unconfined res=failed
> Unable to set audit pid, exiting
> The audit daemon is exiting.
> Error setting audit daemon pid (Permission denied)
>
> /var/r...

Read more...

Revision history for this message
supporto (inetworking) wrote :
Download full text (3.4 KiB)

Hi,
I'm trying to install auditd on my system (Ubuntu 20.04.2 LTS, kernel 5.4.0-72-generic) but I've got the same problem:

# systemctl status auditd
● auditd.service - Security Auditing Service
     Loaded: loaded (/lib/systemd/system/auditd.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2021-05-13 11:52:40 CEST; 22min ago
       Docs: man:auditd(8)
             https://github.com/linux-audit/audit-documentation
    Process: 2715947 ExecStart=/sbin/auditd (code=exited, status=1/FAILURE)

May 13 11:52:40 ***.***.** audispd[2715950]: No plugins found, exiting
May 13 11:52:40 ***.***.** auditd[2715948]: Error setting audit daemon pid (File exists)
May 13 11:52:40 ***.***.** auditd[2715948]: Unable to set audit pid, exiting
May 13 11:52:40 ***.***.** auditd[2715948]: The audit daemon is exiting.
May 13 11:52:40 ***.***.** auditd[2715948]: Error setting audit daemon pid (Permission denied)
May 13 11:52:40 ***.***.** auditd[2715947]: Cannot daemonize (Success)
May 13 11:52:40 ***.***.** auditd[2715947]: The audit daemon is exiting.
May 13 11:52:40 ***.***.** systemd[1]: auditd.service: Control process exited, code=exited, status=1/FAILURE
May 13 11:52:40 ***.***.** systemd[1]: auditd.service: Failed with result 'exit-code'.
May 13 11:52:40 ***.***.** systemd[1]: Failed to start Security Auditing Service.

I've modified the path for pid from /run to /var/run but nothing changed.
This is the output from auditd -f:

Config file /etc/audit/auditd.conf opened for parsing
local_events_parser called with: yes
write_logs_parser called with: yes
log_file_parser called with: /var/log/audit/audit.log
log_group_parser called with: adm
log_format_parser called with: RAW
flush_parser called with: INCREMENTAL_ASYNC
freq_parser called with: 50
max_log_size_parser called with: 8
num_logs_parser called with: 5
priority_boost_parser called with: 4
qos_parser called with: lossy
dispatch_parser called with: /sbin/audispd
name_format_parser called with: NONE
max_log_size_action_parser called with: ROTATE
space_left_parser called with: 75
space_action_parser called with: SYSLOG
verify_email_parser called with: yes
action_mail_acct_parser called with: root
admin_space_left_parser called with: 50
admin_space_left_action_parser called with: SUSPEND
disk_full_action_parser called with: SUSPEND
disk_error_action_parser called with: SUSPEND
use_libwrap_parser called with: yes
tcp_listen_queue_parser called with: 5
Listener support is not enabled, ignoring value at line 30
tcp_max_per_addr_parser called with: 1
Listener support is not enabled, ignoring value at line 31
tcp_client_max_idle_parser called with: 0
Listener support is not enabled, ignoring value at line 33
enable_krb5_parser called with: no
krb5_principal_parser called with: auditd
distribute_network_parser called with: no
Started dispatcher: /sbin/audispd pid: 2734164
type=DAEMON_START msg=audit(1620901085.986:3240): op=start ver=2.8.5 format=raw kernel=5.4.0-72-generic auid=444800001 pid=2734162 uid=0 ses=105896 subj=unconfined res=success
config_manager init complete
Error setting audit daemon pid (File exists)
type=DAEMON_ABORT msg=audit(1620901085.988:3241): op=se...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in audit (Ubuntu):
status: New → Confirmed
Revision history for this message
Seth Arnold (seth-arnold) wrote :
Download full text (4.2 KiB)

Thanks for the strace, these looked like the 'important' parts:

sendto(3, {{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=3, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa2\xb8\x29\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}, 56, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 56
poll([{fd=3, events=POLLIN}], 1, 500) = 1 ([{fd=3, revents=POLLIN}])
recvfrom(3, {{len=76, type=NLMSG_ERROR, flags=0, seq=3, pid=2734242}, {error=-EEXIST, msg={{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=3, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa2\xb8\x29\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}}}, 8988, MSG_PEEK|MSG_DONTWAIT, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, [12]) = 76
recvfrom(3, {{len=76, type=NLMSG_ERROR, flags=0, seq=3, pid=2734242}, {error=-EEXIST, msg={{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=3, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa2\xb8\x29\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}}}, 8988, MSG_DONTWAIT, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, [12]) = 76
write(2, "Error setting audit daemon pid ("..., 44Error setting audit daemon pid (File exists)) = 44

...

write(2, "The audit daemon is exiting.", 28The audit daemon is exiting.) = 28
write(2, "\n", 1
) = 1
sendto(3, {{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=4, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}, 56, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 56
poll([{fd=3, events=POLLIN}], 1, 500) = 1 ([{fd=3, revents=POLLIN}])
recvfrom(3, {{len=76, type=NLMSG_ERROR, flags=0, seq=4, pid=2734242}, {error=-EACCES, msg={{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=4, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}}}, 8988, MSG_PEEK|MSG_DONTWAIT, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, [12]) = 76
recvfrom(3, {{len=76, type=NLMSG_ERROR, flags=0, seq=4, pid=2734242}, {error=-EACCES, msg={{len=56, type=AUDIT_SET, flags=NLM_F_REQUEST|NLM_F_ACK, seq=4, pid=0}, "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...}}}, 8988, MSG_DONTWAIT, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, [12]) = 76
write(2, "Error setting audit daemon pid ("..., 50Error setting audit daemon pid (Permission denied)) = 50
write(2, "\n", 1
) = 1

I don't understand why it's issuing an AUDIT_SET command after it already decided to exit -- maybe it's just trying to tear itself down cleanly.

I found a few cases in the kernel code for returning both file exists and permission denied:

kernel/audit.c audit_netlink_ok():

                /* Only support auditd and auditctl in initial pid namespace
                 * for now. */
                if (task_active_pid_ns(current) != &init_pid_ns)
                        re...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.