Passive FTP is not handled properly by the ip_vs_ftp module

Bug #1453180 reported by Shawn Heisey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

I have a setup on CentOS 5 (kernel 2.6.18-128.1.6.el5.centos.plus, ipvsadm v1.24, ldirectord v1.186-ha-2.1.3) that handles this perfectly. I'm migrating because the software on that system is very old.

After migrating the config to Ubuntu 14, fully updated with aptitude, only active FTP works. The kernel is 3.13.0-52-generic, ipvsadm is v1.26, and ldirectord is v1.186-ha -- all are installed from Ubuntu packages.

root@lb1:~# lsb_release -rd
Description: Ubuntu 14.04.2 LTS
Release: 14.04
root@lb1:~# uname -a
Linux lb1 3.13.0-52-generic #86-Ubuntu SMP Mon May 4 04:32:59 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Passive FTP, which should be handled by the ip_vs_ftp module, doesn't work properly. The control channel works, but data connections don't establish. The ip_vs_ftp module is loaded from /etc/rc.local and the system has been rebooted a number of times. The ldirectord process is not started by upstart, it is started by pacemaker.

The LVS load balancer is being configured by ldirectord. This is the ldirectord config:

checktimeout=5
checkinterval=10
negotiatetimeout=20
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=no

virtual=XX.XXX.XXX.71:21
        fallback=127.0.0.1:21
        real=10.100.2.61:21 masq 65535
        real=10.100.2.60:21 masq 1
        service=ftp
        request="monitortest.txt"
        receive="good"
        login="lbtest"
        passwd="PASSWD"
        scheduler=wrr
        protocol=tcp
        checktype=negotiate

On both CentOS 5 and Ubuntu 14, the machine has actual public IP addresses on it, and that virtual address is a public IP. The firewall is disabled.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-52-generic 3.13.0-52.86
ProcVersionSignature: Ubuntu 3.13.0-52.86-generic 3.13.11-ckt18
Uname: Linux 3.13.0-52-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 7 22:02 seq
 crw-rw---- 1 root audio 116, 33 May 7 22:02 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Fri May 8 09:15:14 2015
HibernationDevice: RESUME=UUID=cbeacb5e-cd21-4b18-a72f-7d6ebaec9c40
IwConfig:
 lo no wireless extensions.

 em2 no wireless extensions.

 em1 no wireless extensions.
MachineType: Dell Inc. PowerEdge R320
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-52-generic root=UUID=58c5cea9-08d7-41d7-8950-cd1c5ff86cde ro splash quiet vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-52-generic N/A
 linux-backports-modules-3.13.0-52-generic N/A
 linux-firmware 1.127.11
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/10/2014
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.3.3
dmi.board.name: 0KM5PX
dmi.board.vendor: Dell Inc.
dmi.board.version: A02
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.3.3:bd07/10/2014:svnDellInc.:pnPowerEdgeR320:pvr:rvnDellInc.:rn0KM5PX:rvrA02:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R320
dmi.sys.vendor: Dell Inc.

Revision history for this message
Shawn Heisey (elyograg) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Shawn Heisey (elyograg) wrote :

I notice that the ip_vs_ftp module is used by nf_nat. Does this mean that FTP mangling cannot happen without the firewall?

I really don't want to enable to the Linux firewall ... all of this is behind a Cisco firewall with restrictive ACLs, even though I'm using public IPs on this machine.

root@lb1:~# lsmod | grep ftp
ip_vs_ftp 13079 0
ip_vs 136629 2 ip_vs_ftp
nf_nat 21841 1 ip_vs_ftp

If I have to enable the firewall, then I will need help configuring it. In addition to being a load balancer, this machine also serves as a router -- the only way to access the back-end servers, even directly by private IP, is by routing through it.

Revision history for this message
Shawn Heisey (elyograg) wrote :

I grabbed a packet capture on the FTP client of the attempted FTP through LVS. When the client sends the PASV command, it never gets a response.

Repeating the packet capture on the machine doing LVS (and capturing both interfaces), I got more info. The FTP server sends the reponse to the PASV command, which the ip_vs_ftp module should mangle (changing to the public IP) and forward to the client ... but it never does. Instead thousands of duplicate ACKs begin traversing the network. I will attach a screenshot of the capture in wireshark. The IP addresses are different than my ldirectord config above ... I had to set up a temporary FTP server and run a different virtual address, because the other FTP servers are using the old machine as their default gateway.

Revision history for this message
Shawn Heisey (elyograg) wrote :

I cloned the latest resource-agents repository from github, built a new ldirectord, and started up that copy. No change.

Revision history for this message
Shawn Heisey (elyograg) wrote :

Will it be possible to solve this problem without turning on the firewall?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.1 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1-rc3-vivid/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Shawn Heisey (elyograg) wrote :

That rc3-vivid kernel doesn't seem to exist.

I set up a lab machine, tried it out, and saw the same behavior that I've reported here.

Then I installed the 4.1 rc2-vivid kernel package for amd64 and rebooted. It complained about missing firmware for my realtek nics, but networking appears to work just fine. The newer kernel did not help.

Linux lb5 4.1.0-040100rc2-generic #201505032335 SMP Mon May 4 03:36:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

I also tried enabling UFW on my test machine, putting a config /etc/ufw/applications.d for port 21/tcp (FTP) and allowing that application. That didn't help either.

Revision history for this message
Shawn Heisey (elyograg) wrote :

Additional data point -- I've built temporary production FTP load balancers with CentOS 6, and they work properly. The firewall is disabled here too. Here's the "uname -a" output of the online machine:

Linux lb5 2.6.32-504.16.2.el6.centos.plus.x86_64 #1 SMP Wed Apr 22 00:59:31 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Shawn Heisey (elyograg) wrote :

Side issue: The username on my Ubuntu account (cz-ubuntu) was not chosen by me, and I'd like to change it. I can't get into the Ubuntu forums with this account. Who do I need to contact for that?

Revision history for this message
Shawn Heisey (elyograg) wrote :

I have been trying out other solutions in the lab, such as having haproxy (which is also running on these machines) handle the FTP. So far I have not been able to find the right config to make that work.

Revision history for this message
Shawn Heisey (elyograg) wrote :

The reason that I filed this as a bug is because I have a config that works perfectly on 2.6 kernels that I cannot get working on 3.13 or 4.1 kernels.

I figure there are two possible reasons:

1) The feature has become broken and needs to be fixed.
2) Something changed in how the feature works and now it requires a different config.

I think it's probably number 1. If it is number 2, then I will need to know the new way of configuring it, and I would expect to file a bug against ldirectord.

I'm not sure how to go about it, but if there's a way to load a system with a 2.6.32 kernel, prove that it works right, and then step through each kernel release after that, we could figure out which specific release broke it.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.