[MIR] netcf

Bug #904014 reported by Serge Hallyn on 2011-12-13
48
This bug affects 8 people
Affects Status Importance Assigned to Milestone
netcf (Ubuntu)
Medium
Unassigned
Nominated for Precise by Adam Stokes
Quantal
Medium
Unassigned

Bug Description

Binary package hint: netcf

Availability: Currently in universe
Rationale: libvirt would like to use netcf to query and customize network interfaces on the host
Security: No CVEs.
Quality Assurance: Package works out of the box with no prompting. There are no major bugs in Ubuntu and the are no major bugs in Debian.
Standards Compliance: FHS and Debian Policy compliant.
Maintenance: Simple, stable package which is in mostly (*) sync with Debian. The maintainer is a member of the Ubuntu server team, making quick responses likely.
Dependencies: All are in main

According to comment #22, the testcases require netlink which is not allowed in the builders, therefore the testcases will remain off.

(Upstream as accepted patches to enable libnl3, so old comment about that removed)

Marc Deslauriers (mdeslaur) wrote :

netcf for libvirt is also required for virt-manager to enumerate networks properly without the use of HAL...

Changed in netcf (Ubuntu):
assignee: nobody → Jamie Strandboge (jdstrand)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in netcf (Ubuntu):
status: New → Confirmed
Jamie Strandboge (jdstrand) wrote :

Looks fine to me. Builds ok with just main, looks maintainable, no security issues, packaging looks good. Would be good to have a man page for /usr/bin/ncftool, but that won't hold up this MIR. Since these commands will be running as root, I spotted checked the code, and memory allocations, etc seem to be handled well.

There is a test suite available and I would like to see it enabled in the build as a condition of this MIR. 'make check' seems to work just fine without requiring network access. Marking 'In Progress'.

Changed in netcf (Ubuntu):
assignee: Jamie Strandboge (jdstrand) → nobody
status: Confirmed → In Progress
assignee: nobody → Serge Hallyn (serge-hallyn)
Serge Hallyn (serge-hallyn) wrote :

Thanks, Jamie. New version uploaded will run make check at each build.

Serge Hallyn (serge-hallyn) wrote :

Given that there is only a single test for netcf itself (the rest are gnulib tests), and i can't reproduce the failures anywhere except in our buildds for amd64 and i386, i'm at a loss for how to try to have part of the testsuite run.

Should I have 'make check' only run on arm, where they pass? Then a blatant breakage of the package would be spotted there even though the amd64/i386 tests aren't run.

Jamie Strandboge (jdstrand) wrote :

What if instead you just have the tests run, but not fail the build? Ideally we would have arm fail the build and i386/amd64 not fail the build. I think it is important to have the tests run on i386/amd64 so that people can notice problems before uploading. These would also be useful when comparing build logs (something the security team does before uploading).

Serge Hallyn (serge-hallyn) wrote :

Would I do this by simply patching the test case to return 0? Or is there a magic DEB_ flag I can set for that?

Ralf Spenneberg (ralq) wrote :

We are currently in the process of building a webbased gui for the management of virtual machines via libvirt. It would be greate if netcf is included in libvirt, because then the creation of vlans and bridges connected to these vlans could be handled via libvirt.
Please link libvirt to netcf.

Jamie Strandboge (jdstrand) wrote :

I'm afraid I don't have the debhelper incantation OTOH to give you. With libvirt I have been known to patch a test case to output that it failed, but return 0. Eg, from 9008-fix-daemon-conf-ftbfs.patch in lucid's libvirt:
-wait $pid || fail=1
+#wait $pid || fail=1
+# warn, but do not fail since Ubuntu buildds do not work too well with this test
+wait $pid || {
+ printf "WARNING: wait failed (TODO: update test suite). Reason:\n" 1>&2
+ cat log 1>&2
+}

You could apply a similar methodology.

Quoting Jamie Strandboge (<email address hidden>):
> I'm afraid I don't have the debhelper incantation OTOH to give you. With libvirt I have been known to patch a test case to output that it failed, but return 0. Eg, from 9008-fix-daemon-conf-ftbfs.patch in lucid's libvirt:
> -wait $pid || fail=1
> +#wait $pid || fail=1
> +# warn, but do not fail since Ubuntu buildds do not work too well with this test
> +wait $pid || {
> + printf "WARNING: wait failed (TODO: update test suite). Reason:\n" 1>&2
> + cat log 1>&2
> +}
>
> You could apply a similar methodology.

All output from netcf is redirected and immediately deleted. I've spent
two days trying to get meaningful output, and think i finally got it by
having debian/rules tweak the autogenerated Makefile to cat the logfile.
PPA compiles have at least a 5 hour delay right now, but hopefully I'll
have debug output soon!

Note that I've already done several tests to confirm that the tests
themselves to not fail. That is, I can hook up nothing but a single test
that does nothing but assert that 1 == 1, and it still fails. But if I
hook up not tests, it passes. Anyway, should have output soon.

Thanks.

Serge Hallyn (serge-hallyn) wrote :

Still not working. Presumably the Makefile looks different when created on the buildds, so that the

 sed -i 's/cat $$@-t >>$$@;/\0 \\\necho XXXXX;pwd;ls -ld test-debian.log; cat test-debian.log;echo XXXXX; /' tests/Makefile

done in override_dh_auto_config, which works in my sbuilds, does not cache anything there.

Adam Stokes (adam-stokes) wrote :

Hi Serge,

Curious if you made any progress on this? If not, I'd like to try and help get this building if you are being spread thin from other action items.

Thanks,
Adam

Serge Hallyn (serge-hallyn) wrote :

Quoting Adam Stokes (<email address hidden>):
> Hi Serge,
>
> Curious if you made any progress on this? If not, I'd like to try and
> help get this building if you are being spread thin from other action
> items.

I haven't. Help would be greatly appreciated :)

-serge

Ralf Spenneberg (ralq) wrote :

Hi Serge,

is there anything we can help from the outside? Unfortunately we do not have access to the Ubuntu build system. We need netcf in libvirt for our environment and would gladly offer some time to push this ahead.

Thanks,

Ralf

Serge Hallyn (serge-hallyn) wrote :

Quoting Ralf Spenneberg (<email address hidden>):
> Hi Serge,
>
> is there anything we can help from the outside? Unfortunately we do not
> have access to the Ubuntu build system. We need netcf in libvirt for our
> environment and would gladly offer some time to push this ahead.

The testsuite does also fail in ppas, so without access to the build
system (which I also do not have) you can tweak the package to add
debugging and iteratively push to ppa. It's slow, but it's what I'll
be doing when I get time next cycle. You could certainly do the same
thing.

I will start (next cycle) by writing a cleaner patch to handle both
libnl1 and libnl3 to push upstream, and merging the newer version.
Then I'll get the testsuite working, so we can get into main. If you
can get the testsuite working first, that'd be terrific.

Hi Serge,

Ralf asked me to inspect this testsuite bug. I'm not done yet but it seems that there is a problem with libnl.
The cause of problem is the nl_connect_call in netlink_init in src/dtuil_linux.c:806, which returns a nonzero value.
So far I don't know why, but I will have a closer look at it this week.

Btw, the reason why even a very simple test case causes an error is because it happens in the init code.
CuSuiteSetup(suite, setup, teardown) registers tutil:setup and this calls ncf_init, which calls drv_init and finally netlink_init.

Serge Hallyn (serge-hallyn) wrote :

Thanks, Hendrik, that makes sense.

In nl.c:117 the nl_connect call tries to create a netlink socket: sk->s_fd = socket(AF_NETLINK, SOCK_RAW | flags, protocol);
(protocol is NETLINK_ROUTE).
This calls returns ENOENT (22). So It seems that the ppa environment doesn't support netlink.

Unfortunately it's necessary to call setup and therefore nl_connect before processing any test case.
The only possibility I see, beside changing the ppa environment, is to disable the tests.

Hi Serge,
there are quite some important features of libvirt that can't be used without netcf.
So I think it would be a great benefit for virtualisation with ubuntu if libvirt is linked to netcf.

So is there anything I can do to support you?

Serge Hallyn (serge-hallyn) wrote :

Quoting Hendrik Schwartke (<email address hidden>):
> Hi Serge,
> there are quite some important features of libvirt that can't be used without netcf.
> So I think it would be a great benefit for virtualisation with ubuntu if libvirt is linked to netcf.
>
> So is there anything I can do to support you?

I'm hoping to get to netcf work within the next weeek (contingent upon
other bugs). The steps I see are:

 1. update debian and ubuntu packages to the lastest upstream
    release
 2. make a clean patch to make netcf work with either libnl1 or
    libnl3
 3. resend any patches plus the libnl patch upstream
 4. get the testsuite either running, or detecting that it can't
    run. Thanks again for the insight about netlink as a
    potential cause.

Changed in netcf (Ubuntu):
milestone: none → ubuntu-12.04.1
importance: Undecided → Medium
Serge Hallyn (serge-hallyn) wrote :

As soon as upstream releases 0.2.0, that will include the libnl3 support. Given comment #22's explanation of the testcase failures in builders, I believe that with the 0.2.0 package we should be ready for main. (hoping :)

description: updated
Martin Jackson (mhjacks) wrote :

+1 for 0.2.0 release. It seems the git repo has Serge's changes; what would prevent an 0.2.0 release at this point? It seems like:

netcf MIR blocks libvirt build with netcf; the libvirt bug is getting a lot of attention from users, but this bug is the real linchpin for that issue. (Once 0.2.0 is released, it should pass the test suite, which will allow it into main, which will enable libvirt to depend on it, which will fix the libvirt bug).

Serge's changes appear to be the most recent in the Git repo, so what is blocking a netcf 0.2.0 release? It seems like there is some traffic on the mailing list about FreeBSD support; is netcf waiting for that work to complete to release 0.2.0?

Adam Stokes (adam-stokes) wrote :

I've spoken with upstream maintainers and Laine stated that once he's back from vacation (~July 9th) he will put together a release.

I'll keep in touch with him to make sure this stays on his radar.

Thanks,
Adam

Changed in netcf (Ubuntu Quantal):
milestone: ubuntu-12.04.1 → ubuntu-12.10
Serge Hallyn (serge-hallyn) wrote :

2.0 was released today. I will start on the debian and ubuntu packages on monday.

Serge Hallyn (serge-hallyn) wrote :

(Note that per comment #18, since netlink is apparently not supported in buildd/ppa, tests cannot be run and will therefore be disabled. I will however run them on my own system before pushing.)

Serge Hallyn (serge-hallyn) wrote :

0.2.0-1ubuntu1 has been pushed to quantal. I've built a libvirt in ppa (ppa:serge-hallyn/virt) linked against it, which is working (virsh iface-list works).

Ping on MIR?

Changed in netcf (Ubuntu Quantal):
assignee: Serge Hallyn (serge-hallyn) → Jamie Strandboge (jdstrand)
Jamie Strandboge (jdstrand) wrote :

Based on my previous assessment and that 'make check' cannot be run on the buildd, ACK.

Changed in netcf (Ubuntu Quantal):
assignee: Jamie Strandboge (jdstrand) → nobody
status: In Progress → Fix Committed
Colin Watson (cjwatson) wrote :

Moved to main.

Changed in netcf (Ubuntu Quantal):
status: Fix Committed → Fix Released
Ralf Spenneberg (ralq) wrote :

Thanks a lot for your great work.

Is it possible to make the same move for Precise? Precise being LTS will be much longer around than Quantal.
We would really appreciate it if netcf could be moved to main for 12.04.1 as it was planned originally

Any thoughts?

Colin Watson (cjwatson) wrote :

We never touch /dists/precise/ after release (barring legal action, at least). If this is going to happen then it will have to be associated with a stable release update (possibly without source changes) to netcf so that we can move it in precise-{proposed,updates}, and presumably if this is at all worthwhile then it would also be associated with an update to libvirt to make use of it. I'm not expressing any particular opinion right now on whether the latter would meet stable update rules, but if it doesn't then there doesn't seem much point in moving netcf.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers