speedup startup by delaying binding of ports

Reported by alzhimer on 2005-06-06
2
Affects Status Importance Assigned to Milestone
Zope 2
Medium
Tres Seaver

Bug Description

we found the early binding of ports when starting up zope is problematic in our load balanced (behind a pound) setup. if a backend restarts and binds its ports, pound almost immediately thinks it's back in business and starts sending request. obviously, the server needs some more time to finish its startup; meanwhile, those requests are being unanswered (at least delayed for quite a while), and users need to wait for the response (if any). we only use high ports for the servers, so there's no real need for us to bind the ports with root privileges (of course it would be cleaner to check if low ports are configured, and only delay the binding if not).

last not least: the late binding seems to have a huge impact on the startup time (although this is no hard fact, just a feeling).

our solution is rather simple:

in lib/python/Zope2/Startup/ (or lib/python/Zope/Startup/ for 2.7):

Index: __init__.py
===================================================================
--- __init__.py (revision 30421)
+++ __init__.py (working copy)
@@ -88,8 +88,8 @@
         self.setupPublisher()
         # Start ZServer servers before we drop privileges so we can bind to
         # "low" ports:
- self.setupZServer()
- self.setupServers()
+ #self.setupZServer()
+ #self.setupServers()
         # drop privileges after setting up servers
         self.dropPrivileges()
         self.makeLockFile()
@@ -100,6 +100,10 @@
         # emit a "ready" message in order to prevent the kinds of emails
         # to the Zope maillist in which people claim that Zope has "frozen"
         # after it has emitted ZServer messages.
+
+ self.setupZServer()
+ self.setupServers()
+
         logger.info('Ready to handle requests')
         self.setupFinalLogging()

Cheers,

Sascha

Chris McDonough (chrism-plope) wrote :

> = Request - Entry #1 by alzhimer on Jun 6, 2005 6:32 am
>
> we found the early binding of ports when starting up zope is problematic
> in our load balanced (behind a pound) setup. if a backend restarts and
> binds its ports, pound almost immediately thinks it's back in business
> and starts sending request. obviously, the server needs some more time to
> finish its startup; meanwhile, those requests are being unanswered (at
> least delayed for quite a while), and users need to wait for the response
> (if any). we only use high ports for the servers, so there's no real need
> for us to bind the ports with root privileges (of course it would be
> cleaner to check if low ports are configured, and only delay the binding
> if not).

Yep. Probably any patch that is meant to go into mainline Zope to bind late will need to do this. Really the only reason we bind the ports so early is to be able to drop privileges for further steps.

For what it's worth, using ZEO with Zope can reduce startup time by quite a bit. You didn't mention whether you use ZEO or not but if you don't, you might consider it.

alzhimer (alzhimer) wrote :

yes, we have a ZEO setup. i might be wrong, but i have the feeling that the early incoming requests are slowing the startup down.

i didn't know how to find out if low ports are configured, but i guess it's a no brainer for someone with deeper knowledge than me...or may be someone has a hint?

Chris McDonough (chrism-plope) wrote :

As far as blocked requests slowing startup, I guess the only way to know is to do some tests and share the results. Would you be willing to do that? Assuming it really does speed things up, to conditionally do late bindings of ports, something just needs to compared each server "factory" that is part of the config (self.cfg.servers) against <= 1024. These server factories are defined in lib/python/ZServer/datatypes.py . All TCP server factories appear to have an "port" attribute which is the port number, so iterating over all of them, checking for "port" and comparing it <= 1024 should work ok.

alzhimer (alzhimer) wrote :

okay, I shouldn't trust my feelings too much. I compared some startup times with and without the late binding change, and must admit that they look very similar. However, I believe that the late binding is a good thing for our load balanced setup - I don't want users to get blank pages only because zope opens a port while still not ready to handle it. Here is a improved patch which does the late binding only if there is no port <= 1024 configured:

Index: __init__.py
===================================================================
--- __init__.py (revision 30421)
+++ __init__.py (working copy)
@@ -86,10 +86,14 @@
         self.setupLocale()
         self.setupSecurityOptions()
         self.setupPublisher()
- # Start ZServer servers before we drop privileges so we can bind to
- # "low" ports:
         self.setupZServer()
- self.setupServers()
+ self.setBindEarly()
+
+ if self.bind_early:
+ # Start ZServer servers before we drop privileges so we can bind to
+ # "low" ports:
+ self.setupServers()
+
         # drop privileges after setting up servers
         self.dropPrivileges()
         self.makeLockFile()
@@ -100,6 +104,10 @@
         # emit a "ready" message in order to prevent the kinds of emails
         # to the Zope maillist in which people claim that Zope has "frozen"
         # after it has emitted ZServer messages.
+
+ if not self.bind_early:
+ self.setupServers()
+
         logger.info('Ready to handle requests')
         self.setupFinalLogging()

@@ -208,6 +216,13 @@
                                                  % (server.servertype(),e[1]))
         self.cfg.servers = servers

+ def setBindEarly(self):
+ self.bind_early = False
+ for server in self.cfg.servers:
+ if server.port <= 1024:
+ self.bind_early = True
+ return self.bind_early
+
     def dropPrivileges(self):
         return dropPrivileges(self.cfg)

Cheers, Sascha

Chris McDonough (chrism-plope) wrote :

Thanks for the work! However, users should never see a blank page regardless of late or early binding. Even though the port is bound, a request will block waiting for the publisher to start. If you see something happen otherwise, that's a bug. But if you don't, there's not much of a reason to include this logic. :-(

alzhimer (alzhimer) wrote :

may be I wasn't exact enough, but we've often seen a problem with the early binding. what happens is, the load balancer thinks the backend is back to life when it sees the backend's listen port is open. then it starts sending request to that backend, which still might need quite a while until it's really ready to start handling request (think of checking the zeo cache). on a busy site the queue of unahandled requests can grow quite fast, which results in users needing to wait a long time for their responses; and if the queue grew big enough some might even get timeouts.

Chris McDonough (chrism-plope) wrote :

OK, I understand now. That is a convincing reason to merge this, given that pound would have otherwise sent the request to an already-ready appserver and the request would get serviced more quickly instead of sticking around in the waiting appserver's listen queue. I will merge this into the 2.7 and 2.8 branches as well as the trunk and see if anyone complains. ;-) Note that there is a small problem with the patch inasmuch as not every server has a 'port' attr that is an integer (some servers don't listen on TCP ports), but I can fix that.

alzhimer (alzhimer) wrote :

oh, I thought I had changed this.

if server.port is not None and server.port <= 1024:

should do the trick, right?

Tres Seaver (tseaver) wrote :

Status: Pending => Accepted

 Supporters added: tseaver

See also the patch referred to by #1942 (which duplicates this
issue):

  http://mail.zope.org/pipermail/zope-dev/2005-November/025709.html

Tres Seaver (tseaver) wrote :

The attached patch (against the 2.9 branch) integrates the patches from
that earlier issue, using explicit configuration rather than heuristics to
enable the "deferred listening."

Darryl Dixon (esrever-otua) wrote :

For what it's worth, for all the users of older (Zope <=2.9.8) releases, we did the same thing for the same reason with the following slightly hacky hotfix. It hooks in to Lifetime.loop to only light up a 'HAport' (in Pound terms) once the application was in fact actually ready (Pound watches for the appearance/disappearance of the HAport instead of the usual application port). Because of some other initialisation stuff we do, our Zope startup times were anywhere up to 5-10 minutes, so this was a critical fix for us:

import socket, thread, zLOG
from exceptions import Exception

from App.config import getConfiguration

import Lifetime

_original_loop = Lifetime.loop

def halistenerStarter():
    zLOG.LOG('ZZZHAListener', zLOG.INFO, 'Attempting to initialise')
    try:
        servers = [x.getsockname() for x in getConfiguration().servers]
    except Exception, e:
        zLOG.LOG('ZZZHAListener', zLOG.INFO, 'Dieing on: %s' % str(e))
    if servers:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        # The first one in the config file is the HTTP Server...
        # We could just specify a value for the ip-address variable in
        # the zope.conf and use getConfiguration().ip_address, but
        # this would limit us in some other ways.
        run_ha = True
        try:
            s.bind((servers[0][0], 58080))
        except:
            zLOG.LOG('ZZZHAListener', zLOG.INFO, "Couldn't bind to %s on port 58080 - probably there's something already listening there? Starting up without the HAport" % servers[0][0])
            run_ha = False

        def ha_run(s):
            s.listen(socket.SOMAXCONN)
            while(1):
                try:
                    conn, addr = s.accept()
                    conn.close()
                except socket.error:
                    pass
            s.close()

        if run_ha:
            zLOG.LOG('ZZZHAListener', zLOG.INFO, 'Coming online now')
            thread.start_new_thread(ha_run, (s,))
    else:
        zLOG.LOG('ZZZHAListener', zLOG.WARN, 'Could not find a valid HTTPServer to bind to the IP address of')
    _original_loop()

Lifetime.loop = halistenerStarter

Andreas Jung (ajung) wrote :

Zope already had some configuration option that controlled when the ports were opened. Strange but I can not find this option any more in the current Zope 2 core.

Hanno Schlichting (hannosch) wrote :

The fast-listen support has been added to Zope a long time ago.

Changed in zope2:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments