Stop trigger Nagios alarm when outgoing mail gets a transient error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
KARL4 |
Fix Released
|
High
|
Paul Everitt |
Bug Description
We have a bunch of things going on with mailin/mailout that need some attention. Several of these issues do a log.error which then causes Nagios to tell me KARL is in a critical error state. [wink]
This is the first one. For outgoing mail, gocept's queue is setup to do address verification for each outgoing message, with a cache. If the remote mail server doesn't answer in time, gocept's mail server gives KARL (repoze.sendmail) a warning. That then triggers, I believe, a log.error in KARL, which triggers Nagios to think KARL is broken.
It isn't a severe error. repoze.sendmail tries again in 3 hours and always is able to deliver. (Unfortunately, that 3h setting is not easily configurable, but correct me if I am wrong on that.)
Ideally this would generate a log.warning or something in KARL. We can know about it, but not go crazy.
Some extra notes:
- The full traceback is below
- The .ini files that configure OSF are in a separate package (osideploy) which Fabric uses to generate
Error while sending mail from to x@y.com
Traceback (most recent call last):
File "/srv/osfkarl/
self.
File "/srv/osfkarl/
connection.
File "/usr/lib/
raise SMTPDataError(code, resp)
SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
Carlos de la Guardia (cguardia) wrote : | #1 |
Changed in karl4: | |
status: | New → Fix Committed |
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1510266] Stop trigger Nagios alarm when outgoing mail gets a transient error | #2 |
It’s really hard to replicate and will take quite a testing setup to do so, I think.
Is the idea that its an unhandled exception in repoze.sendmail that makes it all the way to Pyramid’s handler? And thus, since you are handling it, Pyramid’s standard logging (which marks it as a error instead of a warning) gets triggered?
—Paul
> On Nov 12, 2015, at 4:35 AM, Carlos de la Guardia <email address hidden> wrote:
>
> I looked at all the code involved, and I think the easy way out is to
> just catch the SMTPDataError and log it. You are right that there is no
> easy way to configure the 3 hour setting, short of forking
> repoze.sendmail.
>
> My problem is that I can't easily replicate this. Can you?
>
> ** Changed in: karl4
> Status: New => Fix Committed
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> Fix Committed
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Carlos de la Guardia (cguardia) wrote : | #3 |
Yeah, repoze.sendmail catches all smtp exceptions, but only acts on error codes between 500 and 599. It re-raises the exception otherwise, and mailout was not handling it.
Changed in karl4: | |
status: | Fix Committed → Fix Released |
Paul Everitt (paul-agendaless) wrote : | #4 |
Hmm, production just raised an alarm for this:
SMTPDataError: (450, '4.1.1 id=04334-07 - Temporary MTA failure on relaying, from MTA(smtp:
Carlos de la Guardia (cguardia) wrote : | #5 |
This is discouraging. Why is the exception not bubbling up? We might need to fork repoze.sendmail.
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1510266] Re: Stop trigger Nagios alarm when outgoing mail gets a transient error | #6 |
Does that mean you think there is a bare except handler somewhere?
—Paul
> On Nov 16, 2015, at 2:02 PM, Carlos de la Guardia <email address hidden> wrote:
>
> This is discouraging. Why is the exception not bubbling up? We might
> need to fork repoze.sendmail.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> Fix Released
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Paul Everitt (paul-agendaless) wrote : | #7 |
Here is the full traceback:
Traceback (most recent call last):
File "/srv/osfkarl/
self.
File "/srv/osfkarl/
connection.
File "/usr/lib/
raise SMTPDataError(code, resp)
SMTPDataError: (450, '4.1.1 id=04334-07 - Temporary MTA failure on relaying, from MTA(smtp:
Is this the relevant section? A 450 error code means it hits the “else”. Looks ok to me, your exception handler should have caught it…although it’s weird, the traceback above doesn’t go through our KARL code. Is this because it’s done at TM commit time?
try:
except smtplib.
if 500 <= e.smtp_code <= 599:
—Paul
> On Nov 16, 2015, at 2:02 PM, Carlos de la Guardia <email address hidden> wrote:
>
> This is discouraging. Why is the exception not bubbling up? We might
> need to fork repoze.sendmail.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> Fix Released
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Tracebac...
Carlos de la Guardia (cguardia) wrote : | #8 |
I think I know where to go with this. I'll do it on Wednesday.
Paul Everitt (paul-agendaless) wrote : | #9 |
Hi Carlos. Based on your last comment, this one is still open. We're getting a Nagios alarm several times a day, so it would be nice to quiet this.
Changed in karl4: | |
importance: | Low → High |
milestone: | 012 → 013 |
status: | Fix Released → In Progress |
Carlos de la Guardia (cguardia) wrote : | #10 |
I made a PR to repoze.sendmail to allow us to avoid this or any other transient errors. If it's not merged we could use my branch in the meantime.
Paul Everitt (paul-agendaless) wrote : | #11 |
Should I make our own -agendaless package and put in our index? Probably off of master plus this branch, so we’d need to be careful when deploying it (big jump forward in repoze.sendmail version.)
—Paul
> On Dec 10, 2015, at 3:16 AM, Carlos de la Guardia <email address hidden> wrote:
>
> I made a PR to repoze.sendmail to allow us to avoid this or any other
> transient errors. If it's not merged we could use my branch in the
> meantime.
>
> https:/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Paul Everitt (paul-agendaless) wrote : | #12 |
Also, does that repoze.sendmail PR imply a change in our code, to add ignore_
—Paul
> On Dec 10, 2015, at 8:14 AM, Paul Everitt <email address hidden> wrote:
>
>
> Should I make our own -agendaless package and put in our index? Probably off of master plus this branch, so we’d need to be careful when deploying it (big jump forward in repoze.sendmail version.)
>
> —Paul
>
>> On Dec 10, 2015, at 3:16 AM, Carlos de la Guardia <email address hidden> wrote:
>>
>> I made a PR to repoze.sendmail to allow us to avoid this or any other
>> transient errors. If it's not merged we could use my branch in the
>> meantime.
>>
>> https:/
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https:/
>>
>> Title:
>> Stop trigger Nagios alarm when outgoing mail gets a transient error
>>
>> Status in KARL4:
>> In Progress
>>
>> Bug description:
>> We have a bunch of things going on with mailin/mailout that need some
>> attention. Several of these issues do a log.error which then causes
>> Nagios to tell me KARL is in a critical error state. [wink]
>>
>> This is the first one. For outgoing mail, gocept's queue is setup to
>> do address verification for each outgoing message, with a cache. If
>> the remote mail server doesn't answer in time, gocept's mail server
>> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
>> a log.error in KARL, which triggers Nagios to think KARL is broken.
>>
>> It isn't a severe error. repoze.sendmail tries again in 3 hours and
>> always is able to deliver. (Unfortunately, that 3h setting is not
>> easily configurable, but correct me if I am wrong on that.)
>>
>> Ideally this would generate a log.warning or something in KARL. We can
>> know about it, but not go crazy.
>>
>> Some extra notes:
>>
>> - The full traceback is below
>>
>> - The .ini files that configure OSF are in a separate package
>> (osideploy) which Fabric uses to generate
>>
>>
>> Error while sending mail from to x@y.com
>>
>> Traceback (most recent call last):
>> File "/srv/osfkarl/
>> self.mailer.
>> File "/srv/osfkarl/
>> connection.
>> File "/usr/lib/
>> raise SMTPDataError(code, resp)
>> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>>
>> To manage notifications about this bug go to:
>> https:/
>
Carlos de la Guardia (cguardia) wrote : | #13 |
To avoid unpleasant surprises, maybe I should backport the fix to the branch that we use?
You are right, we would need to add that ignore_
Paul Everitt (paul-agendaless) wrote : | #15 |
I think we previously agreed that we should upgrade to the latest, to try and fix the multiple Message-Id issue. Thus, if you agree, I will fork that repo into karlproject and make a package based on your branch.
—Paul
> On Dec 10, 2015, at 1:08 PM, Carlos de la Guardia <email address hidden> wrote:
>
> To avoid unpleasant surprises, maybe I should backport the fix to the
> branch that we use?
>
> You are right, we would need to add that ignore_
> call.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Carlos de la Guardia (cguardia) wrote : | #14 |
By the way, which version of repoze.sendmail are we using?
Carlos de la Guardia (cguardia) wrote : | #16 |
Ok, let's fork it into karlproject. I will add the mailout change when that's done.
Paul Everitt (paul-agendaless) wrote : | #17 |
repoze.sendmail 2.3
—Paul
> On Dec 10, 2015, at 1:26 PM, Carlos de la Guardia <email address hidden> wrote:
>
> By the way, which version of repoze.sendmail are we using?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Paul Everitt (paul-agendaless) wrote : | #18 |
Today is a KARL day for me. I’ll focus first on various things related to repoze.sendmail.
I made this fork of master:
https:/
I will edit setup.py to use PEP 440-compliant “local version identifiers”. Thus, the version field will be “4.2+agendaless.1”.
Should I change the repository name to be repoze.
—Paul
> On Dec 10, 2015, at 1:33 PM, Carlos de la Guardia <email address hidden> wrote:
>
> Ok, let's fork it into karlproject. I will add the mailout change when
> that's done.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Paul Everitt (paul-agendaless) wrote : | #19 |
As an FYI and reminder, it occurs to me, now that we have forked this, we can do some other things we wanted to do. For example, the crazy changes I made previously in KARL, to monkey-patch stuff in here, can now be done directly in here. Also, we can make the 3 hour retry delay configurable.
—Paul
> On Dec 10, 2015, at 1:33 PM, Carlos de la Guardia <email address hidden> wrote:
>
> Ok, let's fork it into karlproject. I will add the mailout change when
> that's done.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Changed in karl4: | |
milestone: | 013 → 014 |
Carlos de la Guardia (cguardia) wrote : | #20 |
I just pushed the change to use the ignore_transient feature in mailout.
Paul Everitt (paul-agendaless) wrote : | #21 |
Over to me for testing/
Changed in karl4: | |
assignee: | Carlos de la Guardia (cguardia) → Paul Everitt (paul-agendaless) |
status: | In Progress → Fix Committed |
Paul Everitt (paul-agendaless) wrote : | #22 |
This is now in testing on the master branch (with some other things per an email just now to Nat/Carlos) which is setup on karlstaging.
Changed in karl4: | |
status: | Fix Committed → Fix Released |
Paul Everitt (paul-agendaless) wrote : | #23 |
Hmm, after the production update, I'm still getting errors logged for:
Traceback (most recent call last):
File "/srv/osfkarl/
self.
File "/srv/osfkarl/
connection.
File "/usr/lib/
raise SMTPDataError(code, resp)
SMTPDataError: (450, '4.1.1 id=30340-30 - Temporary MTA failure on relaying, from MTA(smtp:
Changed in karl4: | |
status: | Fix Released → In Progress |
Changed in karl4: | |
milestone: | 014 → 015 |
Carlos de la Guardia (cguardia) wrote : | #24 |
Pretty strange. SMTPDataError is a subclass of SMTPResponseExc
Paul Everitt (paul-agendaless) wrote : Re: [Bug 1510266] Stop trigger Nagios alarm when outgoing mail gets a transient error | #26 |
Crap, you’re right, it’s on master but I didn’t put that into the egg. :(
—Paul
> On Jan 4, 2016, at 1:02 AM, Carlos de la Guardia <email address hidden> wrote:
>
> Pretty strange. SMTPDataError is a subclass of SMTPResponseExc
> it should be caught inside the queue code. Are we 100% sure the
> ignore_
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> Stop trigger Nagios alarm when outgoing mail gets a transient error
>
> Status in KARL4:
> In Progress
>
> Bug description:
> We have a bunch of things going on with mailin/mailout that need some
> attention. Several of these issues do a log.error which then causes
> Nagios to tell me KARL is in a critical error state. [wink]
>
> This is the first one. For outgoing mail, gocept's queue is setup to
> do address verification for each outgoing message, with a cache. If
> the remote mail server doesn't answer in time, gocept's mail server
> gives KARL (repoze.sendmail) a warning. That then triggers, I believe,
> a log.error in KARL, which triggers Nagios to think KARL is broken.
>
> It isn't a severe error. repoze.sendmail tries again in 3 hours and
> always is able to deliver. (Unfortunately, that 3h setting is not
> easily configurable, but correct me if I am wrong on that.)
>
> Ideally this would generate a log.warning or something in KARL. We can
> know about it, but not go crazy.
>
> Some extra notes:
>
> - The full traceback is below
>
> - The .ini files that configure OSF are in a separate package
> (osideploy) which Fabric uses to generate
>
>
> Error while sending mail from to x@y.com
>
> Traceback (most recent call last):
> File "/srv/osfkarl/
> self.mailer.
> File "/srv/osfkarl/
> connection.
> File "/usr/lib/
> raise SMTPDataError(code, resp)
> SMTPDataError: (450, '4.1.1 id=25378-35 - Temporary MTA failure on relaying, from MTA(smtp:
>
> To manage notifications about this bug go to:
> https:/
Paul Everitt (paul-agendaless) wrote : | #25 |
Paul needs to get this into an egg and into a release.
Changed in karl4: | |
status: | In Progress → Fix Released |
I looked at all the code involved, and I think the easy way out is to just catch the SMTPDataError and log it. You are right that there is no easy way to configure the 3 hour setting, short of forking repoze.sendmail.
My problem is that I can't easily replicate this. Can you?