lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

Since Mike's raising of the issue of things like out-of-order mails, and also
some out-of-band issues we've seen with some subscribers trying to mail lua-l
since it moved to Pepperfish, I thought I'd take a moment to explain how
Pepperfish handles email and how that affects the Lua mailing list.

When someone sends an email to lua-l, their mail server makes a connection to
Pepperfish and sends a line which identifies the sender.  This is known as the
'MAIL FROM:' line in SMTP parlance and is the "envelope sender" as far as mail
processing is concerned.  This address is the address to which any errors would
be reported.  Note: This can (and sometimes does) differ from the address in
the 'From' header in the mail itself.

When Pepperfish receives the MAIL FROM, before *any* other processing occurs,
we subject that address to what is called 'Sender callout verification'.  This
means that Pepperfish looks up what mail server would handle any errors which
could be generated, and connects to that mail server, pretending to send an
error report to that address.  If such an action results in a failure from the
mail server in question, Pepperfish's mail server immediately rejects that
sender because it would be unable to produce an error report and reasonably
expect it to reach the sender of the email.

This is the point at which several subscribers have hit issues.  Pepperfish
perform sender callout verification with a NULL sender address.  This is both
valid and recommended because it prevents any mail server Pepperfish connects
to to perform such a verification from requiring to perform such a verification
on any non-NULL sender address which Pepperfish might otherwise have used.
Thus loops in sender verification are prevented.  Without such protection,
mails might never get through.

Sender callout verification is a very common approach to reducing spam loads,
and ensuring that badly set up clients get notified as quickly as possible.
Pepperfish is by no means unique in doing this.  This protection alone allows
Pepperfish to reject 43% of incoming SMTP connections before they cause any
more load to the system.

Next, the sender provides a recipient line.  (The 'RCPT TO:' in SMTP parlance
and "envelope recipient" for processing considerations.)  Pepperfish
immediately performs a receipient callout check next, to ensure that the
incoming recipient is valid and deliverable.  This check is done in a very
similar way to sender verification, only the now validated sender can be used
to perform the check this time.  This action allows Pepperfish to reject an
additional 17% of all incoming traffic.

At this point, the sender provides the actual mail body and Pepperfish checks
it for header syntax etc.  Very few mails are rejected at this point, and so
the mail is accepted for delivery to the mailing list manager (mailman).

Mailman accepts the mail, performs its own verification (e.g. subscriber
checks) and assuming it is satisfied, it prepares outgoing mails to everyone on
the list.

In order to allow for the Pepperfish server's inherent parallelism ability,
mailman limits each outgoing mail to 25 recipients and limits itself to 9 of
those batches per connection to the mail server.  These numbers were selected
to split the list up efficiently but not overly, to allow Exim (the mail server
software) to parallelise itself well, without overwhelming the server.  This
produces around 75 queue entries to be processed by the mail server.

Each time mailman does this, the subscribers in any given queue item may
change.  The receipients in a given queue entry are processed in order and
deliveries are attempted.  Some deliveries take longer than others.  E.g. mails
to yahoo.com.cn and yahoo.com.br can sometimes take up to a minute to defer.
This holds up other deliveries in that particular queue item sometimes.  As a
result, sometimes you may receive mails "out of order" if a quick reply is made
while the server is still stuck delivering the original mail to someone else
before you.  You will eventually receive all the mail you are meant to however.

As you can see, there's a lot of points in that process where delays can be
introduced, however the Pepperfish server is tuned to try and minimise those
delays.  If there are points where the delays balloon unacceptably then I will
be trying my best to ensure the configuration is tweaked sufficiently to ensure
they don't happen again.

I hope this explanation is of some use to those of you who were concerned with
how mail is processed by the Pepperfish server and that it lays to rest some of
your concerns over how your posts to the mailing list are processed.

Here are a few example timings based on a mail processed by lua-l earlier
today:

Time from receipt of mail at Pepperfish to first enqueueing of the result for
sending to subscribers: 1 second.

Time from receipt of first of those outgoing batches, to the last of them (i.e.
time for Mailman to generate all of its output): 8 seconds.

Time from receipt of incoming mail to reach 50% delivered to subscribers: 16
seconds.

Time to reach 75% delivered: an additional 9 seconds after that.

Time from receipt of incoming mail to reach 95% delivered: 53 seconds.

Time from then to reach 99%: approx 3 minutes.

The remaining stragglers were dealt with over the subsequent few hours as retry
timers expired.

This means that Pepperfish shunts somewhere in the region of 1500 mails in the
course of those first 27 seconds, at an average more than 50 mails per second.

Regards,

Daniel.

-- 
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 3CCE BABE 206C 3B69