|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
There are two common methods of defecting email spam:
It make sense to construct spam filter hierarchically, first eliminating "header-detectable" spam and then analyzing the body for the rest (say 2%) of the messages. The idea implemented in Spam Assassin with weights is very questionable as in this case you usually need results of all of the checks to get the weight.
One of the most typical signs of email spam is fake information in the envelope. Checking it for consistency along with simple DNS checks are probably the most efficient way to detect spam and in combination with other header checks you can usually get ~98% accuracy level.
That means that in most cases you do not even need to analyze the body. That demonstrates well "the power of metadata" about which we now know so well due to Snowden information about NSA activities.
In other words the most efficient approaches to spam filtering is checking of headers for RFC compliance and merciless junking of deviations not included in the whitelists.
Mail routing information is especially important as in spam it is routinely faked.
But you need to analyze your mail stream before implementing this measure and to create an exception lists (whitelists) as many legit sources abuse SMTP protocol (FedEx and many other Fortune 100 companies).
Catherine Hampton has developed a set of Procmail configuration files, named the "Spam Bouncer", which takes Procmail to the limit by implementing a spam blocking scheme using just pure Procmail regex features. This is a very limited approach, and it has value mainly as the Procmail tutorial. One of the interesting capabilities of the Spam Bouncer that might deserve further attention from implementers of similar tools is its ability to divide potentially objectionable mail into two different levels of suspicion: blatant spam messages and questionable messages:
|
IMHO Perl is one of the best languages for writing email spam filters spam and Perl interface to MTA can help. But with the current servers the implementations based of milters are plain vanilla overkill. You can get the same result based of pure movement of messages from one Sendmail (or other MTA) instance to another. So filtering in this case that when you move files, which represent messages you filter out spam and reprocess questionable messages to delete attachments. This solution easily scales to approximately ten million messages per day ( if two socket server with 32Gb or more of RAM and SSD disk for OS is used). It is also simpler and more reliable then milter-based solutions.
Actually Sendmail can benefit from incorporating some scripting language too (it also can replace dinosaur macro-based rewriting rules :-).
Perl-based spam filters are usually open source. I know only one commercial Perl-based spam filter and it sucks ;-) Active State developed Perlmx (now Pure Message), see perl.com Filtering Mail with PerlMx [Oct. 10, 2001] which looks like an expensive commercial variant of Procmail written in Perl, but with all due respect to Perl, this is a weak overpriced product. The product that does not even comes close to (far from being perfect) free SpamAssasin.
Actually Pure Message is a nice demonstration of the level of degradation of commercial solutions where all development efforts go into marketing and interface and none into fundamental algorithms. That tendency permits open source to complete with them despite much less manpower. What is really funny is that Sophos bought Active State with the explicit goal of entering anti-spam market. Paradoxically they managed to chose probably the weakest product that Active State had had.
Procmail proved to be a really useful free tool that was available at the right place at the right time, but its age now shows and it is very limited in its filtering capabilities. You can use Procmail to invoke Perl scripts and that approach is one of the simplest and effective strategy to fight spam. This combined procmail+Perl approach can probably remove more then 80% or slightly higher percentage of spam with a very low number of false positives.
Again, Perl is a much better tool and combined approach (procmail+Perl) is probably the first choice to consider. It proved to be simple and scalable of regular midrange Solaris hardware (4 CPUs 8GB of RAM) up to approximately half-million messages per day.
It is very important to understand that spam changes the nature of email and unfortunately a "spam filter" further amplifies this effect. Nothing can compensate this deterioration of mail environment due to the spam filter, but one can easily amplify the negative effects of spam by using too much zeal in spam filtering ;-). The road to hell is paved with good intentions.
It is very important to understand that spam changes the nature of email and unfortunately a "spam filter" further amplifies this effect. Nothing can compensate this deterioration of mail environment due to the spam filter, but one can easily amplify the negative effects of spam by using too much zeal in spam filtering ;-). The road to hell is paved with good intentions. |
The fact is that from a reasonably reliable delivery mechanism ("old email environment") combination of "spam+ a spam filter" turns email into a new variation of "Alice in wonderland" ("new email environment"). This "new email environment" represents a really unreliable/capricious mechanism that can arbitrarily block useful mails so delivery of any mail is no longer assured. From this point of view the importance of whitelists cannot be underestimated as they restore predictability for at least a part of the address space.
An "overzealous" spam filter can make this situation much worse by completely killing a weak useful signal: this is a demonstration of the law of unintended consequences of adopting weak commercial solution that I tried to stress.
The problem here is not with false positives but with false negatives. If you get 1000 emails and filter 990 of them as spam with just one false negative (useful message classified as spam) and one false positive, then you have false negative rate 10% while your filtering quality is 99.9%. And one false negative out of 1K spam messages actually is typical for top of the line solutions.
That means that if you are aggressive with spam filtering then you can really hurt users. And most users are conservative and still have expectations of the "old email" (predictable) environment while actually they need to operate in a completely different (unpredictable) "new email" environment.
The problem with spam is not just useless or obnoxious messages, but the fact that it is polluting the stream of incoming email that a person relies upon. That means that the marking mode is not much better then the spam blocking mode: marking the subject line at the gateway (and delivering the messages to a special folder, for example the Spam folder in Lotus notes or Netscape messenger using a client filter rule) does not help much because its very easy to overlook misidentified important mail as people tend to trust "marks". That can be lessened by using, say, three gradation of "spam warning", but the problem remains. But it has one tremendous advantage: it makes un-necessary complex Web-based interface (often with Postgress or MySQL frond-end) and other means of retrieving email from quarantine. You can just move emails makred with message "Surely spam" to the local spam filter using client filtering capabilities. And leave message marked "Probably spam" in the input folder. Making users independent of reliability of the storage of spam on the central server. Actually I think that centralizing spam storage is a almost sure sign of incompetent email architects.
What is actually important here is flexibility. A user that have a local spam folder does not depend on central infrastructure and correcting mistake is just simple move from one folder to another. Please note that I am talking about business situation, where a single missed email might mean lost business, etc. not about home mail, but still this augment is partially applicable to home email stream too. Like crazy and stupid people tend to destroy communication in a group, crazy and stupid spam filters are destroying email: people sometimes miss very important emails due to misconfigured of badly written spam filters both in enterprise and home environments.
Summarizing the augment above it looks like an overzealous spam filter in business situation looks more like a Trojan horse that harms the business that a useful addition to mail (especially if it implemented on the gateway level with the blocking mode as default). IMHO the part of the IT that implemented such solution might even look lile an example of "sysadmin fascism". Spam filtering solutions which are not taking into account interests of the business really deserves a very close scrutiny and are ripe for outsourcing, along with personnel which maintain them :-)
Actually the current situation with the commercial spam filters reminds me the first generation of virus scanners: the quality is extremely questionable (partially due to the problem of "too much zeal" that I mentioned above), products suffer from deadly "creeping featurism/excessive complexity" in interface (if Web interface is implemented it constitutes probably more then 50% of development efforts, see Pure Message as an example). Taking into account the limitations of the current technology it is very important to know were to stop. IMHO the key today is not "almost complete elimination of spam" but user friendliness and minimization of false positives. there is nothing horrible is a user manually deletes a half dozen of messages then manage to leak throu the spam filter. But it is horrible if a user is blocked from receiving an important message, as he/she may never know about it again.
I would advocate a rule-based filter with simple rules (that can be dynamic for example in SN and subject line checking) and user friendly interface over complex "God know how we figured it out" type of filters especially Baessian filters (BTW "bogofilter" is definitely written in a wrong language ;-).
Even "spam assassin" written in a more suitable for the task language (Perl) evolved from more or less simple (and reasonable) tool into a complex (and rather unpredictable) beast :-).
BTW the whole idea of assigning probabilities to individual words and using Bayesian logic is a very attractive and false direction. IMHO text pattern recognition even in the most primitive form (static regular expressions) is a more predictable from the user standpoint (and thus more reliable) approach. Moreover user can be educated about this approach and warned if the set of rules change to accomodate new sources of spam or new type of spam messages.
My feeling is that unless you have flexible user controlled exception lists, it's very unclear how you can diminish chances of the filtering out an extremely important letter (a miss, that essentially kills the usefulness of the filter once and forever for a particular user), the problem that I outlined above. If the exception list is dynamic and user controlled then I can add the return address to the filter exception list each time I send an important e-mail to somebody. That's prevents reply from being blocked and this is already something useful. A step in right direction, so to speak.
Summarizing I would like to state:
Spam seems to demonstrated that in its current implementation SMTP mail outlived its usefulness and need to enhanced replaced to deal with the changed environment. Authentication at least of the level of major mail hums is the most logical way to proceed as this is where most spam is injected into mail stream. See SMTP Authentication
If you need a spam filter I would look into the following features:
But what we probably need is not a better mousetrap (spam filter), but a new mail protocol (or a
revision of SMTP, which is an outdated protocol, anyway) that helps to restore confidence in email,
may be along with strong legal framework for bringing obnoxious spammers to justice. Of course
not all spammers are created equal, but some of them might benefit from some period of complete isolation
from society.
Dr. Nikolai Bezroukov
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
I'd like to add you to my professional network on LinkedIn.
- Hilda Booker
Accept View invitation from Hilda Booker WHY MIGHT CONNECTING WITH Hilda Booker BE A GOOD IDEA?
Hilda Booker's connections could be useful to you
After accepting Hilda Booker's invitation, check Hilda Booker's connections to see who else you may know and who you might want an introduction to. Building these connections can create opportunities in the future.
Anonymous Coward:"One thing that's long worried me is that the bulk of spammers and malware writers may hire copywriters with a better grasp of English than most of the ones I see now"At least in the '419-style' scams, research from Microsoft [microsoft.com] implies that the bad English is, at least in part, deliberate. It's obvious enough to 'smart' people that they won't bother responding (and therefore tying up the spammer's time trying to extricate their funds/credentials/whatever). However, less-savvy people might not realize it's a scam and therefore follow the links. As a result the hit rate of people who do respond is likely to be higher, resulting in a better yield for the scammer.
stillpixel:
I suppose that technique would boil down to basically use the grammar most likely used by the persona you are targeting much like in advertising. So if you are targeting people less educated or computer savvy, then use poor grammar and misspell words.N0Man74:
They are different attack vectors with different goals. Phishing relies on confusing a fake organization for a legitimate one. The more authentic and professional looking the better. Even a non-gullible person might fall prey to some of these sites (especially when more people are viewing e-mails on their phones and phones make it MUCH harder to see the tell-tale sign of a bad link).
When all you need is log-in information, or a bit of personal information, the more legitimate looking the better. You don't care if the person is gullible or not, because you are asking less of them. You set up a web server and just collect data with no need for human interaction with the visitors.
The Nigerian scams need people that are more gullible because those scams require more human time investment (and direct interaction) on the part of the scammer, and a greater amount of gullibility for their prey (since it also involves them sending money, not just filling in a form).
organgtoo:
The only thing I use bookmarks for now is to make sure I don't fat-finger the URL to one of my financial sites and enter my credentials into an imposter's site. Whenever I get an e-mail that I have a new statement or that I need to reset my password, I use the bookmark rather than clicking the link in the body of the e-mail.
Anonymous Coward
The decline of western civilazation (Score:1)
God, I am so tired of people who don't give a fuck about anyone but themselves. This goes for more than just the spammers. I would have thought that in the 21st century, with all of the technology and information available, that people would be a bit more willing to think about what's not just good for them, but also what helps out society and world as a whole. I remember how Usenet was once a thriving and intelligent community - and because of folks like this, it is now a shadow of itself. Way to go! Yeah, I blame capitalism, ignorance and greed - short-term gains for long-term losses. Anything to make a buck.
Welcome to the future where the banksters and spammers and morally bankrupt politicians and 'corporate persons' rule the day and 'apologize' when they're caught. It's time for the human race to grow the hell up and think of more than just profit. Yeah, I'm ranting - for now. Thanks for reading. ;-)
Domain Registry of America Scam
DROA serves as a reseller of domain name registration services for eNom, Inc. ("eNom"), an ICANN-accredited registrar of second level domain names. DROA's domain name registration services enable its customers to establish their identities on the web.
In the course of offering domain name services, DROA has engaged in a direct mail marketing campaign aimed at soliciting consumers in the United States to transfer their domain name registrations from their current registrar to eNom through DROA.
In many instances, consumers do not realize that by returning the invoices along with payment to "renew" their domain name registrations they are, in fact, transferring their domain name registrations from their then-current registrars to eNom. DROA's renewal notices/invoices do not clearly and conspicuously inform consumers of this material fact. 16. Defendant's renewal notices/invoices also fail to inform consumers that DROA charges a processing fee of $4.50 for any transfers of domain name registrations that are not completed, even if through no fault of the consumers. 17. In many instances, DROA promises credits to consumers who request them, but fails to transmit the credits to the consumers' credit card accounts in a timely manner.
Despite the FTC ruling again DRoA (located online at: http://www.ftc.gov/os/2003/12/031219stipdomainreg.pdf) IT IS HEREBY ORDERED that, in connection with the advertising, marketing, promotion, offering for sale, selling, distribution, or provision of any domain name services, Defendant, its successors, assigns, officers, agents, servants, and employees, and those persons in active concert or participation with it who receive actual notice of this Order by personal service or otherwise are hereby permanently restrained and enjoined from making or from assisting in the making of, expressly or by implication, orally or in writing, any false or misleading statement or representation of material fact, including but not limited to any representation that the transfer of a domain name registration is a renewal. II. IT IS FURTHER ORDERED that, in any written or oral communication where Defendant makes any representation that a domain name service is expiring or requires renewal, Defendant, its successors, assigns, officers, agents, servants, and employees, and those persons in active concert or participation with it who receive actual notice of this Order by personal service or otherwise are hereby permanently restrained and enjoined from failing to disclose, in a clear and conspicuous manner, in advance of receipt of any payment for services: A. Any cancellation or processing fees imposed prior to the effective date of any transfer or renewal; and B. Any limitations or restrictions on cancelling a request for domain name services.
June 27, 2010 | Website Design, Content Management System And SEO Blog
A few days ago we received a statement in the mail from Domain Registry of America. The invoice gives us the impression that a couple of our domain names are up for renewal and are about to expire. The letter actually states that, "Your domain name registrations will expire November 19, 2010!" Even though the dates they have on file are correct, we're not falling for this type of direct mail scam and you shouldn't either! This type of marketing scam is aimed at consumers who do not realize that by returning the invoices along with a payment, their domain names are in fact transferring from their current domain registrar to DROA.If you received one of these letters, please ignore it! Do NOT complete the payment slip at the bottom or make any payments to this company. To add insult to injury, the letter has their address listed as: 2316 Delaware Avenue #266 Buffalo, New York. With some quick help from Google maps, the address comes up the same as the UPS Store, so guaranteed it's just a mail box!
July 7, 2006 | Lucid Design
We have discovered that a company called "Domain Registry of America" or "DROA" has been emailing domain name owners with deceptive messages about domain transfers. The goal of the emails is to trick people into transferring their domain names away from their existing domain name provider. The emails falsely claim to be a response to a transfer request made by the current owner and should NOT be acted upon. This has been going on for over a year and several of our clients have been duped by this scam. Once DROA takes over ownership it can be somewhat difficult to regain control of the domain. This is in addition to the phenomenally high prices they charge (they make it sound like you get a good deal with them).
This scam seems to be targeting .com domains only and I haven't seen any cases yet for other domains.
If people wish to express their concern, they can contact The Federal Trade Commission (in the US) at www.ftc.gov or the Ministry of Consumer Affairs' scam watch (NZ) at www.consumeraffairs.govt.nz/scamwatch/
Please be aware that this is a snowballing spam generation scheme in the US (the company behind it is based in California and has approx 200 employees. That's right: two hundred). They run a social network called fanbox.com. See
http://en.wikipedia.org/wiki/SMS.ac,_Inc.
To lure people to their network they invented a neat social engineering trick based on popularity of social networks that allow them to collect millions of email addresses and claim membership of an order of 3.5 million people.
The scheme works something like this:
First you get a letter from one of your friends that looks innocent and pretty plausible, for example
Hi,
I set up a profile where I can post photos, connect and share.
Do me a favor and confirm our relationship here.
Thanks,
<name of your friend>If you click the link (very bad idea :-) it will propose you to login to this social networking site using any of your existing Webmail accounts (hotmail, gmail, yahoomail, etc). It also asks you to send an invitation to your friends.
What it does next is harvesting all your emails addresses in Web address book (it understands various formats) and send invitation to those on the list like regular email virus does. Pretty neat trick...
Fanbox.com, formerly known as sms.ac, is one of the most annoying and sleaziest spams and misrepresentations going right now. Here's how to stop receiving this spam.
If you are receiving email list this, we urge you to forward them to the federal Trade Commission. If they receive enough complaints, perhaps they'll get off their lazy government backsides and do something about the scum behind this scam:
How to Block Fanbox Emails or Cancel Your Account
Don't click on the link to cancel your account. That will only confirm to these scum that your email address is being used and ensure MORE spam. And since you never signed up for it, you haven't got an account to cancel. They are just trying to trick you into clicking on a link and confirming your information!
Instead, put fanbox.com, fanboxapps.com, and sms.ac in your junk / blocked senders, junk email or spam list in your email program (eg., Outlook junk mail list)
Report these spammers to the government:
To forward unwanted or deceptive spam to the Federal Trade Commission; send it to [email protected],
Also see the FTC and here to Report Porn Spam. In California, also use [email protected]. In Missouri, use [email protected]. In Virginia, use [email protected].
If you think you have been taken advantage of by a spam scam, file a complaint with the FTC online at www.ftc.gov. Complaints will help the FTC find and stop people who are using spam to defraud consumers.
How their scam works:
When you sign up for FanBox, it asks for your permission to email everyone in your address book. After you give them your password (DON'T do it!) it will start spamming everyone in your contact list / address book. It will send them these stupid ":____ asked you a question" spams.
We've received them here; and verified that the senders had no intention of sending them to us, or "asking" a question. They felt victimized.
For detailed discussion of this scam see these links:
2. Spamhuntress.com: sms.ac turns into fanbox/
Hello,
Rocky my boyfrend received faxbox invitation from a girl into hotmail account. This invitation was relating to his fanbox login. He says that he did not register himself in fanbox. His fanbox nickname is like his skype name or hotmail messanger nickname. Is that possible that he did not registered himself or he is lying?He almost certainly did not register with Fanbox/Faxbox. According to this article, they get people's names and addresses from other victims, and then spam the new victims. They try to make it look like they have an account, and it can be canceled/unsubscribed/shut down. But, they ignore your request for removal and add you to a verified "good email account" list.
"Name and Shame", or socially responsible use of your log dataYour logs contain an ever-growing mass of data on spammers. How about making an effort to make that data useful to others?
Those of us who run email services know, from sometimes painful experience, what it takes to ensure that the minimum possible amount of unwanted advertising and scams that may turn out to be security hazards reaches our users' inboxes.
Email: This should have been very simple
Handling email should really be quite simple: The server is configured to know what domains it receives mail for and what users actually exist in those domains. When a machine makes contact and indicates that it intends to deliver email, the server check if the recipient is a valid user. If the recipient is valid, the message is received and put in the relevant user's mailbox. Otherwise, a message about a failed delivery and optionally the reason for the failure is sent to the user specified as the sender.If they were all honest people
In each part of the process, the underlying premise is that the communicating parners offer each other correct information. Frequently that is the case, and we have legitimate communications between partners with a valid reason for contacting each other. Unfortunately there are other cases where the implicit trust is abused, such as when email messages are sent with a sender address other than the real one, quite likely a made-up one in a domain that belongs to other people. Some of us occasionally receive delivery failure messages for messages we verfiably did not send[1]. If we take the time to study the contents of those messages, in almost all cases we will find that the messages are spam, sometimes the scamming kind and perhaps part of an attempt to take control of the recipient's computer or steal sensitive data.What do the ones in charge do, then?
If you ask a typical system administrator what measures are in effect to thwart attempts at delivering unwanted or malicious messages to their users, you will most likely get a description that says, essentially, the messages are filtered through systems that inspect message contents. If the message does not contain anything known to be bad (known spam or malware) or something sufficiently similar to a known bad, the message is delivered to the user's mailbox. If the system determines that the message contents indicates it should not be delivered, the messages is thrown away undelivered, and some system administrators will tell you that the system also sends a message about the decision not to deliver the message to the stated sender address.Large parts or this is likely part of moderately educated users' passive knowledge, and most of us are likely to accept that content filtering is all we can do to keep dubious or downright criminal elements out of our working environment. For the individual end user, only minor adjustments to this are likely to be possible.
Measures based on observed behavior
But those of us who actually run the service also have the opportunity to study the automatically generated log data from our systems and use spammers' (that is, senders of all types of unwanted mail, including malware) behavior patterns to remove most of the unwanted traffic before actual message content is known. In order to do that, it is necessary to go to a more basic level of network traffic and study sender behavior on the network level.One of the simpler forms of behavior based measures emerged in the form of a technique called greylisting in 2003. The technique is based on a slightly pedantic and rather creative interpretation of established standards. The Internet protocol for email transfer, SMTP (the Simple Mail Transfer Protocol) allows servers that experience temporary problems that make it impossible to receive mail to report a specific 'temporary local problem' status code to correspondents trying to deliver mail. Correctly configured senders will interpret and act on the status code and delay delivery for a short time. In most circumstances, the delivery will succeed within a short time. It is worth noting that this part of the standard was formulated to help the mail services's reliability. At most times, the retries happen without alerting the person who wrote and sent the message. The messages generally reach their destination eventually.
Lists of grey and black, little white lies
Greylisting works like this: the server reports a temporary local problem to all attempts at delivery from machines the server has not exchanged mail with earlier. Experience shows that the pre-experiment hypothesis was mainly correct: Essentially all machines that try to deliver valid email are configured to check return codes and act on them, while almost all spam senders dump as many messages as possible, and never check any return codes. This means that somewhere in the eighty to high nineties percentage of all spam volume is discarded at the first delivery attempt (before any content filtering), while legitimate email reaches its intended recipients, occasionally with delayed delivery of the initial message from a new correspondent.One other behavior based technique that predates greylisting is the use of 'blacklists' - lists of machines that have been classified as spam senders - and rejecting mail from machines on such lists. Some groups eventually started experimenting with 'tarpits', a technique that essentially means your end of the communication moves along very slowly. A much cited example is the spamd program, released as a part of the free operating system OpenBSD in May of 2003. The program's main purpose at the time was to answer email traffic from blacklisted hosts one byte per second, never leaving a blacklisted host any real chance of delivering messages.
The combination of blacklists and greylisting proved to work very well, but the quest for even more effective measures continued. Yet again, the next logical step grew out of observfin spammer behavior. We saw earlier that spammers do not bother to check whether individual messages are in fact delivered.
Laying traps and bait
By early 2005, these observations lead to a theory that was soon proved useful: If we have one or more addresses in our own domains the are certain to never receive any valid mail, we can be almost a hundred percent certain that any mail addressed to those addresses is spam. The addresses are spamtraps. Any machines that try to deliver spam to those addresses are placed in a local blacklist, and we keep them busy by answering their traffic at a rate of one byte per second. The machines stay on the blacklist for 24 hours unless otherwise specified.The new technique, dubbed greytrapping was launched as part of the improved spamd in OpenBSD 3.8, released May 2005. In early 2006, Bob Beck, one of the main spamd developers announced that his greytrapping hosts at the University of Alberta generates a downloadable blacklist based on the greyptrap data, updated once per hour, ready for inclusion in spamd setups elsewhere. This is obviously useful. Machines that try to deliver mail to addresses that were never deliverable most likely do not have any valid mail to deliver, and it we are doing society at large a favor by delaying their deliveries and wasting their time to the maximum extent possible.
It is worth mentioning that during the period we have used the University of Alberta blacklist at our site, it has contained a minimum of twenty-some thousand IP addresses, and during some busy periods have reached almost two hundred thousand.
You can help, too
Fortunately you do not need to be a core developer to be able to contribute. The exact same tools Bob Beck uses to generate his blacklist is available to everybody else as part of OpenBSD, and they are actually not very hard to use productively.Here at BSDdly.net and associated domains we saw during the (Northern hemisphere) summer of 2007 a marked increase in email sent to addresses that have never actually existed in our domains. This was clearly a case of somebody, one or more groups, making up or generating sender addresses to avoid seeing any reactions to the spam they were sending. This in turn lead to us starting an experiment that is still ongoing. We record invalid addresses in our own domains as they turn up in our logs. From these addresses we pick the really improbable ones, put them in our local spamtrap list and publish the list on a specific web page on our server[2].
Experience shows that it it takes a very short time for the addresses we put on the web page to turn up as target addresses for spam. This means that we have succeeded in feeding the spammers data that makes it easier for us to stop their attempts, and frequently we make spam senders use significant amounts of time communicating with our machines with no chance of actually achieving anything. The number of spamtrap addresses has reached fifteen thousand, and we have at times observed groups of machines that spend weeks working through the whole list, with average time spent per unsuccessful delivery attempt clocked at roughly seven minutes.
As a byproduct of the active spammer trapping we started exporting our own list of machines that had been trapped via the spamtrap addresses during the last 24 hours and making the list available for download. This list's existence has only been announced via the spamtrap addresses web page and a few blog posts, but we see that it's retrieved, most likely automatically, at intervals and is apparently used by other sites in their systems.
At this point we have establised that it is possible to create a system that makes it very unlikely that spam actually makes it through to users, while at the same time it is quite unlikely that legitimate mail is adversely affected. In other words, we have the cyberspace equivalent of good fences around our property, but spammers are still out there and may create serious probles for those who are witout adequate protection.
Collecting evidence, or at least seek clarity
We would have loved to see law enforcement take the spammer problem seriously. This is not just because the spam that reaches its targets is irritating, but rather because almost all spam is sent via equipment that spammers use without the legal owners' consent. We would have liked to see resources allocated in proportion to the criminal activity the spam represents. We would have liked to help, but it might seem that we would not have usable evidence available due to the fact that we do not actually recieve the messages the spammers try to deliver. On the other hand, we have at all times a list of machines that have tried to deliver spam, identified with an almost hundred percent certainty based on the spammer trapping addresses. In addition, our systems routinely produce logs of all activity, with the level of detail we set ourselves. This means that it is possible to search our logs for the IP addresses that have tried to deliver spam to our systems during the last 24 hours, and get a summary of what those machines have done.A search of this kind typically yields a result like this:
Aug 10 02:34:29 skapet spamd[13548]: 190.20.132.16: connected (4/3)
Aug 10 02:34:41 skapet spamd[13548]: (GREY) 190.20.132.16: <[email protected]> -> <[email protected]>
Aug 10 02:34:41 skapet spamd[13548]: 190.20.132.16: disconnected after 12 seconds.
Aug 10 03:41:42 skapet spamd[13548]: 190.20.132.16: connected (14/13), lists: spamd-greytrap
Aug 10 03:42:23 skapet spamd[13548]: 190.20.132.16: disconnected after 41 seconds. lists: spamd-greytrap
Aug 10 06:30:35 skapet spamd[13548]: 190.20.132.16: connected (23/22), lists: spamd-greytrap becks
Aug 10 06:31:16 skapet spamd[13548]: 190.20.132.16: disconnected after 41 seconds. lists: spamd-greytrap becks
The first line here states that 190.20.132.16 contacts our system at 02:34:29 AM on August tenth, as the fourth active SMTP connection, three blacklisted. A few seconds later it appears that this is an attempt at delivering a message to the address [email protected]. That address was already one of our spamtraps, most likely one that was harvested from our logs and was originally made up somewhere else. After 12 seconds, the machine disconnects. The attempted delivery to a spamtrap address means that the machine is added to our local spamd-greytrap blacklist, as indicated in the entry for the next attempt about one hour later. This second attempt lasts for 41 seconds. The third try in our log material happens just after 06:30, and the addition of the list name becks indicates that in the meantime has tried to deliver to one of Bob Beck's spammer trap addresses and has entered that blacklist, too.
Unfortunately, it is unlikely that logs of this kind are sufficient as evidence for criminal prosecution purposes, but the data may be of some use to those who have an interest in keeping machines in their care from sending spam.
"Name And Shame", or just being neighborly?
After some discussions with colleagues I decided in early August 2008 to generate daily reports of the activities of machines that had made it into the local blacklist on bsdly.net and publish the results. If all we have is the fact that a machine has entered a blacklist as an IP address (such as 24.165.4.190), and there is no supporting material, it is fairly easy for whoever is in charge of that address range to just igjore the entry as an unsupported allegation. We hope that when whoever is responsible for the network containing 24.165.4.190 sees a sequence like this,
Host 24.165.4.190:
Aug 10 02:57:40 skapet spamd[13548]: 24.165.4.190: connected (9/8)
Aug 10 02:57:54 skapet spamd[13548]: (GREY) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 02:57:55 skapet spamd[13548]: (GREY) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 02:57:56 skapet spamd[13548]: 24.165.4.190: disconnected after 16 seconds.
Aug 10 02:58:16 skapet spamd[13548]: 24.165.4.190: connected (8/6)
Aug 10 02:58:30 skapet spamd[13548]: (GREY) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 02:58:31 skapet spamd[13548]: (GREY) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 02:58:32 skapet spamd[13548]: 24.165.4.190: disconnected after 16 seconds.
Aug 10 02:58:39 skapet spamd[13548]: 24.165.4.190: connected (7/6), lists: spamd-greytrap
Aug 10 03:02:24 skapet spamd[13548]: (BLACK) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 03:03:17 skapet spamd[13548]: (BLACK) 24.165.4.190: <[email protected]> -> <[email protected]>
Aug 10 03:05:01 skapet spamd[13548]: 24.165.4.190: From: "Preston Amos" <[email protected]>
Aug 10 03:05:01 skapet spamd[13548]: 24.165.4.190: To: [email protected]
Aug 10 03:05:01 skapet spamd[13548]: 24.165.4.190: Subject: Wonderful enhancing effect on your manhood.
Aug 10 03:06:04 skapet spamd[13548]: 24.165.4.190: disconnected after 445 seconds. lists: spamd-greytrap
they will find that to be a sufficient for action of some kind. The material we generate is available via the "The Name And Shame Robot" web page. The latest complete report of log excerpts is available via links at that page. Previous versions are archived offline, but will be made available on request to parties with valid reasons to request the data."The Name And Shame Robot"" is rather new, and it is too early to say what effect, if any, the publication has had. We hope that others will do similar things based on their local log data or even synchronize their data with ours. If you are interested in participating, please make contact.
Regardless of other factors, we hope that the data can be useful as indicators of potential for improvement in the networks that appear regularly in the reports as well as material for studies that will produce even better techniques for spam avoidance.
A shorter version of this article in Norwegian was published in Computerworld's Norwegian edition on August 22, 2008; the longer Norwegian version is available as an earlier blog post.
[1] A collection of such failure messages collected earlier this year is available at http://www.bsdly.net/~peter/joejob-archive.2008-07-28.txt.
[2] See http://www.bsdly.net/~peter/traplist.shtml, references at that page lead to my blog, which consists of public field notes, as well as other relevant material.
About the author
Peter N. M. Hansteen ([email protected]) is a consultant, system administrator and writer, based in Bergen, Norway. He has written various articles as well as "The Book of PF", published by No Starch Press in 2007, and lectures on Unix- and network-related topics. He is a main organizer of BLUG (Bergen (BSD and) Linux User Group), vice president of NUUG (Norwegian Unix User Group) and an occasional activist for EFF's Norwegian sister organization EFN (Elektronisk Forpost Norge).
January 26, 2005 | John Beck's Weblog
I've spent a lot of time over the past couple of months trying out some new (and some not so new) anti-spam techniques. Note that this article assumes some familiarity with sendmail m4 macros; see
$CFDIR/README
for background and all sorts of details on these, where$CFDIR
is one of:
/etc/mail/cf
on Solaris 10/usr/lib/mail
on Solaris 7, 8 or 9- the
cf
sub-directory of the sendmail distribution for people "rolling their own"These techniques are in the form of FEATURE and HACK m4 macros (the difference being that the former are provided and blessed by sendmail.org / Solaris whereas the latter are not, though a HACK may evolve into a FEATURE in a future release). For a HACK, one would use
HACK(`hack-name')dnlin one's
.mc
file, likewiseFEATURE(`feature-name')dnlWhen installing hacks, one must create
$CFDIR/hack
(if it does not already exist) and placehack-name.m4
in that directory. Note that the sendmail distribution comes with such a sub-directory but Solaris does not.Also, to explain some terms used below: the access list is enabled by the
FEATURE(`access_db')
macro; details on this are in$CFDIR/README
, both in its sub-section in the FEATURES section, and in the ANTI-SPAM CONFIGURATION CONTROL section. AndFEATURE(`delay_checks')
is strongly recommended, as it is needed to enable the overrule by an OK entry in the access list that I mention in a few places; this feature is also described in its subsection in the FEATURES section, as well as in the "Delay all checks" sub-section of the ANTI-SPAM CONFIGURATION CONTROL section.Anyway, onto the details. In the order I started deploying them:
- The first is HACK(`block _bad_helo'), written by Neil Rickert, a professor in the Computer Science Department at Northern Illinois University and a volunteer sendmail.org contributor. SMTP clients are supposed to send the client FQHN (fully qualified host name) as the HELO/EHLO parameter, but many broken clients send the server FQHN (or IP address) instead, or something without a "
.
". This rejects any such transmissions. The upside is that I have found it to block a good amount of spam, with no false positives for me. A couple of users of my personal domain have had some small number of false positives with it, though. And the down side is that it cannot be overruled by an OK entry in the access list. Bart has had a lot of troubles with this rule; apparently old versions of Netscape and early version of Mac's Mail.App got this wrong.- The second is a regular feature: DNS-based black-lists. I use (note: line wrapped for readability)
FEATURE(`enhdnsbl', `bl.spamcop.net', `"Spam blocked see: http://spamcop.net/bl.shtml?"$&{client_addr}', `t')dnlwhile Bart uses
FEATURE(`dnsbl', `sbl-xbl.spamhaus.org')dnlBoth have proven extremely effective with very few false positives, and this feature, using whichever list, has the added virtue of allowing an OK override in the access list.
- The last is HACK(`require_rdns'), also written by Neil Rickert. I enhanced Neil's original version so that it would allow an OK override in the access list. This enhanced version has been unbelievably effective, in sheer numbers, while also so far amazingly accurate (I estimated a false positive rate of 0.5% after a few weeks, but I think that may have gone even lower since I white-listed a few sites). As the name suggests, it requires that the SMTP client's IP address reverse map to some name, and also that the name forward map to the same address. (An IP address can have multiple A records; this merely requires that the original IP address is one of them.) This is the single most effective anti-spam rule I have ever deployed.
Overall, spam getting thru my personal domain's mail server to my users (including myself, my wife, my siblings, our mom, etc.) has dropped about 90% since I started using these techniques, despite the ever-increasing spam trends on the rest of the Internet. [E-mail]
( January 26, 2005 04:16 PM ) Permalink | Comments [2]
My simple solution to spam (Score:5, Informative)
by KalvinB (205500) on Saturday January 03, @08:33PM (#7870109)
(http://www.icarusindie.com/)Spammers need images to get past word filters and to make an ad "stand out." Images can't be sent with the e-mail so src tags are used. href tags are also used for links they expect people to click on. "http://" is a unique identifier that absolutly cannot be obfuscated or it will not work. You can add a lot of junk before an @ symbol but eventually the real link must be there. Simply block that link and poof, no more spam from spammers advertising using that domain. You can block countless spammers by blocking a single 100% unique URL that no legitimate e-mail will ever contain. The full write up [icarusindie.com] of my take on what I see as horribly flawed ways to combat spam and source code for the custom programs I use to strip links out of e-mails.
I have an example of spam posted there where everything is just a mess in the e-mail. The headers are forged, the text is all obfuscated. But there, clear as day is an "HTTP://"
Poof, killed the spam domain. And there's no way to circumvent my method except by not having links of any form in the e-mail. If you put a link in a spam, I will find it and I will block it.
Slashdot
Re:Bout Time! (Score:5, Informative)
by Just Some Guy (3352) <kirk+slashdot.strauser@com> on Monday June 28, @09:45AM (#9550230)
(http://subwiki.honeypot.net/ | Last Journal: Wednesday December 31, @03:36PM)I "augmented" SpamAssassin with an extremely tight Postfix ruleset. A remote server has to jump through these hoops before SA ever gets a crack at it: 1. HELO Filtering
- Reject any connection that doesn't start with HELO or EHLO.
- Allow any host on my LAN to continue on to step 2.
- Reject any host not on my LAN that sends a hostname or IP of a machine on my LAN.
- Reject non-FQDN hostnames (ala "mailserver").
- Reject invalid hostnames (ala "432$@@112").
- Let everyone who makes it this far continue on to step 2.
2. Sender Filtering
- Allow authenticated senders to continue on to step 3.
- Allow hosts on my LAN to continue on to step 3.
- Reject non-FQDN sender domains ("foo@bar").
- Reject unknown sender domain ("[email protected]") - after all, if I can resolve their domain, then I couldn't reply to them anyway, right?
- Let everyone who makes it this far continue on to step 3.
3. Recipient Filtering
- Reject non-FQDN recipient domains (they'd bounce anyway).
- Reject unknown recipient domains (same as above).
- Allow authenticated users to send their mail and stop processing.
- Allow hosts on my LAN to send their mail and stop processing.
- Reject mail from anyone else that isn't to one of my domains, or one I'm an MX for.
- Use SPF to reject spoofed email.
- Use the relays.ordb.org, list.dsbl.org, and sbl-xbl.spamhaus.org DNS blackhole lists.
- Greylist all email not coming in from or going out to peer MXes.
- Pass everything else to step 4.
4. Content Filtering and Delivery
- Use ClamAV to reject viruses. This takes a big load off SpamAssassin.
- Use SpamAssassin to tag messages.
- Use Cyrus's Sieve to reject high-probability spam, put medium-probability messages into a "review" folder, and filter everything else into the appropriate folders.
I reject over 95% of all incoming mail before it ever gets to SpamAssassin. This means that SA's success rate isn't as good as on other systems (since I weed out all of the obvious spam), but my mailbox is happy and shiny.
SpamAssassin is a brilliant last line of defense, but I wouldn't advise just dumping your raw incoming stream into it. Much of the useful information about a message isn't available to spamd (such as your list of local domain names, relay domains, etc.) and you should consider using a set of cheaper filters to flush out the blatant chaff.
Re:Great News! (Score:5, Interesting)
by NigritudeUltramarine (778354) on Saturday June 26, @04:59AM (#9535803)A success rate of 95% really sucks when (like me) you get just over 2,500 spams a day. That'd still mean around 125 spams a day would be getting through. (I've had the same email address since the early 1990's, back when there was no reason to keep your email address "secret.") Personally I do use SpamAssassin, but as an intermediate step.
First step: Check a whitelist of known senders. Deliver if the sender is on the list, AND the message originated from an IP subnet that I allow for them personally.
Second step: Scan with SpamAssassin. If the score is really high (above 20) throw it the hell out.
Third step: If the score is less than 20, and the person wasn't whitelisted, run the message through TMDA [tmda.net] and politely tell the sender I'm not sure who they are, and I get a lot of spam, and could you please click this link to prove that you're a real person.
I've been using this three-step system for eighteen months now, and out of over one million messages that have come into my mailbox (really), exactly FOUR spam messages have made it all the way through. Apparently the spammers decided to go ahead and click on the little link, or they used a real person's return address, and when that person got they autoreply, they were too stupid to understand what was going on.
Even better, I have not received ANY indiciation that I've lost any messages; at least, no one has ever mentioned anything about an email that I didn't get.
I've got five other people at my domain using the same system, although for not quite as long (one for fifteen months, three for about a year, and one for just a month now); they have all had similar success.
So based on those numbers I'd estimate a success rate of 99.9997% for eliminating spam (which is, admittedly, COMPLETELY INSANE), and a false-positive (or at least "lost message") rate of 0% so far (fingers crossed). A few people have had to confirm their messages, of course, but I've whitelisted them as that happens.
I actually wrote all the connecting code in PHP, believe it or not, with a MySQL database as a backend. It's invoked using
.qmail files. PHP is indeed good for things other than web pages; and was a little bit easier for me to maintain and deal with than Perl. The whole thing is less than 25KB of code. There is also a web backend which I use to configure it; that adds another 40KB. The whole system took about twelve hours of programming to set up, on one Saturday.
Now, for correspondence to companies (such as Microsoft, or Amazon.com), I use a different scheme (although it's handled by the same PHP code). I create up a unique email address for each of them, which ONLY allows mail to or from that domain (for example "[email protected]" only allows messages from amazon.com). Those addresses are also easily cancellable, individually, if the company starts to annoy me with spam. Basically, each email address can be assigned its own unique whitelist, and can be cancelled individually at any time, through the little web interface.
I also have a number of email addresses for things such as customer support for our company (I write computer software). I'm using the same system for those, also, but instead of checking whitelists based on the sender, I've found a simple way to do it is to check for ANY of our product names anywhere in the message body or subject. If the message doesn't mention any of them, it sends a simple autoreply back similar to that in (3) above, but mentioning that the message didn't seem to be about any of our products, but if it was, please click here, blah blah. We don't have a high volume of support messages (about one or two a day; we're a small company) but in the last year only three or four people have had to click through like that, and, honestly, their support requests were so f*cked up anyways that I'd rather it just dropped them on the floor.
;-) Then, as a very last step in all this, I also catch all email sent to invalid addresses in my various domains (which come to over 5,000 messages a day), and report those as spam to Vipul's Razor [sourceforge.net]. Which helps out the community, and me indirectly because my SpamAssassin installation also uses the Razor.
3.0, late-July, early August (Score:5, Informative)
by chathamhouse (302679) on Saturday June 26, @04:31AM (#9535751)
(http://www.chathamhouse.org)3.0.0pre1 was made available last week. It will apparently take another month or so to finalize the weighting of the rules.
I've put 3.0.0pre1 on a production system that filters ~350k messages per day. With some tweaking of the RBL, bayes, and AWL rules, it is much (~10%) more efficient at tagging spam than 2.63, which I'm running on a parallel server that also sees ~350k messages/day (load balancing is your friend).
More info: http://www.au.spamassassin.org/full/3.0.x/dist/bu
i ld/3.0.0_change_summary
sorting mail by spamassassin score (Score:5, Informative)
by David Jao (2759) * <[email protected]> on Saturday June 26, @03:08AM (#9535570)
(http://dominia.org/djao/)I'd like to delete anything with a score > 15, simply store anything with a score > 5, and send an auto-reply for scores between 5 and 10 indicating that the message was marked as spam and I'll probably never look at it. I can't speak for auto-replies, but you can do the sorting part client-side. The key is that spamassassin adds a line like "X-Spam-Level: *****" where the number of *'s is the score of the email. Almost any email client can filter mail to different folders based on headers. The unary representation of the spam score ensures that even a primitive filter can work.
For example, one popular client is Microsoft Outlook, and there are several web pages in google (such as this one [carleton.ca]) that explain how to reroute mail to specific folders depending on the spamassassin score.
Get the owner, not the dog..... (Score:5, Insightful)
by Univac_1004 (643570) on Saturday June 26, @04:13AM (#9535710)
(Last Journal: Monday June 21, @11:35PM)Spam Assassin, while a very clever program, is as misdirected as the "Canned Spam" legislation. It has no effect on the real economics of spam: who pays for it. Somebody is paying for the spamming, and we know exactly who it is. The URL of that organization is prominently displayed in every item of spamail. It is the advertiser.
The advertiser is right there out in the open, easy to locate. If they're not, the spam isn't doing its job, and wouldn't have been sent. And easy to locate means easy to go after, easy to sue, to fine, DoS or whatever.
Dinging the advertisers, and dinging them hard, will instantly put the spammers out of business.
Spamming can be eliminated without blocking, white lists, or anti-spoofing RFC's. Just go to where it's pointing.
To draw an [ugly, graphic] picture: a dog comes and poops on sidewalk in front of my house, and I step in it. Yelling at the dog is going to be only moderately successful, building a poop filter is difficult, messy, and leaky (as Spam Assassin demonstrates) . Following the dog's leash and fining the owner is what works.
The owner doesn't bring the dog back since s/he doesn't want to pay another fine.
No owner, no dog, no spam.
Get the owner.
OT: Spam Cannibal (Score:2)
by gilgongo (57446) on Saturday June 26, @06:50PM (#9539590)
(http://www.hatters.org.uk/ | Last Journal: Tuesday July 29, @04:19PM)As it seems now obligatory to mention anti-spam systems whenever a /. story mentions spam, I thought I'd add the following: Please have a look at Spam Cannibal [spamcannibal.org]
It's an interesting concept that if correctly deployed (big "if") by even a relatively few admins around the world, could really make a difference to the amount of spam on the net. It can also protect hosts against DoS attacks of various kinds.
Don't get me wrong, I'm not astroturfing this (much...). It has flaws - there are those who think blacklisting is a bad idea, and I can see their point of view on that - but I just think Spam Cannibal needs more visibility as an approach.
Challenge-Response schemes are more effective (Score:2, Interesting)
by cpghost (719344) on Saturday June 26, @04:59AM (#9535802)
(http://www.cordula.ws/)Filtering spam generates way too many false positives. Challenge/Response schemes are IMHO much more effective. TMDA [tmda.net] and similar programs can be configured with whitelists for your regular mail partners, auto-whitelists for everyone who confirms their e-mail identity, and, if necessary, with blacklists too.
Re:DSpam (Score:4, Interesting)
by Chief Typist (110285) on Saturday June 26, @11:42AM (#9537291)
(http://www.iconfactory.com/)The best feature of DSPAM, in my opinion, is that the SPAM never leaves the mail server. The bad messages go into a quarantine on the server and can be reviewed by the end user using a web-based interface (looking for false positives.) In the press of a button, that quarantine can be emptied, freeing up disk resources on the server.
Other SPAM solutions (like SpamAssassin) mark the message and continue with delivery. What's the point in downloading the SPAM to your mail client just to throw them away?
Re:Is this a *smart* idea? (Score:5, Interesting)
by DocSnyder (10755) on Saturday March 20, @07:58AM (#8620216)
(http://docsnyder.de/)I don't know, whether this is such a brilliant idea - if this gets widely adopted it can't be long before some idiot will get the idea of paying for a spam to "advertise" one of his competitors just to get HIS site blocked... I'm sure AOL won't block any joe-jobbed targets but only bulletproof servers hosted at Chinanet, Telecom Malaysia, Procergs.com.br etc. which have been spamvertised by known spam gangs.
This is *really* a good idea - Alan Ralsky uses several "throw-away" domains per spam run, but only a handful of different servers to host his crap. Null route these and Ralsky can enlarge his own penis.
This is mandatory for webmails (Score:5, Interesting)
by chrysalis (50680) on Saturday March 20, @07:59AM (#8620220)
(http://www.pureftpd.org/)The company I'm working for provides free web service ( http://www.skymail.fr ). This kind of service frequently gets abused by spammers. Two they abuse it :
1) they open an account, just to have a valid address in order to bypass basic spam filters. Then, they send their spam through other servers using this address as the sender.
2) they use scripts to send spam through the service, as any regular user would. This is extremely annoying.
For 1) we publish SPF for all domains we send mail from. Now, it's up to people to enable SPF on their mail servers.
For 2) we filter _all_ packets coming from China, Korea, Nigeria and addresses listed in Spews and Spamhaus databases. That's about 13000+ filtered networks. Thanks to OpenBSD packet filter, it's trivial to set up and it doesn't introduce any slowdown.
This tutorial describes how to configure BSD systems to use DNS blacklists, procmail, mail "sanitizing" scripts, daemons that watch logs for evidence of spamming and "mail bombing," and similar utilities. Prevention of unauthorized relaying and detection and blocking of outbound spam are also discussed. Countermeasures against address harvesting and privacy invasion techniques such as "Rumplestiltskin" attacks, fingerd scans, tracking via identd, e-mail cookies, and malicious image tags in HTML mail are covered in detail.
kuro5hin.org
tbc Mon Mar 24th, 2003 at 10:02:22 PM EST
I use the "plussed user" feature of sendmail. I searched for plus sign email at Google, and the first ten results weren't too helpful. If nothing else, I hope this diary entry changes that after Google indexes it.
I delete most spam and think nothing of it. Then I started getting spam on my cellphone, and I started logging them on my spammer blacklist wiki page. I got spam a couple days ago, though, that warrants a diary entry. 8 copies of the same message were sent to addresses undeniably harvested off my Web pages: timc+web+writing@divide.net, timc+issre2k@divide.net, timc+web+cancer@divide.net, timc+ca125@divide.net, timc+uflaccid@divide.net, timc+hacks@divide.net, timc+geekcode@divide.net, and timc+web@divide.net.
My ISP supports this feature, which allows mail addressed to [email protected] to be delivered to [email protected], and I have procmail rules that delete all mail sent to plain [email protected].
"Plussed users" are explained at sendmail.org. Not all ISPs support this, but it's easy enough to try. Just send a message to yourself with +anything appended to your regular e-mail account name and see if you get it. I tested yahoo.com and hotpop.com; neither one supports it.
I pepper my Web pages with these tagged e-mail addresses so I know why people are writing to me. Each of the hyperlinked plussed users in this article's introduction triggers a Google search to see which page the spammer was harvesting from.
Here's the spam, with my commentary.
Received: from [198.126.104.216] by 207.76.102.240 with ESMTP id XSZCPC; Sat, 22 Mar 03 08:36:11 +0400
Received: from [175.59.87.96] by 198.126.104.216 with ESMTP id ZDJEED; Sat, 22 Mar 03 08:20:11 +0400
From: "Joyce Bryant" <[email protected]>The new 2003 edition of the xxxxxxxx xxxxxxxx
xxxxxxxxx is out! It includes comprehensive and
updated information on xxxxxxxx xxxxxxxxx, xxxx,
xxxxxxxxx, xxxxxxxxxx, xxxxxxxxxx, xxxxxx xxxxx
xxxxxxxx, xxxxxxxxxxxxxxxxxxxx, email addresses
and much more. The cost of the xxxxxxxxx is $285.To order the xxxxxxxx xxxxxxxx xxxxxxxxx, please
print this email, complete the information below
and fax it to 905-751-0199 (tel: 905-751-0919)....
To unsubscribe: Send a blank email to: [email protected]
with "Remove" in the subject line.Yeah, right.
See also: the c2 wiki's SpamProof page.
Slice the Spam into workable chunks (Score:1)
by JumperCable (673155) on Sunday January
04, @12:01AM (#7870832)
Everyone is complaining that no solution works against the spam problem. True, there is no single magic bullet. But instead of throwing up our hands and yelling that we are screwed and let the bastards over run us, we need to break the problem down into workable chunks.
[Nov 10, 2003] Spam Nation
November 10, 2003 | InformationWeek
Pinpointing the origin of spam, a necessary step for effective law enforcement, is one of the thorniest problems, because of the mutability of message-header information and "relay raping," the practice of using open server relays to conceal the path of a message. And anti-spam tools don't help, Richter contends. "All these technology companies are doing is taking legitimate marketers who aren't causing problems and filtering our mail because that's all they can catch consistently," he says.
... ... ...
Internet service providers put a lot of effort into combating spam, blocking illegitimate incoming messages and bouncing spammers sending out messages from their systems. While technology can be employed to automate the identification and blocking of unsolicited bulk E-mail, catching and legally removing a spam sender remains a human-driven process. "The way we find out that spam has traveled across our network is when we receive a complaint from a user," says Craig Silliman, director of the network and facilities legal team for MCI. Mary Youngblood, abuse team manager for EarthLink Inc., says it can take months to get a resilient spammer off the network through the legal system.
Youngblood at EarthLink says for this reason, the ISP relies on monitoring tools to seek out spammers: "We look at E-mails themselves, we look at the products they're selling, we look at how many times our automatic processes had to end the connection with their mail machine because of 'user unknowns' [undeliverable mail], we look at our spam filters."
Spammers, she says, make no effort to fine-tune lists to get higher-percentage response rates. "They don't think that way. What they say is, 'Gee, if I get a one-out-of-a-thousand response, think how much I would get if I doubled my E-mail," she says. "Spammers deal in volume, instead of only sending E-mail to those who want it." Of course, it's possible to disagree about whether permission was given to receive messages. Many of those who believe they've been spammed, Richter says, received the unwanted E-mail as a result of their own actions, such as registering for prizes at Web sites.
... ... ...
Atkins sees the cost of enforcement as a problem. "Most of the spam out there breaks existing consumer-protection, criminal, or fraud laws," she says, echoing similar concerns voiced by ePrivacyGroup's Everett-Church. "But spammers are hard to prosecute. They hide, they lie, they cheat, and it costs a lot of money to track them down and build a case against them. That is money a lot of states don't have."
Richter concurs. "The people who these laws are supposed to be trying to attack, they're not going to be affected," he says. "The guy overseas isn't affected."
Greylisting got it's name because it is kind of a cross between black- and white-listing, with mostly automatic maintenance. A key element of the Greylisting method is this automatic maintenance.
The Greylisting method is very simple. It only looks at three pieces of information (which we will refer to as a "triplet" from now on) about any particular mail delivery attempt:
- The IP address of the host attempting the delivery
- The envelope sender address
- The envelope recipient address
From this, we now have a unique triplet for identifying a mail "relationship". With this data, we simply follow a basic rule, which is:
If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure.
Since SMTP is considered an unreliable transport, the possibility of temporary failures is built into the core spec (see RFC 821). As such, any well behaved message transfer agent (MTA) should attempt retries if given an appropriate temporary failure code for a delivery attempt (see below for discussion of issues concerning non-conforming MTA's).
During the initial testing of Greylisting, it was observed that the vast majority of spam appears to be sent from applications designed specifically for spamming. These applications appear to adopt the "fire-and-forget" methodology. That is, they attempt to send the spam to one or several MX hosts for a domain, but then never attempt a true retry as a real MTA would. From our testing, this means that currently, based on a fairly conservative interpretation of testing data, we see effectiveness of over 95%, and that is with no legitimate mail ever being permanently blocked.
This blocking comes with a minimal price from the terms of local resources. Assuming the use of a local datastore for the triplet and other metadata, there is no required network traffic caused by Greylisting other than that associated with the connection itself. Since we are not checking the contents of the message at all there is very little processing overhead, unlike many other spam blocking methods.
There is one effect that could be seen as either a positive or negative. Since the Greylisting method delays acceptance of unknown mail, that will generate a little more work for the sending MTA of legitimate mail. The flip side is that it generates a lot more work and smarts for the spammer's systems, hopefully enough to make the costs of spamming higher, possibly even to the point of making spamming unprofitable for some of them.
The best part is that since we never permanently fail a message delivery, as long as the delivering MTA's are well behaved, we should never cause a legitimate mail to bounce. There should never be a false positive!
Slashdot
Re:security through obscurity, again? (Score:5, Interesting)
by blakestah (91866) on Friday June 20, @02:48PM (#6256300)
(http://www.keck.ucsf.edu/~dblake)The thing that is wrong is the SMTP protocol, and most people's conception of a spammer. Once you see a few "confessions of ex-spammers", everything changes. There are people out there who pay $10000 in startup costs, and then make $2000/week for spamming. The $10000 gets them software written by knowledgable internet security experts. This software finds any and every way to anonymify the email spam, and finds lists of people to spam.
As long as knowledgable internet security experts are getting paid good cash to enable spammers, and SMTP doesn't change, spam will only continue to get worse. There needs to be a fundamental change in SMTP protocols. It oughta take the spammers about 2 days to fix their MTA bug to get around greylisting.
Re:security through obscurity, again? (Score:4, Insightful)
by SillySlashdotName (466702) on Friday June 20, @03:14PM (#6256584)I see that, in fine /. tradition, you didn't RTFA. From the article: If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure. (emphasis addded)
Later in the article it goes into much more detail about the delay, how long to delay if the triplet has not been seen before, life time of the whitelist, etc.
It also talks about configuring the times - they mention the default delay is 1 hour, but that their records suggest that 1 minute would have caught 99% of the same spam messages - "The data collected during testing showed that more than 99% of the mail that was blocked with the tested setting of 1 hour would still have been blocked with a delay setting of only 1 minute. At that point, having a larger initial delay will definitely help, as it gives time for other blocking methods to act. For this reason, it is suggested that at least a one hour delay value be kept as a default, since spammers will start adapting as soon as this method becomes known and starts being used.
Re:security through obscurity, again? (Score:5, Interesting)
by letxa2000 (215841) on Friday June 20, @04:48PM (#6257552)
(http://www.geocities.com/efaxslams)is reject the mails on the greylist after holding the connection for, say, 10 minutes. That will help deter spamming software, I doubt it. I would assume the spam software would have a timeout, and I doubt it's ten minutes. If they want to hit-and-run and aren't even willing to make a second delivery attempt when an error code is returned, I doubt they're going to wait 10 minutes. I'm sure that within 30 seconds or less they'll consider it a dead connection and hang up.
Problem is, I used to have my sendmail HANG UP in real-time on an incoming connection as soon as it realized a message was spam. I.e., the incoming message was filtered in the DATA phase and if it was spam I hung up immediately. It worked great and it felt good, but there were many spam programs that took the disconnection as some kind of TCP/IP failure and immediatelty tried again. So I had one day where a single message was attempted to be delivered about 30,000 times as the spammer connected, I hung up, spammer software said "Oops, let me try again!" About one delivery attempt every second or so.
I'd be willing to bet if you put a 10 minute timeout in sendmail you'll see lots of spammer software disconnecting sooner and just trying again. It takes more of their resources, but takes more of yours, too.
Re:security through obscurity, again? (Score:5, Insightful)
by blakestah (91866) on Friday June 20, @03:33PM (#6256813)
(http://www.keck.ucsf.edu/~dblake)RTFA! There is no magical waiting period or re-try period that cannot be trivially coded around. And, with good money on the line, will be trivially coded around.
You don't get it. Really smart people are getting paid a whole lot of money to make programs to exploit every possible crack in the way we send email. There is no general rule to spammers, except that it is a lot of money and they are very clever. Little bandaids are not going to stop this one - there needs to be a much more fundamental change. And I am not talking about laws against spam - I am talking about changes in the protocols we use to send email.
Re:your first mistake (Score:5, Interesting)
by Henry Stern (30869) <[email protected]> on Friday June 20, @03:29PM (#6256774)
(http://www.stern.ca/)It means they have to do retrys...that means spam runs take longer, especially since they have to run...then wait for a locally defined timeout, and run all those addresses again
AND they have to do it from the same IP.
Not to mention that if this is used in conjunction with other collaborative tools (i.e. RBL, checksums), by the time that the spamming MTA can return its IP address will have been submitted to MAPS/etc. and the contents of the message will have been submitted to Razor/Pyzor/DCC.
I think that this greylisting idea will be pretty hard to beat by Joe spammer. Since the game of spam detection is pretty much an arms race, slowing him down will probably be enough to turn the battle in your favour.
Re:can't believe their numbers (Score:5, Informative)
by McDutchie (151611) on Friday June 20, @02:49PM (#6256312)
(http://slashdot.org/)Eh, open relays are soooo 20th century. :) Actually most open relays today are either blocked or closed, and newly installed MTAs are secure against third-party relaying by default, so this spam method is dying out [it-analysis.com]. Most spam today is sent either directly to the receiving MTA, through open proxies, or through formmail.pl and similar exploits.
Tempfailing is not new and unique (Score:5, Informative)
by HiKarma (531392) * on Friday June 20, @02:39PM (#6256198)This idea isn't so new or unique. It's been discussed a fair bit on the ASRG [ietf.org] mailing list under the name "tempfailing". First I heard of it was from Landon Noll and Mel Pleasant. It is noted in brief as one of the techniques in this plan to end spam [templetons.com] (though their plan, which did include the triplets, is not laid out in full there.)
It is a worthwhile technique for a little while, and if spammers were rational, would be worthwhile for some time to come. But spammers are not rational, and already this technique is not as useful as would be hoped.
Do a Google Search for Tempfailing [google.com] especially in ASRG to see statistics etc.
Re:1 false positive is not acceptable. (Score:5, Interesting)
by pclminion (145572) on Friday June 20, @03:00PM (#6256426)Wrong. 1 false positive can be acceptable, and in fact is probably better than how things are now. At USENIX '03 there was a paper presented on artificial intelligence techniques for spam detection. I can't provide a link since only USENIX members can download the paper (at this point, at least). I was a coauthor of that paper.
One of the things we've discovered in our research is that some classes of filters (most notably, the one I have been developing along with a few other individuals) are actually more effective at correctly classifying email than humans are. That is to say, you can train the learning algorithm on mostly-correctly-classified data, then re-run it over the training data, and almost miraculously, it discovers all kinds of email in the training set that was incorrectly classified.
I.e., this filter has discovered mail that I myself incorrectly thought was spam. It's scary, because there's a lot of it.
To assume that a human will always be 100% accurate at classifying their own email isn't just arrogant, it's plain wrong. Newer filters that will be introduced in the near future might possibly be more accurate than you, a frail human, could ever be.
How about Habeas' haiku method? (Score:4, Interesting)
by siskbc (598067) on Friday June 20, @02:56PM (#6256372)The best idea I've seen in YEARS was to have people start using a specific, original poem as their signatures. Then, the author granted license to anyone who WASN'T sending spam. Therefore, they could sue any spammer for copyright infringement if they used it, and you could train your mail filter to look for the signature. Once spamassassin took it up, it pretty much snowballed. See story here [wired.com] Re:Bayesian Filtering (Score:2)
by anti$pam (682702) on Friday June 20, @04:42PM (#6257503)The key is to make spammers not make money! If people start adopting anti-spam technologies we would reduce the return spammers get from sending spam. Reduce this enough and the spamming business will no longer be profitable.
POPFile is great. I've also used SAProxy (http://saproxy.bloomba.com/) under windows and it works great too.
Again, the idea is not to eliminate all spam, but to reduce the return rate, and therefore the money made by spammers.
Published a paper? (Score:4, Informative)
by Call Me Black Cloud (616282) on Friday June 20, @02:58PM (#6256400)Where? To me, publishing a paper means your writing appeared in some peer-reviewed journal (where the "peers" are acknowledged as domain experts). What you did was put up a web page. With a donation link at the bottom. For others looking for a solution, try POPFile [sourceforge.net]. Open source, cross platform, gives me 96% accuracy.
One more thing: "practically eliminates" is not the same as "eliminates".
Re:Published a paper? (Score:4, Insightful)
by vidarh (309115) <[email protected]> on Friday June 20, @03:33PM (#6256824)
(http://www.personalnames.com/ | Last Journal: Friday April 04, @04:47AM)To me publishing a paper in a peer reviewed journal instead of on the web would mean that I'd expect audience to be reduced to a ridiculously small fraction of people that might be interested. If I wanted to publish something I'd do it on the web first, and if it stacks up people I respect would start talking about it and link to it. Yes, I realize that for "serious" science still expect things to be published in peer reviewed journals, but in most cases I can't help but think that getting the article out there would be more useful. Sure, peer review is important, and somewhere to look for some kind of verification of the value of a paper is useful. But I much prefer the Research Index [researchindex.com] way, where I can get a good indication of the value of a paper by looking at how many people have cited a paper and WHO have cited a paper.
Anyway, pretending that putting up a document on a website is somehow less publishing a paper than having it printed in a journal, is just plain elitist. You should propably be a bit more critical to papers that are published that you don't know have been through a proper review, especially if you're not a domain expert yourself, but being aware of the source is something that you always need to be.
Delaying email by one hour! (Score:5, Insightful)
by pjrc (134994) <[email protected]> on Friday June 20, @03:04PM (#6256484)
(http://www.pjrc.com/ | Last Journal: Thursday June 27, @05:31PM)From the linked paper: An hour is short enough that in most cases, users will not notice the delay.
I'm wondering how I'm going to explain that to a new customer over the phone who says "I'll just email that file right now so we can go over it together".
Re:Delaying email by one hour! (Score:5, Insightful)
by vidarh (309115) <[email protected]> on Friday June 20, @03:24PM (#6256712)
(http://www.personalnames.com/ | Last Journal: Friday April 04, @04:47AM)Agreed. I've been involed in operating a larger (hundreds of thousands of active users) mail system a couple of years ago, and users would complain if their mail took more than seconds. We had to upgrade our system at one point because rapid growth had made mail delivery take a couple of minutes on average, and it caused bad publicity - a lot of users had a clear expectation that e-mail should be delivered in a few seconds and that if it didn't something was wrong. I think changing that perception of e-mail as near instant will be incredibly hard. And if you succeed it will just move even more traffic over to the IM networks and cause spamming of IM networks to escalate instead.
Bogofilter does pretty well for a client filter (Score:4, Interesting)
by lxdbxr (655786) on Friday June 20, @03:15PM (#6256612)
(http://www.oenone.demon.co.uk/)The summary does not seem completely accurate; since the greylisting MTA sends an SMTP temp failure there should never be any false positives as long as the sending MTA is vaguely RFC-compliant (sadly not true I suspect). Or at least that was my reading of the paper... I'm currently using Bogofilter [sourceforge.net] (and looking into CRM114 [sourceforge.net]) and getting better than 99% accuracy (about 1 in 200 false negatives at the moment) and very very few false positives (maybe 2 in 5000 messages).
Of course these are MUA level filters (and yes, I know, I've already "paid" with bandwidth to download the spam) - however since the proposed "greylister" would have to be installed as the MTA at major ISPs (as the authors note) I'm not convinced that is more likely to get widespread adoption than the various sorts of adaptive client-based filtering now available, particularly as it requires a database to back the method up.
As far as I am concerned the major factor in a spam filter should be zero false positives - personally I don't mind reviewing one or two spams a week but I get really annoyed if I were to lose a real message (note the two false positives I have sent to date with bogofilter contained forwarded sales pitches along with a message).
97%? not impressive. It's POPfile for me (Score:4, Informative)
by YE (23647) on Friday June 20, @03:24PM (#6256710)I get 98-98.5% accuracy with POPfile [sourceforge.net]. I get about 200 mails a day, of which around 30% spam. I get about 1 false negative a day, and maybe 2 or 3 false positives a month. It's a personal solution and as such is much more attractive to me than something server-based which has to be installed by a [typically VERY uncooperative] BOFH. I use it experimentally for general mail classification (business/personal/a variety of mailing lists etc., all in all 7 buckets) on my home machine, and it works fine in these conditions too, although the accuracy is a bit lower (around 95%).
Greylisting is dead (Score:1)
by MasTRE (588396) on Friday June 20, @05:27PM (#6257886)All of you naysayers out there (I'd be one too if I said it but I won't, read on to find out why) are making a terrible, terrible assumption: that every mail system admin out there will jump on the greylisting bandwagon and implement this. Back in reality, a lot less than 0.01% will actually implement this technique, especially after reading this thread. So, it's a non-issue. Greylisting is dead.
I'm skeptical (Score:2)
by chrysalis (50680) on Friday June 20, @05:15PM (#6257791)
(http://www.pureftpd.org/)Greylisting mainly relies on this (quote) : "These applications appear to adopt the "fire-and-forget" methodology. That is, they attempt to send the spam to one or several MX hosts for a domain, but then never attempt a true retry as a real MTA would."
I strongly disagree. A vaste majority of spammers actually use real mail servers like Qmail. Or strange spam-specific software with support for retries.
Apart from Spam Assassin, I'm using OpenBSD built-in "spamd" ip-based filter. A quick look at the spamd log files shows that the same spammers retry over and over, usually during 7 days.
What I like in Greylistings is that it actually prioritizes mails. A mail coming from a known source will be processed before a mail coming from an unknown source (that will have to wait for the next try) . Not really an antispam feature, but still nice to have.
Anti-Spam Techniques: Honeypot spam detection! (Score:4, Informative)
by mabu (178417) on Friday June 20, @07:06PM (#6258537)Aside from the obvious of getting the authorities to crack down on the existing illegal activities (relay hijacking, violation of TOS of ISPs, header forging, etc.) which is the only true solution, I think there are much better approaches than this "greylisting" method. The problem with the greylist method is it still slows down mail service, and potentially more than the relay blacklist features. The objective here is that end-user/networks should not be penalized in the fight against spam. We already waste too many resources, and according to my latest mail server stats, more than 65% of our inbound mail is UCE. I'm fed up with more than half my e-mail bandwidth being crap my users didn't request so more resource allocation on a local level in the fight against spam is counterproductive!
Here's a very clever, much more practical method I cound recently.
A company is Canada has set up what it calls SORBS [sorbs.net]: Spam and Open Relay Blocking System.
What's different from their blacklist is that they maintain "honeypots" strategically located around the Internet. These are servers they specifically set up as inbound mail relays, but never for legitimate purposes. If the servers get [select] mail activity, it's assumed to not be legitimate and it flags the source as a potential spammer... it makes a lot of sense. You create a domain name, but don't promote it in any legitimate manner, and/or you seed spam lists with these e-mail addresses and then let the spammers send to your key systems around the internet and *bam*, they're identified in real time, and then added to a blacklist.
I really like this idea. Like any other system, it has the potential for abuse but the beauty is the identity of the honeypot systems is kept secret, so it's very difficult for anyone other than spammers to exploit the network.
Slashdot
Spammers, scorched earth and stolen subnets (Score:5, Interesting)
by Xeger (20906) <slashdot@@@tracker...xeger...net> on Wednesday June 11, @04:13PM (#6174798)
(http://www.eatgod.com/)This article raises an interesting point. When a spammer successfuly hijacks address space and uses it to send spam, his IPs are naturally going to appear on various blacklists before too long. The problem isn't limited to blacklists, either. Bayesian spam filters [paulgraham.com] will quickly learn to recognize Received-From headers bearing the stolen IPs. Collaborative hashing filters [sourceforge.net] will also be affected, to a degree.
So...the spammer steals a subnet, uses it to spam for awhile, and then is either shut down or abandons his activities. He leaves behind a zone of "scorched earth" -- addresses that are effectively cannot host a mail transfer agent. It is now the job of the next legitimate recipient to clean up the spammer's mess. He might not even notice anything's wrong until half his emails have gone missing and the other have are bounced with mysterious messages. Having identified the problem, it is now up to him to track down various blacklists and get his addresses removed. The damage done to the Bayesian and collaborative filters simply cannot be undone. Mail will be lost.
To me, this is the real tragedy. Once an address block has been used for spamming, it's effectively ruined until someone inherits it and puts a great deal of time and effort into restoring its good reputation.
i've seen this firsthand (Score:3, Interesting)
by Tancred (3904) on Wednesday June 11, @07:02PM (#6176336)I'm part of the IP Admin group of a large international ISP and have seen this firsthand. New customers routinely ask us to route space, and sometimes it's difficult to tell if it's theirs or not what with all the mergers, acquisitions and renaming of companies. There's definitely more scrutiny of these requests than there was a year ago. A few months ago spammers started to hijack IP space that was registered to companies that are now out of business, which means that most likely nobody is going to notice what they've done.
After a while it's almost like getting squatters' rights - I've been using it and nobody else has a real claim to it, so it's mine.
SecurityFocus
Network operators were galvanized by a particularly brazen case in April, when a trail of spam led to the discovery that no-less than six /16s -- nearly 400,000 addresses -- had been misappropriated from Trafalgar House, a British construction and shipping conglomerate that's now part of Aker Kvaerner, headquartered in Norway. From the U.K., Cox discovered that the perpetrators conned the American Registry for Internet Numbers (ARIN) into changing the contact information for the space. One of the /16s was traced to a Dutch spammer, and the other five to a mysterious company called "Fedfinancial Corp."
Fedfinancial managed to convince ARIN that it had been contracted to provide network management services for Trafalgar. ARIN won't say exactly how it was swindled, but registration records show the grifters had an authentic-looking e-mail address at a newly-minted "traf-infosystems.net" domain, and a genuine street address with matching voice and fax telephone numbers. But the phone numbers ring to Nevada and Offshore Business Formation, a company that sets up corporations for a fee, and takes orders over the Web. Public records show that they incorporated Fedfinancial as a Nevada corporation last January, on behalf of an unnamed client. The street address is also theirs.
ARIN president Ray Plzak says the registry doesn't comment on specific cases, but acknowledged that address space hijacking is a problem. "We have measures in place to detect these kinds of things, and we have a set of procedures that we follow to verify information, and we're continuously looking into ways of improving that" says Plzak. "No procedure is ever 100% perfect, and we recognize that."
Once the ARIN record for a block of space has been tweaked, the new "owner" can show it to a network access provider as proof that he has the right to use the addresses. Kacperski found three providers for his purloined L.A. County block; anyone who questioned his sudden good fortune was treated to a tall tale about an old friend who bequeathed Kacperski the mammoth space when his company went bankrupt.
Anti-spammers argue that access providers should be more skeptical when someone comes in with a ridiculously large allocation. "If it's a customer connecting with T1 and walking in with a /16, or two or three of them, this is something that should set off some alarm bells," says Schlichting. But additional vigilance goes against an access provider's financial interest -- they make money by connecting people, not by turning them away.
And until spammers discovered the technique, IP hijacking was largely considered a dishonest but forgivable path to acquiring old, unused address space belonging to defunct companies. The perpetrators were what the Spamhaus Project describes as "a few crufty geeks" in search of "cheap digs." The scam is victimless in that it normally targets dormant allocations that are otherwise going to waste, in many cases taking blocks of space that belong to defunct companies, or, like the Trafalgar House space, have long faded from corporate memory.
But like the mob moving in on a neighborhood poker game, spammers have turned a once-harmless misdemeanor into an organized and well-funded scheme. Internet defenders shudder at the thought of large portions of the net's real-estate under the control of anonymous rogue entities. "There's no accountability. You don't know who really owns this particular address space. You have no way of finding out," says Schlichting." Some even worry that malefactors will go a step further, and begin hijacking address space that's already in active use. "This whole episode has identified huge weaknesses in the Internet's own infrastructure," says Cox. "What we've seen happen is trivial compared to what we've seen possible."
InformationWeek
In light of EarthLink's announcement and the prospect of millions more users sending challenges, many list administrators already have vowed to ignore them, effectively barring recipients who employ the technique.
"They can get pretty overwhelming is a nice polite way of putting it," said David Farber, a former Federal Communications Commission chief technologist who runs a 25,000-member list on technology.
Though Farber is sympathetic to the war on spam--up to half his inbox is junk--he considers challenge-based techniques too simplistic.
EarthLink's spam filter blocks up to 80 percent of spam. But spam has increased sixfold over the past 18 months.
The company decided to offer its customers the challenge-response option because cranking up spam filtering would only cause more legitimate mailings to get tossed by mistake, said Jim Anderson, vice president of product development.
"It's as close to a silver bullet as you're going to get," Anderson said. "We're simply providing a tool for customers to retake control of the inbox from spammers."
Others deem challenge-response a knee-jerk reaction.
"I'm worried people are going to implement systems like that too quickly because they are so desperate," said Eric Thomas, chief executive of L-Soft International Inc., a Swedish company that makes the popular Listserv mailing list software. "The cure might be worse than the ailment."
America Online now blocks up to 80 percent of incoming E-mail traffic, or more than 2 billion messages a day.
But company spokesman Nicholas Graham says AOL won't adopt challenge-response because having to send out 2 billion challenges a day would tax the system. And why create delays for subscribers?
"They don't want to hear 'You got mail and you just have to wait a few minutes longer,'" Graham said. "They expect to get E-mail quickly and responses quickly."
Anderson said EarthLink has developed the system over several months to minimize the burden on users and list administrators.
Standards call for messages from mailing lists to come with a priority code marked "list" or "bulk." EarthLink's software wouldn't challenge such messages. But because spammers can easily incorporate such coding, such messages would be sorted to a "suspect mail" folder.
The pre-approved sender scheme also difficulties because it doesn't work well with Yahoo Groups and other services where multiple list members post.
Online receipts from Amazon.com and other E-commerce sites also create problems; because they are automated, they won't respond to challenges.
Robert Craddock, chief executive of challenge-response developer DirectPop.net, said that although the system requires legitimate senders to do more work, "I don't think that's a lot to ask in this day and age when everybody's E-mail box is getting inundated."
Some spam experts question whether such techniques will even work. They believe spammers will figure out how to automate responses to challenges--and also learn to make messages appear to come from pre-approved senders--or are themselves "challenges," said John Levine, a board member of the Coalition Against Unsolicited Commercial E-Mail.
"It's very easy to come up with things that look like a solution," Levine said. "Lots of people say this will solve everything, spam won't be a problem anymore. Of course, they said the same things about a variety of previous techniques."
As spam has proliferated - and with it the attempts by big Internet providers to block messages sent from the addresses of known spammers - many mass e-mailers have become more clever in avoiding the blockades by aggressively bouncing messages off the computers of unaware third parties.
In the last two years, more than 200,000 computers worldwide have been hijacked without the owners' knowledge and are currently being used to forward spam, according to AOL and other Internet service providers. And each day thousands of additional PC's are compromised at companies, institutions and - most commonly of all - homes with high-speed Internet connections shared by two or more computers.
"The spammers have mutated their techniques," said Ronald F. Guilmette, a computer consultant in Roseville, Calif., who has developed a list of computers that are forwarding spam. "Today, if you are trying to do a really mass spamming, it is de rigueur to do it in an underhanded manner."
Just last Thursday, 17 law enforcement agencies and the Federal Trade Commission issued a public warning about some of the ways spammers now commandeer computers to evade detection. The officials translated the warning into 11 languages because many of the exploited computers are known to be in China, South Korea, Japan and other countries with heavy Internet use.
Mostly, the spammers are exploiting security holes in existing software, but increasingly they are covertly installing e-mail forwarding software, much like a computer virus. For some, hacking is no longer about pranks, but making a profit.
"This is not about a hacker trying to show off, or give you a hard time," said William Hancock, chief security officer for Cable and Wireless, the British telecommunications company. "This is about money. As long as there are people who want spam to go out, this is not going to go away."
Spam fighters say that some software is too easy to exploit and should be fixed. Moreover, computer users can take technical precautions to safeguard their machines. But not everyone will bother to take those steps, even if he or she discovers having been dragooned into the spammers' global army.
To begin with, most users do not see much effect when their computer has been co-opted. Surfing the Web from the victimized computer may be slower than usual but that is not always easy to detect. In most cases, the owners' e-mail addresses are not added to the spammed messages, so there is no need to worry that friends and associates will think the PC owners have suddenly started peddling herbal Viagra.
Indeed, the only way most users even become aware of such hijackings is when they receive telephone calls or e-mail from their Internet service providers saying a piece of spam was traced back to their machines.
"People are shocked," said Bobby Arnold, a network abuse engineer at Earthlink, the big Internet provider. "Someone will say, `I thought my computer was running a little slow, but I had no idea it was being used to send spam.' "
Some of the victims of the hidden spammers are revolted to learn, Mr. Arnold said, that they are aiding the hucksters and pornographers responsible for what many Internet users consider the medium's great blight. The truly offended rush to safeguard their machines.
But others, who see no direct impact to themselves, simply shrug off the problem, Internet providers say. Intent on reducing their network clutter, the providers then often try to cajole them into cooperating - and, if that fails, will sometimes cut off a user's service.
Sometimes people do find that someone has been sending spam and using their e-mail address as the sender, but this does not mean that their computers were used. Nothing on the Internet verifies that an e-mail message was actually sent by the person listed in the "From" address, which is one reason fighting spam is so hard.
And spammers like to send e-mail that appears to be from their enemies or names chosen at random. The legitimate owners of those addresses are often left to clean out hundreds or thousands of complaints from their e-mailboxes.
When a computer receives an e-mail message, it does record a code number, called an Internet protocol address, that can be traced to the computer that is connecting to it. But often e-mail is passed from one machine to another and the identity of the original sender cannot be verified.
Indeed, the rapid rise in the number of spammers trying to hijack innocent computers is a direct result of their desire to hide their own Internet protocol addresses from spam blockers. Most commonly, they are taking advantage of a backdoor in much of the software that office users or people with high-speed connections at home often install to share an Internet link among several computers - or so-called proxy servers. Some other types of e-mail and Web surfing software, typically run by larger companies, can also be taken advantage of if security features are not properly set up.
Because it essentially enables one computer to masquerade as another, a proxy server is an ideal tool for anyone seeking to use the Internet anonymously. So proxy servers are used by people in some countries to visit Web sites blocked by government censors. They are also used by hackers trying to attack other machines. And they are perfect for spammers trying to avoid filters.
None of these uses would be possible if the owners of the proxy servers made sure to configure them for access only by authorized users. But whether from laziness or ignorance, many users of proxy servers leave them open to anyone on the Internet.
AnalogX Proxy, a free proxy-server program that has been downloaded by more than a million people, is automatically in the open state when it is first installed. Mark Thompson, the author of AnalogX, said he had rebuffed the requests of many antispam activists to distribute the software with the security features already activated because doing so would make it harder to set up.
"The biggest plug for the proxy is it is really easy to get it running," he explained. Mr. Thompson said he did try to achieve a compromise by revising the program to give people a warning about security problems every time it starts.
Even so, Wirehub, a Dutch Internet service provider, says that 45,000 of the 150,000 open proxy servers it has identified as sending spam appear to be using AnalogX.
To find all these vulnerable machines, spammers and other hackers deploy computers that do nothing more than try to connect to millions of computers across the Internet, looking for open proxy servers to exploit.
At the Flint Hills School, "it was pretty amazing how fast our vulnerability was picked up by the spammers," Robert Hampton, the school's director of technology, said recently. Once the problem was identified, the school was able to fix it immediately.
Spammers and hackers trade or sell lists of open proxy servers on dozens of Web sites. And other sites sell software a would-be spammer can use to find new servers.
In the last six months, an increasingly common trick has been for spammers to attach rogue e-mail-forwarding software to other e-mail messages or hide it in files that are meant to emulate songs on music sharing sites like KaZaA.
As with all such hacker contraptions, and much spam, it is difficult to figure out who is behind these programs. But there is some evidence that one of the major spam-sending programs, known as Jeem, originated in Russia, which has been a fertile ground for both spammers and hackers.
Last October, Michael Tokarev, a Russian computer programmer active in the worldwide antispam effort, noticed a lot of spam in Russian that offerred bulk-mailing services. The messages were identical, but they came from many different computers. He investigated and found they were forwarded by a program, calling itself Jeem, that had not been seen before.
Mr. Tokarev said that in December, a Russian forum for spammers called Carderplanet.com contained a posting offering to sell the Internet addresses of open proxy servers, for $1 each, that appeared to be machines infected with Jeem. "Since the last week of December, several big U.S. spammers started to use those Jeems, too," Mr. Tokarev wrote in an instant message interview last week.
Machines infected with Jeem, which is especially hard to find because it keeps switching its identity on the computers it borrows, seem to be used these days mostly by spammers selling pornography, David Ritz, a volunteer spam fighter, said. Using a software monitoring tool he helps run, Mr. Ritz last week examined the messages sent to Internet news groups from just one home computer infected with Jeem. On one day last week, this computer sent 773 pornographic news postings with subjects like "Lolita paradise" and "N.U.D.E -- L,O,L,I,T,A,S."
"Open proxies are the single greatest threat to the integrity of the network that we see now," he said.
AOL, which has made fighting spam a central part of its marketing thrust, is taking what some see as radical action against open proxy servers. It will no longer accept any incoming e-mail sent directly from the computers of individual home users with high-speed service. This will not affect most home users because they typically do not run e-mail servers on their own computers but connect their e-mail programs to servers run by their Internet providers. But a handful of advanced users and small businesses do run their own e-mail servers connected to high-speed lines, and they no longer can send e-mail to AOL users.
For buyers, there is no choice but to buy tactically. Spam is a large, expensive problem, and there is no benefit to waiting for the vendor market to mature. Meta Group believes a "cocktail" approach is best, whereby multiple techniques are used to combat spam (such as content blocking, plus DNS lookup, plus subscriptions to known spammer lists).
Companies may also want to consider hosted anti-spam offerings as tactical solutions, though Meta Group has noticed growing pains in small hosted suppliers as they try to scale up to meet demand (for example, Message Labs halting all inbound traffic for several hours). Post-2005, however, Meta Group believes most Global 2000 companies will use a single vendor for multiple mail hygiene needs (such as protection from spam, viruses, denial of service, and trade secret disclosures). Virus companies will play a role here, along with a few surviving upstarts.
Given the economic incentives, spammers will fight mightily to continue to do business, despite increasingly tall barriers. Therefore, Meta Group recommends that anti-spam suppliers rapidly update technology (as much as once per quarter) to stay ahead of new spam sending techniques. Some of the newer ideas being contemplated or already in use for spam fighting include:
• Challenge/response: Some vendors are betting that the spam problem will get so bad that the only truly effective way to combat it will be for each user to compile a list of senders from whom they will willingly accept mail (a so-called "whitelist"). All other senders will be asked to prove that they have legitimate messages via a challenge/response system. Given the burden on the sender to prove legitimacy and on the recipient to maintain a whitelist, Meta Group believes the challenge/response approach will be used only sporadically for extremely spam-sensitive individuals by 2005.
• Spam application programming interfaces (APIs): Just as some e-mail system suppliers (such as Microsoft Exchange) designed specific APIs that enable virus protection vendors to write directly to operational code for greater accuracy and faster processing, Meta Group expects vendors to offer spam-specific APIs for similar reasons during 2003-05.
• Weighted values: As noted above, Meta Group believes the spam "cocktail" approach of using various blocking techniques is most effective. To maximize this multi-discipline approach, Meta Group believes vendors will develop sophisticated point systems enabling mail managers to dedicate a certain amount of points to content, the sending domain, header analysis, etc., and to then set thresholds for spam determination and disposal.--for example, 85 points being regarded as suspected spam, resulting in the message being sent to the user but stamped as suspected spam, and over 95 points being deleted outright. Trial and error will be used mostly to determine the optimal mix, making broad testing capabilities an important criterion for vendor selection once this facility is in common use (2004). Other vendors (such as Cloudmark) will write their own rules (150 or more), have their own weighting, and provide downloads of the formula on a periodic basis.
Anti-spam product comparison See which spam-fighting products support black lists, perform content blocking, and detect pornographic images. • Table 1: Trend Micro, Network Associates, Vircom, Group Software
• Table 2: SurfControl, Symantec, Lyris, Clearswift, CipherTrust
• Table 3: Elron Software, 8e6 Technologies, ActiveState, Marshal, MailFrontier, Tumbleweed
• Client-side tools: Meta Group expects IBM and Microsoft to increasingly add better spam-blocking tools with their next client versions (Notes 7.0 in the second half of 2004, Outlook 11 in the second half of 2003). Third parties will also develop sophisticated client/server tools for training mail clients to recognize spam (for example, Banter in first quarter 2003). Companies like Orchestria will focus on invoking pre-emptive client-side hygiene services prior to sending.
• Spam signatures: Just as companies commonly subscribe to known spammer domain listing services, Meta Group expects companies to subscribe by 2004 to public and private spam signature update services, whereby a spam signature is created for each new spam instance.
• Image detection: Pornographic images continue to be a threat to organizations. Early attempts at image detection through pixel analysis and pattern recognition have not worked well, but Meta Group expects significant investment in image detection to yield effective systems in 2004-05.
Slashdot
whirlycott writes "I just published a paper called The Spam Problem: Moving Beyond RBLs on my site. I comprehensively describe RBLs and list eight specific problems with them. I also get into ideas that next generation antispam system creators should read. I hope that this will be useful to anybody who is attending the Spam Conference at MIT on Jan 17th."
abstractsAdaptive Spam Filtering Jason Rennie, MIT AI Lab
Spam is a rampant problem with an annoying characteristic. As quickly as heuristics are developed to spot spam, the spammers change their tactics. Current systems all have one hole that spammers love to sneak through: they can't adapt. Hand-crafted rule-based classifiers have static rules. Bayesian approaches use static pre-processing that ignores "!!!!" and/or Japanese (for example). We need a new approach---a way to dynamically learn patterns that can identify spam. I describe one such approach: spam filtering as a compression problem. Given a set of e-mails and their labels (spam/non-spam), the objective is to encode a program for identifying spam in less space than it takes to encode the labels. In conjunction with a classification algorithm, this framework provides a natural way to score patterns. As part of a spam filtering system, it can be used to adapt the set of features used for labeling e-mail as spam. I describe the rationale for this approach and give examples of its performance on real data.
Work done in conjunction with Tommi Jaakkola.
__________________________________________________
Following Their Patterns
John Draper, ShopIP
I've spent considerable time tracking specific spammers, to try and get an idea of how they operate. Using the Crunchbox security system, we've been able to track them (almost in real time), keeping track of the times it arrives in our mailbox, studying patterns. They are not as consistent as I had hoped, though some are incredibly persistent, almost the point of harassment. We are writing "snort" rules, which the Crunchbox instantly triggers when any specific spam we are looking for comes in our network.
__________________________________________________
The Case for Spam Research Infrastructures
Paul Judge, CipherTrust
The scale and effect of the spam epidemic leads us to suggest that spam is no longer simply a nuisance, but is a type of information security problem. Therefore, we encourage systematic efforts to understand and analyze the problem and propose solutions. As part of these efforts in spam research, there is a need for the types of infrastructures that have proved useful in other areas of computer research. We identify three types of such infrastructures:
1) public trace data, 2) research tools, and 3) technical conferences.
Public trace data has been used for years in networking research and in network security research. Recently, SpamArchive.org has been established to provide publicly available spam and non-spam archives useful for testing, training, and benchmarking. We discuss the goals, current status, and possible future directions of SpamArchive.org. Research tools are necessary for collecting, processing, and analyzing spam-related data. In the past, developers
interested in contributing to anti-spam efforts largely have written spam filters. We stress the importance of other types of tools and discuss examples of necessary tools including: 1) tools to anonymize spam and non-spam messages; 2) tools to measure global spam activity; and 3) tools to perform automated testing including automated effectiveness and accuracy measurements.__________________________________________________
eXpurgate: a different approach in filtering E-Mail and detecting SPAM
Robert Rothe, eleven GmbH
eXpurgate is new service developed and provided by eleven allowing companies and consumers to reliably protect themselves against SPAM. Furthermore eXpurgate categorizes E-Mails into clean, bulk and dangerous and therefore allows its users to differentate between important, less important, dangerous and unsolicited messages.
eXpurgate tests the main characteristic of SPAM-E-Mail, that is its characteristic of being sent en masse. This does not necessarly require an E-Mail to be forwarded through the system, but a short fingerprint or "key" communicated to the expurgate-system is sufficient to allow the system to perform the categorizition. This fingerprint gives no evidence of the textual content of the E-Mail.
In my presentation I will describe the concept of eXpurgate and will address the following issues: Absence of a common SPAM definition; DNA of SPAM?; SPAM is just part of the problem; limitations of single-ended approaches
__________________________________________________
Spam Filtering, Round 2
Paul Graham, Arc
Good statistical spam filters will be delivered to a lot of end-users in the coming year. How will spam change as a result? To some extent, it will change by going away, as spam ceases to be a money-making proposition. But doubtless spammers will try a few tricks before giving up. We're already seeing them rephrasing their messages and trying to frustrate tokenization. In this talk I'll try to predict how spam will mutate in response to better filters, and what we'll have to do to catch it.
A nice intro. Some useful links.
10. What are DNS blacklists?
DNS blacklists are lists of domains that are known to originate Spam. Many anti-spam software programs use these lists to control Spam by refusing any email that originates from one of these domains. DNS blacklists are usually maintained by anti-spam organizations or by individuals with an intense dislike for Spam. The difficulty with DNS blacklists is the need for objectivity in deciding when to blacklist a domain. In order to know that a domain is producing Spam, the offence must be reported. Reporting Spam without any anti-abuse mechanism in place, however, leaves nothing to stop people from getting servers added to a DNS blacklist out of malice. The obvious solution would be to require a minimum number of reported incidents before blacklisting a server. This proves equally unsatisfactory however as a measure to stop Spam mail. Anyone who manages large mailing lists knows that a small percentage of people who subscribe subsequently accuse the sender of spamming them when they receive their email. Naturally, a company that sends out millions of legitimate commercial emails will receive more accusations of Spam than one that sends out a smaller amount of spam free bulk email.The real solution lies in good management. A system administrator that knows about Spam, that knows who the large legitimate bulk mailers are and responds rapidly to complaints from unjustly blacklisted domains will ultimately provide a useful service to the Internet community at large. There are some well-managed DNS blacklists on the Internet and these can be a useful addition to the feature set of anti spam software. Below is a short list of the better known sites:
Realtime Blackhole List
Spam Cop
Spews.org
Open Relay Data Base
Monkeys.com
Rfc-ignorant.org
version 1.5
SpamAssassin is a mail filter which attempts to identify spam using text analysis and several internet-based realtime blacklists.
Using its rule base, it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email.
Once identified, the mail can then be optionally tagged as spam for later filtering using the user's own mail user-agent application.
In its most recent test, SpamAssassin differentiated between spam and non-spam mail correctly in 99.94% of cases. Since then, it's just been getting better and better!
SpamAssassin also includes support for reporting spam messages automatically, and/or manually, to collaborative filtering databases such as Vipul's Razor [1].
[1]: http://razor.sourceforge.net/
The distribution provides "spamassassin", a command line tool to perform filtering, along with "Mail::SpamAssassin", a set of perl modules which implement a Mail::Audit plugin, allowing SpamAssassin to be used in a Mail::Audit filter, spam-protection proxy SMTP or POP/IMAP server, or a variety of different spam-blocking scenarios.
In addition, Craig Hughes has contributed "spamd", a daemonized version of SpamAssassin, which runs persistently. Using "spamc", a lightweight C client, this allows an MTA to process large volumes of mail through SpamAssassin without having to fork/exec a perl interpreter for each one.
Ian R. Justman has contributed "spamproxy", a spam-filtering SMTP proxy server. This lives in the "spamproxy" directory.
SpamAssassin lives at http://spamassassin.org/ or in CPAN, and is distributed under the same license as Perl itself. Use of the SpamAssassin name is restricted as documented in the file named "Trademark".
This module owes a lot of inspiration to Mark Jeftovic's filter.plx, which I used for a long time, and contributed some code to. However, SpamAssassin is a ground-up rewrite with a new, greatly improved ruleset, a different code model and installation system, and hopefully will be easy to adapt for a multitude of applications.
About: MailScanner is an Email virus scanner and spam tagger. It supports sendmail and Exim MTAs, and the Sophos, McAfee, F-Prot, F-Secure, CommandAV, InoculateIT, Inoculan 4.x, and Kaspersky anti-virus scanners. It supports SpamAssassin for highly successful spam identification. It is specifically designed to handle Denial Of Service attacks. It is very easy to install, and requires no changes at all to your sendmail.cf file. It is designed to be lightweight, and so won't grind your mail system to a halt with its load.
Changes: This release fixes the problems caused by viruses embedding newline characters in the middle of the subject line, which broke the MIME parsing code. The Sophos "autoupdate" script was fixed to properly handle the new Sophos "NSV" version of their virus scanner.
and my '.procmailrc' looks something like this
# Preliminaries VERBOSE=yes SHELL=/bin/sh #Use the Bourne shell (check your path!) PROCMAILDIR=${HOME}/.procmail MAILDIR=${HOME}/mail #First check what your mail directory is! FILTERDIR=${PROCMAILDIR}/filters # Location of filterfile LOGFILE=${PROCMAILDIR}/logs/procmail.log # My logfiles goes here. LOG="--- Logging ${LOGFILE} for ${LOGNAME}, " # Whatever rulesets you'll use # The order of the rulesets is significant ############################################################################## # PERSONAL MAIL ############################################################################## # This section first picks out all legal mail based on the $FILTERFILE and # returns them to default mailspool (/var/spool/mail/username). Then you can # add other rulesets below, eg. sorting mailinglists. # # Filter personal legal mail based on ${FILTERDIR}/whitelist ############################################################################## :0 * ? formail -x"From" -x"From:" -x"Sender:" \ -x"Reply-To:" -x"Return-Path:" -x"To:" \ | egrep -is -f ${FILTERDIR}/whitelist ${DEFAULT} ############################################################################## # MISC. MAILINGLISTS AND OTHER IMPERSONAL MAILS ############################################################################## # Filter a mailinglists to an own mailbox ($HOME/mail/veritas), because # we don't want to mingle personal mail and highvolume mailinglists.) # Take a look at a mailheader and look for a line to trigger the filter. # # In this example the line "X-BeenThere:" identifies this mailinglist. ############################################################################## :0: * ^X-BeenThere:.*[email protected] veritas # You can combine different rulesets. Just separate them with '|'. :0: * ^From:.*[email protected]|^From:.*[email protected] redhat ############################################################################## # JUNKMAIL - FINAL CHECKPOINT ############################################################################## # All mail reaching this point will be treated as junk, and placed in folder # 'spam' ($HOME/mail/spam). # # For fascist zero tolerance policy, replace 'spam' with '/dev/null'. # WARNING: ALL MAIL WILL THEN BE INSTANTLY AND PERMANENTLY DESTROYED! # (Locking is not recommended.) Use ":0" ############################################################################## :0: spam
perl.com
PerlMx is a utility by ActiveState that allows Perl programs to interface with Sendmail connections. It's quite a powerful tool, and once installed, it's very easy to use. This article will detail how to install and setup PerlMx, and provide an overview both of what you can do with PerlMx, and how to do it. This overview will be based on spamNX, the anti-spam code I developed, available at https://sourceforge.net/projects/spamnx. My next PerlMx article will go through spamNX in depth, to demonstrate how to harness the power of PerlMx.
Prerequisites
PerlMx is made possible by the excellent Milter code provided in Sendmail versions 8.10.0 and higher. This code, when compiled into Sendmail, allows external programs to hook into the Sendmail connection process via C callbacks. PerlMx passes these C hooks to the Perl interpreter, where you can access the information with a simple
shift
.Versions 8.12 and higher of Sendmail enable Milter by default. In prior versions, you must first enable the code in Sendmail. To do so, go to the
devtools/site
directory off of the Sendmail source code. Add the following lines to yoursite.config.m4
file:dnl Milter APPENDDEF(`conf_sendmail_ENVDEF', `-D_FFR_MILTER=1') APPENDDEF(`conf_libmilter_ENVDEF', `-D_FFR_MILTER=1')Now compile and install Sendmail. Once installed, add the following lines to your
config.mc
file (again, for Sendmail below version 8.12):define(`_FFR_MILTER','1')dnl INPUT_MAIL_FILTER(`<filter_name>', `S=inet:3366@localhost, F=T')Be warned that if you enable Milter in your configuration file, all Sendmail connections will fail unless PerlMx is running. So wait until your code is ready to go before you change your config file.
Your Sendmail installation is now ready to go. PerlMx also needs Perl 5.6.0 or higher, with ithreads enabled, and cannot have big integer (nor really big) support enabled. Prior to 5.6.1, PerlMx will also need the File-Temp module installed. With Perl properly configured, run the installation program provided by ActiveState.
Once PerlMx is installed and your code is ready to go, run:
pmx <package> &To launch PerlMx. At this point, you can safely turn on the Milter code in your Sendmail configuration. There are a few command-line options available for PerlMx. Most are unnecessary, but I've found I need to allow more than the default five threads. To do so, run:
pmx -r <# of threads> <package> &You can read about all the available options by running
pmx -h
.ActiveState has an FAQ available if you run into any trouble or have questions that aren't covered here.
Sendmail, Perl, and PerlMx should now be installed and ready to go. The following section provides an overview of how to write PerlMx code.
CNET.com
High-risk activity
So how do you stay off spam lists? We think we've pretty much nailed the biggest culprits. Our advice: Avoid the following people and places.The culprit: an unscrupulous message board
Spam servings: up to 10 per day
I opened an e-mail account with Hotmail in December of 1999 and used it in a single message at what was then Deja.com's Usenet Discussion Service (now part of Google). That was the only time I ever used that address.Five months later, unsolicited mail started popping into that mailbox. Over the next two months, in addition to 16 "legitimate" marketing messages from Hotmail and Deja.com, a backlog of 61 bulk advertising messages leaked in. As time passed, I got as many as 10 messages per day with subjects ranging from debt consolidation to Ponzi schemes, herbal ecstasy to celebrity hot tub sessions.
The remedy
Your best line of defense against this kind of unwanted e-mail is not to get on marketing lists in the first place: just don't use your regular e-mail address on message board and Usenet postings. If you simply must participate in Web-wide discussions with your actual address, turn on whatever spam protection your e-mail service provides. Hotmail's Inbox Protector, for example, diverted around 60 percent of the unwanted messages from my e-mail account; nonetheless, it let through more than one message per day.The culprit: America Online's chat room
Spam servings: up to 60 messages per month
With my brand-new 700-free-hours America Online screen name, I hopped into a chat room for San Francisco residents and lurked in a second, generic AOL chat room (Town Talk). Not long after, I discovered a message in my AOL in-box with the subject line "My sister and I went to a nude beach... (Over 18)." Well, there are a few nude beaches around San Francisco, but that didn't seem to be the focus of this message, which described Tammy, Syndi, and Simone's exploits, with copious Web links. Six other messages followed in swift succession, all prurient and crass. By the next day, the count reached 10. Two weeks later: 31. One month later: 51 messages.The remedy
There's only one way to avoid hassle from chat louts: use a dedicated screen name for chat and block e-mail to that screen name. From the master screen name, enter the keywords mail controls, select your chat-room screen name from the list, and check off either Block All E-mail or Customize Mail Controls. If you block all e-mail, use a different screen name for e-mail and give it out only to trusted chat buddies. If you customize mail controls, you can block all incoming mail except from names that you list. This is a slick trick, but, unfortunately, not one that works outside the AOL world.The culprit: an online lottery
Spam servings: 10 or more per week
Didn't your mama ever warn you about games of chance? When I entered a sweepstakes at iWin and gave them a new AOL address, I didn't notice any messages about marketing from third parties, so I assumed I'd be OK. Wrong, wrong, wrong: I received eight promotional e-mails in two weeks, none of which came from iWin. Some boasted disingenuous subject headings such as "Do you know these people?" designed to lull me into a false sense of security, but they contained lottery information from GroupLotto, an iWin affiliate. Once I hit the Unsubscribe link, the spam dwindled to no more than three messages per day. After a second attempt to unsubscribe, GroupLotto stopped sending me e-mail, but then I got spam from something called CustomerOffers. Meanwhile, an affiliate program called SFI welcomed me (I hadn't signed up for it, apparently uh------ @aol.com had done so for me; thanks, uh------!).The remedy
It's simple: When the sweepstakers come a-calling, just say no.
Unixreview
MS Outlook, Netscape Mail 4 (and above) and Eudora 5.0 all include powerful rules-based filtering of mail. This has eliminated the need for specialized plug-ins that were required for Eudora 4.0 and other earlier Windows-based mail clients.
While each mail client has its own way of dealing with filters, most modern mail clients give users the ability to filter, based on virtually every aspect of the message body and header. For example, you could accept mail only from known recipients. This could be problematic, as you would need to constantly update the list of people who can send you email.
Another filtering technique is to filter based on a domain or user. For example, if the domain @spammer.net was a large source of spam, you could filter all mail from @spammer.net into the trash folder and not ever have to see it. Again, there may be times when you want to see mail from [email protected], so you would have to periodically check the trash. This method would block all UCE from "@spammer.net", but it wouldn't block any UCE from "@uce.net".
A third technique is to filter based on a lack of address in the "to:" field. Many spammers utilize the blind copy field ("bcc:") to help hide their tracks. Numerous mail programs have the ability to filter email that doesn't have your address in the "to:" field.
Email Security through Procmail attempts to address the trend towards "enhancing" email clients with support for active content, which exposes end-users to many and varied threats, by "sanitizing" email: removing obvious exploit attempts and disabling the channels through which exploits are delivered. Facilities for detecting and blocking Trojan Horse exploits and worms are also provided.
Changes: Minor bugfixes.
12-27-99
Spam Buster is a powerful tool in the battle against spam. Loaded with an editable list of more than 15,000 spam sources and editable filter criteria for subject, header, source, and more, it will check up to 12 email accounts automatically at the interval you set. By not downloading full messages, you save time. Spam Buster runs from the task tray or Start menu. It notifies you when email arrives, summarizes your filter strategy, and lets you set "never filter" lists so that important mail always gets through. You control whether mail is deleted or marked as questionable. When a spam gets through, you can blacklist that sender and never hear from them again. The program even checks to validate domain names, stopping those fictitious origination sources. You'll need a POP mail server, but the program will work behind a proxy server/firewall. Use it free with an ad banner, or register to eliminate the ads. This one is easy to configure, loaded with options, and very effective.
Reviewed on Jan 17 1998.
ZDNet Story
WHAT THE POLITICIANS ARE DOING
An anti-spam bill is working its way through the U.S. Congress. It would accomplish a lot. Maybe too much. Here are some of the things House Bill 3113 (click for more.) would do:
- Require a valid return address on any unsolicited email
- Prohibit header forgery
- Allow ISPs to post and enforce policies regulating spam sent to their users
Some others are suggesting that a bounty be put on spammers' heads. This would make the fight much less reactive and give ISPs and others a nice incentive to go after spammers.Click for more.
Filtering. Monitoring spammers in real time is an approach working for Brightmail, which has seeded the Internet with email accounts it controls. When one of those accounts is spammed, an employee analyzes it and adds it to a filter -- which is updated and distributed daily to its ISP and enterprise customers. AT&T's Worldnet recently joined Brightmail's growing list of customers -- Earthlink and Excite@Home among them. Click for more.
"Internet developers and administrators have a common goal that usually goes unspoken: they want to make the Internet as efficient and effective as possible, but they want to do so without top-down solutions from a central authority - governmental or otherwise. Sure, the Internet was originally built with US government funding. But today's Net is so entrepreneurial and laissez-faire that pundits have coined the phrase "cyber-libertarian" to encapsulate the prevailing view that a free Net and a free society go together.
"It's important to understand that RFC 2505, "Anti-Spam Recommendations for SMTP MTAs," reflects those core beliefs, albeit at a level that's more personal than political. Although the IETF members who contributed to these recommendations obviously consider spam a serious problem for humans and hardware alike, their suggestions adhere scrupulously to the established concepts of netiquette."
sendmail.net
Internet developers and administrators have a common goal that usually goes unspoken: they want to make the Internet as efficient and effective as possible, but they want to do so without top-down solutions from a central authority - governmental or otherwise. Sure, the Internet was originally built with US government funding. But today's Net is so entrepreneurial and laissez-faire that pundits have coined the phrase "cyber-libertarian" to encapsulate the prevailing view that a free Net and a free society go together.
It's important to understand that RFC 2505, "Anti-Spam Recommendations for SMTP MTAs," reflects those core beliefs, albeit at a level that's more personal than political. Although the IETF members who contributed to these recommendations obviously consider spam a serious problem for humans and hardware alike, their suggestions adhere scrupulously to the established concepts of netiquette. Fighting deceptive behavior with more deceptive behavior just isn't how civilized netizens get things done.
After all, the most obvious way to get rid of spam would be simply to throw it away. Sendmail could accept mail that it interprets as spam, return "250 OK," and then quietly delete said mail. The spammer would assume said mail had been delivered and go away, misinformed and smug. But RFC 2505's authors are pretty clear that such behavior isn't the sort of thing they advocate. "This clearly violates the intent of RFC821," they write, "and should not be done without careful consideration." After all, tossing out spam isn't just dishonest, it's risky: a mail daemon that throws away unwanted mail might be inadvertently throwing the good stuff away right along with it.
What does RFC 2505 recommend instead? For the most part, finer control over existing functions implemented by sendmail and other SMTP agents. For instance:
- More flexible, fine-tunable filtering rules for blocking specific domains and senders. Specifically, the ability to treat specific recipients or senders differently from others at their domain, instead of choosing only to accept or refuse all mail from, say, hotmail.com. One could then block all mail from spammer.com, except for mail from [email protected]. Or, conversely, one could refuse mail from spammer.com addresses, unless the message is from [email protected], or perhaps if the mail is being sent to [email protected] from an otherwise blocked domain.
- More selective mail filtering based on both sides of the "@" sign. For example, an administrator at the newly merged AOL/Time-Warner could configure sendmail 8.10 to do things like:
Refuse mail addressed to [email protected], unless it comes from the aol.com or timewarner.com domains.
Refuse all mail from mattdrudge.com, unless it comes from [email protected] or is addressed to [email protected].
Accept all mail from hotmail.com, unless it comes from [email protected]
- The ability to set rate controls on accepting mail from the same sender or host, to prevent a spammer from taking up too much bandwidth.
- Disabling the EXPN and VRFY commands by default, to prevent spammers from extracting lists of valid addresses without the administrator's knowledge.
- Access control lists (ACLs) to prevent just anyone from, say, issuing an ETRN command that reruns the entire mail queue.
- More specific return status codes, so that administrators at the other end have a better idea why their mail isn't being accepted, and can do something about it.
It's important to undertstand that RFC 2505 doesn't change or extend the existing SMTP protocol and extensions. Rather, it suggests enhanced configuration capabilies and better settings that should be added to existing STMP servers. It also suggests ways in which these capabilities and settings could be used to restrict large volumes of unsolicited bulk email without blocking entire domains. Administrators should read this lengthy RFC thoroughly and decide which recommendations work best for them. In future articles, we'll show you how to specifically implement some of RFC 2505's recommendations in sendmail 8.10.
Fortunately, cold turkey is built into sendmail 8.9. In this version, for the first time, forwarding of SMTP messages is turned off by default. The fact is that if you administer a mail server, your default approach should be to disable SMTP relaying to avoid helping spammers pollute the world's inboxes even as they defile their own souls.
InformationWeek
How many digitally signed or encrypted E-mails do you get in a day?
I'm probably not typical because I get somewhere around 800 E-mail messages a day (thank goodness for autoresponders!). But consider the percentage: of those 800-some E-mails, only a dozen or so are digitally signed. I can't ever recall having gotten an encrypted message, and I've been using E-mail since around 1980.
It's surprising because it's ridiculously easy to spoof E-mail. At the simplest level, many users are unaware how easy it is to alter the "From" and "Reply To" fields in E-mail. It's child's play to send someone an E-mail that will look (to a casual or inexperienced eye) like a message from, say, a boss, a co-worker, or a spouse. The potential for mischief or outright fraud is enormous.
It's not a lot harder to hack many mail servers: Spammers do it all the time, and the "warez" boards are full of tools that will help a hacker find poorly guarded mail servers they can exploit.
But is also very easy to use digital certificates or simple encryption to validate messages or protect them from prying eyes. For example, Netscape Messenger and Microsoft's Outlook and Outlook Express both support the S/MIME (Secure Multipurpose Internet Mail Extensions) standard, and both can use digital certificates that can verify the identity of E-mail senders and receivers, helping to keep the mail contents private.
(Check the "Help" files of your E-mail client for more information. Or for Netscape digital-signing information, see http://home.netscape.com/security/basics/email.html. For Microsoft digital-signing info, see http://support.microsoft.com/support/kb/articles/q168/7/26.asp.)
Netscape and Microsoft also make it easy to obtain a basic digital certificate, and the benefits of getting one are enormous: A digital certificate can eliminate the need for multiple passwords on various Web sites; it helps you really know who you're talking to or hearing from; and it makes sending encrypted E-mail a snap.
So, why don't more people use certificates and encryption? I have several theories:
Cost. These days, you can get a browser for free. You can get an E-mail client for free. Heck, you can get an entire PC, including an OS and applications, for "free" (if you sign up with the right ISP). But you can't get a digital certificate for free. The most popular certificate vendor, VeriSign, charges $10 per year for a very low-end "Class 1" certificate--and all that really does is prove that you have a valid E-mail address. (VeriSign offers more secure certificates for E-commerce and developers, but they cost much more. A Java-Signing Certificate, for example, costs $400 per year.)
Hassle. E-mail has become the lingua franca of business because it's so fast and easy. Appending a certificate and encrypting your messages takes extra steps, extra clicks, extra thought, and extra time.
Cost. Did I mention that, unlike almost everything else these days, certificates aren't free?
Ignorance. Many users understand the need to know who's sending them programs and attachments, but few worry about the source of basic E-mail--perhaps because they don't know how easy it is to fake an E-mail address.
And did I mention cost? Here's an opportunity for some browser or E-mail client vendor--perhaps AOL/Netscape?--to make some major headway: Subsidize free Class 1 certificates for users. At a stroke, this would elevate the product (the one with the free certificate) above the competition. It would increase public awareness about certificates and would start generating thousands or even millions of new certificate users, which would help encourage others to get and use certificates, too. And it would help make the Net a safer place.
Any takers? Do you or your business use digital certificates or encryption for E-mail? Why or why not? Would you use one if it were free and part of your basic E-mail application? What do you think it will take to foster general acceptance and use of digital certificates? Join in!
Killer app that it is, email is also a serious productivity drain. The typical U.S. worker receives over 200 emails per day, according to a new Pitney Bowes survey. That's a staggering 1,000 messages a week to deal with -- read, reply, delete and/or ignore. Thank goodness for filters that kill spam, as we highlighted in a recent story. Click for more. But what happens when the spammer is the guy in the next office, or Great Aunt Edna, or your best friend from college?
Google matched content |
For most corporations, email is a mission-critical application. It often is the number one communications medium for developers, sales, and customers. However, unsolicited commercial email (UCE or spam) has reached levels at which it is starting to interfere with the effectiveness of email as a communication tool. Separating junk from real email wastes not only network/computing resources but also employee time. More important, many people consider spam to be an invasion of their private mailboxes; arguably the worst aspect of spam is that it demoralizes employees and can even jeopardize their emotional well-being.
Most corporate postmasters have been given the responsibility of dealing with spam. A quick search on the Internet reveals various technical solutions that have been created to help stop spam. One big implementation problem with these anti-spam measures is that they are usually applied on a site-wide basis. For most corporations, some email addresses - such as sales, technical support, and bug reporting - must not be blocked. Some of the spammers are our customers; we want their purchase orders to get through but not their spam. We never want to block bug reports from coming in, even if they are from a known spammer.
This article discusses configuration changes that can be made to sendmail rulesets in order to implement an anti-spam filtering policy on a per-user basis. Users can decide if they want to activate anti-spam features and what level of filtering they want.
The Anti-Spam Features of sendmail
Beginning with sendmail 8.8, the check_* group of rulesets were added as features. This group of rulesets provides hooks into the SMTP dialog. For the sake of clarity, I'll show the SMTP dialog here:
1. The sending machine issues a HELO (or EHLO) in which it identifies itself.
2. The sending machine issues a MAIL FROM in which it identifies the sender of the message.
3. The sending machine issues a RCPT TO in which it identifies the recipient of the message.
4. The sending machine issues a DATA to tell the receiving machine it is about to transfer the message.
5. The message is transferred, and the sending machine ends the message with a "." on a line by itself.
6. The receiving machine acknowledges that it got the message, usually by issuing a unique number.
Sendmail 8.8 included the following four check rulesets:
check_relay - this ruleset is called after step 1 in the SMTP dialog above. It is used to prevent unauthorized IPs from connecting to your machine. check_mail - this ruleset is called after step 2 in the SMTP dialog. It is used to stop mail from known senders. check_rcpt - this ruleset is called after step 3 in the SMTP dialog. It is primarily used to stop relaying (not to be confused with check_relay above.) Relaying occurs when an external user sends mail to your server meant for a different external user. They are using your server as a relay for their email. Spammers often do this in order to hide their identity or to take advantage of your resources. Since we know both the sender and recipient at this point, we can decide whether or not the email is relayed. check_compat - this ruleset is called after step 5 in the SMTP dialog. It can be used to stop delivery of a message after it has been accepted. Although these check_* hooks were provided, it was left to the system administrator to actually develop rules using these hooks. Claus Assmann[1] and Robert Harker[2] maintain a set of effective rules based on these hooks.
When Sendmail 8.9 was released, Eric Allman included some basic anti-spam features that could be configured into sendmail to take advantage of these hooks. By default, Sendmail 8.9 had relaying turned off (implemented in the check_rcpt ruleset). Furthermore, you could enable rejection of email based on either a DNS lookup or the results of a database lookup (implemented in the check_mail ruleset).
The Problem
The main problem with the anti-spam features included with sendmail is that the checks are made too early in the SMTP dialog. As configured by sendmail, both the DNS and database check are made in check_mail (SMTP step 2), after the sender has been identified. If the sender fails the checks, the mail is rejected.
The rejection comes too early because we do not know whom the mail is meant for yet. Also, this means that mail will be bounced regardless of who the recipient was. This is a problem for corporations because there may be some addresses that must receive all email. Also, some users may actually want to get spam (true case)!
In May of 1994, while I was reading the phl.food newsgroup, I saw something new. It was a message with this subject:
U.S. Green Card Lottery - New Immigration Opportunity
That was not what I expected to see in phl.food, so I wrote to the author:
This has nothing at all to do with food, and you posted it to phl.food. Please be more polite in the future and keep announements in relevant and appropriate groups.And he sent me a reply:
People in your group are interested. Why do you wish to deprive them of what they consider to be important information??I was really startled. I had Naïvely expected that the author would recognize that he had done something incorrect once it was pointed out to him. Gosh, was I wrong!
That was the beginning of my life with spam. It was the now infamous `Green Card Spam' from Lawrence Canter and Martha Siegel, a pair of incompetent lawyers. But they were on the leading edge of a big trend. Within two years the newsgroups were clogged with spam, and at the same time, email spam was becoming common.
Whitelist-based spam filtering
I get a lot of spam email. In the first half of December 2000, I received an average of more than 36 spam messages per day, out of 384 total messages per day.
I tried various ways of filtering it in the past, and finally decided the best way to do it is to use whitelist-based filtering.
Most spam filtering systems use blacklists, where mail from a certain list of email addresses or matching a certain list of text patterns is rejected or otherwise filtered. These lists take a lot of time and effort to maintain, and in the end still don't work very well.
The way whitelist-based filtering works is: you create a list of addresses of people you expect to receive mail from, and filter anything that is not from them into a separate low-priority mailbox that you check once a week or once a month or something.
There are other features that could be implemented as well, such as sending a reply to unknown recipients automatically to notify them that you might not read their mail for a while, or possibly asking them to verify that the message was sent by a human (in which case it would then be delivered directly to your inbox.)
perl.com Stopping Spam with SpamAssassin [Mar. 06, 2002]
I receive a lot of spam; an absolute massive bucket load of spam. I received more than 100 pieces of spam in the first three days of this month. I receive so much spam that Hormel Foods sends trucks to take it away. And I'm convinced that things are getting worse. We're all being bombarded with junk mail more than ever these days.
Well, a couple of days ago, I reached my breaking point, and decided that the simple mail filtering I had in place up until now just wasn't up to the job. It was time to call in an assassin.
SpamAssassin is a rule-based spam identification tool. It's written in Perl, and there are several ways of using it: You can call a client program,
spamassassin
, and have it determine whether a given message is likely to be spam; you can do essentially the same thing but use a client/server approach so that your client isn't always loading and parsing the rules each time mail comes; or, finally, you can use a Perl module interface to filter spam from a Perl program.
Configuring Sendmail 8.9 Anti-Relaying
Unsolicited Bulk Email: Definitions
and Problems (ube-def.html)
A paper covering the basic terms and issues to help facilitate discussion of unsolicited bulk email
(UBE), better known as "UCE" and "spam".
Unsolicited Bulk Email: Mechanisms for
Control (ube-sol.html)
Describes the many solutions that have been proposed for the UBE problem. This report gives extensive
information about each proposed solution, as well as the pros and cons of each. The second edition of
this report was released in May, 1998.
Story Spam TKO Toolkit How to Score in the War Against Junk Email -- contains useful feedback from readers. Among them:
...Have you tried Calypso? http://www.mcsdallas.com. It works great specially for those freaks like me who have four or more POP3 mailboxes to check. It is shareware but I think it is worth the pennies... I have also found very useful E.R.C. It is freeware and it works fine too. It is basically a mail redirection tool. You set up the POP3 account to be check periodically and this small tray utility will redirect your mails to another POP3 account...
...Since I've started using SpamCop, I don't get spam anymore. And I was receiving probably 20 spams a day. Check them out at http://SpamCop.net
...I find that Hotmail's approach is best I've found so far. You are required to open all mail, short of immediately deleting each or every email received, but when you do open the email item and see it's spam, you have the option to 'Block the Sender' where the filter is set so you don't receive any more email from that particular sender. Now, it's not so much like instantly eliminating spam but once a message is identified, its sender is blocked and you never see spam from that sender again.
...Easiest thing in the world to do - set up filters in your e-mail client. I have - in my MS Outlook client, and it was a no-brainer. For heaven's sake, don't bring the government into it.
... Spam is all uninvited ads thrust in my face. Amazon icons intruding into my research are as much spam as banner ads from banks when I'm searching for real estate and grocery flyers that crowd out the letters from my family. All are as much 'spam' as the little guys hoping that I'll fall for their junk and schemes instead of the junk and schemes from the legit (i.ee., paying) advertisers.
It's not honest (or fair or decent) to wage holy war against the little intruders while blessing the big ones. That gambit itself is just spam from the purveyors of ad-space -- and I'm not a/never would be an advertiser of either sort.
PC World Online - Bounce Spam Mail Fool those spam mailers. Send a fake bounce message back to them, making their targeted mailing list manager think your address was invalid. Works in a lot of cases where you can't get to the real sender of the message.
PC World Online -- Advanced E-mail Protector. Add a new spam weapon to your e-mail arsenal. Advanced Email Protector runs your messages through a multi-stage filtering gauntlet before they finally reach your mailbox.
Also you may try 'Genius II' from http://www.sinnerz.com/genius/
Consumers Won't Tolerate Spam, Varney Warns
Why Can't I Send Emails to People About My Products
Milter Helping You Mangle Your Mail At Will
When dealing with SPAM, the big question is, do you return REJECT to save your bandwidth, or TEMPFAIL to tie up resources on the spammers machine?
For more detailed information, check the Milter API [sendmail.com].
The purpose of this page is to make it so that spammers who attempt to collect email addresses off the web through programs will not have real email addresses in their database, causing them trouble because they will have to clean out their list. This page has one hundred randomly generated email addresses (reload and new ones will appear). At the bottom of the page is a link to this page again, essentially reloading it for programs to collect more fake email addresses. Email collecting programs will be sent in an infinite loop by following the link at the bottom of the page and will get more and more fake email addresses stuck in their databases. This helps to place many invalid email addresses that won't help spammers (they will get more returned email ;) and is our effort to FIGHT SPAM.
bayespam 0.9.2 - The qmail spam filter that learns
Changes include:
- MIME support and decoding
- DBM support
- Incremental add/remove of emails
- Changed to simple 0/1 exit value
- More control over .qmail setup
- Includes filter test script
- Command-line arguments
- Directory recursion
- Ignore files over certain size
- Ignore all-numeric tokens, HTML comments
- Ignore duplicate "interesting" tokens
- Switchable case sensitivity
- Better error checking
- Various optimizations
freshmeat.net Project details for bogofilter
O'Reilly Network Bayesian Filtering with bogofilter and Sylpheed Claws
Janne's bogofilter setup - Janne Nikula's pages
Busting Spam with Bogofilter, Procmail and Mutt
Update (Feb. 26, 2003): My Linux Journal article is out (March 2003 issue), which goes beyond the article presented here by bringing in a further improvement based on the chi-square distribution. Unfortunately I can't supply a link since it's only in the printed magazine for now. In other developments, Hexamail says their Hexamail Guard filter is based on this work.
Update (Dec. 9, 2002): I've written an article on the techniques described here plus a further one involving Fisher's approach to meta-analysis, which will be published in an upcoming issue of Linux Journal. These techniques, together, have beaten naive Bayesian classification and classification based on the Bayesian chain rule in head-to-head testing. Due to the constraints of the publishing process, I can't make the article available here until it's published by Linux Journal, but Greg Louis has written up the idea here. Note that there are further enhancements waiting to be tested. One last item for today, I became aware today that SpamAssassin is using some of these ideas.
Another update (Nov. 5, 2002): Bogofilter now has the approach described here as a built-in option. It is testing very well against the original approach.
A fair amount of testing has been done since the original version of this essay was posted on Sept. 16, 2002 with the results that as of Sept. 26, 2002 the spambayes project has decided to use the algorithms below. The first test that was done that combines the original idea and "Further Improvement 1" is discussed here and it is very positive. In general the spambayes mail list has emerged as the center of testing, so if you're interested, it would make sense to go there. Also, this essay is undergoing extensive revisions as feedback comes in. Where someone has made an important contribution, I'll point it out.
ZDNet Software Library - Anti-Spam Tools -- several reviews of anti spam tools
ZDNet Software Library - Search Results (Spam) -- rather old list, you can get a better list by doing search on the keyword spam in http://www.zdnet.com/swlib/ yourself.
Spam Buster Destroy junk email |
12-27-99 |
Contact Plus Corporation
Spam Buster is a powerful tool in the battle against spam. Loaded with an editable list of more
than 15,000 spam sources and editable filter criteria for subject, header, source, and more, it
will check up to 12 email accounts automatically at the interval you set. By not downloading full
messages, you save time. Spam Buster runs from the task tray or Start menu. It notifies you when
email arrives, summarizes your filter strategy, and lets you set "never filter" lists so that important
mail always gets through. You control whether mail is deleted or marked as questionable. When a
spam gets through, you can blacklist that sender and never hear from them again. The program even
checks to validate domain names, stopping those fictitious origination sources. You'll need a POP
mail server, but the program will work behind a proxy server/firewall. Use it free with an ad banner,
or register to eliminate the ads. This one is easy to configure, loaded with options, and very effective.
Reviewed on Jan 17 1998.
FlameThrower is a highly configurable add-in for Microsoft's Exchange and Outlook email
that scans incoming messages to determine whether they're spam. If they meet the criteria you set,
they're moved to a special folder. You can also create a list of senders you want excluded from
the spam lists. (Shareware/Win95-98-NT)
Click for more.
SpamAssassin Welcome to SpamAssassin -- Traditional approach. Should be used with caution: can backfire (blocking business mail). The spam-identification tactics used include:
perl.com Stopping Spam with SpamAssassin [Mar. 06, 2002]
For those of you who aren't familiar with
Mail::Audit
, the idea is simple: just like withprocmail
, you write recipes that determine what happens to your mail. However, in the case ofMail::Audit
, you specify the recipe in Perl. For instance, here's a recipe to move all mail sent to[email protected]
to another folder:use Mail::Audit; my $mail = Mail::Audit->new(); if ($mail->from =~ /perl5-porters\@perl.org/) { $mail->accept("p5p"); } $mail->accept();
For more details on how to construct mail filters with
Mail::Audit
, see my previous article.Plugging SpamAssassin into your filters couldn't be simpler. First of all, you absolutely need the latest version of
Mail::Audit
, version 2.1 from CPAN. Nothing earlier will do! Now write a filter like this:use Mail::Audit; use Mail::SpamAssassin; my $mail = Mail::Audit->new(); ... the rest of your rules here ... my $spamtest = Mail::SpamAssassin->new(); my $status = $spamtest->check($mail); if ($status->is_spam ()) { $status->rewrite_mail() }; $mail->accept("spam"); } $mail->accept();
As you might be able to guess, the important thing here is the calls to
check
andis_spam
.check
produces a "status object" that we can query and use to manipulate the e-mail.is_spam
tells us whether the mail has exceeded the number of "spam points" required to flag an e-mail as spam.The
rewrite_mail
method adds some headers and rewrites the subject line to include the distinctive string "*****SPAM******". The additional headers explain why the e-mail was flagged as spam. For instance:X-Spam-Status: Yes, hits=6.1 required=5.0 tests=SUBJ_HAS_Q_MARK,REPLY_TO_EMPTY,SUBJ_ENDS_IN_Q_MARK version=2.1
This message had a question mark in the subject, an empty reply-to, and the subject ended in a question mark. The mail wasn't actually spam, but this goes to prove that the technique isn't perfect. Nevertheless, since installing the spam filter, I've only seen about 10 false positives, and zero false negatives. I'm happy enough with this solution.
One important point to remember, however, is where in the course of your filtering you should call SpamAssassin's checks. For instance, you want to do so after your mailing list filtering, because mail sent to mailing lists may have munged headers that might confuse SpamAssassin. However, this means that spam sent to mailing lists might slip through the net. Experiment, and find the best solution for your own e-mail patterns.
[Aug 16, 2002] The SpamBouncer a Procmail-Based Spam Filter version 1.5
junkfilter: Junk Mail Filtration with Procmail - Junkfilter filters sex spam, MLM schemes, and all other types of unsolicited commercial e-mail (UCE). Current version: 20020519
A Plan for Spam -- by Paul Graham. Too optimistic and oversimplistic paper. But can serve as an introduction to Bayesian spam filtering
Better Bayesian Filtering about paper of Paul Graham
http://www.activestate.com/PureMessage. PerlMx (now PureMessage) is a very weak, antivirus style commersial solution The only positive thing about this extremly weak rpoduct is that allows the creation of Perl scripts which run inside the Sendmail system; these scripts can do things like reject, log, or rewrite mail. Pure MEsage 4.6 is a marginal product that you prbably cannot accept even as a gift, to say nothing paying money for this expensive commersial product.
Perl interpreter runs as a separate process in its own context. Thus, it can run without any sort of special privileges, which makes a lot of things easier. As long as the communication channel between sendmail and PerlMx remains secure, it should be very hard to introduce new security problems with PerlMx.
ASPN PerlMx Spam Filter PerlMx Spam Filter
ASPN PerlMx Docs Developer Reference
ASPN Reference pmx-faq - Frequently Asked Questions about PerlMx
PerlMx filters are plain Perl modules written to conform to a particular interface. They should be installed into the Perl library tree located beneath the root PerlMx installation directory. Your filter should have a .pm extension.
If you wrote your filter with a Makefile.PL (see ExtUtils::MakeMaker), you can install your filter with these commands:
perl Makefile.PL make make installIf you did not write your filter with a Makefile.PL, run
perl -V:installsitelib
to find where to put the module and manually copy the filter module into the specified location.We recommend that you install a filter into the Perl library tree beneath the root PerlMx installation directory, but this is not mandatory if you using
pmx1
during development.pmx1
looks for the filter in the current directory if it doesn't find the filter in the Perl module search path. You can edit the search path in thepmx1
command line with the following command:pmx1 MyFilter -- -I/look/here/firstIf you are developing a filter module with a MakeMaker style Makefile.PL and corresponding blib tree, you can edit the search path in the
pmx1
command line, with the following command:pmx1 MyFilter -- -MblibNote that you do not need to install your filters while you're testing them.
PerlMx - PerlMx--Samples - PerlMx sample filters
PerlMx - Getting Started with PerlMx
perl.com Filtering Mail with PerlMx [Oct. 10, 2001]
Note: most are pretty raw and not tuned to high volume environments.
McAfee.com - SpamKiller -- for personal use. windows -based
Lyris MailShield Email Filtering
Lyris MailShield is $4,995 for a single-server license. There are no annual license fees at this time. Order now
8e6 Technologies - Press Room - Press Releases
The BlackMail Anti Spam Mailer Daemon
ESCOM's Active SMTP (ASMTP) Internet Appliance -- APPL5 ASMTP Appliance in a 2U rackmount case configured and licensed for up to 5000 users. Includes hardware with software and documentation installed, printed installation notes and QC client software. $3500.
PC WEEK The 'Star Wars' phase of anti-spam tools
The Arms Race between those who send unsolicited commercial e-mail--or "spam"--and those who try to block it will enter the "Star Wars" stage this week with the introduction of a filtering product that blocks e-mail at the Internet gateway. Unlike previous escalations in the war, this tactic requires minimal effort and may work for a while.
Aside from the "do nothing" approach, the most popular response to junk e-mail is to filter it using a database of spammers (a "blacklist")--whether at the Internet gateway or ISP, at the e-mail server, or (most commonly) at the client. But that's not a very effective solution. Most junk mailers have long ago figured out how to hide the true origin of their e-mail, which is why most junk e-mail comes from made-up addresses such as "[email protected]." It's been estimated that less than 20 percent of junk e-mail comes from repeat offenders.
Other products have tried using an analysis of the content of messages to recognize junk, looking for phrases such as "once in a lifetime," "act now" or "XXX." That's more effective, but there's also a risk of rejecting legitimate e-mail. At PC Week, for example, we get a lot of junk e-mail, but we also get legitimate press releases that contain phrases such as "once in a lifetime." (I wouldn't mind seeing those e-mails filtered out, but not everyone feels that way.)
The most effective tools use a combination of both approaches. Integralis' MIMEsweeper, for example, filters on both the From: address and the content of e-mail messages. However, the product only works with specific e-mail servers. Version 3.2, for example, just added support for Microsoft's Exchange.
This week, Berkeley Software Design (www.bsdi.com) will announce a product called BSDI Mail Filter that combines both approaches but works before the spam clogs your server or reaches your client. And, since it works at the SMTP level, it can work with any e-mail server or gateway.
Mail Filter is a network appliance. It's just a box, with no keyboard or monitor, that is configured and administered using a Web browser. Corporations plug it in, tell it where their e-mail server is and modify their Domain Name System to point e-mail at the Mail Filter. That's it. As e-mail arrives, it first goes to Mail Filter and then to your real e-mail server. Junk e-mail is hopefully stopped (and returned) by Mail Filter before it gets to your e-mail server or users.
BSDI claims that preliminary tests (on its own network) show Mail Filter to be about 90 percent to 95 percent effective. Rather than examining the content of messages, it uses a combination of a blacklist and an analysis of e-mail headers. BSDI estimates that a majority of junk e-mail can be stopped by rejecting messages with invalid From: addresses and messages with improperly formatted e-mail headers.
Spammers will eventually figure out a way around BSDI's filters, but corporations can sign up for a subscription so that Mail Filter is automatically updated over the Internet with the latest and most effective algorithms. If you're nervous about losing important e-mail, you can also configure Mail Filter to simply tag suspected spam (such as adding "SPAM" to the subject line) until you are comfortable with the filtering. You can also create an "exceptions" list to forward some e-mail that would normally be blocked, such as e-mail from a customer with a broken or nonconforming e-mail gateway.
We haven't tested BSDI's approach yet, but it is the easiest I've seen. Just plug the box in and forget it. It can also protect an entire corporation at the source. If I were a spammer, I'd be worried.
Have you found another solution that works? Let me know at [email protected].
Slashdot The Spam Problem Moving Beyond RBLs
whirlycott writes "I just published a paper called The Spam Problem: Moving Beyond RBLs on my site. I comprehensively describe RBLs and list eight specific problems with them. I also get into ideas that next generation antispam system creators should read. I hope that this will be useful to anybody who is attending the Spam Conference at MIT on Jan 17th.
If you want to be more aggressive in your filtering of spam, you can configure your spam filter to check a variety of Internet sites that keep lists of known spammers and relays:
MailScanner -- GPLed program. Uses two instances of Sendmail and serves as a bridge between them.
About: MailScanner is an Email virus scanner and spam tagger. It supports sendmail and Exim MTAs, and the Sophos, McAfee, F-Prot, F-Secure, CommandAV, InoculateIT, Inoculan 4.x, and Kaspersky anti-virus scanners. It supports SpamAssassin for highly successful spam identification.
It is specifically designed to handle Denial Of Service attacks. It is very easy to install, and requires no changes at all to your sendmail.cf file.
It is designed to be lightweight, and so won't grind your mail system to a halt with its load.
Changes: This release fixes the problems caused by viruses embedding newline characters in the middle of the subject line, which broke the MIME parsing code. The Sophos "autoupdate" script was fixed to properly handle the new Sophos "NSV" version of their virus scanner.
Brief Description
MailScanner is a complete e-mail security system designed for use on e-mail gateways. It protects against viruses, and detects attacks against e-mail client packages (such as Outlook, Outlook Express, Eudora). It can also detect almost all unsolicited commercial e-mail (spam) passing through it and respond to all incidents in a wide variety of ways.
Not only can it scan for known viruses, but it can also protect against unknown viruses hidden inside e-mail attachments by refusing entry to attachments whose filenames match any given pattern. This can include generic patterns that trap filenames attempting to hide the true filename extension (e.g. ".txt.vbs").
Attachments containing viruses that can be disinfected (e.g. word processor macro viruses) are automatically disinfected and sent on to their original destination.
It is superior to many commercial packages in its ability to handle attacks against itself, such as Denial Of Service attacks caused by messages containing the "Zip of Death".
It is easy to install into an existing e-mail gateway, requiring very little knowledge of sendmail (or Postfix, Exim or ZMailer) and no change to an existing sendmail configuration.
If you cannot afford to run it as a virus scanner, but wish to use it solely for e-mail spam protection and e-mail client vulnerability protection, you can just set "Virus Scanner = none" and it will no longer require any form of virus scanner to operate.
MailScanner itself is entirely open source, but it uses widely known commercial virus scanning packages at its core. The other software it uses is all high quality open source software, leading to a system that can be trusted for performance and reliability.
In its most common use, sendmail provides both SMTP service and delivery service at the same time. It listens for incoming e-mail messages on the SMTP port, places them into a queue, and delivers them to their destination at the earliest opportunity.
When using MailScanner, this is split into two separate jobs, each handled by a different sendmail process and a different queue. The first sendmail process listens for messages on the SMTP port and places them into an incoming queue. MailScanner is responsible for collecting messages from the incoming queue, checking and filtering them, then placing them in an outgoing queue and triggering the second sendmail process to deliver them.
Due to the design and structure of sendmail, this split is extremely simple to achieve and requires no recompiling or configuration file changes. All the required changes can be easily done by editing the commands used to start sendmail.
Spam research can be done with standard Unix tools.
How to Complain About Spam, or, Put a Spammer in the Slammer
spammimic - hide a message in spam
There are terrific tools (like PGP and GPG) for encrypting your mail. If somebody along the way looks at the mail they can't understand it. But they do know you are sending encrypted mail to your pal.
The answer: encode your message into something innocent looking.
Your messages will be safe and nobody will know they're encrypted!
There is tons of spam flying around the Internet. Most people can't delete it fast enough. It's virtually invisible. This site gives you access to a program that will encrypt a short message into spam. Basically, the sentences it outputs vary depending on the message you are encoding. Real spam is so stupidly written it's sometimes hard to tell the machine written spam from the genuine article.
spamgourmet - disposable email addresses, spam filtering
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March 12, 2019