|
Please visit the LangaList Home Page Please note: Older issues may contain information that is now out of date How To
Subscribe and Unsubscribe is at the end of this
note. Mailing List Trouble? See
http://langa.com/help.htm Please recommend the LangaList to a friend! (And maybe win a prize!) An easier-to read formatted
HTML version of this newsletter is available The
LangaList 2004-01-15 Please visit our sponsors and help keep the LangaList S.E. free!
--- ( Your Clicks On Ad Links Help Keep The LangaList Free! ) ---
--------------( the above is an advertisement )------------- 1) Spammers Get Smarter
I've seen that too, Jeff. The text is sometimes gibberish, but other times is an actual letter or even a passage from a literary work. What's going on in all these cases is that the spammer is trying to overwhelm Bayesian filters by altering the context in which the spam trigger words appear. Bayesian filters operate statistically; they compute the odds of any given email being spam by looking at the type and frequency of certain words and formatting conventions. Some combinations almost always indicate spam. For example, if the Latin name of a male body part appears many times in a short email, along with words associated with making a purchase, odds are it's spam. (This is actually a difficult subject to write about without using words and phrases that will trigger spam filters! <g>) But if that same Latin term appears infrequently in a long text that also contains the words "Michelangelo" and "David," odds are it's not spam, but a description of a statue. So, spammers have started adding off-topic content to their emails to defeat statistical analysis. In the simplified example I'm using, if a spammer placed a long block of text on classical sculpture in an ad for some enlargement potion, a Bayesian filter might be tricked into letting the ad through, thinking it's not spam or that it's only "possible spam." Still, because Bayesian filters are sensitive to context, they usually work pretty well, and remain our single best option in fighting spam. In contrast, blacklist-oriented filters are the worst choice because they're far too crude, often blocking entire ranges of IP addresses or even whole ISPs because of one or two spammers. This is analogous to blocking all paper mail from all residents of, say, Texas because a couple Texans sent out some bad mail. While this broad block would stop the bad Texans, it also would be grotesquely unfair to the majority of totally innocent mail users in Texas. But this kind of wholesale blocking is exactly what email blacklist/blocklists do. Any ISP, web host or mailing service of any size will have some small percentage of sleazeball spammers, and they deserve to be punished. But blacklist-oriented services punish the innocent along with the guilty by indiscriminately blocking all users of a given ISP, web host or mailing service. Blacklists are evil, and I can't wait for them to go away. A third anti-spam approach has some merit: It's "rule-based" filtering. It's far less flexible than Bayesian filtering, and has more false-positives, but actually is better at catching some kinds of disguised spam. For example, a rule-based filter could be set up to treat as spam any email that (1) mentions the Latin word for a male body part plus (2) the brand name for some enlargement potion and (3) contains instructions on making a purchase. The rule-based filter will work even if there's (say) a long block of text on classical sculpture appended to the email. But rule-based filters are hard to maintain, and must be extremely complex to avoid a huge number of false positives. They're very hard to do well, and require a lot of intervention, where Bayesian filters can be more or less "set and forget:" The filter watches what you mark as spam and non-spam, and automatically builds and updates its statistical rules. If you can only use one tool, make it a Bayesian filter ( see http://www.informationweek.com/story/IWK20021115S0018 ). But I've actually been getting excellent results with two-stage filtering: I use a highly-developed rule-based tool ( http://www.spamassassin.org/ ) as the initial filter, and then follow with a Bayesian filter (the one built into the current Eudora; http://www.eudora.com ). It's still not perfect, but has reduced the spam in my inbox to a trickle, with a low percentage of false positives. All of this is occasioned by the article "E-Mail--Hideously Unreliable," at http://www.informationweek.com/story/showArticle.jhtml?articleID=17300016 . That article describes a test of almost 11,000 emails I sent to LangaList volunteers, in which some 40% of the attempted communications failed, with many of the utterly benign messages being eaten by hyperactive spam filters. Plus, at the end of that article, I also sum up the best-available techniques--- not only filters, but also methods of sending email--- to help you ensure successful deliveries, and minimize the chances that your emails will be lost. Odds are, if you're reading this newsletter, email is important to you. Please check out the article at http://www.informationweek.com/story/showArticle.jhtml?articleID=17300016 so you'll know what we're all up against, and what you can do about it! (See also next item.) Click to email this item to a
friend 2) Test IssuesA number of readers took issue with the email test (described in full at http://www.informationweek.com/story/showArticle.jhtml?articleID=17300016 ) :
First, spam is unsolicited commercial email; and it's usually sent in bulk. This email was solicited--- the volunteers invited it. It contained no commercial message. It was not sent in bulk. Thus, it was not spam, by any normal definition. And I had clearly stated "I won't tell the volunteers in advance what address the mail will come from or what the subject line will be.... " Yes, if I'd carefully designed the mail so you'd know it was from me, and if I'd sent it from an already-known/whitelisted address, then it would have bypassed most filters. But what would that have proved? I wanted all filtering--- both software and human--- to come into play because that's where I suspected most mail was being lost. Many people do indeed simply discard *all* unfamiliar email--- and that means that a lot of valid mail is getting trashed. Think of it: If your filters (or you) treat unrecognized
email as spam, than you will never hear from anyone
you don't already know. You'll never be able to get any reply from any new web
site you visit, or any new email service you sign up for. You won't hear from
friends or coworkers who change email addresses, or who write to you from
secondary accounts. In fact, you'll never be able to be contacted by *anyone* whom you haven't
already heard from and approved. Surely that's NOT what you want. 8-) And I believe that's why 40% of the test communications failed! Other readers took issue with other parts of the test, but I still think it did what it set out to do: To examine the reliability of unanticipated or initial-contact non-spam emails from non-hostile but non-whitelisted senders: the kind of mail you might exchange with co-workers, friends, business associates, or customers whose addresses aren't already in your "approved senders" list. But maybe I'm wrong. To voice your own opinions, pro or con, please join the discussion via the link on the last page of the article at http://www.informationweek.com/story/showArticle.jhtml?articleID=17300016 . See you there! Click to email this item to a
friend --- ( Your Clicks On Ad Links Help Keep The LangaList Free! ) ---
--------------( the above is an advertisement )------------- 3) Microsoft Blinks--- Again!Microsoft as once again stepped to the brink of dropping Win98 support, and then backed away at the last minute:
If you're still using or supporting Win98, that's good news. But don't celebrate too much: What it mostly means is that Microsoft will continue to offer limited paid support (at $35 per call) for Win98 users; and may--- may--- offer additional security patches and updates for the aging OS. Microsoft had previously announced that the Win98 Knowledgebase and other self-help items would remain alive until 2006. That remains in effect. It's great that Microsoft has taken the pressure off Win98 users, but it doesn't change the fact that Win98 is getting quite long in the tooth, in tech terms. Even if Microsoft releases additional security patches, there are fundamental limits: Win98 predates a whole range of current and new technologies--- USB2, support for huge hard drives, newer CPU types and optimizations, SATA, etc. etc.--- and there's only so much back-filling you can do. Microsoft sometimes force-feeds meaningless updates to its customers, like WinME, which should have been a free update for Win98 users. But all OSes eventually need to be upgraded at their core. That's not a Microsoft thing. It's the way all tech evolves. So even if Microsoft will now keep Win98 on limited life support, it's still time for Win98 users to start thinking about an upgrade. It could be a free or low-cost Linux distribution, or upgrading to XP, or even getting a new PC with a newer OS preinstalled (which doesn't have to cost a lot: http://www.informationweek.com/story/IWK20030206S0014 ). It's wonderful that there's some extra time now, so your upgrade can be slow and methodical rather than a hasty, slapped-together thing. Take your time, but do make a move fairly soon. It's time! Click to email this item to a
friend 4) Odd Ripple EffectsA *very* weird chain of events affected huge numbers of users of Microsoft Office and Norton Antivirus last week--- and may still be affecting some. Other users were affected too, although in lower numbers. You'll know if you're effected because your copy of Microsoft Word became very, very slow to open; and Excel might not have been able to start at all; and/or your whole system may have bogged down. The problem isn't in Word, Excel, Windows, or NAV--- but is something else entirely:
Thanks, Joshua, and all the other readers who wrote about this. This is actually a security problem: Verisign couldn't handle the load on its servers as millions of PC sought to update an expired certificate list. Norton AntiVirus, among many other apps, relies on that list to help ensure that its patches are authentic. When NAV couldn't get the updated list, it stalled, bringing the default "scan on open" operations in Word and Excel to a halt. But there's a fix on both the Symantec site, above, and at Verisign ( http://langa.com/u/2t.htm ). If your PC is still slow or erratic, check out those links. It just goes to show that, in PCs, the symptom may be far, far removed from the root problem! Click to email this item to a
friend --- ( Your Clicks On Ad Links Help Keep The LangaList Free! ) ---
--------------( the above is an advertisement )------------- 5) Keep Master Backups Current
I update my "master" image a couple times a year: I make a normal image of my working setup, store it, and then reinstall my original master image--- the "perfect, like new" setup from some time ago. I then run Windows Update, Office Update, and any other updates for the software installed in the master setup. I then do a cleanup, defrag the system, and burn a new master image: This becomes the new reference standard; a foundation image for any new installs I have to do. (Note: I still keep the old master images on CD so I can roll the system back all the way to as-delivered-from-the-factory, if I need to.) Once the master image is up to date, I then reinstall my most-recent working image, and resume normal operation. This way, the master image and working image are never too far out of synch; and the master image stays clean of the debris of day-to-day operation. It's one of those things that sounds harder than it is. 8-) In operation, it's really not bad at all. I usually can restore, update, and save a new master image in much less than an hour (albeit with a fast system). Done a couple times a year, it's a small price to pay for having a current master image to fall back on, just in case! Click to email this item to a
friend
Click to email this item to a
friend 7) Free System Cleanup Tool
Thanks--- looks handy, giving you a simple, unified front end to "...delete Cookies, clear Internet Explorer Cache, delete index.dat Files, clear Typed URLs, Windows Temp Folder and much more." Click to email this item to a
friend
Click to email this item to a
friend --- ( Your Clicks On Ad Links Help Keep The LangaList S.E. Free! ) ---
--------------( the above is an advertisement )-------------- 9) CD ShreddersHi Fred, I have a large number of old diskettes and CD Roms with sensitive information. As the information is outdated, I want to destroy the physical media instead of writing over them as I figured that might be faster and less tedious. What's the best and most effective way? (PS. I tried using a very strong magnet on the diskettes but the data stubbornly remains intact!) Thanks, Tim Thackery One simple, reliable approach is to physically destroy them with a shredder. There are specialty shredders made just for discs, and the data is extremely difficult--- nearly impossible--- to recover: http://langa.com/u/2v.htm If that's not possible, you can cut the floppies with shears and score the top
of the CDs with an awl or other sharp instrument. The more cutting/scoring, the
harder the data would be to recover. (But it's not a perfect method.) Click to email this item to a
friend 10) Just For GrinsIn the early days of computing, programmers with too much free time created "ASCII graphics;" pictures--- often risque--- composed entirely of numbers, letters, and punctuation marks used as pixels. Some of the drawings were amazingly detailed, with surprising artistry. Which brings us to this:
Amazing! Click to email this item to a
friend --- ( Your Clicks On Ad Links Help Keep The LangaList Free! ) ---
--------------( the above is an advertisement )------------- 11) Plus! Edition Highlights:
Click to email this item to a
friend (Want to give a gift
subscription to the LangaList Plus edition? See you next issue! Best, An easier-to read formatted HTML version is available in the "Current Issue" section of http://www.langa.com. (The HTML version of each issue normally is available by 9AM EST [UT-5] of the issue date.) All past LangaList issues are also available at the Langa.Com site. UNSUBSCRIBE (instant removal!):
http://langa.com/leave_langalist.htm CHANGE ADDRESS? LIST TROUBLE? HAVE QUESTIONS? OTHER PROBLEM? NEED HELP? See http://langa.com/help.htm This newsletter is SPAM PROOF and requires two levels of subscriber confirmation
before delivery begins: See
http://langa.com/info.htm |
||||
|
Please visit the LangaList Home Page |