Illiterate phishers

« previous post | next post »

I've recently noticed an uptick in spam with good graphical quality but terrible proofreading. A few random examples are below.

The spammeisters at ResearchGate are acting like proud grade-school parents over my contributions to the field of Natural Glyage Processing:

Some people who pretend to be from nightly.news.report or maybe from Home & Life Magazine want to tell me about Cannabis Oil. The core of their message is reasonably professional, with just a couple of typos or L2-English infelicities:

But that email starts out with this curious bit of text:

Myl history :thinking: i'll allow it username checks out watch out, he's gonna get you with

…and ends with this:

NMB three six seven zeroround meadow lane hatboro PA 19040
If you do not wish to receive future mailings from NMB Resources, Cancel/stop-.

now i want to watch blade now r/tytumblrgifs luke cage film my reaction when people say x-men started the modern
should remain alert to signs of anxiety, depression, and opioid use disorder (see recommendations 8 and 12) that might be unmasked by an opioid taper and arrange for management of these co-morbidities. for patients agreeing to taper to lower opioid

Meditative text messages among the phishers? Dribblings from a text-generation program that's lost its tiny mind?

The same crew (?), pretending to be ABC News ,wrote to tell me about relief from "Chornic Pain" under the Subject line "CBD oils aer now legal in all states":

No pre-graphics burbling, but afterwards came this:

it?? Kama asked, speaking in a low voice as he edged closer to the pair. ?Not easily,? Skizer replied as Razen glared at Kama. ?With both drones, I
am able to generate a low fidelity.



17 Comments

  1. Chris C. said,

    February 6, 2018 @ 3:41 pm

    Most likely, snippets of text designed not so much to convey information as to confound Bayesian spam filters.

  2. Aaron said,

    February 6, 2018 @ 4:03 pm

    Obligatory link to research about the false positive problem: https://www.microsoft.com/en-us/research/publication/why-do-nigerian-scammers-say-they-are-from-nigeria/. Not only can misspellings help get through spam filters, they also only fool people most likely to actually fall for whatever they're selling.

    I agree that that last paragraph of gibberish does seem like a lorem ipsum generator gone mad :)

    [(myl) That hypothesis doesn't account for ResearchGate's interest in Natural Glycage Processing…]

  3. bratschegirl said,

    February 6, 2018 @ 7:30 pm

    So if with both drones one is able to generate a low fidelity, do more than two drones lead to higher or lower fidelity? And which provides more relief from that pesky chornic pain?

  4. Gregory Kusnick said,

    February 6, 2018 @ 7:56 pm

    Those text snippets don't seem machine-generated to me. A few minutes of googling turned up the source of the opioid snippet on the CDC website, and the bit about Razen, Kama, and their drones appears to come from an old reddit thread cached here.

    So it seems pretty clear that these bits are being harvested from actual web pages for use as camouflage to sneak spam past the filters, as Chris suggests.

  5. AntC said,

    February 6, 2018 @ 9:53 pm

    ?myl is running a couple of careers on the side: both glyage processing and glycage processing??

    Google search seems to know that 'glyage' is a thin cover for 'language', and finds those ResearchGate papers straight away. It's more puzzled by 'glycage'.

    I would have thought 'glyage' is sufficiently lexically distant from 'language' that Google might offer all sorts of re-spellings(?)

    Perhaps we need to behave more like Chinese netizens to evade big brother Google: invent cunning ways to more obliquely say stuff?

  6. Termy said,

    February 7, 2018 @ 1:59 am

    I suspect that "Glyage" comes from (https://aclanthology.info/pdf/J/J92/J92-2007.pdf), which Google seems to have transcribed as "N A T U R A L. GLYAGE. PROCESSING."

    In fact if I highlight, cut and paste the title it comes out as "THE ACL-MIT PRESS SERIES in
    NATURAL GLYAGE
    PROCESSING "

  7. Termy said,

    February 7, 2018 @ 3:34 am

    My previous comment of some hours ago appears to have vanished into the yonder – possibly eaten by the spam filter?

    If you search for "glyage", besides this LL article Google returns an archived page of the ACL anthology advertising the "ACL-MIT PRESS SERIES in NATURAL LANGUAGE PROCESSING", featuring none other than one Mark Y Liberman. From 1992!

    If the title is highlighted, cut and paste, the text renders as "THE ACL-MIT PRESS SERIES in NATURAL GLYAGE PROCESSING", hence the Google title. Researchgate have obviously grabbed this, complete with the odd text rendering.

  8. Keith said,

    February 7, 2018 @ 5:16 am

    Do these really come from ResearchGate? Have you looked in the headers and used a whois service to find the holders of the domains from whence the messages were sent?

    That postal address, 3670 Round Meadow Ln, Hatboro, PA 19040, is the address of A-Z Magic:

    Arlen Zachary of A-Z Magic is one of the Philadelphia area's premier children's entertainers. ​He has been performing magic and entertaining audiences in and around ​the Philadelphia area for over 35 years. A full-time, professional magician who presents a high-energy, spellbinding program of magic, that is full of fun

    I wonder what Mr Zachary would think of the spammers using his address?

  9. Ursa Major said,

    February 7, 2018 @ 7:03 am

    The ResearchGate record for that paper has the same error in the title, and it also appears on the cover sheet of the full text pdf (generated by ResearchGate). https://www.researchgate.net/publication/265080569_THE_ACL-MIT_PRESS_SERIES_in_NATURAL_GLYAGE_PROCESSING

    I don't know much about how RG works… Are the records created by users? But it seems unlikely that someone would mangle the typing of 'language' so badly. My first thought was actually that it was an OCR error, but the absence of any other errors in the text held by RG and the decent-looking quality of the scanned pdf seem to rule that out. So it does look like a mystery as to where glyage came from.

  10. Rodger C said,

    February 7, 2018 @ 7:48 am

    Perhaps it was typed by someone who saw "natural … processing" and was primed to expect a chemical, like glycerin, or glycogen.

  11. Emily said,

    February 7, 2018 @ 11:04 am

    Sounds like the cannabis oil sellers have been sampling a bit too much of their own merchandise…

  12. Ursa Major said,

    February 7, 2018 @ 11:13 am

    Hi Rodger. As a chemist, when I first glanced at the image a similar thought crossed my mind, but as soon as I read it properly I knew it couldn't be the case. The suffix -age is rarely (never?) used in chemical names, and while the stem gly- occurs in some kinds of names I would expect glyc- in any conceivable context implied by the sentence. So to my chemist's eyes 'glyage' looks like a nonsense word. Also, 'natural language processing' is a standard term in its field, but 'natural xxx processing' doesn't feel like the way a chemist would describe something.

  13. Ernie in Berkeley said,

    February 7, 2018 @ 2:15 pm

    The random text is indeed included to foil spam filters. Here's an explanation:

    http://www.hoax-slayer.com/hidden-text-spam.html

    And go here to generate some of your own.

    http://www.spammimic.com/encode.shtml

  14. J.W. Brewer said,

    February 7, 2018 @ 4:27 pm

    Of course perfectly fluent L1 Anglophone spammers are not always good at their job and sometimes come up with weirdly-off headlines or catchphrases. I just got an advertising email seeking to sell a reasonably-reputable product from a reasonably-reputable and US-based source with the subject line "Don’t miss certain threats to your identity." Nothing ungrammatical about it, but it makes no sense pragmatically if you're selling something. Why would the reader/prospect be concerned about missing certain (unspecified) threats but be nonchalant about missing other (also unspecified) threats? It's just incredibly bad phrasing in context even though there's nothing wrong with it syntactically and the body of the email doesn't have any obvious signs of ESLishness. (I think the point they're trying to make is that their product/service detects *additional* important threats that could lead to identity-theft etc that the competition doesn't detect, on top of also doing the stuff the competition also does. That sounds like a good marketing pitch if they could figure out a way to convey it in the subject line.)

  15. Rodger C said,

    February 9, 2018 @ 7:46 am

    @Ursa Major, I have an undergrad minor in chemistry. I think you're assuming that the transcriber (a) actually knows chemistry, (b) is paying attention to the task. Neither seems likely to me.

  16. Francis Boyle said,

    February 10, 2018 @ 4:38 am

    Isn't glyage the process that gives us words like 'covfefe'?

  17. mg said,

    February 10, 2018 @ 11:26 am

    I recently got spam inviting me to submit a paper to the prestigious (pay to publish, of course) publication <>.

RSS feed for comments on this post