making gifs automatically
Marc LeBrun
mlb at well.com
Thu Nov 7 03:05:43 CET 2002
>=Rick Shepherd
> However, now it occurs to me that a more
> sophisticated robot of the future (if not of the present)
> could probably use OCR (Optical Character Recognition)
> to convert the gif back to text and still harvest the addresses.
There are already very clever programs which make this extremely
difficult. They use arbitrary fonts, tilt and mis-register the character
images, add noise and meaningless graphics (eg light grid lines), use
varying colors from image to image, etc.
Certainly one could imagine a very sophisticated OCR program handling this
(after all, humans can!<;-) but it's not clear that it would be
economically feasible for a spambot web crawler to subject every random gif
or jpeg it came across to such treatment, on the off chance that it would
produce a viable eMail address.
I think there was a group at CMU(?) that had algorithms for doing this over
a year ago. Unfortunately I've been unable to track them down...does
anyone know who they were?
More information about the SeqFan
mailing list