IJSRP, Volume 4, Issue 3, March 2014 Edition [ISSN 2250-3153]
Mrs.Latha.K., Nivedha.P., Menagagandhi.G., Ramya.T
In the field of collaborative spam filtering by near-duplicate detection, an e-mail abstraction scheme is required to more certainly catch the evolving nature of spams. Compared to the existing methods in prior research, in this work, we explore a more sophisticated and robust e-mail abstraction scheme, which considers e-mail layout structure to represent e-mails. The specific procedure SAG is proposed to generate the e-mail abstraction using HTML content in e-mail, and this newly-devised abstraction can more effectively capture the near-duplicate phenomenon of spams.