ÅؽºÆ® :: DESUPER.

Áߺ¹ µÈ ŽÁö ¸ðµâ ±Ùó
Áö±Ý ´Ù¿î·Îµå

ÅؽºÆ® :: DESUPER. ¼øÀ§ ¹× ¿ä¾à

±¤°í

  • Rating:
  • ƯÇã:
  • Perl Artistic License
  • °¡°Ý:
  • FREE
  • °Ô½ÃÀÚ À̸§:
  • Jan Pomikalek
  • °Ô½ÃÀÚ À¥»çÀÌÆ®:
  • http://search.cpan.org/~janpom/

ÅؽºÆ® :: DESUPER. ű×


ÅؽºÆ® :: DESUPER. ¼³¸í

Duplicates ŽÁö ¸ðµâ ±Ùó Text :: Deduper´Â Al (http://www.ra.ethz.ch/cdstore/www6/technical/paper205/paper205.html)ÀÇ Andrei Z. Broder°¡ Á¦¾ÈÇÑ ´àÀº ÆÞ½º ÃøÁ¤À» »ç¿ëÇÏ´Â Perl ¸ðµâÀÔ´Ï´Ù. (±Ùó Áߺ¹) ¹®¼­¸¦ Text¿¡ ±â¹ÝÀ¸·ÎÇÕ´Ï´Ù.ÁÖÀÇ »çÇ× : ¸ðµâÀº ¾ËÆĺª ¹®ÀÚ ½ÃÄö½º¸¦ ŽÁöÇÏ¿© ÅؽºÆ®¸¦ ÅؽºÆ®¸¦ ÅäÅ«À¸·Î ÇØÅ· ÇÒ ¼öÀÖ´Â ¾ð¾î·Î ¿Ã¹Ù¸£°Ô ÀÛµ¿ÇÕ´Ï´Ù. µû¶ó¼­ ¿¹¸¦ µé¾î ¸Å¿ì ÁÁÀº °á°ú¸¦ Á¦°øÇÏÁö ¾ÊÀ» ¼öµµ ÀÖ½À´Ï´Ù. Chinese.synopsis´Â ÅؽºÆ®¸¦ »ç¿ëÇÕ´Ï´Ù :: DESUPER; $ deduper = »õ ÅؽºÆ® :: deduper (); $ deduper-> add_doc ( "doc1", $ doc1text); $ deduper-> add_doc ( "doc2", $ doc2text); @similar_docs = $ deduper-> find_similar ($ doc3text); ... # ÅؽºÆ® ¹è¿­¿¡¼­ Áߺ¹ ±Ùó¿¡¼­ »èÁ¦ $ deduper = »õ ÅؽºÆ® :: deduper (); $ text (@texts) {´ÙÀ½ $ deduper-> find_similar ($ text); $ deduper-> add_doc ($ i ++, $ text); Ǫ½Ã @no_near_duplicates, $ text; } ¿ä±¸ »çÇ× : ¡¤ Perl.


ÅؽºÆ® :: DESUPER. °ü·Ã ¼ÒÇÁÆ®¿þ¾î

filterunit.

´Â ¸í·É Çà ÇÁ·Î±×·¥¿¡ ´ëÇØ ´ÜÀ§ Å×½ºÆ®¸¦ °í¾È ÇÒ ¼öÀÖ°ÔÇÕ´Ï´Ù. ...

125

´Ù¿î·Îµå