ÀÚ¿¬ ¾ð¾î ÅøŶ

Natural Language ToolkitÀº Python ¶óÀ̺귯¸® ¹× ±âÈ£ ¹× Åë°èÀû ÀÚ¿¬ ¾ð¾î 󸮸¦À§ÇÑ ÇÁ·Î±×·¥ ½ºÀ§Æ®ÀÔ´Ï´Ù.
Áö±Ý ´Ù¿î·Îµå

ÀÚ¿¬ ¾ð¾î ÅøŶ ¼øÀ§ ¹× ¿ä¾à

±¤°í

  • Rating:
  • ƯÇã:
  • GPL
  • °¡°Ý:
  • FREE
  • °Ô½ÃÀÚ À̸§:
  • Steven Bird
  • °Ô½ÃÀÚ À¥»çÀÌÆ®:

ÀÚ¿¬ ¾ð¾î ÅøŶ ű×


ÀÚ¿¬ ¾ð¾î ÅøŶ ¼³¸í

Natural Language ToolkitÀº Python ¶óÀ̺귯¸® ¹× »ó¡Àû ¹× Åë°èÀû ÀÚ¿¬ ¾ð¾î 󸮸¦À§ÇÑ ÇÁ·Î±×·¥ ½ºÀ§Æ®ÀÔ´Ï´Ù. Natural Language ToolkitÀº Python ¶óÀ̺귯¸® ¹× »ó¡Àû ¹× Åë°èÀû ÀÚ¿¬ ¾ð¾î 󸮸¦À§ÇÑ ÇÁ·Î±×·¥ ½ºÀ§Æ®ÀÔ´Ï´Ù. NLTK¿¡´Â ±×·¡ÇÈ µ¥¸ð ¹× »ùÇà Data.It°¡ Æ÷ÇԵǾî ÀÖ½À´Ï´Ù. Toolkit.documentation¿¡¼­ Áö¿øÇÏ´Â ¾ð¾î ó¸® ÀÛ¾÷ µÚ¿¡ÀÖ´Â ±âº» °³³äÀ» ¼³¸íÇÏ´Â ÀÚ½À¼­¸¦ Æ÷ÇÔÇÏ¿© ±¤¹üÀ§ÇÑ ¹®¼­°¡ ¼ö¹ÝµË´Ï´Ù. NLTK Ȩ¿¡¼­ NLTK¸¦ »ç¿ëÇÏ´Â ¹æ¹ý¿¡ ´ëÇÑ »ó´çÇÑ ¾çÀÇ ¹®¼­¸¦ »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù. ÆäÀÌÁö : ƯÈ÷ NLTK Ȩ ÆäÀÌÁö¿¡´Â ¼¼ °¡Áö À¯ÇüÀÇ ¼³¸í¼­°¡ Æ÷ÇԵǾî ÀÖ½À´Ï´Ù. ¡¤ ÀÚ½À¼­´Â ƯÁ¤ ÀÛ¾÷À» ¼öÇàÇÏ´Â ÄÁÅؽºÆ®¿¡¼­ ÅøŶÀ» »ç¿ëÇÏ´Â ¹æ¹ýÀ» Çлýµé¿¡°Ô °¡¸£Ä¨´Ï´Ù. ÅøŶÀ» »ç¿ëÇÏ´Â ¹æ¹ýÀ» ¹è¿ì°íÀÚÇÏ´Â »ç¶÷¿¡°Ô ÀûÇÕÇÕ´Ï´Ù. ¡¤ ToolkitÀÇ ÂüÁ¶ ¹®¼­´Â ÅøŶÀÇ ¸ðµç ¸ðµâ, ÀÎÅÍÆäÀ̽º, Ŭ·¡½º, ¸Þ¼­µå, ÇÔ¼ö ¹× º¯¼ö¸¦ ¼³¸íÇÕ´Ï´Ù. ÀÌ ¹®¼­´Â »ç¿ëÀÚ¿Í °³¹ßÀÚ ¸ðµÎ¿¡°Ô À¯¿ëÇؾßÇÕ´Ï´Ù. ¡¤ ¸¹Àº ±â¼ú º¸°í¼­¸¦ »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù. ÀÌ º¸°í¼­´Â ÅøŶÀÇ ¼³°è ¹× ±¸ÇöÀ» ¼³¸íÇÏ°í Á¤´çÈ­ÇÕ´Ï´Ù. ÅøŶÀÇ °Ç¼³À» ¾È³»ÇÏ°í ¹®¼­È­Çϱâ À§ÇØ ÅøŶ °³¹ßÀÚ°¡ »ç¿ëÇÕ´Ï´Ù. ÇлýµéÀº ÅøŶÀÌ µðÀÚÀÎ µÈ ¹æ½Ä°ú ±× ÀÌÀ¯°¡ ¹«¾ùÀÎÁö¿¡ ´ëÇÑ ÀÚ¼¼ÇÑ Á¤º¸¸¦ ¿øÇÒ °æ¿ì ÇлýµéÀº ÀÌ·¯ÇÑ º¸°í¼­¸¦ ÂüÁ¶ ÇÒ ¼ö ÀÖ½À´Ï´Ù. ÀÌ ¸±¸®½ºÀÇ »õ·Î¿î ±â´É : NLTK : - ù ¹ø° ÁÖ¹® ³í¸®, ¼±Çü ·ÎÁ÷, Á¢ÂøÁ¦ ÀǹÌ, DRT, LFG (Dan Garrette)¿¡ ´ëÇÑ È®Àå µÈ Semantics ÆÐÅ°Áö - WordNetÀÇ »õ·Î¿î Wordsense Ŭ·¡½º . °¨Áö Å°¿¡¼­ Synsets¿¡ ´ëÇÑ ¾×¼¼½º¸¦ Áö¿øÇÏ°í ¼¾½º Ä«¿îÆ® (Joel Nothman)¿¡ ¾×¼¼½º Áö¿ø (Joel Nothman) - NLTK.TAG.CRF (Sense Counts) - MISC ¹ö±× ¼öÁ¤, SYNSET, MAXENT - CUNKERS¿¡ ´ëÇÑ MAXENT °³¼± µÈ Áö¿ø À¯¿¬ÇÑ Ã»Å© ÄÚÆÛ½º Æ÷ÇÔ Reader, New Rule Type : ChunkRulewithContext- Pos-Tag-Tag Contordancing NLTK.Draw.POS_Concordance - RegexP chunkers °³¹ßÀ»À§ÇÑ »õ·Î¿î GUI NLTK.Draw.RechUnkParser - Conll.py¿¡¼­ ConllChunkCorpusReader¿¡ ConllChunkCorpusReader¸¦ Ãß°¡ÇÏ´Â Bio_Sents () ¹× Bio_Words () ¸Þ¼­µå°¡ Ãß°¡µÇ¾ú½À´Ï´Ù. µ¶¼­ (Word, Tag, Chunk_Typ) Conll-2000 CorpusÀÇ Æ©ÇÃ. ¶ÇÇÑ ÀÌ·¯ÇÑ º¯°æ »çÇ×À» Áö¿øÇϱâ À§ÇØ ConllChunkCorpusView¸¦ ¼öÁ¤Çß½À´Ï´Ù. - ºÎ¸ð Æ÷ÀÎÅÍ (nltk.tree.parentedtree ¹× nltk.tree.multiparentedtree)¸¦ ÀÚµ¿À¸·Î À¯ÁöÇÏ´Â ºÎ¹«Àû ÀÎ ³ª¹« (Jussi Salmela, Paul Bone) - °ÔÀ¸¸¥ ½ÃÄö½º¿¡ ´ëÇÑ Áö¿øÀÌ Çâ»óµÇ¾ú½À´Ï´Ù. ºê·¡Å¶ÀÌÀÖ´Â ¹®ÀÚ¿­À» Æ®¸®¿¡ º¯È¯ÇϱâÀ§ÇÑ À¯¿¬ÇÑ Æļ­ - DocStrings to DocStrings to DocStrings (ÁøÇàÁßÀÎ ÀÛ¾÷) - »õ·Î¿î NLG ÆÐÅ°Áö, FUF / ¼­Áö (PETRO Verkhogliad) - »õ·Î¿î Á¾¼Ó Æļ­ ÆÐÅ°Áö (Jason Narad) - »õ·Î¿î Coreference ÆÐÅ°Áö ACE-2, MUC-6 ¹× MUC-7 CONFORA (Joseph Frazee) - CCG Æļ­ (Graeme Gange) - ÃÖÃÊ ÁÖ¹® ÇØ»óµµ ÀÌ·ÐÀû ÀÎ Prover (Dan Garrette) µ¥ÀÌÅÍ : - NNW NPS äÆà ÄÚÆÛ½º A ND Corpus Reader (nltk.corpus.nps_chat) - ConllCorpusReader´Â ÀÌÁ¦ Conll 2004 ¹× 2005 Corpora¸¦ Àд µ¥ »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù .- API.pyÀÇ NLTK_CONTRIB.COREF ¿ë HMM ±â¹Ý TreeBank POS Tagger ¹× Pharase Chunker ±¸Çö. ÀÌ·¯ÇÑ °´Ã¼ÀÇ Çǵå¹é ¹öÀüÀº µ¥ÀÌÅÍ / ÅÂ±× »ç¿ëÀÚ ¹× µ¥ÀÌÅÍ / chunkers.Book¿¡¼­ È®Àε˴ϴÙ. -ÀÌ ¸±¸®½º¿¡¼­´Â »õ·Î¿î ±â´ÉÀÇ Çǵå¹é¿¡ ´ëÇÑ ÀÀ´äÀ¸·Î ±âŸ ¼öÁ¤ »çÇ× : ¡¤ÀÌ ¹öÀüÀº NLTKÀÇ API¸¦ 2.0 ¸±¸®½º ¹× NLTK ºÏÀÇ ÃâÆǺ¸´Ù ¾Õ¼­ ¸¶¹«¸®ÇÕ´Ï´Ù. ¼ö½Ê °³ÀÇ »ç¼ÒÇÑ °³¼± »çÇ×°ú ¹ö±× ¼öÁ¤ÀÌÀÖ¾ú½À´Ï´Ù. NLTK.FOO.BAR ¾ç½ÄÀÇ ¸¹Àº À̸§À» NLTK.BAR·Î »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù. ÀÇ»ç °áÁ¤ Æ®¸®, ¹è¿­ ¹× µµ±¸ »óÀÚ ¸ðµâ¿¡ È®Àå µÈ ±â´ÉÀÌ ÀÖ½À´Ï´Ù. »õ·Î¿î ¹ø¿ª Àå³­°¨ nltk.misc.babelfish°¡ Ãß°¡µÇ¾ú½À´Ï´Ù. »õ·Î¿î ¸ðµâ NLTK.Help ÅÂ±× ¼¼Æ® ¼³¸í¼­¿¡ ¾×¼¼½º ÇÒ ¼ö ÀÖ½À´Ï´Ù. NLTK°¡ TKInter¾øÀÌ ºôµåÇÏ°í ¼³Ä¡ÇÒ ¼ö ÀÖµµ·Ï ¼öÁý µÈ °¡Á® ¿À±â°¡ ¹ß»ýÇÕ´Ï´Ù (¼­¹ö ½ÇÇà). »õ µ¥ÀÌÅÍ¿¡´Â ÃÖ´ë ¿£Æ®·ÎÇÇ chunker ¸ðµ¨ ¹× ¾÷µ¥ÀÌÆ® µÈ ¹®¹ýÀÌ Æ÷ÇԵ˴ϴÙ. NLTK ContribÀº Coreference Package (Joseph Frazee) ¹× ISRI ¾Æ¶ø¾î ½ºÅ×¸Ó (HOSAM ALGASAIER)¿¡ ´ëÇÑ ¾÷µ¥ÀÌÆ®°¡ Æ÷ÇԵ˴ϴÙ. ÀÌ Ã¥Àº ÃÖÁ¾ ÃâÆǺ¸´Ù ½ÇÁúÀûÀÎ »ç¼³ ¼öÁ¤À» °ÅÃƽÀ´Ï´Ù.


ÀÚ¿¬ ¾ð¾î ÅøŶ °ü·Ã ¼ÒÇÁÆ®¿þ¾î