Apache Nutch MultiLingual Support
![]() ![]() ¾î¶»°Ô? ¶java.netÀÇ
![]() ![]() ½ÃÀÛ ¶·ÎÄÿ¡ nutch binary¸¦ ¼³Ä¡ÇÑ ÈÄ¿¡
dormael@dormael-desktop:~/nutch-test/nutch-0.8.1$ mkdir test dormael@dormael-desktop:~/nutch-test/nutch-0.8.1$ vi test/nutch dormael@dormael-desktop:~/nutch-test/nutch-0.8.1$ cat test/nutch http://my.domain.name/ Å©·Ñ·¯°¡ ¿ÜºÎ ¸µÅ©·Î ³ª°¡´Â °ÍÀ» ¸·±â À§ÇØ ¾Æ·¡¿Í °°ÀÌ ¼öÁ¤.
dormael@dormael-desktop:~/nutch-test/nutch-0.8.1$ vi conf/crawl-urlfilter.txt # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*my.domain.name/ ¹®Á¦ ¹ß»ý ¹× ÇØ°á ¶nutch Å©·Ñ·¯°¡ »ó¼¼ÇÑ ¸Þ½ÃÁö ¾øÀÌ °è¼Ó NullPointerExceptionÀ» ³ÂÀ½.
ã¾Æº» °á°ú ±âº» ¼³Á¤¿¡ Ãß°¡ÀûÀ¸·Î ÇÊ¿äÇÑ ³»¿ëÀÌ ´©¶ôµÊ.
dormael@dormael-desktop:~/nutch-test/nutch-0.8.1$ vi conf/nutch-site.xml ÇÁ·ÎÆÛƼµé Áß¿¡ Å©·Ñ·¯ÀÇ Á¤º¸¸¦ ³Ö¾îÁØ ÈÄ¿¡ ¹®Á¦¾øÀÌ ½ÇÇàµÊ.
±âº»°ªÀÌ ºñ¾î ÀÖ¾î¼ ExceptionÀÌ ¹ß»ýµÈ °ÍÀ¸·Î º¸ÀÓ.
<name>http.agent.name</name> <value>My Nutch Test</value> <name>http.agent.description</name> <value>Test</value> <name>http.agent.url</name> <value>no</value> <name>http.agent.email</name> <value>no</value> |