Unicode::Japanese::JA(3pm)
NAME
Unicode::Japanese::JA - XXXXXXXXXX
XX
use Unicode::Japanese;
use Unicode::Japanese qw(unijp);
# convert utf8 -> sjis
print Unicode::Japanese->new($str)->sjis;
print unijp($str)->sjis; # same as above.
# convert sjis -> utf8
print Unicode::Japanese->new($str,'sjis')->get;
# convert sjis (imode_EMOJI) -> utf8
print Unicode::Japanese->new($str,'sjis-imode')->get;
# convert zenkaku (utf8) -> hankaku (utf8)
print Unicode::Japanese->new($str)->z2h->get;
XX
- Unicode::Japanese XXXXXXXXXXXXXXXXXXXXXXXXXXX
- XX
- o Unicode::Japanese XXXXXXXXXUTF-8 XXXXXXXXXXX
- o XS XX/XXXXXXXXXXXXXXXX XS XXXXXXXXXXXXXXXXX No-XS
XXXXXXXXXXXXXXXXXXXX (Japanese.pm XXXXXXXXXXXXXX)X
- o XXXXXXXXXXXXXXXXXXXXXXXXXXXX
- o XXXX (DoCoMo i-modeXKDDI AU, Softbank Mobile, ASTEL dot-i) XXXXX
Unicode XXXXXXXXXXXXXXXXDB XXXXXXXXXXXXXXX
- o XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- o SJIS XX MS-CP932 XXXXX Unicode XXXXXXXXXXXX
- o Unicode -> SJISXXXEUC-JP/JISX XXXXXXXXSJIS XXXXXXXXXX &#dddd;
XXXXXXXXX XXXUnicodeXXXXXXXXXXXXXXX '?'XXXXX. XX, XXXXXXXXXXXX,
XXXXXXXXXXXX'?'XXXXX. - o Perl-5.8.0 XXXXXX, utf8 XXXXXXXXXXXXX. utf-8 `XXX'X XXXXX utf8()
XXXXX, utf-8 `XX'X XXXXX getu() XXXXXXXXX.get() XXXXXXXXXX utf-8 `XXX'X XXXXX (XXXXXXXXXXXXXXXXX).sjis(), jis(), utf8(), etc.. XXXXXXXXXXXXXXX. new, set, getcode XXXXXXXXX, utf8-flaged/bytes XXXXXX.
XXXXXXXX
o perl 5.10.x, 5.8.x, etc. (5.004 XX).
- o (XXXXOK) C XXXXX. XXXXXXXX XS X Pure Perl XXXXXXXXXX. C
- XXXXXXXXXXXX, Unicode::Japanese X Pure Perl XXXXXXXXXXXXXXXXXX.
- o (XXXXOK) XXXXX Test.pm XX Test::More.
- XXXXXXXXXXXXXXXXXX.
XXXX
- $s = Unicode::Japanese->new($str [, $icode [, $encode]])
- XXX Unicode::Japanese XXXXXXXXXXXXX
- XXXXXXXXXXXX"set" XXXXXXXXXXX
- $s = unijp($str [, $icode [, $encode]])
- Unicode::Janaese->new(...) XXX.
- $s->set($str [, $icode [, $encode]])
- $str: XXX
$icode: XXXXXXXXXXXXXXXX 'utf8'
$encode: XXXXXXXXXXXXXX - XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX UTF-8 XXXXXXXX
- XXXXXXXXXX:
auto
utf8 ucs2 ucs4
utf16-be utf16-le utf16
utf32-be utf32-le utf32
sjis cp932 euc euc-jp jis
sjis-imode sjis-imode1 sjis-imode2
utf8-imode utf8-imode1 utf8-imode2
sjis-doti sjis-doti1
sjis-jsky sjis-jsky1 sjis-jsky2
jis-jsky jis-jsky1 jis-jsky2
utf8-jsky utf8-jsky1 utf8-jsky2
sjis-au sjis-au1 sjis-au2
jis-au jis-au1 jis-au2
sjis-icon-au sjis-icon-au1 sjis-icon-au2
euc-icon-au euc-icon-au1 euc-icon-au2
jis-icon-au jis-icon-au1 jis-icon-au2
utf8-icon-au utf8-icon-au1 utf8-icon-au2
ascii binary - ( XXX.)
- XXXXXXXXXXXXXXXX'auto' XXXXXXXXXXXXXX 'auto' XXXXXXXXXXXXXgetcode() XXXXXXX XXXXXX
- XXXXXXXXXXXX'base64' XXXXXXXXX base64 XXXXXXXXXbase64 XXXXXXXXX
Unicode::Japanese XXXXXXXXXXXXX - XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 'binary' XXXXXXX
- sjis-imodeXsjis-dotiXXXXXXXXXX &#dddd; X XXXXXXXXXXX
- XXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXX
- sjis, utf8 XXXXXXXXXXXXXXXXXsjisX sjis-auXsjis-doti
XXXXXXXXXXXXXXXXXsjis-auX XXXXXX - $str = $s->get
- $str: XXX(UTF-8)
- XXXX UTF-8 XXXXXXXXXXX
- XXX `XXX' X XXXXXX, XXXXXXXXXXXXXXXXX.
- XXXXXXXXX utf8() XXXXX, XXXXXXXX getu() XXXXXXXXXXXXXXXXX.
- $str = $s->getu
- $str: XXX(UTF-8)
- XXXX UTF-8 XXXXXXXXXXX
- Perl-5.8.0 XXXXXXX, utf-8 XXXXXXX utf-8 XXXXXX XXXX.
- $code = $s->getcode($str)
- $str: XXX
$code: XXXXXXXXXXX - XXXXXXX($str)XXXXXXXXXXXXXXX
- XXXXXX, XXXX, XXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- (PurePerlX)
- 1. UTF-32 X BOM XXXXXutf32 XXXXXXX
- 2. UTF-16 X BOM XXXXXutf16 XXXXXXX
- 3. UTF-32BE XXXXXXXXXXXutf32-be XXXXXXX
- 4. UTF-32LE XXXXXXXXXXXutf32-le XXXXXXX
- 5. ESC XX XXX 8 XXXXXXXXXXXXXXXXXXXXXXXascii XXXXX XXESC XXXX
ASCII XXXX (0x00-0x1F XX 0x7F) X ascii XXXXXXXXX XX - 6. JISXXXXXXXXXXXXXXXXXXXjis XXXXXXX
- 7. J-PHONE XXXXXXXXXXXXXsjis-jsky XXXXXXX
- 8. EUC-JP XXXXXXXXXXXXXXeuc XXXXXXX
- 9. SJIS XXXXXXXXXXXXXXsjis XXXXXXX
- 10. SJIS XXXX au XXXXXXXXXXXXXXXsjis-au XXXXXXX
- 11. SJIS X i-mode XXXXXXXXXXXXXXXsjis-imode XXXXXXX
- 12. SJIS X dot-i XXXXXXXXXXXXXXXsjis-doti XXXXXXX
- 13. UTF-8 XXXXXXXXXXXutf8 XXXXXXX
- 14. XXXXXXXXXXXXXXXunknown XXXXXXX
- (XSX)
- 1. UTF-32 X BOM XXXXXutf32 XXXXXXX
- 2. UTF-16 X BOM XXXXXutf16 XXXXXXX
- 3. XXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXXXX.
ascii / euc / sjis / jis / utf8 / utf32-be / utf32-le / sjisjsky / sjis-imode / sjis-au / sjis-doti4. XXXXXXXXXXXXXXX, XXXXXXXXXXXXXX, XXXXXXXX.
utf32-be / utf32-le / ascii / jis / euc / sjis / sjis-jsky /
sjis-imode / sjis-au / sjis-doti / utf85. XXXXXXXXXXXXXXXunknown XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXo UTF-8 XXXXXXSJISXXXXXXXXXXXXXXXXXXo UCS2 XXXXXXXXXXXXo UTF-16 X BOM XXXXXXXXXXXXXXXo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
&#dddd; XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXSXPurePerlXX, XXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXX. XX,
XXXXXXXXXXXXXsjisXXX, PurePerlXXsjisXXXXXXX XSXXXXXXXX. XXXsjisjskyXXXXXXXXXXXXX. XX, XX XXXXXXXXXXXXX, euc-jpXXXXX,
XXXXXXXXXXXXXXXXX XXXXXX.$code = $s->getcodelist($str)
$str: XXX
$code: XXXXXXXXXXXXXXXXXX($str)XXXXXXXXXXXXXXXgetcode XXXX, XXXXXXXXXXXXXXX XXXXXXX. - $str = $s->conv($ocode, $encode)
$ocode: XXXXX (XXXXXX)
utf8 ucs2 ucs4 utf16
sjis cp932 euc euc-jp jis
sjis-imode sjis-imode1 sjis-imode2
utf8-imode utf8-imode1 utf8-imode2
sjis-doti sjis-doti1
sjis-jsky sjis-jsky1 sjis-jsky2
jis-jsky jis-jsky1 jis-jsky2
utf8-jsky utf8-jsky1 utf8-jsky2
sjis-au sjis-au1 sjis-au2
jis-au jis-au1 jis-au2
sjis-icon-au sjis-icon-au1 sjis-icon-au2
euc-icon-au euc-icon-au1 euc-icon-au2
jis-icon-au jis-icon-au1 jis-icon-au2
utf8-icon-au utf8-icon-au1 utf8-icon-au2
binary - ( XXX.)
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX - $encode: XXXXXXXXXXXXXX
$str: XXX - XXXXXXXXXXXXXXXXXXXXXXXXXXX
- XXXXXXXXXX'base64' XXXXXXXXX base64 XXXXXXXXXbase64 XXXXXXXX
XXXXXXXXXX - perl-5.8.0 XXXXXX, XXX utf-8 XXXXXXXXXXXXXXXXX.
- $s->tag2bin
XXXXXXXXX &#dddd; XXXXXXXXXXXXXXXXXXXXXXXXX - $s->z2h
XXXXXXXXXXXX - $s->h2z
XXXXXXXXXXXX - $s->hira2kata
XXXXXXXXXXXXXXXX - $s->kata2hira
XXXXXXXXXXXXXXXX - $str = $s->jis
$str: JIS XXXXXXXXXXXXXXX - XXXX JISXISO-2022-JPX XXXXXXXXXXX
- $str = $s->euc
$str: euc-jp XXXXXXXXXXXXXXX - XXXX EUC-JP XXXXXXXXXXX
- $str = $s->utf8
$str: utf-8 XXXXXXXXXXXXXXX - XXXX UTF-8 XXXXXXXXXXX
- perl-5.8.0 XXXXXXX, XXXXXXXXX.
- $str = $s->ucs2
$str: ucs2 XXXXXXXXXXXXXXX - XXXX UCS2 XXXXXXXXXXX
- $str = $s->ucs4
$str: ucs4 XXXXXXXXXXXXXXX - XXXX UCS4 XXXXXXXXXXX
- $str = $s->utf16
$str: ucs-16 XXXXXXXXXXXXXXX - XXXX UTF-16 XXXXXXXXXXX BOMXXXXXXX XXXXXXXXXXXXXXXXXX
- $str = $s->sjis
$str: sjis XXXXXXXXXXXXXXX - XXXX SJISXMS-CP932X XXXXXXXXXXX
- $str = $s->sjis_imode
$str: sjis/imodeXXX XXXXXXXXXXXXXXX - XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXimodeXXXXXXXX.
- $str = $s->sjis_imode1
$str: sjis/imode XXX XXXXXXXXXXXXXXX - XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXXXXXXXXXXX.
- $str = $s->sjis_imode2
$str: sjis/imode XXX XXXXXXXXXXXXXXX - XXXX i-mode XXXXX SJIS XXXXXXXXXXX XXXXX, XXXXXXXXXX.
- $str = $s->sjis_doti
$str: sjis/dot-i XXX XXXXXXXXXXXXXXX - XXXX dot-i XXXXX SJIS XXXXXXXXXXX
- $str = $s->sjis_jsky
$str: sjis/j-sky XXX XXXXXXXXXXXXXXX - XXXX j-sky XXXXX SJIS XXXXXXXXXXX XXXj-skyXXX(VERSION 0.15 XX,
jsky2)XXXXX. - $str = $s->sjis_jsky1
$str: sjis/j-sky XXX XXXXXXXXXXXXXXX - XXXX j-sky XXXXX SJIS XXXXXXXXXXX Page 1X3 XXXXXXXXXXX.
- $str = $s->sjis_jsky
$str: sjis/j-sky XXX XXXXXXXXXXXXXXX - XXXX j-sky XXXXX SJIS XXXXXXXXXXX Page 1X6 XXXXXXXXX.
- $str = $s->sjis_icon_au
$str: sjis/AU iconXX XXXXXXXXXXXXXXX - XXXX AU XXXXX SJIS XXXXXXXXXXX
- $str_arrayref = $s->strcut($len)
$len: XXXXXXX(XXXX)
$str_arrayref: XXX - $lenXXXXXXXXX(XX)XXXXXXXXXXXXXXXX
- XXXXXXX, utf-8 XXXXXXXutf-8XXXXX.
- $len = $s->strlen
$len: XXXXXXX - UTF-8 XXXXXX length() XXXXXXXXXXXXXXXXX 3 XXXXXXXXXXX XXXXXXXXXXXXXXXX SJIS XXXXXXXXXXXXXXXXXX 2 XXXXXX
- $s->join_csv(@values);
@values: XXXXX - XXX CSV XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXX("\n")XXXXXXXX
- @values = $s->split_csv;
@values: XXXXX - XXXXXXXXXXXXXXXXXX CSV XXXXXXXXXXXXXX
XXXXXXXXXXX("\n")XXXXXXXXXXXXXXXX - XXX binary XXXXX utf-8 XXXXXXXX. binary XXXXXXXXXXXXXXX.
XXXXXXXXXXXXXXXXX
- +---------------+----+-----+-------+
|encoding | in | out | guess |
+---------------+----+-----+-------+
|auto : OK : -- | ----- |
+---------------+----+-----+-------+
|utf8 : OK : OK | OK |
|ucs2 : OK : OK | ----- |
|ucs4 : OK : OK | ----- |
|utf16-be : OK : -- | ----- |
|utf16-le : OK : -- | ----- |
|utf16 : OK : OK | OK(#) |
|utf32-be : OK : -- | OK |
|utf32-le : OK : -- | OK |
|utf32 : OK : -- | OK(#) |
+---------------+----+-----+-------+
|sjis : OK : OK | OK |
|cp932 : OK : OK | ----- |
|euc : OK : OK | OK |
|euc-jp : OK : OK | ----- |
|jis : OK : OK | OK |
+---------------+----+-----+-------+
|sjis-imode : OK : OK | OK |
|sjis-imode1 : OK : OK | ----- |
|sjis-imode2 : OK : OK | ----- |
|utf8-imode : OK : OK | ----- |
|utf8-imode1 : OK : OK | ----- |
|utf8-imode2 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-doti : OK : OK | OK |
|sjis-doti1 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-jsky : OK : OK | OK |
|sjis-jsky1 : OK : OK | ----- |
|sjis-jsky2 : OK : OK | ----- |
|jis-jsky : OK : OK | ----- |
|jis-jsky1 : OK : OK | ----- |
|jis-jsky2 : OK : OK | ----- |
|utf8-jsky : OK : OK | ----- |
|utf8-jsky1 : OK : OK | ----- |
|utf8-jsky2 : OK : OK | ----- |
+---------------+----+-----+-------+
|sjis-au : OK : OK | OK |
|sjis-au1 : OK : OK | ----- |
|sjis-au2 : OK : OK | ----- |
|jis-au : OK : OK | ----- |
|jis-au1 : OK : OK | ----- |
|jis-au2 : OK : OK | ----- |
|sjis-icon-au : OK : OK | ----- |
|sjis-icon-au1 : OK : OK | ----- |
|sjis-icon-au2 : OK : OK | ----- |
|euc-icon-au : OK : OK | ----- |
|euc-icon-au1 : OK : OK | ----- |
|euc-icon-au2 : OK : OK | ----- |
|jis-icon-au : OK : OK | ----- |
|jis-icon-au1 : OK : OK | ----- |
|jis-icon-au2 : OK : OK | ----- |
|utf8-icon-au : OK : OK | ----- |
|utf8-icon-au1 : OK : OK | ----- |
|utf8-icon-au2 : OK : OK | ----- |
+---------------+----+-----+-------+
|ascii : OK : -- | OK |
|binary : OK : OK | ----- |
+---------------+----+-----+-------+
(#): guessed when it has bom. - XXXXXXXX
1. utf32 (#)
2. utf16 (#)
3. utf32-be
4. utf32-le
5. ascii
6. jis
7. sjis-jsky (pp)
8. euc
9. sjis
10. sjis-jsky (xs)
11. sjis-au
12. sjis-imode
13. sjis-doti
14. utf8
15. unknown
DESCRIPTION OF UNICODE MAPPING
Unicode XXXXXXXXXXXXXXXXXXXX
- Shift_JIS
- MS-CP932 XXX Unicode XXXXXXXXXXXX XXXXXXXXXXXXXURLXXXXXXXXXXXX
- <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT>
- Unicode XX SJIS XXXXXXXXXXXXXXXXXXXXXXXXX XXXXX &#dddd; XXXXXXXXX
XXXXXXXXXXX?XXXXXXXXX - XXXXXXXX SJIS XXXXXXXXXXXXXXXXXXXXXX?XXXXXXXXX
- EUC-JP/ISO-2022-JP
XXSJISXXXXXXXXXXXUnicode XXXXXXXXXX XXXXXSJIS XXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXX - DoCoMo i-mode
F800 - F9FF XXXXXXXXXXXXXXXXXXU+0FF800 - U+0FF9FF XXXXXXXXXXXXX - ASTEL dot-i
F000 - F4FF XXXXXXXXXXXXXXXXXXU+0FF000 - U+0FF4FF XXXXXXXXXXXXX - J-PHONE J-SKY
J-SKY XXXXXXXXXXXXXXXX "\e\$" XXXXXXX1XXXXX
1XXXXXXX2XXXXX"\x0f"XXXXXXX 1XXXXXXXXXXXXXXXXX2XXXXXXXXXXXXXXXX
XXXXXXXXXXXX - XX1XXXXX2XXXXXXXX1XXXXXXXX4500 - 47FF XXXXX U+0FFB00 - U+0FFDFF
XXXXXXXXXXXXX - Unicode::Japanese XXXUnicode XX J-SKY XXXXXXXXXXXXXX
1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX - AU
XXXXXXXXXXXXU+0FF500 - U+0FF6FF XXXXXXXXXXXXX
PurePerl mode
- use Unicode::Japanese qw(PurePerl);
- use XXXXX 'PurePerl' XXXXXXX, XSXXXXXXXXXXXXXXXXXX.
XX
XXXXXX "bug-unicode-japanese at rt.cpan.org" XX XXXXXXXX. XXXX
<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Unicode-Japanese>. XXX
web XXXXXXXXXXXXXXXXXX. XXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXX.
- o EUC-JPXJIS XXXXXSJIS XXXXXXXX UTF-8 XXXXXXXXX SJIS
- XXXXXXXXXXXXXXXXXXXXXXXXXXX
- o XSXXXXXXXXXXEUC-JPXSJIS(XXXXX)XXXXXXXXX \e XXXXXXXEUC-JPXSJIS
XXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXX - o Japanese.pm XXXXXXXXXXXXXXXXXXFTP X ASCII XXXX XXXXXXXXXXXXXXX
XXXX
- XXXXXXXXXXXXXXX perldoc XXXXXXXXXXXXXX.
- perldoc Unicode::Japanese
- XX, XXXXXXXXXXXXXXXX:
- o AnnoCPAN: Annotated CPAN documentation
<http://annocpan.org/dist/Unicode-Japanese>- o CPAN Ratings
<http://cpanratings.perl.org/d/Unicode-Japanese>- o RT: CPAN's request tracker
<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Unicode-Japanese>- o Search CPAN
<http://search.cpan.org/dist/Unicode-Japanese>
CREDITS
Thanks very much to:
NAKAYAMA Nao
SUGIURA Tatsuki & Debian JP Project
XXXXXXXXXX
Copyright 2001-2008 SANO Taku (SAWATARI Mikage) and YAMASHINA Hio, all
rights reserved.
- XXXXXXXXXXXXXXXXXXXXXXXX Perl XXX XXXXXX XXXXXXXXXXXXXXXXXXX.