てきとうなメモ

本の感想とか技術メモとか

perlの-Cオプション

PerlUnicode文字列処理している時に、utf-8フラグのついたデータをそのまま出力しようとすると「Wide character in print」と警告される。

$ perl -E 'say pack("U*", 0x3042)'
Wide character in print at -e line 1.
あ

binmode使う手もあるけど

$ perl -E 'binmode STDOUT, ":utf8"; say pack("U*", 0x3042)'
あ

'-C'オプションを使うとワンライナーらしくなる。

$ perl -CO -E 'say pack("U*", 0x3042)'
あ

O以外にもいろいろ設定できるようだ。

                I     1   STDIN is assumed to be in UTF-8
                O     2   STDOUT will be in UTF-8
                E     4   STDERR will be in UTF-8
                S     7   I + O + E
                i     8   UTF-8 is the default PerlIO layer for input streams
                o    16   UTF-8 is the default PerlIO layer for output streams
                D    24   i + o
                A    32   the @ARGV elements are expected to be strings encoded
                          in UTF-8
                L    64   normally the "IOEioA" are unconditional,
                          the L makes them conditional on the locale environment
                          variables (the LC_ALL, LC_TYPE, and LANG, in the order
                          of decreasing precedence) -- if the variables indicate
                          UTF-8, then the selected "IOEioA" are in effect
                a   256   Set ${^UTF8CACHE} to -1, to run the UTF-8 caching code in
                          debugging mode.