Is it true that you shouldn't use Perl on Windows?

"Is it true that you shouldn't use Perl on Windows?"

Let's answer this question.

Perl supports Windows

First and foremost, the Perl core supports Windows.

Perl is written in C, but if you look at the source code, you'll find porting for Windows.

Distribution for Windows

Perl has distributions for Windows, Active Perl and Strawberry Perl.

Think of a distribution as an easy-to-install and easy-to-use version of Perl on Windows.

Both Active Perl and Strawberry Perl were released shortly after the latest version of Perl was released.

Why should some people not use Perl on Windows?

Why does some people say that Perl works fine on Windows and the distribution is up to date, but you shouldn't use Perl on Windows?

It's a Japanese problem. It is a criticism that Japanese is difficult to handle on Windows.

To write the conclusion first, Perl can handle Japanese correctly on Windows, and if you learn the correct manners, you will not be inconvenienced.

However, the fact that such criticisms came out means that Perl's Japanese processing certainly had a period of turmoil historically.

What is the confusion of Perl's Japanese processing?

When you search the Web, Perl has various Japanese processing methods.

jcode.pl, JCode.pm, Encode, encoding, JPerl, binmode, utf8, utf8::is_uft8.

So what exactly should we trust?

Why are the Japanese processing methods that exist on the Web so mixed?

One reason is the transition of the times.

Perl 4 and Perl 5 early

that did not support Japanese correctly

Initially in Perl 4 and Perl 5, there was no Unicode support.

Therefore, when I tried to use Perl with cp932 (a derivative of Shift_JIS) of Windows at this time, Japanese processing was inconvenient.

Therefore, JPerl was developed by applying a patch to Perl that can handle the character code of cp932 as it is.

This is a patch to Perl itself. In other words, jperl is a derivative of Perl, not Perl itself.

Apart from this, there was also a way to handle Japanese processing in Perl by using Perl and a character encoding module called jcode.pl.

Jcode.pm is a modularization of jcode.pl.

In the latter half of Perl 5, by supporting Unicode, Perl itself has become a trend to handle multi-byte characters like Japanese.

Unicde was experimentally supported in Perl 5.6, and Encode was introduced in Perl 5.8. Perl 5.8.7 is free from bugs around Unicode.

This is between 2002 and 2007.

Japanese processing, time of turmoil

Of course, just because Unicode is supported doesn't mean you can easily change to a new habit because you have the old habit.

In fact, refactoring a program that uses jcode.pl or JCode.pm into the new Unicode mechanism takes a lot of effort.

From the JPerl side, it was a fact that it would take a lot of work to use Perl 5.8's Unicode support for convenience.

Also, the new Unicode mechanism is a mechanism that clearly distinguishes between the internal character code and the external character code, so the concept is fundamentally different from the encoding mechanism of jcode.pl and JCode.pm. ..

The new Unicode mechanism was inconvenient and unclear. When I combined the characters, they were garbled and I got a Wide Character warning, and I felt that this was a failure of the Perl core team.

There was a time when utf8::is_utf8 was misunderstood that Perl could determine if it was using an internal character code or an external character code.

It was misunderstood by people familiar with Perl, the Perl community, and CPAN authors.

There was a time of great turmoil.

But when I learned Perl and character encoding and looked back from 2020, I asked, "Is there a better way to introduce Unicode while maintaining backward compatibility with Perl?" I feel it.

When I think "what should I do?", I can't think of a better way.

I think Perl's core team has done the best implementation within the constraints.

The simple solution is encode when you go out, decode when you go inside

The simple and good policy of Perl's character code in 2020 is to use the encode function of the Encode module to convert it to the actual character code, and then use the decode function to convert it to the internal character code. ..

Save the program in UTF-8 and write "use utf 8" at the beginning.

Only this.

There are other ways to do it in terms of performance and ease of writing, but first it's a good idea to learn the above method, which is simple and can handle any pattern.

Details are explained in the following article.

Encode module-Properly handle multibyte strings such as Japanese

Frequently Asked Questions

How do you handle Japanese file names?

Encode to cp932 when passing it to the open function. The file name is outside the program (OS is outside the program), so encode it at the position where you call the open function.

use utf8;

my $file ='Ah.txt';

open my $fh,'<', encode ('cp932', $file)
  or die "Can't open file: $!";

The same applies to functions that receive file names other than open(unlink, File::Copy::copy, File::Copy::move).

If you have too many, subrouting it will make things a little easier.

use utf8;

sub cp932 {
  my ($file) = @_;
  
  $file = encode ('cp932', $file);
  
  return $file;
}

my $file ='Ah.txt';

open my $fh,'<', cp932 ($file)
  or die "Can't open file: $!";

Perl Philosophy