Being the thoughts and writings of one Gustaf Erikson; father, homeowner, technologist.

This category contains posts about computing, programming, and development

Tuesday, 2006-01-31


Converting from ISO-8859-1 to UTF-8 in Perl

When posting my observations via email any Swedish characters are converted to quoted-printable ISO-8859-1 by Gmail. However, this blog is in UTF-8. This is how I translated the input from the mail message.

#!/usr/bin/perl -w
use strict;
use MIME::QuotedPrint qw( decode_qp );
use Encode qw( decode encode );
# split the mail message
my ( $headers, $body );
{
    local $/ = undef;
    ( $headers, $body ) = split( "\n\n", <STDIN>, 2 );
}
# decode the qouted-printable input
$body = decode_qp( $body );
# decode to Perl's internal format
$body = decode( 'iso-8859-1', $body );
# encode to UTF-8
$body = encode( 'utf-8', $body );
print $body, "\n";

The result is piped into a second script that formats the actual posting.

Pretty basic, eh? But until you know how, it can be a bit frustrating getting this to work.

Submit a comment

Please enter comments as plain text only; HTML is not supported. Submitting an URL is optional.

Comments are moderated and may not appear immediately.

Comments are closed for this story.