Wednesday, June 11, 2008

Re: [GENERAL] encoding confusion

Sim Zacks wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> The data in the longblob field might be text, which could be causing the
> confusion. For example, when I look at the data in the longblob field, I
> see /n for a newline and when I look at the bytea it is 012.

That's right - newline is ASCII 10 (or 12 in octal).

> I can only tell you what happened in the client end, in terms of
> corruption. I am using the Thunderbord client. When I clicked on a
> message, it didn't show the data and when I looked at the headers, it
> was just a big mess. I'm guessing that somehow the newlines didn't work
> and the headers and message were overlaid on top of each other.

Well that might be a problem with dmail's setup rather than the
database. I think headers are restricted to ASCII only (the body is a
different matter). The best bet is to be certain whether the database is
to blame.

Find a problem entry, dump that one row to a file from MySQL, do the
same from PostgreSQL and also from the midpoint in your Python code
doing the transfer. Then use a hex editor / dumper (e.g. "hexdump -C" on
linux) to see what bytes differ in the files.

--
Richard Huxton
Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: