Saturday, August 30, 2008

Re: [GENERAL] ERROR: relation . . . does not exist

On Aug 30, 2008, at 9:19 AM, Christophe wrote:

>
> On Aug 30, 2008, at 6:26 AM, Albretch Mueller wrote:
>> Well, then obviously there is the need for it and you were not
>> successful enough at convincing these developers that they were
>> "confusing postgresql with a spreadsheet program"
>
> The behavior you are looking for is typical of a spreadsheet,
> because spreadsheet programs (generally; I'm sure there are
> exceptions) don't have the notion of a schema; each cell can hold
> its own particular type. That being said, the automatic type-
> guessing that Excel, say, provides is far from foolproof; I've
> probably spent more time cleaning up Excel's bad guesses than would
> have been saved by my just specifying a type for each column.
>
> As has been noted, text representation of values are extremely
> ambiguous as of which Postgres type they mean... and, of course, you
> could have user-defined domains and types as well. It's true that
> it could take a wild guess, but that's not a traditional part of a
> database engine.
>
> That being said, it would not be too hard to write a client that
> accepted a CSV or tab-delimited file, parsed the header into column
> names, and then scanned the values of the columns to take a
> reasonable guess as to the column type from a highly limited set of
> possibilities. This is probably one of those classic "twenty lines
> of Perl" problems.

About 150 line of perl[1]. It can actually work quite well, but is
entirely a client-side problem. None of that sort of heuristics should
go anywhere near COPY in.

> It doesn't seem as though COPY INTO is the right place for that,
> since the particular guesses and set of types that one would make
> strike me as very closely tied to your particular application domain.

Cheers,
Steve

[1] A validator (regex) for each data type, then for each column track
which data types it may be, as you scan through the file. Use the
relative priorities of different data types to assign something
appropriate for each column, then do a second pass translating the
format into something Postgresql is comfortable with and feed it into
pg_putcopydata.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: