> On Thu, Aug 28, 2008 at 7:50 PM, Adrian Klaver <aklaver@comcast.net> wrote:
> > Define easily.
>
> ~
> OK, let me try to outline the approach I would go for:
> ~
> I think "COPY FROM CSV" should have three options, namely:
> ~
> 1) the way we have used it in which you create the table first
> ~
> 2) another way in which defaults are declared, generally as:
> ~
> 2.1) aggressive: data type, value and formatting analysis is done; if
> only 1 or 0 are found declare then a BOOLEAN, if repeated data is
> found (say state codes) and the stratification nodes cover the rest of
> the data, stratify the data out to other extra table (they have a name
> I can't recall now), index it ..., if data is kind of numeric with
> front slashes and/or hyphen could they possibly be dates? if they are
> definitelly dates convert them to bigint (and do the formatting in the
> presentation code (also this a win-win situation with i18n code)) ...
> ~
> 2.2) conservative: data type and value, but no formatting analysis is
> done and the greater encompassing data type is selected, say for 1 or
> 0 data use bytes [0, 255], for bytes use int, if something could be
> encoded as char(2), use varchar instead, . . .
> ~
> 2.3) dumn: just use the coarsest data type possible; bigint for
> anything that looks like a number and varchar for the rest
> ~
> the "dumn" option should suggest to the DBA the option they are
> using, quantified consequences for their desicions (larger DBs for no
> reason, approx. reduction in speed, . .) and how not to be "dumn"
> ~
> 3) or you could define "import templates" declaring which specific
> data types to use for data in a certain way, which could be declared
> per column using regexps
> ~
>
> > I could go on, but the point is that table data types require some
> > thought on the part of the DBA.
>
> ~
> Well, it still requires their minds and input, but they will have
> jobs even if they get some help, don't you think so ;-)
> ~
> lbrtchx
This is a combination of more work then necessary and putting the cart after
the horse. All I can see happening is delaying the point of decision to a
later time and or dumping the decision process on someone else. There is
already a "dumb" solution that has been brought many times on this list. It
involve creating a holding table that has text only fields and copying the
data into and then moving the data from there to a final table. As far as
import templates I suggest looking at:
http://pgloader.projects.postgresql.org/
It also addresses some of your other suggestions. It does not automatically
create a table though.
--
Adrian Klaver
aklaver@comcast.net
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
No comments:
Post a Comment