Tuesday, June 3, 2008

Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak

On May 28, 2008, at 1:22 PM, Gregory Stark wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> "Tomasz Rybak" <bogomips@post.pl> writes:
>>> I tried to use COPY to import 27M rows to table:
>>> CREATE TABLE sputnik.ccc24 (
>>> station CHARACTER(4) NOT NULL REFERENCES
>>> sputnik.station24 (id),
>>> moment INTEGER NOT NULL,
>>> flags INTEGER NOT NULL
>>> ) INHERITS (sputnik.sputnik);
>>> COPY sputnik.ccc24(id, moment, station, strength, sequence, flags)
>>> FROM '/tmp/24c3' WITH DELIMITER AS ' ';
>>
>> This is expected to take lots of memory because each row-requiring-
>> check
>> generates an entry in the pending trigger event list. Even if you
>> had
>> not exhausted memory, the actual execution of the retail checks would
>> have taken an unreasonable amount of time. The recommended way to do
>> this sort of thing is to add the REFERENCES constraint *after* you
>> load
>> all the data; that'll be a lot faster in most cases because the
>> checks
>> are done "in bulk" using a JOIN rather than one-at-a-time.
>
> Hm, it occurs to me that we could still do a join against the
> pending event
> trigger list... I wonder how feasible it would be to store the
> pending trigger
> event list in a temporary table instead of in ram.


Related to that, I really wish that our statement-level triggers
provided NEW and OLD recordsets like some other databases do. That
would allow for RI triggers to be done on a per-statement basis, and
they could aggregate keys to be checked.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

No comments: