Saturday, June 14, 2008

Re: [GENERAL] Backup using GiT?

On Fri, Jun 13, 2008 at 11:11 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Tom Lane wrote:
>> "James B. Byrne" <byrnejb@harte-lyne.ca> writes:
>
>> > GiT works by compressing deltas of the contents of successive versions of file
>> > systems under repository control. It treats binary objects as just another
>> > object under control. The question is, are successive (compressed) dumps of
>> > an altered database sufficiently similar to make the deltas small enough to
>> > warrant this approach?
>>
>> No. If you compress it, you can be pretty certain that the output will
>> be different from the first point of difference to the end of the file.
>> You'd have to work on uncompressed output, which might cost more than
>> you'd end up saving ...
>
> The other problem is that since the tables are not dumped in any
> consistent order, it's pretty unlikely that you'd get any similarity
> between two dumps of the same table. To get any benefit, you'd need to
> get pg_dump to dump sorted tuples.
>
> --
> Alvaro Herrera

http://www.CommandPrompt.com/
> The PostgreSQL Company - Command Prompt, Inc.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

The idea of using GIT for backing-up databases is not that bad.

I would propose the following:
-- dump the creation script in a separate file; (or maybe one file
per object (table, view, function) etc.;)
-- dump the content of each table in it's own file;
-- dump the tuples sorted but in plain text (as COPY data or
INSERTS maybe); (as Alvaro suggested);
-- don't use compression (as Tom and Chander suggested) because
GIT already uses compression for the packed files;

One advantage of using GIT in the manner described previously will
be change tracking by doing just a simple git diff you could see the
modifications (inserts, updates, deletes, etc., schema alteration).
Going a step further you could also do merges between multiple
databases with the same structure (each database would have it's own
branch).

Just imagine how simple a database schema upgrade will be in most
situations, when both the development and the deployed schema have
been modified and we want to put them into sync.

As a conclusion I would subscribe to such an idea.

Ciprian Craciun.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: