Thursday, May 29, 2008

Re: [HACKERS] [PERFORM] Memory question on win32 systems



Merlin Moncure wrote:
On Thu, May 29, 2008 at 10:19 AM, Justin <justin@emproshunts.com> wrote:   
To my understanding Postgresql only caches queries and results in memory for that specific connection.  So when that connection is closed those cached results are cleared out.    So cached indexs and queries are for that connection only.   I hope my understanding is correct.     
 completely wrong.  They are called 'shared' buffers for a reason.  Also you are missing the point of the o/s file cache which lies under that.  If you have a computer with 4gb ram that is addressable, all its memory is used for caching at all times, period, less what operating system needs or what is used by running programs or temporary demands (sorting, etc).  Also, postgresql doesn't as a rule cache 'results and queries'.  shared buffers setting reserves memory for postgresql's internal cache (plus some housekeeping things like locks)...which can be faster than the o/s cache because it is more tightly integrated with the backend. However a page fault to disk is much more interesting in performance terms than the performance differences between shared buffers and o/s cache.  merlin    
This is not meant to be argumentative i'm trying to clearify what is going because documentation for different sources seem to be in conflict  or confusing. 
---------------------------------------------------
From Postgresql Second Edition:
    Shared_Buffers: This cache is shared by all clients connected to a single cluster. DISK I?) (and cache I/O) is performed in 8KB chunks.  The shared_buffers parameter determines how many 8KB will be created in the shared cache.

from here explanation is different
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html  it referrs to queries which means to me the SQL command and the result sets (but that is wrong)

We have sort_mem aka work_mem
---------------------------------------------------
From Postgresql Second Edition:
     where postgresql processes query it transforms the query from string form into an execution plan. An execution plan is a sequence of operations that must be performed in order satisfy the query.   

(This specific to the client  connection and when the client is closed out this is cleared my thinking.)
----------------------------------------------------
From postgresql Documentation
work_mem (integer)

Specifies the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. The value is defaults to one megabyte (1MB). Note that for a complex query, several sort or hash operations might be running in parallel; each one will be allowed to use as much memory as this value specifies before it starts to put data into temporary files. Also, several running sessions could be doing such operations concurrently. So the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries

---------------------------------------------------------
Quoting You "Also, postgresql doesn't as a rule cache 'results and queries'.

Then what is the purpose of shared buffers if nothing is being reused  is it only used to keep track locks, changes and what is to being spooled to the kernel???

I'm confussed trying to figure out how caches are being use and being moving through postgresql backend.   As many have stated that small database can fit completely in the caches How does that really work.








No comments: