IO:DataFileRead
DataFileRead occurs when a connection needs to access a specific data page that is not currently present in the PostgreSQL Shared Buffers.
The process must wait while the operating system reads the page from the disk (or filesystem cache) into memory.
Implication:
While some disk reads are normal, a high percentage of these wait events typically indicates that your active dataset (working set) is larger than your available memory,
or that inefficient queries are forcing unnecessary disk reads.
Why it is happening (Root Causes)
If this event is dominating, look for these culprits:
- Sequential Scans (Missing Indexes):
If a query scans a whole table because it lacks an index, it forces PostgreSQL to read old/unused data from the disk churning the cache.
Missing or inefficient indexes are a common cause of high DataFileRead events.
- Insufficient Memory:
The shared_buffers (PostgreSQL's cache) or the OS Page Cache (System RAM) is too small to hold the "hot" (frequently accessed) data.
-
Cold Cache:
This is normal immediately after a database restart because the memory is empty.
-
Bloat:
If tables/indexes are bloated with dead rows, PostgreSQL has to read more pages from the disk to get the same amount of live data.
Comparison: IO:DataFileRead vs. IO:DataFilePrefetch
It is important not to confuse this with Prefetching.
- DataFileRead: The process is stuck waiting for a specific page it needs right now.
- DataFilePrefetch: The OS is reading pages ahead of time because it anticipates you will need them (an optimization).