Why AWR Reports Are Misread
Most DBAs open an AWR report, scroll straight to "Top 5 Timed Events," pick the top entry, and start tuning that. That's the wrong approach. The top wait event is often a symptom, not a root cause. Reading AWR correctly means starting with the load profile, understanding DB Time, and only then looking at waits — with the context to interpret them.
This guide walks through the sections that actually matter and shows you what to look for in each.
Generating an AWR Report
AWR snapshots are taken automatically every hour (default). To generate a report:
-- List recent snapshots to find your window
SELECT snap_id, begin_interval_time, end_interval_time
FROM dba_hist_snapshot
ORDER BY snap_id DESC
FETCH FIRST 48 ROWS ONLY;
-- Generate the HTML report for snap IDs 1200–1201 on instance 1
@$ORACLE_HOME/rdbms/admin/awrrpt.sql
-- Choose: html or text, instance number, begin/end snap IDs
For a specific time window from the command line:
-- AWR report for the last 2 hours
VARIABLE rpt CLOB;
EXEC :rpt := DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_HTML(
l_dbid => (SELECT dbid FROM v$database),
l_inst_num => 1,
l_bid => (SELECT min(snap_id) FROM dba_hist_snapshot
WHERE begin_interval_time >= SYSDATE - 2/24),
l_eid => (SELECT max(snap_id) FROM dba_hist_snapshot
WHERE end_interval_time <= SYSDATE)
);
Section 1: Report Summary — DB Time Is Everything
The first thing to look at is DB Time in the Report Summary. DB Time is the total Oracle CPU + wait time consumed by all foreground sessions during the snapshot window. It is the single most important number in the report.
DB Time: 1,482.3 (mins)
Elapsed time: 60.1 (mins)
DB CPU: 421.6 (mins)
Calculate the active session count: DB Time ÷ Elapsed Time = 1482.3 ÷ 60.1 = 24.7 average active sessions.
If your server has 32 CPUs and you see 24.7 average active sessions, the database is well-utilized. If you see 150 average active sessions on the same server, you have a throughput problem — sessions are piling up waiting.
DB CPU ÷ DB Time = 421.6 ÷ 1482.3 = 28% of DB Time was on CPU. The remaining 72% was wait time. That means wait events matter here.
Section 2: Load Profile — Throughput at a Glance
Per Second Per Transaction
DB Time(s): 24.7 0.04
DB CPU(s): 7.0 0.01
Redo size (bytes): 2,847,312 4,210.2
Logical reads (blocks): 98,421 145.7
Block changes: 8,492 12.6
Physical reads (blocks): 1,847 2.7
Physical writes (blocks): 2,104 3.1
User calls: 4,823 7.1
Parses: 2,312 3.4
Hard parses: 147 0.2
Executions: 67,621 100.1
Rollbacks: 32 0.0
Transactions: 676
What to look for:
Hard parses per second > 100: SQL is not being reused. Check cursor_sharing or application-level bind variable usage. Hard parses are CPU-expensive and cause library cache latch contention.
Logical reads per transaction > 10,000: Queries are doing full scans or missing indexes. Cross-check with the SQL ordered by logical reads section.
Redo size per second > 50 MB: Heavy write workload. Check for bulk DML without commit batching, or missing direct-path inserts.
Physical reads per second disproportionate to logical reads: Buffer cache hit ratio is low. Consider increasing DB_CACHE_SIZE.
Section 3: Top 5 Timed Events — Read These Last
Now that you have context, look at Top 5 Timed Events:
Top 5 Timed Foreground Events
Event Waits Time (s) Avg wait (ms) % DB Time
------------------------------ --------- --------- -------------- ---------
DB CPU 25,300 28.4%
db file sequential read 2,847,201 31,200 10.96 35.0%
log file sync 412,100 8,904 21.61 10.0%
db file scattered read 198,300 4,200 21.18 4.7%
latch: shared pool 12,400 3,600 290.32 4.0%
DB CPU first: If CPU is 28% of DB Time and you have capacity, CPU is not the bottleneck.
db file sequential read (single block I/O): Index reads or undo reads. High total time with reasonable avg wait (10ms) usually means the query is doing many correct index lookups — not necessarily a problem. If avg wait > 30ms, I/O subsystem is slow.
log file sync: Wait experienced by a COMMIT. If avg wait > 20ms, your redo log group is on slow storage or log_buffer is undersized. Move redo logs to SSD.
db file scattered read (multi-block I/O): Full table scans or fast full index scans. High here often matches high logical reads per transaction — look for missing indexes.
latch: shared pool: If this appears with high total time, you have hard parse or cursor invalidation issues. Check cursor_sharing and look at v$sql for statements with high parse_calls/executions ratio.
Section 4: SQL Ordered by CPU and Elapsed Time
This is where you find the specific SQL causing load:
-- Find the same top SQL from v$sql in real time
SELECT sql_id,
ROUND(cpu_time/1e6, 2) cpu_sec,
ROUND(elapsed_time/1e6, 2) elapsed_sec,
executions,
ROUND(cpu_time/NULLIF(executions,0)/1e6, 4) cpu_per_exec,
SUBSTR(sql_text, 1, 100) sql_preview
FROM v$sql
WHERE executions > 0
ORDER BY cpu_time DESC
FETCH FIRST 20 ROWS ONLY;
For a top SQL from the AWR report, get its execution plan history:
-- See all plans for a specific SQL ID from AWR history
SELECT plan_hash_value,
MIN(begin_interval_time) first_seen,
MAX(end_interval_time) last_seen,
SUM(executions_delta) total_execs,
ROUND(SUM(elapsed_time_delta)/1e6 / NULLIF(SUM(executions_delta),0), 2) avg_elapsed_ms
FROM dba_hist_sql_plan p
JOIN dba_hist_sqlstat s USING (sql_id, plan_hash_value)
JOIN dba_hist_snapshot sn USING (snap_id)
WHERE p.sql_id = '&your_sql_id'
GROUP BY plan_hash_value
ORDER BY last_seen DESC;
A plan_hash_value change between two snapshots means the optimizer chose a different plan — often the root cause of a sudden performance regression.
Section 5: Wait Event Histograms
The histogram section shows how waits are distributed. A histogram where 90% of "db file sequential read" waits complete in <8ms tells a very different story than one where 40% take >64ms.
-- Current wait event histogram from memory
SELECT event, wait_time_milli, wait_count
FROM v$event_histogram
WHERE event IN ('db file sequential read', 'db file scattered read', 'log file sync')
ORDER BY event, wait_time_milli;
For I/O events, if you see a bimodal distribution (lots of fast waits + a long tail of slow waits), suspect I/O subsystem contention at specific times rather than a persistent problem.
Using ASH to Drill Into a Specific Window
AWR covers an hour; ASH covers seconds. When a user says "it was slow between 14:23 and 14:31," use ASH:
-- What was happening between 14:23 and 14:31 today?
SELECT
TO_CHAR(sample_time, 'HH24:MI:SS') sample_time,
session_state,
event,
COUNT(*) active_sessions
FROM v$active_session_history
WHERE sample_time BETWEEN
TO_DATE('2026-06-09 14:23:00', 'YYYY-MM-DD HH24:MI:SS') AND
TO_DATE('2026-06-09 14:31:00', 'YYYY-MM-DD HH24:MI:SS')
GROUP BY TO_CHAR(sample_time, 'HH24:MI:SS'), session_state, event
ORDER BY 1, 4 DESC;
For a specific SQL during that window:
SELECT sql_id, event, COUNT(*) samples,
ROUND(COUNT() 100.0 / SUM(COUNT(*)) OVER(), 1) pct
FROM v$active_session_history
WHERE sample_time BETWEEN
TO_DATE('2026-06-09 14:23:00', 'YYYY-MM-DD HH24:MI:SS') AND
TO_DATE('2026-06-09 14:31:00', 'YYYY-MM-DD HH24:MI:SS')
AND session_state = 'WAITING'
GROUP BY sql_id, event
ORDER BY samples DESC
FETCH FIRST 20 ROWS ONLY;
For historical ASH (older than the in-memory window):
-- From DBA_HIST_ACTIVE_SESS_HISTORY — same query on the history table
SELECT sql_id, event, COUNT(*) samples
FROM dba_hist_active_sess_history
WHERE sample_time BETWEEN
TO_DATE('2026-06-08 14:23:00', 'YYYY-MM-DD HH24:MI:SS') AND
TO_DATE('2026-06-08 14:31:00', 'YYYY-MM-DD HH24:MI:SS')
AND session_state = 'WAITING'
GROUP BY sql_id, event
ORDER BY samples DESC;
The Six Questions an AWR Report Should Answer
When you open any AWR report, work through this checklist:
- What is the average active session count? (DB Time ÷ Elapsed Time) — are you over your CPU count?
- What fraction of DB Time is on CPU vs waiting? If >70% CPU, look at CPU-bound SQL. If <30% CPU, look at waits.
- What is the hard parse rate? Anything above 100/second warrants attention.
- What are the top 2–3 waits by total time, and what are their average wait times? Average wait tells you severity; total time tells you scope.
- Are there any "new" events in Top 5 compared to your baseline? A new latch or enqueue event is a signal.
- Which SQL IDs dominate CPU and elapsed time? Pull their plans and check for plan regressions.
Common AWR Patterns and What They Mean
| Pattern | Likely Cause | First Check |
|---------|-------------|-------------|
| log file sync >15ms avg | Slow redo storage | Move redo to SSD; check iostat on redo disk |
| library cache: mutex X high | Hard parses / plan invalidation | cursor_sharing=FORCE; check v$sql_plan_statistics_all |
| enq: TX — row lock contention | Application lock contention | v$session.blocking_session |
| buffer busy waits > 1% DB Time | Hot blocks (segment header or data block) | dba_segments where segment_type='TABLE' |
| latch: cache buffers chains | Hot data blocks | Find object with v$bh by file#/block# |
| cursor: pin S wait on X | Hard parse concurrency | cursor_sharing, avoid shared pool flush |
| read by other session | I/O bottleneck with session contention | Check I/O queuing, consider async I/O |
TuneVault's health check runs AWR analysis automatically on every check, surfacing the top wait patterns and flagging SQL regressions without requiring you to manually generate and read the report.