Skip to content

Addition of diagnostic collectors and bug fixes#6

Open
vyruss wants to merge 13 commits intomainfrom
feat/additions_202605
Open

Addition of diagnostic collectors and bug fixes#6
vyruss wants to merge 13 commits intomainfrom
feat/additions_202605

Conversation

@vyruss
Copy link
Copy Markdown
Collaborator

@vyruss vyruss commented May 2, 2026

Summary

Adds 5 new diagnostic collectors and extends 4 existing collectors with columns supporting deeper analysis: bloat, vacuum health, wraparound headroom, index validity/usage, sequence headroom, SSL audit, replication slot spill, extension version metadata. Also fixes the read_ahead system collector and splits the hostname collector to disambiguate kernel hostname from resolver-canonical FQDN.

New collectors:

  • stat_ssl (postgresql/) - per-backend SSL/TLS state from pg_stat_ssl
  • stat_replication_slots (postgresql/) - logical replication slot spill counters from pg_stat_replication_slots (PG14+)
  • sequences (databases/{db}/) - sequences with last/min/max values from pg_sequences
  • bloat (databases/{db}/) - heuristic table bloat estimate from pg_stats
  • pgstattuple (databases/{db}/) - authoritative bloat via pgstattuple_approx() when the extension is installed

Extended collectors:

  • databases - datfrozenxid, datminmxid, datconnlimit, datistemplate, datallowconn
  • tables - rebuilt on pg_class + pg_stat_all_tables for vacuum/analyze timestamps, dead-tup counters, reloptions, relpersistence, heap size (pg_relation_size), table size (pg_table_size), reltoastrelid/toast size; capped at top 1000 by size
  • indexes - rebuilt on pg_index + pg_stat_all_indexes for semantic-key columns, indisvalid, idx_scan/idx_tup_read/idx_tup_fetch, index size; capped at top 1000 by size
  • available_extensions - now sourced from pg_available_extension_versions for per-version installed/trusted/superuser/relocatable/requires metadata

Bug fixes:

  • read_ahead system collector now handles virtio (/dev/vd*) and xen (/dev/xvd*) block devices and gracefully skips unmatched globs; output is now labeled per device.
  • hostname system collector split: system/hostname.out now contains plain hostname (kernel hostname / gethostname()), and a new system/hostname_fqdn.out contains hostname -f (resolver canonical name).

Checklist

  • Unit tests (TestQueryTaskColumnsCoverage asserts new column tokens and LIMIT 1000 stay in the queries; TestPostgreSQLCollectors covers stat_ssl and stat_replication_slots)
  • Regression tests (existing tests still pass)
  • Docs/README updated (README.md, docs/index.md, DATA.md, docs/data.md)
  • Integration test impact (all 6 scenarios pass on PG18; pgstattuple correctly skipped when extension absent; stat_replication_slots PG14+ guarded by existing 42P01 SkipError path)
  • Real-world validation (collected against vorlon's PG18 instance — 24 SSL backends, 393 sequences, 176 MB bloat detected on oc_filecache, top-by-size ordering verified, pgstattuple correctly skipped across all 10 DBs)
  • Security checks (no new attack surfaces, no new dependencies)
  • PR links to a tracking issue/ticket

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

📝 Walkthrough

Walkthrough

This PR updates PostgreSQL collectors and documentation to use richer catalog/stat views (adds bloat/pgstattuple/sequences, switches to pg_available_extension_versions, adds stat_replication_slots and stat_ssl), adds a unit test validating query contents, updates collector expectations, and separately broadens the Linux read_ahead system task to probe more device globs.

Changes

PostgreSQL Data Collection Expansion

Layer / File(s) Summary
Data Shape / Documentation
DATA.md, README.md, docs/data.md, docs/index.md
Docs updated: pg_available_extension_versions replaces pg_available_extensions; added pg_stat_ssl, pg_stat_replication_slots; per-database collectors now include bloat.tsv, pgstattuple.tsv (if installed), sequences.tsv; tables/indexes described as sourced from pg_class+pg_stat_all_tables and pg_index+pg_stat_all_indexes with expanded metrics.
Core Implementation (Query Tasks)
postgres_tasks.go
Query changes: available_extensions now selects explicit columns from pg_available_extension_versions; databases adds frozenxid_age/minmxid_age; new instance task stat_replication_slots; per-database tasks added/replaced: bloat, pgstattuple, sequences; indexes and tables replaced with catalog+stat joins returning richer size/stat/vacuum fields.
Wiring / Collector Registration
postgres_test.go
TestPostgreSQLCollectors expected collector list updated to include stat_replication_slots and stat_ssl and stat_slru reordered.
Tests / Validation
radar_test.go
New TestQueryTaskColumnsCoverage verifies selected postgres and per-database SimpleQueryTask.Query strings contain required tokens/subqueries (column names, fragments like LIMIT 1000).

System Task: read_ahead probe

Layer / File(s) Summary
Command Invocation Change
system_tasks_linux.go
read_ahead task replaced a single blockdev --getra call with a shell loop over device globs /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd*, invoking blockdev --getra per-device, echoing non-empty results, and exiting 0.

Poem

🐰 New sources spring, catalog and stats entwine,
Queries dig deeper, row by careful line.
Slots and SSL, bloat and sequences too—
The rabbit hops through schemas, gathering truth. 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately captures the main changes: adding new diagnostic collectors and fixing the read_ahead system collector, as evidenced by the summary and file changes.
Description check ✅ Passed The PR description comprehensively describes the changeset, detailing five new collectors, extensions to four existing collectors, bug fixes, and testing/validation performed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/additions_202605

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 2, 2026

Up to standards ✅

🟢 Issues 1 medium

Results:
1 new issue

Category Results
Complexity 1 medium

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@vyruss vyruss changed the title Add diagnostic collectors (bloat, sequences, SSL, slot stats) and extend existing TSVs Add diagnostic collectors May 2, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
radar_test.go (1)

532-559: ⚡ Quick win

Expand coverage checks for changed available_extensions and age fields.

This test is a great guardrail; add assertions for the available_extensions source token and the new computed database age aliases so future query edits don’t silently regress those additions.

🔧 Suggested test additions
  checks := []struct {
      taskList    string
      taskName    string
      mustContain []string
  }{
+     {"postgres", "available_extensions", []string{
+         "pg_available_extension_versions", "installed", "trusted", "requires",
+     }},
      {"postgres", "databases", []string{
          "datfrozenxid", "datminmxid", "datconnlimit",
-         "datistemplate", "datallowconn",
+         "datistemplate", "datallowconn", "frozenxid_age", "minmxid_age",
      }},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 532 - 559, Update the test expectations so the
databases coverage includes the new available_extensions source token and the
computed database age aliases: add "available_extensions" to the
{"postgres","databases",...} expected fields and add the age-related alias names
used by the query (the computed database age column(s) labeled in the query,
e.g., "age" and any alias variant) to the appropriate databases/perDB
expectations (look for the {"postgres","databases",...} and
{"perDB","tables"/"indexes"/...} entries and append the new field names).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: Update the test expectations so the databases coverage
includes the new available_extensions source token and the computed database age
aliases: add "available_extensions" to the {"postgres","databases",...} expected
fields and add the age-related alias names used by the query (the computed
database age column(s) labeled in the query, e.g., "age" and any alias variant)
to the appropriate databases/perDB expectations (look for the
{"postgres","databases",...} and {"perDB","tables"/"indexes"/...} entries and
append the new field names).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d9e9d5c8-0d3c-4b73-a361-42f1a8dac090

📥 Commits

Reviewing files that changed from the base of the PR and between 5ae14db and d893c6a.

📒 Files selected for processing (7)
  • DATA.md
  • README.md
  • docs/data.md
  • docs/index.md
  • postgres_tasks.go
  • postgres_test.go
  • radar_test.go

@vyruss vyruss changed the title Add diagnostic collectors Add diagnostic collectors (bloat, sequences, SSL, slot stats), extend existing TSVs, and fix read_ahead May 2, 2026
@vyruss vyruss changed the title Add diagnostic collectors (bloat, sequences, SSL, slot stats), extend existing TSVs, and fix read_ahead Add diagnostic collectors and fix read_ahead May 2, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
system_tasks_linux.go (1)

338-338: ⚡ Quick win

Consider adding /dev/md* (software RAID) to the device glob list.

Enterprise PostgreSQL deployments frequently sit on software RAID arrays (mdadm), and blockdev --getra is a legitimate read-ahead tuning surface for those devices. The same [ -b "$d" ] guard already handles no-match silently.

♻️ Proposed addition
-Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"},
+Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd* /dev/md*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"},

Note: /dev/dm-* (LVM/device-mapper) read-ahead is technically valid too, but its behaviour at the DM layer is less commonly tuned and the output can be voluminous on systems with many LVs — leaving it out is a reasonable call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@system_tasks_linux.go` at line 338, The device glob used in the command
assigned to Args currently iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd*
but omits software RAID devices; update the command string (the Args value in
system_tasks_linux.go where Args: []string{"-c", "..."} is defined) to include
/dev/md* in the glob (e.g. add /dev/md* to the list) so blockdev --getra will
also query mdadm devices while keeping the existing [ -b "$d" ] guard and the
rest of the pipeline intact.
radar_test.go (2)

532-559: ⚡ Quick win

available_extensions is missing from the coverage checks despite being reworked in this PR.

The test's stated goal is to "catch accidental column removal during future edits." The available_extensions task was explicitly reworked in this PR to use pg_available_extension_versions, adding installed, trusted, relocatable, requires, and superuser — yet none of those new columns are guarded here.

✅ Suggested addition
 		{"postgres", "databases", []string{
 			"datfrozenxid", "datminmxid", "datconnlimit",
 			"datistemplate", "datallowconn",
 		}},
+		{"postgres", "available_extensions", []string{
+			"pg_available_extension_versions", "installed", "trusted",
+			"relocatable", "requires", "superuser",
+		}},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 532 - 559, The test coverage list is missing the
reworked available_extensions check; add a new entry for the
"available_extensions" task in the coverage slice (similar to other {"postgres",
"...", []string{...}} entries) and include the new columns added in this PR:
"installed", "trusted", "relocatable", "requires", and "superuser" (alongside
any existing name/version columns used by available_extensions) so the test will
catch accidental removals; locate the slice in radar_test.go where other
{"postgres", "...", []string{...}} entries are defined and append the
{"postgres", "available_extensions", []string{ "installed", "trusted",
"relocatable", "requires", "superuser", /* include existing columns like
name/version as appropriate */ }} entry.

573-578: ⚡ Quick win

Add a default case to fail clearly on unrecognized taskList values.

Without a default, a typo in taskList (e.g., "Postgres" instead of "postgres") leaves task as nil, and the failure message at line 580 says task "X" not found in Postgres tasks — suggesting the task itself is missing rather than that the list identifier is wrong.

🛡️ Proposed fix
 		switch c.taskList {
 		case "postgres":
 			task = taskByName(postgresQueryTasks, c.taskName)
 		case "perDB":
 			task = taskByName(perDatabaseQueryTasks, c.taskName)
+		default:
+			t.Errorf("unknown taskList %q for task %q", c.taskList, c.taskName)
+			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 573 - 578, The switch on c.taskList that assigns
task (using taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a
default branch to fail fast when c.taskList is unrecognized; add a default case
in that switch which calls the test failure helper (e.g., t.Fatalf or
equivalent) with a clear message stating the invalid taskList value and the
allowed options so typos (like "Postgres") surface as an invalid list rather
than a missing task; ensure the error references c.taskList and the expected
names to aid debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: The test coverage list is missing the reworked
available_extensions check; add a new entry for the "available_extensions" task
in the coverage slice (similar to other {"postgres", "...", []string{...}}
entries) and include the new columns added in this PR: "installed", "trusted",
"relocatable", "requires", and "superuser" (alongside any existing name/version
columns used by available_extensions) so the test will catch accidental
removals; locate the slice in radar_test.go where other {"postgres", "...",
[]string{...}} entries are defined and append the {"postgres",
"available_extensions", []string{ "installed", "trusted", "relocatable",
"requires", "superuser", /* include existing columns like name/version as
appropriate */ }} entry.
- Around line 573-578: The switch on c.taskList that assigns task (using
taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a default
branch to fail fast when c.taskList is unrecognized; add a default case in that
switch which calls the test failure helper (e.g., t.Fatalf or equivalent) with a
clear message stating the invalid taskList value and the allowed options so
typos (like "Postgres") surface as an invalid list rather than a missing task;
ensure the error references c.taskList and the expected names to aid debugging.

In `@system_tasks_linux.go`:
- Line 338: The device glob used in the command assigned to Args currently
iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd* but omits software RAID
devices; update the command string (the Args value in system_tasks_linux.go
where Args: []string{"-c", "..."} is defined) to include /dev/md* in the glob
(e.g. add /dev/md* to the list) so blockdev --getra will also query mdadm
devices while keeping the existing [ -b "$d" ] guard and the rest of the
pipeline intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df0d4cf9-354b-4f46-8328-2672125843fe

📥 Commits

Reviewing files that changed from the base of the PR and between d893c6a and 8fd7d28.

📒 Files selected for processing (5)
  • DATA.md
  • docs/data.md
  • postgres_tasks.go
  • radar_test.go
  • system_tasks_linux.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • DATA.md
  • postgres_tasks.go

@vyruss vyruss changed the title Add diagnostic collectors and fix read_ahead Addition of diagnostic collectors and bug fixes May 2, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/data.md (1)

49-112: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

read_ahead.out source description not updated to reflect the bug fix.

Line 93 (unchanged) still documents:

| `system/read_ahead.out` | `blockdev --getra /dev/*` | ...

The PR explicitly fixes read_ahead to also probe /dev/vd* (virtio) and /dev/xvd* (Xen) in addition to the original pattern. The source description should reflect the multi-pattern probe, e.g. `blockdev --getra` (sd/vd/xvd/nvme) or similar, so readers understand virtio and Xen hosts are covered.

📝 Suggested description update
-| `system/read_ahead.out` | `blockdev --getra /dev/*` | Block device read-ahead settings |
+| `system/read_ahead.out` | `blockdev --getra` (sd/vd/xvd/nvme) | Block device read-ahead settings per device |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/data.md` around lines 49 - 112, Update the table entry for
system/read_ahead.out to reflect the expanded probe patterns: change the source
description from "blockdev --getra /dev/*" to something explicit like "`blockdev
--getra` (sd/vd/xvd/nvme devices)" or "`blockdev --getra /dev/sd* /dev/vd*
/dev/xvd* /dev/nvme*`" so it documents that the read_ahead collector now probes
/dev/vd* (virtio) and /dev/xvd* (Xen) in addition to the original devices;
locate the row by the symbol system/read_ahead.out and update its third-column
description accordingly.
🧹 Nitpick comments (1)
DATA.md (1)

49-50: ⚖️ Poor tradeoff

DATA.md and docs/data.md are identical — consider consolidating.

Both files receive the same edits every time documentation changes. Maintaining two copies in lockstep is error-prone (a future update to one will silently diverge from the other). Consider making one a symlink or redirect to the other, or removing the duplicate.

Also applies to: 185-186, 227-227, 265-265, 267-267, 288-305

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DATA.md` around lines 49 - 50, The repository contains duplicate
documentation files DATA.md and docs/data.md; consolidate them by keeping a
single source of truth (choose either DATA.md or docs/data.md), remove the
duplicate file, and replace it with a symlink or a redirect to the retained file
so future edits only need to be made once; update any references or links in the
repo and CI/docs build config to point to the retained file and verify examples
like the `system/hostname.out`/`system/hostname_fqdn.out` entries still appear
correctly in the remaining file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@docs/data.md`:
- Around line 49-112: Update the table entry for system/read_ahead.out to
reflect the expanded probe patterns: change the source description from
"blockdev --getra /dev/*" to something explicit like "`blockdev --getra`
(sd/vd/xvd/nvme devices)" or "`blockdev --getra /dev/sd* /dev/vd* /dev/xvd*
/dev/nvme*`" so it documents that the read_ahead collector now probes /dev/vd*
(virtio) and /dev/xvd* (Xen) in addition to the original devices; locate the row
by the symbol system/read_ahead.out and update its third-column description
accordingly.

---

Nitpick comments:
In `@DATA.md`:
- Around line 49-50: The repository contains duplicate documentation files
DATA.md and docs/data.md; consolidate them by keeping a single source of truth
(choose either DATA.md or docs/data.md), remove the duplicate file, and replace
it with a symlink or a redirect to the retained file so future edits only need
to be made once; update any references or links in the repo and CI/docs build
config to point to the retained file and verify examples like the
`system/hostname.out`/`system/hostname_fqdn.out` entries still appear correctly
in the remaining file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3cd0013a-9091-40e0-a5c1-80a1a28610f6

📥 Commits

Reviewing files that changed from the base of the PR and between 8fd7d28 and 53e2d77.

📒 Files selected for processing (4)
  • DATA.md
  • docs/data.md
  • system_tasks_darwin.go
  • system_tasks_linux.go

@vyruss vyruss requested a review from bonesmoses May 2, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant