Skip to content

Add/get report hosts command#2861

Open
ozgen wants to merge 14 commits intomainfrom
add/get-report-hosts-command
Open

Add/get report hosts command#2861
ozgen wants to merge 14 commits intomainfrom
add/get-report-hosts-command

Conversation

@ozgen
Copy link
Copy Markdown
Member

@ozgen ozgen commented Apr 2, 2026

What

  • Move report host handling into separate files.
  • Add management support for host-only responses.
  • Add the get_report_hosts GMP command.
  • Add documentation for the new command.
  • Verify by build and manual gvm-cli testing.

Why

  • To separate host-specific logic from get_reports.
  • To improve maintainability and code organization.
  • To support fetching only report hosts when full report data is not needed.
  • To help reduce large GMP response sizes.

References

GEA-1692

ozgen added 4 commits April 2, 2026 10:02
Extract report host related SQL and XML generation logic from the
existing report handling code into dedicated report hosts modules.

Keep the existing behavior unchanged and prepare the codebase for a
separate get_report_hosts GMP command.
Add management-layer handling for sending report host XML separately
from get_reports.

Reuse the existing filtering and host selection logic, including
result_hosts_only and container scan handling, so host-only responses
can be generated without changing the existing report output.
Introduce the get_report_hosts GMP command for retrieving host data
from a single report.

Support report_id, filter parsing and lean mode, and route the command
through the new management-layer report hosts handling.
@ozgen ozgen requested review from a team as code owners April 2, 2026 08:09
@ozgen ozgen marked this pull request as draft April 2, 2026 08:09
@ozgen ozgen marked this pull request as ready for review April 2, 2026 08:28
@greenbonebot greenbonebot enabled auto-merge (rebase) April 2, 2026 09:37
Copy link
Copy Markdown
Contributor

@mattmundell mattmundell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new command and the report host extraction look like great ideas.

src/manage_sql.c Outdated

#if ENABLE_CONTAINER_SCANNING
if (is_container_scanning_report)
if (print_report_hosts_xml (&ctx,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're fast, but I would have loved for this extraction part to be a separate initial PR.

It's harder to see what's going on when it's all piled into one big PR. It also can lead to mingling changes in with later commits in this PR.

This is moving some of the report printing out into separate files. I wonder if that's a good idea because the report printing is all linked together with the new context variable. Might make it harder to replace as a whole.

Also, just something to consider, we've combined XML output in our "libmanage" layer and in the long run maybe that belongs more in the upper layers (because it's GMP protocol specific, and less to do with managing the data/scans/schedules).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mattmundell, that makes sense. I see your point that the extraction and the new command ended up mixed together, which makes it harder to review. My intention was to keep the existing logic unchanged and reuse it for the new command with minimal risk, but I agree that a separate PR for the extraction would have made that clearer.

Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.

Also agreed on the broader architectural point: for this change I stayed close to the existing design to avoid a larger rewrite, but I understand the concern.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which makes it harder to review

Even just splitting 83d24a8 into multiple commits would make this PR easier to follow:

  • a commit that just moves whole functions to new files (even a few commits if there are many functions)
  • a commit that moves whole functions around within files (why was print_report_context_cleanup moved within manage_sql.c? Maybe just a patch artifact from static removal, possibly due to the patch being so big, but now I'm unsure...)
  • a commit that extracts the report host xml printing.

I also highly recommend that every commit compiles, because this would keep out confusing changes, like in 83d24a8 there are stray files in CMakeLists.txt, and there are #includes of files that are missing (eg manage_report_hosts.h).

I know I'm generally old and grumpy, but it's very annoying and time consuming having to sift through so much stuff trying to understand what the intention was, like when I'm looking at potential problems (eg flagged by an agent) or when I'm trying to understand the general design direction.

Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.

OK cool. What I was trying to say was that all the report printing code could stay together for now, but I understand the idea of separating it cleanly into modules.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is really helpful feedback.

You're right, the commits got a bit mixed, which makes the PR harder to follow. Splitting them more clearly would definitely help.

Also good point about keeping each commit buildable. The missing includes and CMake issues were not intentional, but I see how that adds confusion.

I will keep this in mind for future changes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also plan to split certificate, CVE, and error results to reduce the XML size. I will keep your suggestions in mind for that too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this will help in my drive to reduce the size of print_report_xml_start.

@ozgen ozgen force-pushed the add/get-report-hosts-command branch from cd7e0c8 to eaa5ae3 Compare April 2, 2026 11:13
@ozgen ozgen requested a review from mattmundell April 2, 2026 11:28
@ozgen ozgen force-pushed the add/get-report-hosts-command branch from 4ce038c to 45d7914 Compare April 2, 2026 14:40
@ozgen ozgen requested a review from mattmundell April 2, 2026 14:42
src/manage_sql.c Outdated

#if ENABLE_CONTAINER_SCANNING
if (is_container_scanning_report)
if (print_report_hosts_xml (&ctx,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which makes it harder to review

Even just splitting 83d24a8 into multiple commits would make this PR easier to follow:

  • a commit that just moves whole functions to new files (even a few commits if there are many functions)
  • a commit that moves whole functions around within files (why was print_report_context_cleanup moved within manage_sql.c? Maybe just a patch artifact from static removal, possibly due to the patch being so big, but now I'm unsure...)
  • a commit that extracts the report host xml printing.

I also highly recommend that every commit compiles, because this would keep out confusing changes, like in 83d24a8 there are stray files in CMakeLists.txt, and there are #includes of files that are missing (eg manage_report_hosts.h).

I know I'm generally old and grumpy, but it's very annoying and time consuming having to sift through so much stuff trying to understand what the intention was, like when I'm looking at potential problems (eg flagged by an agent) or when I'm trying to understand the general design direction.

Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.

OK cool. What I was trying to say was that all the report printing code could stay together for now, but I understand the idea of separating it cleanly into modules.

@ozgen ozgen requested a review from mattmundell April 2, 2026 21:40
#include "manage_settings.h"
#include "manage_sql_report_hosts.h"

#include <util/fileutils.h>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason this works, but everywhere else uses gvm-libs/util/fileutils.h

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4

is_container_scanning_report,
result_hosts_only,
result_hosts,
host_summary_buffer);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As host_summary_buffer is always checked for NULL before being used, this could just pass NULL.

Would save the allocation and make this function simpler.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 51c7b6c

{
term = filter_term (get->filt_id);
if (term == NULL)
return 2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is leaking elements of ctx. It looks like you're trying to mirror print_report_xml_start, so I think you should do the ctx inits in the same places (so after deriving the filter controls).

Side note: this derive section looks like a good candidate to share between print_report_xml_start and manage_send_report_hosts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4

}

max_results = manage_max_rows (max_results, get->ignore_max_rows_per_page);
(void) first_result;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check because these variables are so verbose: you're doing this so the filter derive code can stay the same as print_report_xml_start, right? The manage_report_filter_controls allows NULL to be passed instead of an address for all these args.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4


print_report_context_cleanup (&ctx);

if (xml_dir[0] != '\0')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always true.
Maybe use a second var like xml_dir = &xml_dir_buffer that is set NULL on mkdtemp failure, or just a flag.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4

<summary>Whether to include host details</summary>
<type>boolean</type>
</attrib>
</pattern>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the lean attrib.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GET_REPORTS doc also mentions lean in the description.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: d3dc449

report_t report;
task_t task;
gchar *usage_type;
gboolean is_container_scanning_report = FALSE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate init

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4

</summary>
</option>
<option>
<name>rows</name>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about these options. From what I can see they still need to be implemented. I guess that will be quite complex (for one, due to result_hosts_only there may need to also be a result filter in GET_REPORT_HOSTS).

If it's going to be some time until they're implemented then maybe they should be hidden here.

Copy link
Copy Markdown
Member Author

@ozgen ozgen Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: d3dc449

const gchar *usage_type,
gboolean is_container_scanning_report,
int lean,
gmp_parser_t *parser)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bad idea to pull the GMP parser down into the manage layer. We already have the pattern of a send arg in manage_send_reports.

I understand that the manage layer is already working with the GMP XML to some extent but I think we should look to reducing this. Perhaps the XML parts can be moved up, or extracted into something that both the GMP layer and manage layer use.

Copy link
Copy Markdown
Member Author

@ozgen ozgen Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually implemented earlier to move it in that direction. Later, @timopollmeier suggested restructuring it this way, so I adjusted accordingly.

If we agree that the previous approach (keeping the GMP parsing out of the manage layer) is the better direction, I can revert and align with that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 2c9c7f4

ozgen and others added 4 commits April 3, 2026 13:21
Co-authored-by: Matt Mundell <32057441+mattmundell@users.noreply.github.com>
Extract duplicated report filter control setup into a shared helper
and use it in manage_report_hosts and manage_sql_report_hosts.

Remove duplicate initialization in gmp_report_hosts.c and guard
temporary directory cleanup with a dedicated mkdtemp success flag.
Document the lean parameter for report hosts and remove options that are
not currently implemented to keep the GMP documentation aligned with
actual behavior.
@ozgen ozgen requested a review from mattmundell April 3, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants