Conversation
Extract report host related SQL and XML generation logic from the existing report handling code into dedicated report hosts modules. Keep the existing behavior unchanged and prepare the codebase for a separate get_report_hosts GMP command.
Add management-layer handling for sending report host XML separately from get_reports. Reuse the existing filtering and host selection logic, including result_hosts_only and container scan handling, so host-only responses can be generated without changing the existing report output.
Introduce the get_report_hosts GMP command for retrieving host data from a single report. Support report_id, filter parsing and lean mode, and route the command through the new management-layer report hosts handling.
mattmundell
left a comment
There was a problem hiding this comment.
The new command and the report host extraction look like great ideas.
src/manage_sql.c
Outdated
|
|
||
| #if ENABLE_CONTAINER_SCANNING | ||
| if (is_container_scanning_report) | ||
| if (print_report_hosts_xml (&ctx, |
There was a problem hiding this comment.
I know you're fast, but I would have loved for this extraction part to be a separate initial PR.
It's harder to see what's going on when it's all piled into one big PR. It also can lead to mingling changes in with later commits in this PR.
This is moving some of the report printing out into separate files. I wonder if that's a good idea because the report printing is all linked together with the new context variable. Might make it harder to replace as a whole.
Also, just something to consider, we've combined XML output in our "libmanage" layer and in the long run maybe that belongs more in the upper layers (because it's GMP protocol specific, and less to do with managing the data/scans/schedules).
There was a problem hiding this comment.
Thanks @mattmundell, that makes sense. I see your point that the extraction and the new command ended up mixed together, which makes it harder to review. My intention was to keep the existing logic unchanged and reuse it for the new command with minimal risk, but I agree that a separate PR for the extraction would have made that clearer.
Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.
Also agreed on the broader architectural point: for this change I stayed close to the existing design to avoid a larger rewrite, but I understand the concern.
There was a problem hiding this comment.
which makes it harder to review
Even just splitting 83d24a8 into multiple commits would make this PR easier to follow:
- a commit that just moves whole functions to new files (even a few commits if there are many functions)
- a commit that moves whole functions around within files (why was
print_report_context_cleanupmoved within manage_sql.c? Maybe just a patch artifact fromstaticremoval, possibly due to the patch being so big, but now I'm unsure...) - a commit that extracts the report host xml printing.
I also highly recommend that every commit compiles, because this would keep out confusing changes, like in 83d24a8 there are stray files in CMakeLists.txt, and there are #includes of files that are missing (eg manage_report_hosts.h).
I know I'm generally old and grumpy, but it's very annoying and time consuming having to sift through so much stuff trying to understand what the intention was, like when I'm looking at potential problems (eg flagged by an agent) or when I'm trying to understand the general design direction.
Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.
OK cool. What I was trying to say was that all the report printing code could stay together for now, but I understand the idea of separating it cleanly into modules.
There was a problem hiding this comment.
Thanks, this is really helpful feedback.
You're right, the commits got a bit mixed, which makes the PR harder to follow. Splitting them more clearly would definitely help.
Also good point about keeping each commit buildable. The missing includes and CMake issues were not intentional, but I see how that adds confusion.
I will keep this in mind for future changes.
There was a problem hiding this comment.
I also plan to split certificate, CVE, and error results to reduce the XML size. I will keep your suggestions in mind for that too.
There was a problem hiding this comment.
Great, this will help in my drive to reduce the size of print_report_xml_start.
cd7e0c8 to
eaa5ae3
Compare
4ce038c to
45d7914
Compare
src/manage_sql.c
Outdated
|
|
||
| #if ENABLE_CONTAINER_SCANNING | ||
| if (is_container_scanning_report) | ||
| if (print_report_hosts_xml (&ctx, |
There was a problem hiding this comment.
which makes it harder to review
Even just splitting 83d24a8 into multiple commits would make this PR easier to follow:
- a commit that just moves whole functions to new files (even a few commits if there are many functions)
- a commit that moves whole functions around within files (why was
print_report_context_cleanupmoved within manage_sql.c? Maybe just a patch artifact fromstaticremoval, possibly due to the patch being so big, but now I'm unsure...) - a commit that extracts the report host xml printing.
I also highly recommend that every commit compiles, because this would keep out confusing changes, like in 83d24a8 there are stray files in CMakeLists.txt, and there are #includes of files that are missing (eg manage_report_hosts.h).
I know I'm generally old and grumpy, but it's very annoying and time consuming having to sift through so much stuff trying to understand what the intention was, like when I'm looking at potential problems (eg flagged by an agent) or when I'm trying to understand the general design direction.
Regarding the split into separate files, the main goal was to isolate the report-host-specific code, not to fully decouple it from the existing report printing path. I agree the code is still tied together through the report printing context, so this is not a full modular separation yet.
OK cool. What I was trying to say was that all the report printing code could stay together for now, but I understand the idea of separating it cleanly into modules.
src/manage_report_hosts.c
Outdated
| #include "manage_settings.h" | ||
| #include "manage_sql_report_hosts.h" | ||
|
|
||
| #include <util/fileutils.h> |
There was a problem hiding this comment.
For some reason this works, but everywhere else uses gvm-libs/util/fileutils.h
src/manage_report_hosts.c
Outdated
| is_container_scanning_report, | ||
| result_hosts_only, | ||
| result_hosts, | ||
| host_summary_buffer); |
There was a problem hiding this comment.
As host_summary_buffer is always checked for NULL before being used, this could just pass NULL.
Would save the allocation and make this function simpler.
src/manage_report_hosts.c
Outdated
| { | ||
| term = filter_term (get->filt_id); | ||
| if (term == NULL) | ||
| return 2; |
There was a problem hiding this comment.
This is leaking elements of ctx. It looks like you're trying to mirror print_report_xml_start, so I think you should do the ctx inits in the same places (so after deriving the filter controls).
Side note: this derive section looks like a good candidate to share between print_report_xml_start and manage_send_report_hosts.
src/manage_report_hosts.c
Outdated
| } | ||
|
|
||
| max_results = manage_max_rows (max_results, get->ignore_max_rows_per_page); | ||
| (void) first_result; |
There was a problem hiding this comment.
Just to check because these variables are so verbose: you're doing this so the filter derive code can stay the same as print_report_xml_start, right? The manage_report_filter_controls allows NULL to be passed instead of an address for all these args.
src/manage_report_hosts.c
Outdated
|
|
||
| print_report_context_cleanup (&ctx); | ||
|
|
||
| if (xml_dir[0] != '\0') |
There was a problem hiding this comment.
This is always true.
Maybe use a second var like xml_dir = &xml_dir_buffer that is set NULL on mkdtemp failure, or just a flag.
| <summary>Whether to include host details</summary> | ||
| <type>boolean</type> | ||
| </attrib> | ||
| </pattern> |
There was a problem hiding this comment.
Missing the lean attrib.
There was a problem hiding this comment.
The GET_REPORTS doc also mentions lean in the description.
| report_t report; | ||
| task_t task; | ||
| gchar *usage_type; | ||
| gboolean is_container_scanning_report = FALSE; |
src/schema_formats/XML/GMP.xml.in
Outdated
| </summary> | ||
| </option> | ||
| <option> | ||
| <name>rows</name> |
There was a problem hiding this comment.
I'm wondering about these options. From what I can see they still need to be implemented. I guess that will be quite complex (for one, due to result_hosts_only there may need to also be a result filter in GET_REPORT_HOSTS).
If it's going to be some time until they're implemented then maybe they should be hidden here.
src/manage_report_hosts.c
Outdated
| const gchar *usage_type, | ||
| gboolean is_container_scanning_report, | ||
| int lean, | ||
| gmp_parser_t *parser) |
There was a problem hiding this comment.
I think it's a bad idea to pull the GMP parser down into the manage layer. We already have the pattern of a send arg in manage_send_reports.
I understand that the manage layer is already working with the GMP XML to some extent but I think we should look to reducing this. Perhaps the XML parts can be moved up, or extracted into something that both the GMP layer and manage layer use.
There was a problem hiding this comment.
I actually implemented earlier to move it in that direction. Later, @timopollmeier suggested restructuring it this way, so I adjusted accordingly.
If we agree that the previous approach (keeping the GMP parsing out of the manage layer) is the better direction, I can revert and align with that.
Co-authored-by: Matt Mundell <32057441+mattmundell@users.noreply.github.com>
Extract duplicated report filter control setup into a shared helper and use it in manage_report_hosts and manage_sql_report_hosts. Remove duplicate initialization in gmp_report_hosts.c and guard temporary directory cleanup with a dedicated mkdtemp success flag.
Document the lean parameter for report hosts and remove options that are not currently implemented to keep the GMP documentation aligned with actual behavior.
What
get_report_hostsGMP command.Why
References
GEA-1692