Conversation
|
Could you show the screenshow at least 3-4 mins data? One dot in only one minute is not very clear. |
There was a problem hiding this comment.
Review: Support Virtual-GenAI monitoring
Critical Issues
1. Config file exclude mismatch in server-starter/pom.xml
Exclude says gen-ai-settings.yml but actual file is gen-ai-config.yml.
2. requiredModules() returns empty array
GenAIAnalyzerModuleProvider uses CoreModule in start() but doesn't declare it in requiredModules(). Should return new String[] { CoreModule.NAME }.
3. Module naming convention violation
Existing analyzer modules use lowercase-hyphenated: agent-analyzer, log-analyzer, meter-analyzer. New module genAI-analyzer should be gen-ai-analyzer.
4. Package should be org.apache.skywalking.oap.server.analyzer.genai
Currently uses org.apache.skywalking.oap.meter.analyzer.* which collides with the existing meter-analyzer module's package. Since this is a trace span analyzer (shared across SkyWalking/OTEL/Zipkin receivers), the package should be org.apache.skywalking.oap.server.analyzer.genai.*.
Design Issues
5. Duplicate OAL metric
gen_ai_provider_resp_time and gen_ai_provider_latency_avg both compute from(GenAIProviderAccess.latency).longAvg(). Remove one.
6. totalCost semantics are confusing
Stored value is tokens * costPerM, dashboard divides by 1,000,000. Better to store actual cost by dividing at computation time.
7. Missing NamingControl in VirtualGenAIProcessor
Other virtual processors all use NamingControl to normalize service names. GenAI processor skips this.
8. Tag key inconsistency: gen_ai.stream.ttfr vs timeToFirstToken
Tag says "ttfr", field says "timeToFirstToken", doc doesn't mention this tag at all.
Code Quality Issues
9. GenAIConfigLoader constructor ignores Yaml parameter
Accepts Yaml but creates a new one in loadConfig().
10. fastjson dependency in e2e test
No new dependency version should be added directly in sub-module pom.xml.
Dependencies are managed by BOM. We have decided not to include this repo as it had a lot of critical CVEs before. We have to fix those(re-release patch version), it is too pain.
11. E2E Dockerfile clones unpinned external repo
Dockerfile.provider clones spring-projects/spring-ai-examples without pinning a commit/tag. Any upstream change could break the e2e test.
12. Documentation typo
virtual-genai.md: "Virtual cache represent the Generative AI service nodes" - copy-paste from virtual-cache doc.
Minor Issues
13. Missing newline at end of file in multiple files: gen-ai-config.yml, menu.yaml, SPI files, e2e expected YAMLs, dashboard JSONs.
14. GenAIModelAccessDispatcher bypasses normal dispatch flow - directly calls MetricsStreamProcessor.getInstance().in(traffic).
15. VirtualGenAIProcessor.recordList should be final.
16. Blank line in import block in VirtualServiceAnalysisListener.java between java.util and lombok imports.
There was a problem hiding this comment.
Additional issue: should use percentile2 instead of percentile
All production OAL files use percentile2(10). The old percentile function only exists in e2e test OAL for backward-compatibility testing.
In virtual-gen-ai.oal, the following lines should use percentile2:
gen_ai_provider_latency_percentile = from(GenAIProviderAccess.latency).percentile2(10);
gen_ai_model_latency_percentile = from(GenAIModelAccess.latency).percentile2(10);
gen_ai_model_ttft_percentile = from(GenAIModelAccess.timeToFirstToken).filter(timeToFirstToken > 0).percentile2(10);
And your UI doesn't show the correct percentile labels.
|
@wu-sheng |
|
UI side got merged. When you update this PR, please include the submodule update. |
|
not yet finish, some check fails in my local env, still fixing |
| config: test/e2e-v2/cases/zipkin/kafka/e2e.yaml | ||
| - name: Zipkin BanyanDB | ||
| config: test/e2e-v2/cases/zipkin/banyandb/e2e.yaml | ||
| - name: Virtual-genai |
There was a problem hiding this comment.
| - name: Virtual-genai | |
| - name: Virtual GenAI |
| @@ -0,0 +1,16 @@ | |||
| # Virtual GenAI | |||
There was a problem hiding this comment.
You need to update the demo to point to here. I think from Marketplace/General Service?
There was a problem hiding this comment.
Not just this. menu.yml is not updated in the /docs/en
|
e2e fails, please fix it. |

CHANGESlog.