Implement WebSearch and WebOpen with Playwright and DDGS integration by anakori · Pull Request #15 · xemantic/golem-xiv

anakori · 2026-01-15T00:40:10Z

This PR adds new web searching and improved web browsing capabilities for Golem:

Web Search via DDGS
Web browsing with JavaScript rendering via Playwright, with jina.ai as lightweight fallback
Session for authenticated workflows (login, cookies, localStorage persistence)

Web searching

Minimal Python/FastAPI wrapper for DDGS (Dux Distributed Global Search) metasearch library

search backends: bing, brave, duckduckgo, google, wikipedia and more
grokipedia is filtered out. We are NOT poisoning Golem with Elon Musk garbage
configurable parameters: region, safesearch, time filters, etc.
gradle handles venv creation and dependency management

Endpoints:

GET /health - health check
GET /search - web search with full parameter control

Gradle tasks:

./gradlew createVenv - create Python venv
./gradlew installDdgsDeps - install Python dependencies
./gradlew runDdgsSearch - start the service on localhost:8001, invokes createVenv and installDdgsDeps, if the user didn't run createVenv and installDdgsDeps first

Web browsing

We're initializing Playwright with graceful fallback if things won't work as they're supposed to:
- try bundled Playwright chromium first
- fallback to system chromium at known paths
- supports specifying non-standard chromium binary path via --chromium-path CLI option
DefaultWebBrowser is created if Playwright succeeds
Inject DefaultWeb with optional browser into GolemScriptDependencyProvider
Clean up resources on shutdown

HTML to markdown conversion:

custom HTML to markdown converter using jsoup (replaces Flexmark dependency). I am planning to extend it in near future.
handles tables, nested lists, code blocks, images, and links
two operation modes:
- keepPagesOpen=false: default, fresh page per request, closed after use
- keepPagesOpen=true: reuses page for debugging with --show-browser

`Web` Interface

interface Web {
    suspend fun open(url: String): String
    suspend fun openInSession(sessionId: String, url: String): String
    suspend fun closeSession(sessionId: String)
    fun listSessions(): Set<String>
    suspend fun search(
        query: String,
        provider: String? = null,  // "ddgs" (default, work well) or "anthropic" (not yet implemented, expensive)
        page: Int = 1,
        pageSize: Int = 10,
        region: String = "us-en",
        safesearch: String = "moderate",
        timelimit: String? = null  // "d", "w", "m", "y"
    ): String
}

`WebBrowser` Interface

interface WebBrowser {
    suspend fun open(url: String): String
    suspend fun openInSession(sessionId: String, url: String): String
    suspend fun closeSession(sessionId: String)
    fun listSessions(): Set<String>
}

CLI options:

--show-browser: run chromium in non-headless mode (window visible)
--chromium-path=/path/to/chromium: use chromium installed in non-standard path

Usage in GolemScript

// Simple web search
val results = web.search("xemantic AI")

// Filtered search
val recentResults = web.search(
    query = "Kotlin coroutines",
    timelimit = "w",  // Last week
    pageSize = 20
)

// Get webpage content
val content = web.open("https://example.com")

// TODO/WIP: Authenticated browsing session
// I plan on adding support for interacting with websites with Golem
web.openInSession("github", "https://github.com/login")
val privateData = web.openInSession("github", "https://github.com/settings/profile")
web.closeSession("github")

Tests

Unit tests (mocked)

DefaultWebTest.kt:

open() with Playwright success/failure scenarios
open() fallback to jina.ai
openInSession() behavior
search() with DDGS service
error handling for various failure modes

DefaultWebBrowserTest.kt:

HTML to Markdown conversion
headings, paragraphs, links, images, lists, tables
code blocks and blockquotes

Integration tests

DefaultWebIntegrationTest.kt:

real DDGS service integration
verifies search result format and content

DefaultWebBrowserIntegrationTest.kt:

real Playwright browser integration
tests actual web page fetching
tests session creation and management

Integration tests are tagged with @Tag("integration") and skip gracefully when required services are unavailable

Dependencies

Pythond dependencies

fastapi
uvicorn
ddgs

JVM dependencies

jsoup (for HTML parsing in Playwright module)

How to run it

Open 4 terminals

Terminal #1 - start Neo4j

./gradlew runNeo4j

Terminal #2 - start DDGS service

./gradlew runDdgsSearch

Terminal #3 - start Golem

export ANTHROPIC_API_KEY=your_key
./gradlew run

Optionally, if you want to see the web browser used by Playwright:

export ANTHROPIC_API_KEY=your_key
./gradlew run --args="--show-browser"

If you want to specify chromium binary located in non-standard path:

export ANTHROPIC_API_KEY=your_key
./gradlew run --args="--show-browser --chromium-path=/snap/bin/chromium"

Terminal #4 - web UI

./gradlew jsBrowserDevelopmentRun --continuous

Running tests

# Unit tests only
./gradlew :golem-xiv-core:test --tests "*DefaultWebTest*"
./gradlew :golem-xiv-playwright:test --tests "*DefaultWebBrowserTest*"

# Integration tests (start DDGS service first!)
./gradlew :golem-xiv-core:test --tests "*Integration*"
./gradlew :golem-xiv-playwright:test --tests "*Integration*"

More

I would like to take it further and let Golem see the web by making screenshots and allowing it to click on elements and to login in to websites, fill the forms etc. I am wondering about extending/improving markanywhere, to use it as HTML to markown converter, but I am not sure about that yet.

morisil

I left my initial comments, there is much more, but I think we should start by drafting the architecture together first. Let's have a meeting focused on that, and then we can proceed with the implementation.

anakori · 2026-02-09T14:02:17Z

Change 1: New SearchProvider Interface

New file: golem-xiv-api-backend/src/main/kotlin/Web.kt:

interface SearchProvider {
    suspend fun search(
        query: String,
        page: Int = 1,
        pageSize: Int = 10,
        region: String = "us-en",
        safeSearch: String = "moderate",
        timeLimit: String? = null
    ): String
}

Original Web interface combined two distinct responsibilities:

content fetching (open, openInSession)
web searching

By enforcing interface segregation:

each search provider (DDGS, Anthropic) can implement just the search contract
Web interface stays focused on content fetching
new providers can be added without modifying Web interface
different search implementations can be swapped at runtime

Change 2: renamed open() to fetch() with content negotiation

What changed:

Before:

suspend fun open(url: String): String

After:

val MarkdownContentType = ContentType("text", "markdown")
suspend fun fetch(url: String, accept: ContentType = MarkdownContentType): String

Removed from Web interface:

suspend fun openInSession(sessionId: String, url: String): String
suspend fun closeSession(sessionId: String)

Since for now we want to simplify web.open() by getting rid of session management and leaving just basic content fetching, web.fetch() is more accurate (suggests we're only fetching the content), because web.open() might suggest persistent browser state
The ContentType parameter enables future support for different output formats (HTML, plain text, JSON)

Change 3: search provider map injection

Before:

override suspend fun search(..., provider: String?, ...): String {
    return when (provider) {
        "anthropic" -> throw UnsupportedOperationException(...)
        "ddgs", null -> searchWithDdgs(...)
        else -> throw IllegalArgumentException(...)
    }
}

After:

class DefaultWeb(
    private val searchProviders: Map<String?, SearchProvider>,
    private val httpClient: HttpClient,
    ...
) : Web {
    override suspend fun search(..., provider: String?, ...): String =
        searchProviders[provider]?.search(...)
            ?: throw IllegalArgumentException("Unknown search provider: $provider")
}

By removing hardcoded web search provider logic, we remove the need to modify DefaultWeb when adding new web search providers. Providers are now injected from outside and DefaultWeb class doesn't need to know how any provider works.
This also means we can inject mocked web search providers for testing without needing the actual DDGS service running or Anthropic API key. DefaultWeb is now open for extension (new providers), but closed for modification.

Change 4: new golem-xiv-ddgs module

New golem-xiv-ddgs module contains DdgsSearchProvider implementing SearchProvider:

class DdgsSearchProvider(
    httpClient: HttpClient,
    ddgsServiceUrl: String = "http://localhost:8001"
) : SearchProvider {
    override suspend fun search(...): String { ... }
}

Before:

golem-xiv-core
  └── DefaultWeb (contains DDGS logic embedded)

After:

golem-xiv-ddgs
  └── DdgsSearchProvider (isolated)
golem-xiv-cognizer-anthropic
  └── AnthropicSearchProvider (isolated)
golem-xiv-core
  └── DefaultWeb (delegates to injected providers)

Golem is supposed to have a modular architecture, allowing for easy composition of custom Golem distributions tailored to specific needs of specific clients. DDGS-specific code now lives in its own module with its own dependencies, projects that don't need DDGS can exclude this module entirely, DDGS integration tests can run in isolation, module clearly encapsulates everything DDGS-related.

Change 5: AnthropicSearchProvider in golem-xiv-cognizer-anthropic module

New file: golem-xiv-cognizer-anthropic/src/main/kotlin/AnthropicSearchProvider.kt

class AnthropicSearchProvider : SearchProvider {
    override suspend fun search(...): String {
        TODO("Not yet implemented")
    }
}

Why this is needed:

Anthropic-specific code belongs in the Anthropic cognizer module, since this module already depends on anthropic-sdk-kotlin and can properly integrate with the Anthropic API.

Other minor changes:

correct camelCase naming convention (safeSearch, timeLimit)
removed implementation details from Golem constitution (only API usage shown)
removed misleading "backward compatibility" comment on http service

claude · 2026-02-09T14:05:27Z

Pull Request Review: Web Search and Web Browsing Integration

Summary

This is a well-structured PR that adds comprehensive web search and browsing capabilities to Golem XIV. The implementation demonstrates good architectural decisions, thorough testing, and attention to operational concerns.

🟢 Strengths

Architecture & Design

Clean separation of concerns: Modular design with separate packages follows SOLID principles well
Graceful degradation: Excellent fallback chain (bundled Chromium → system Chromium → jina.ai)
Provider pattern for search: Pluggable SearchProvider interface enables easy extensibility
Session management: Clean separation between stateless fetching and stateful browsing
Resource management: Proper cleanup on shutdown (GolemServer.kt:269-276)

Code Quality

Excellent error handling: Informative messages with clear remediation steps
Comprehensive logging: Debug logs with content previews aid troubleshooting
Thread-safe sessions: Proper use of ConcurrentHashMap, Mutex, and fast-path optimization
Well-tested: Both unit tests (mocked) and integration tests with graceful skipping

Documentation

Clear CLAUDE.md updates: Accurate feature descriptions and setup requirements
Helpful KDoc comments: Clear interface documentation with examples
Good inline comments: E.g., grokipedia filtering rationale

🟡 Areas for Improvement

Security Concerns ⚠️

HIGH PRIORITY

Missing URL validation in DefaultWeb.fetch() (DefaultWeb.kt:44)
- Risk: SSRF vulnerability - could fetch internal URLs like http://localhost:8001/admin, file:///etc/passwd, or cloud metadata (http://169.254.169.254/)
- Fix: Add URL validation to allow only http/https and block localhost/private IPs
Exception stack traces in Python service (ddgs_service.py:88)
- exc_info=True could leak sensitive info in production
- Consider sanitizing error messages

MEDIUM PRIORITY

No resource limits beyond navigationTimeoutMs
- keepPagesOpen mode could accumulate pages
- Consider page-level resource limits

Performance Considerations

HTML to Markdown is synchronous (WebBrowsing.kt:276-517)
- Large HTML could block dispatcher
- Consider withContext(Dispatchers.Default) for CPU-intensive parsing
No explicit connection pooling
- Add HttpTimeout and pool config to webHttpClient (GolemServer.kt:222)
Session cleanup is manual-only
- Abandoned sessions could leak memory
- Add TTL or LRU eviction in DefaultWebBrowser

Code Quality Issues

MEDIUM PRIORITY

Magic numbers in HTML converter (WebBrowsing.kt:304)
- Extract listOf('\n', ' ', '[', '(') to named constants
Duplicate code in DefaultWebBrowser
- openWithFreshPage() and openKeepingPageOpen() have ~70% duplication
- Extract common navigation/conversion logic
Inconsistent null handling
- searchProviders map allows null keys but filters them in error messages
- Use null consistently or explicit "default" key
Session API partially exposed
- DefaultWeb has openInSession(), closeSession(), listSessions() (lines 75-102) but not in Web interface
- Either remove if unfinished or add to interface - current state is confusing

LOW PRIORITY

Python service binds to 0.0.0.0 (ddgs_service.py:101)
- Use 127.0.0.1 for better local security
Verbose debug logging
- Full content previews (DefaultWeb.kt:49-60, 114-127) could impact performance
- Make configurable or use TRACE level

Test Coverage Gaps

No concurrent session tests - Add test with multiple coroutines accessing same session
No Playwright fallback integration test - Verify Playwright → jina.ai fallback
No malformed HTML edge cases - Test deeply nested, malformed, and very large HTML
Missing negative test cases - Test invalid providers, negative page numbers, etc.

🔵 Additional Observations

Build System

Python venv management: Well-done with OS-specific handling and onlyIf optimization
Dependencies: Current and appropriate (jsoup is a solid choice)

API Design

Web.fetch() ignores ContentType parameter (GolemScriptApi.kt:100)
- Either implement content negotiation or remove parameter
String-based enums for safeSearch, timeLimit
- Acceptable for flexibility but consider validation/documentation

🟢 Minor Suggestions

Add health check endpoint to Golem server (like DDGS /health)
Make jina.ai URL configurable (currently hardcoded at DefaultWeb.kt:105)
Add metrics: search counts, Playwright vs jina.ai ratio, response times
Consider retry logic with exponential backoff for network failures
Add privacy-focused browser args when launching Chromium

🎯 Recommendations

Must Fix Before Merge

SSRF protection: Add URL validation
Session API: Remove unfinished methods or add to interface

Should Fix Before Merge

ContentType parameter: Implement or remove
Code duplication: Extract common logic in DefaultWebBrowser
Session cleanup: Add TTL mechanism

Can Address Later

Performance optimizations (dispatcher, pooling)
Additional test coverage
Metrics and observability

✅ Conclusion

This is a high-quality PR that adds significant value. The implementation is well-architected, thoroughly tested, and demonstrates excellent engineering practices. The fallback mechanisms and error handling are particularly well done.

Recommendation: Approve with minor changes. Address the security concerns (SSRF) and API consistency issues, and this will be ready to merge.

Great work! This feature will significantly enhance Golem's capabilities.

Reviewed: 28 files (+3435/-53 lines) | Date: 2026-02-09

anakori · 2026-02-09T16:42:19Z

Ad security concerns

Ad 1. Missing URL validation in `DefaultWeb.fetch()`:

I believe Claude Bot's SSRF concern here is a false positive, at least for now

Traditional SSRF applies when:

An external, untrusted user supplies a URL to a web-facing service
The server fetches that URL with its own privileges, exposing internal infrastructure

Golem's architecture is fundamentally different:

The entity calling web.fetch() is Golem itself, the AI agent running GolemScript
The human operator runs Golem locally and grants it access to their machine
There is no external attacker injecting URLs, the AI autonomously decides what to fetch based on its reasoning
Restricting localhost/LAN/file access would break legitimate use cases like fetching web search results from the DDGS service at localhost:8001 or accessing internal development servers

However, for future deployment, publicly accessible on the web we should implement SandboxedWeb that wraps DefaultWeb with URL filtering, rate limits, etc., without changing DefaultWeb:

class SandboxedWeb(
  private val delegate: Web,
  private val urlPolicy: UrlPolicy  // configurable per deployment
) : Web {
  override suspend fun fetch(url: String, accept: ContentType): String {
      urlPolicy.validate(url)  // throws on disallowed URLs
      return delegate.fetch(url, accept)
  }
  // ...
}

@morisil do you want me to implement SandboxedWeb now, in this PR?

Ad 2. Exception stack traces in Python service `(ddgs_service.py:88)`:

exc_info=True on line 88 writes the stack trace to the server-side log (stdout/stderr). It does NOT send it to the client.
DDGS service now binds to 127.0.0.1, I changed it, so it's only reachable from localhost. The only consumer is Golem's process on the same machine. There's no external party to "leak" to.

Ad 3. No resource limits beyond `navigationTimeoutMs`

keepPagesOpen reuses a single page:

val page = statelessPage ?: browser.newPage().also {
  statelessPage = it
}

It creates one page and navigates it to different URLs. Pages don't accumulate. The statelessPageMutex ensures serialized access. In keepPagesOpen=false mode (the default), each page is created and closed in a try/finally block.

Ad performance considerations

Ad 1. HTML to Markdown is synchronous

We will migrate to markanywhere converting HTML to Markdown later, this is not an issue for now.

Ad 2. No explicit connection pooling

Java's built-in HttpClient has connection pooling by default

Ad 3. Session cleanup is manual-only

Sessions aren't exposed in the Web interface yet

Ad code quality issues

Ad 1. Magic numbers in HTML converter

We will migrate to markanywhere converting HTML to Markdown later, this is not an issue for now.

Ad 2. Duplicate code in DefaultWebBrowser

We could extract navigateAndConvert(page, url): String helper, but since we're planning to migrate to markanywhere anyway, that refactoring would be thrown away

Ad 3. Inconsistent null handling

The null key is intentional as it represents the default provider. When Web.search() is called without specifying a provider, the map lookup searchProviders[null] resolves to DDGS

Ad 4. Session API partially exposed

That's how it's supposed to be, we're planning on developing it further in the future

Ad 5. Python service binds to 0.0.0.0

Changed to 127.0.0.1

Ad 6. Verbose debug logging

These are all inside logger.debug { ... } lambda blocks. When the log level is INFO or higher (which it will be in normal operation), the lambdas are never executed so no performance overhead in production

Ad Test Coverage Gaps

Will be developed further once markanywhere as HTML to markdown converter is ready

Ad Additional Observations

API design

Ad 1. `Web.fetch()` ignores ContentType parameter

Will be developed further

anakori · 2026-02-09T16:53:12Z

Build failing because GitHub is throwing HTTP 500 errors every few requests 🫠

morisil requested changes Jan 15, 2026

View reviewed changes

anakori added 2 commits February 4, 2026 21:45

Implement WebSearch and WebOpen with Playwright and DDGS integration

c682603

Implement WebSearch/WebOpen with updated API

7111ead

anakori force-pushed the feature/webSearchAndWebOpen branch from e62486d to 7111ead Compare February 9, 2026 14:00

anakori requested a review from morisil February 9, 2026 14:02

bind DDGS to 127.0.0.1

b057cb1

anakori mentioned this pull request Feb 9, 2026

[Task]: WebSearch simplification and architectural changes to search and open functions #29

Open

2 tasks

Migrate test assertions to xemantic-kotlin-test conventions

0b62f7d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement WebSearch and WebOpen with Playwright and DDGS integration#15

Implement WebSearch and WebOpen with Playwright and DDGS integration#15
anakori wants to merge 4 commits intomainfrom
feature/webSearchAndWebOpen

anakori commented Jan 15, 2026

Uh oh!

morisil left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakori commented Feb 9, 2026

Uh oh!

claude bot commented Feb 9, 2026

Uh oh!

anakori commented Feb 9, 2026

Uh oh!

anakori commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

anakori commented Jan 15, 2026

Web searching

Web browsing

HTML to markdown conversion:

Web Interface

WebBrowser Interface

CLI options:

Usage in GolemScript

Tests

Unit tests (mocked)

Integration tests

Dependencies

Pythond dependencies

JVM dependencies

How to run it

Running tests

More

Uh oh!

morisil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anakori commented Feb 9, 2026

Uh oh!

claude bot commented Feb 9, 2026

Pull Request Review: Web Search and Web Browsing Integration

Summary

🟢 Strengths

Architecture & Design

Code Quality

Documentation

🟡 Areas for Improvement

Security Concerns ⚠️

HIGH PRIORITY

MEDIUM PRIORITY

Performance Considerations

Code Quality Issues

MEDIUM PRIORITY

LOW PRIORITY

Test Coverage Gaps

🔵 Additional Observations

Build System

API Design

🟢 Minor Suggestions

🎯 Recommendations

Must Fix Before Merge

Should Fix Before Merge

Can Address Later

✅ Conclusion

Uh oh!

anakori commented Feb 9, 2026

Ad security concerns

Ad 1. Missing URL validation in DefaultWeb.fetch():

Ad 2. Exception stack traces in Python service (ddgs_service.py:88):

Ad 3. No resource limits beyond navigationTimeoutMs

Ad performance considerations

Ad 1. HTML to Markdown is synchronous

Ad 2. No explicit connection pooling

Ad 3. Session cleanup is manual-only

Ad code quality issues

Ad 1. Magic numbers in HTML converter

Ad 2. Duplicate code in DefaultWebBrowser

Ad 3. Inconsistent null handling

Ad 4. Session API partially exposed

Ad 5. Python service binds to 0.0.0.0

Ad 6. Verbose debug logging

Ad Test Coverage Gaps

Ad Additional Observations

API design

Ad 1. Web.fetch() ignores ContentType parameter

`Web` Interface

`WebBrowser` Interface

Ad 1. Missing URL validation in `DefaultWeb.fetch()`:

Ad 2. Exception stack traces in Python service `(ddgs_service.py:88)`:

Ad 3. No resource limits beyond `navigationTimeoutMs`

Ad 1. `Web.fetch()` ignores ContentType parameter