Conversation
Live testing with a fake username ('noonewouldeverusethis7') revealed
29 of the 69 re-enabled sites in #2478 are false positives:
- 18 OP.GG regional trackers (search URL returns results for any input)
- Tom's guide, Pocket Stars, Rocket Tube, Kerch Forum (identical pages)
- We Heart It, Oracle Community, Mydarling, Librusec (errors/timeouts)
- Twitter Shadowban, Reddit Search (Pushshift), TikTok Online Viewer,
Kali community (fetch errors or unaccounted)
Root cause: --self-check alone is insufficient for validating
re-enablement. It verifies claimed→CLAIMED and unclaimed→AVAILABLE,
but does not catch sites that return CLAIMED for ANY arbitrary input.
A direct query with a fake username is the required second filter.
Remaining from #2478: 40 cleanly validated sites stay enabled.
Total: 2608 → 2579 enabled sites.
The op.gg engine was broken: both presenseStrs and absenceStrs from the old definition appeared on EVERY page (claimed AND unclaimed) because they matched strings inside a JSON localization bundle embedded in every op.gg page, not the actual rendered HTML. Fix: - presenseStrs: 'href="/lol/summoners/' — profile links that only appear in search results when a summoner is found (117 hits on claimed, 0 on unclaimed). - absenceStrs: '>\u201cNo search results for' — the rendered H1 tag only present on empty search results (the \u201c left-curly-quote distinguishes it from the JSON template which uses \\" escaping). Also: - URL updated to new domain: op.gg/lol/summoners/search?q=®ion= (old www.op.gg/summoners/search now 308-redirects). - Sites renamed: "OP.GG [LeagueOfLegends] Korea" → "OP.GG LoL Korea" (shorter, same information). Validation: all 17 regions tested with both 'faker' (claimed, found on all 17) and 'noonewouldeverusethis7' (unclaimed, Not found on all 17). Zero false positives. Triple-checked. op.gg engine stats: 0/17 (0%) → 17/17 (100%).
Owner
Author
Remove site database changes that don't belong in the AI mode branch: - Revert 'Re-disable 29 false positives from #2478' (e776dd9) - Revert 'Fix op.gg engine, rename sites, re-enable 17 LoL regions' (686dc31) Restore data.json and sites.md to the state of 777e503 (Re-enable 69 stale-disabled sites validated via self-check #2478). Update the date in sites.md and data_sha256 in db_meta.json to match the restored data. Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com>
Contributor
|
Done in a27b656. The changes from e776dd9 and 686dc31 have been reverted — data.json, sites.md, and db_meta.json are restored to the state from 777e503 (before those two commits). Since force-pushing isn't available, the revert is applied as a new commit on top of the branch rather than removing the commits from history. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.