diff --git a/.gitignore b/.gitignore index dfe0ab105..f6902a534 100644 --- a/.gitignore +++ b/.gitignore @@ -44,3 +44,4 @@ settings.json *.egg-info build LLM +lib diff --git a/docs/source/use-cases/crypto.rst b/docs/source/use-cases/crypto.rst new file mode 100644 index 000000000..b48e15cea --- /dev/null +++ b/docs/source/use-cases/crypto.rst @@ -0,0 +1,147 @@ +.. _use-case-crypto: + +Cryptocurrency & Web3 Investigations +===================================== + +Blockchain transactions are public, but the people behind wallets are not. Maigret helps bridge this gap by finding Web3 accounts tied to a username, revealing the person behind a pseudonymous crypto persona. + +Why it matters +-------------- + +Crypto investigations often start with a wallet address or an ENS name but hit a wall — the blockchain tells you *what* happened, not *who* did it. A username, however, is reused across platforms. If someone trades on OpenSea as ``zachxbt`` and posts on Warpcast as ``zachxbt``, Maigret connects the dots and builds a full profile. + +Common scenarios: + +- **Scam attribution.** A rug-pull promoter uses the same alias on Fragment (Telegram username marketplace), OpenSea, and a personal blog. +- **Sanctions compliance.** Verifying whether a counterparty's online footprint matches known sanctioned individuals. +- **Due diligence.** Before an OTC deal or DAO vote, checking whether the other party has a consistent online presence or is a freshly created sockpuppet. +- **Stolen funds tracing.** A stolen NFT appears on OpenSea under a new account — but the username matches a Warpcast profile with real-world links. + +Supported sites +--------------- + +Maigret currently checks the following crypto and Web3 platforms: + +.. list-table:: + :header-rows: 1 + :widths: 20 40 40 + + * - Site + - What it reveals + - Notes + * - **OpenSea** + - NFT collections, trading history, profile bio, linked website + - + * - **Rarible** + - NFT marketplace profile, collections, listing history + - Complements OpenSea for NFT attribution across marketplaces + * - **Zora** + - Zora Network profile, minted NFTs, creator activity + - Ethereum L2 creator platform; useful for on-chain art attribution + * - **Polymarket** + - Prediction-market profile, positions, public portfolio P&L + - Useful for political/financial prediction attribution + * - **Warpcast** (Farcaster) + - Decentralized social profile, posts, follower graph, Farcaster ID + - Every Farcaster ID maps to an Ethereum address via the on-chain ID registry + * - **Fragment** + - Telegram username ownership, TON wallet address, purchase date and price + - Valuable for linking Telegram identities to TON wallets + * - **Paragraph** + - Web3 blog/newsletter, ETH wallet address, linked Twitter handle + - Richest cross-platform data among crypto sites + * - **Tonometerbot** + - TON wallet balance, subscriber count, NFT collection, rankings + - TON blockchain analytics + * - **Spatial** + - Metaverse profile, linked social accounts (Discord, Twitter, Instagram, LinkedIn, TikTok) + - Rich cross-platform links + * - **Revolut.me** + - Payment handle: first/last name, country code, base currency, supported payment methods + - Not strictly Web3, but widely used by crypto OTC traders for fiat off-ramps; the public API returns structured KYC-adjacent data + +Real-world example: zachxbt +--------------------------- + +`ZachXBT `_ is a well-known on-chain investigator. Let's see what Maigret can find from just the username ``zachxbt``: + +.. code-block:: console + + maigret zachxbt --tags crypto + +Maigret finds 5 accounts and automatically extracts structured data from each: + +**Fragment** — confirms the Telegram username ``@zachxbt`` is claimed, reveals the TON wallet address (``EQBisZrk...``), purchase price (10 TON), and date (January 2023). + +**Paragraph** — the richest result. Returns the real name used on the platform (``ZachXBT``), bio (``Scam survivor turned 2D investigator``), an Ethereum wallet address (``0x23dBf066...``), and a linked Twitter handle (``zachxbt``). The ``wallet_address`` field is especially valuable — it directly links the pseudonym to an on-chain identity. + +**Warpcast** — Farcaster profile with a Farcaster ID (``fid: 20931``), profile image, and social graph (33K followers). Every Farcaster ID is tied to an Ethereum address via the on-chain ID registry, so this is another on-chain anchor. + +**OpenSea** — NFT marketplace profile with bio (``On-chain sleuth | 10x rug pull survivor``), avatar (hosted on ``seadn.io`` with an Ethereum address in the URL path), and a link to an external investigations page. + +**Hive Blog** — blockchain-based blog account created in March 2025. Low activity (1 post), but confirms the username is claimed across blockchain ecosystems. + +From a single username, Maigret produces: + +- **2 wallet addresses** — one TON (from Fragment), one Ethereum (from Paragraph) +- **1 confirmed Twitter handle** — ``zachxbt`` (from Paragraph) +- **1 Telegram username** — ``@zachxbt`` (from Fragment) +- **1 external link** — ``investigations.notion.site`` (from OpenSea) +- **Social graph data** — 33K Farcaster followers, blog activity timestamps + +This is enough to pivot into blockchain analysis tools (Etherscan, Arkham, Nansen) using the wallet addresses, or into social media analysis using the Twitter handle. + +Workflow: from username to wallet +--------------------------------- + +**Step 1: Search crypto platforms** + +.. code-block:: console + + maigret --tags crypto -v + +Review the results. Pay attention to: + +- **Fragment** — if the username is claimed, you get a TON wallet address directly. +- **Paragraph** — blog profiles often contain an ETH address and a Twitter handle. +- **Warpcast** — Farcaster IDs map to Ethereum addresses via the on-chain registry. +- **OpenSea** — avatar URLs sometimes contain wallet addresses in the path. + +**Step 2: Expand with extracted identifiers** + +Maigret automatically extracts additional identifiers from found profiles (real names, linked accounts, profile URLs) and recursively searches for them. This is enabled by default. If Maigret finds a linked Twitter handle on a Paragraph profile, it will automatically search for that handle across all sites. + +**Step 3: Cross-reference with non-crypto platforms** + +The real power is connecting crypto personas to mainstream accounts. Drop the tag filter: + +.. code-block:: console + + maigret -a + +This checks all 3000+ sites. A match on GitHub, Reddit, or a forum can reveal the person behind the wallet. + +Workflow: from wallet to identity +--------------------------------- + +If you start with a wallet address rather than a username, you can use complementary tools to get a username first: + +1. **ENS / Unstoppable Domains** — resolve the wallet address to a human-readable name (``vitalik.eth``). Then search that name in Maigret. +2. **Etherscan labels** — check if the address has a public label (exchange, known entity). +3. **Fragment** — search the TON wallet address to find which Telegram usernames it purchased. +4. **Arkham Intelligence / Nansen** — blockchain attribution platforms that may tag the address with a known identity. + +Once you have a username candidate, feed it to Maigret. + +Tips +---- + +- **Username reuse is the #1 signal.** Crypto-native users often reuse their ENS name (``alice.eth``) or a variation (``alice_eth``, ``aliceeth``) across platforms. Try all variations. +- **Fragment is uniquely valuable** because it directly links Telegram usernames to TON wallet addresses — a rare on-chain / off-chain bridge. +- **Warpcast profiles are Ethereum-native.** Every Farcaster account is tied to an Ethereum address via the ID registry contract. If you find a Warpcast profile, you implicitly have a wallet address. +- **Paragraph often has the richest data** — wallet address, Twitter handle, bio, and activity timestamps in a single API response. +- **Use** ``--exclude-tags`` **to skip irrelevant sites** when you're focused on crypto: + + .. code-block:: console + + maigret alice_eth --exclude-tags porn,dating,forum diff --git a/maigret/resources/data.json b/maigret/resources/data.json index 3aa755e6c..fb2de4bfc 100644 --- a/maigret/resources/data.json +++ b/maigret/resources/data.json @@ -2620,7 +2620,8 @@ }, "Paypal": { "tags": [ - "finance" + "finance", + "fintech" ], "checkType": "message", "absenceStrs": [ @@ -6041,6 +6042,7 @@ "Venmo": { "tags": [ "finance", + "fintech", "us" ], "checkType": "status_code", @@ -8952,6 +8954,7 @@ "banki.ru": { "disabled": true, "tags": [ + "finance", "ru" ], "engine": "engine404", @@ -16017,7 +16020,8 @@ "usernameUnclaimed": "noonewouldeverusethis7", "alexaRank": 108253, "tags": [ - "finance" + "finance", + "fintech" ] }, "forum.kineshemec.ru": { @@ -24307,7 +24311,12 @@ ], "url": "https://cash.app/${username}", "usernameClaimed": "john", - "usernameUnclaimed": "noonewouldeverusethis7" + "usernameUnclaimed": "noonewouldeverusethis7", + "tags": [ + "finance", + "fintech", + "us" + ] }, "Castingcallclub": { "checkType": "message", @@ -29522,6 +29531,7 @@ "disabled": true, "tags": [ "finance", + "fintech", "ru" ], "urlProbe": "https://api.qiwi.me/piggybox/{username}", @@ -34880,6 +34890,39 @@ "social" ] }, + "Polymarket": { + "url": "https://polymarket.com/@{username}", + "urlMain": "https://polymarket.com", + "checkType": "status_code", + "usernameClaimed": "shayne", + "usernameUnclaimed": "noonewouldeverusethis7", + "tags": [ + "crypto" + ] + }, + "Zora": { + "url": "https://zora.co/@{username}", + "urlMain": "https://zora.co", + "checkType": "status_code", + "usernameClaimed": "jacob", + "usernameUnclaimed": "noonewouldeverusethis7", + "tags": [ + "crypto", + "nft" + ] + }, + "Revolut.me": { + "url": "https://revolut.me/{username}", + "urlProbe": "https://revolut.me/api/web-profile/{username}", + "urlMain": "https://revolut.me", + "checkType": "status_code", + "usernameClaimed": "adamp", + "usernameUnclaimed": "noonewouldeverusethis7", + "tags": [ + "finance", + "fintech" + ] + }, "Fragment": { "url": "https://fragment.com/username/{username}", "urlMain": "https://fragment.com", @@ -35385,6 +35428,7 @@ "erotic", "fashion", "finance", + "fintech", "forum", "freelance", "gambling", diff --git a/maigret/resources/db_meta.json b/maigret/resources/db_meta.json index 48ee4f474..faf8572c1 100644 --- a/maigret/resources/db_meta.json +++ b/maigret/resources/db_meta.json @@ -1,8 +1,8 @@ { "version": 1, - "updated_at": "2026-04-21T08:51:46Z", - "sites_count": 3143, + "updated_at": "2026-04-21T09:05:55Z", + "sites_count": 3144, "min_maigret_version": "0.6.0", - "data_sha256": "839ad4bc48719e4145564ef55c96995e9dbf91e43cda1fdac96d7d3475794caa", + "data_sha256": "8a24e0d08b964fb1d6133e90c5580f5dd73d4aba8f567013e7df831e530624bd", "data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json" } \ No newline at end of file diff --git a/sites.md b/sites.md index a828e4bfd..24d3e5a6f 100644 --- a/sites.md +++ b/sites.md @@ -1,5 +1,5 @@ -## List of supported sites (search methods): total 3143 +## List of supported sites (search methods): total 3144 Rank data fetched from Majestic Million by domains. @@ -132,7 +132,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=https://www.fiverr.com/) [Fiverr (https://www.fiverr.com/)](https://www.fiverr.com/)*: top 1K, shopping* 1. ![](https://www.google.com/s2/favicons?domain=https://huggingface.co/) [HuggingFace (https://huggingface.co/)](https://huggingface.co/)*: top 1K, coding, llm, tech* 1. ![](https://www.google.com/s2/favicons?domain=https://laracasts.com/) [Laracast (https://laracasts.com/)](https://laracasts.com/)*: top 1K, coding, education* -1. ![](https://www.google.com/s2/favicons?domain=https://www.paypal.me) [Paypal (https://www.paypal.me)](https://www.paypal.me)*: top 1K, finance* +1. ![](https://www.google.com/s2/favicons?domain=https://www.paypal.me) [Paypal (https://www.paypal.me)](https://www.paypal.me)*: top 1K, finance, fintech* 1. ![](https://www.google.com/s2/favicons?domain=https://itch.io/) [Itch.io (https://itch.io/)](https://itch.io/)*: top 1K, gaming* 1. ![](https://www.google.com/s2/favicons?domain=https://bitbucket.org/) [BitBucket (https://bitbucket.org/)](https://bitbucket.org/)*: top 1K, coding* 1. ![](https://www.google.com/s2/favicons?domain=https://last.fm/) [last.fm (https://last.fm/)](https://last.fm/)*: top 1K, music* @@ -239,7 +239,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=https://www.giantbomb.com) [Giantbomb (https://www.giantbomb.com)](https://www.giantbomb.com)*: top 5K, gaming* 1. ![](https://www.google.com/s2/favicons?domain=https://replit.com/) [Replit (https://replit.com/)](https://replit.com/)*: top 5K, coding* 1. ![](https://www.google.com/s2/favicons?domain=https://steemit.com) [Steemit (https://steemit.com)](https://steemit.com)*: top 5K, news* -1. ![](https://www.google.com/s2/favicons?domain=https://venmo.com/) [Venmo (https://venmo.com/)](https://venmo.com/)*: top 5K, finance, us* +1. ![](https://www.google.com/s2/favicons?domain=https://venmo.com/) [Venmo (https://venmo.com/)](https://venmo.com/)*: top 5K, finance, fintech, us* 1. ![](https://www.google.com/s2/favicons?domain=https://www.inaturalist.org) [iNaturalist (https://www.inaturalist.org)](https://www.inaturalist.org)*: top 5K, hobby, science* 1. ![](https://www.google.com/s2/favicons?domain=https://gfycat.com/) [Gfycat (https://gfycat.com/)](https://gfycat.com/)*: top 5K, photo, sharing*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=https://leetcode.com/) [LeetCode (https://leetcode.com/)](https://leetcode.com/)*: top 5K, coding* @@ -342,7 +342,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=https://www.wowhead.com) [Wowhead (https://www.wowhead.com)](https://www.wowhead.com)*: top 10K, gaming* 1. ![](https://www.google.com/s2/favicons?domain=https://www.periscope.tv/) [Periscope (https://www.periscope.tv/)](https://www.periscope.tv/)*: top 10K, streaming, video* 1. ![](https://www.google.com/s2/favicons?domain=https://www.sports.ru/) [sports.ru (https://www.sports.ru/)](https://www.sports.ru/)*: top 10K, ru, sport* -1. ![](https://www.google.com/s2/favicons?domain=https://banki.ru) [banki.ru (https://banki.ru)](https://banki.ru)*: top 10K, ru*, search is disabled +1. ![](https://www.google.com/s2/favicons?domain=https://banki.ru) [banki.ru (https://banki.ru)](https://banki.ru)*: top 10K, finance, ru*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=https://www.skyscrapercity.com) [SkyscraperCity (https://www.skyscrapercity.com)](https://www.skyscrapercity.com)*: top 10K, forum*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=https://www.drive2.ru/) [Drive2 (https://www.drive2.ru/)](https://www.drive2.ru/)*: top 10K, ru* 1. ![](https://www.google.com/s2/favicons?domain=https://www.empowher.com) [Empowher (https://www.empowher.com)](https://www.empowher.com)*: top 10K, medicine* @@ -715,7 +715,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=https://monkeytype.com/) [Monkeytype (https://monkeytype.com/)](https://monkeytype.com/)*: top 10M, gaming* 1. ![](https://www.google.com/s2/favicons?domain=https://www.gingerbread.org.uk) [Gingerbread (https://www.gingerbread.org.uk)](https://www.gingerbread.org.uk)*: top 10M, gb* 1. ![](https://www.google.com/s2/favicons?domain=https://rive.app) [rive.app (https://rive.app)](https://rive.app)*: top 10M, design* -1. ![](https://www.google.com/s2/favicons?domain=https://cash.me/) [CashMe (https://cash.me/)](https://cash.me/)*: top 10M, finance* +1. ![](https://www.google.com/s2/favicons?domain=https://cash.me/) [CashMe (https://cash.me/)](https://cash.me/)*: top 10M, finance, fintech* 1. ![](https://www.google.com/s2/favicons?domain=https://mstdn.io/) [mstdn.io (https://mstdn.io/)](https://mstdn.io/)*: top 10M, mastodon, social* 1. ![](https://www.google.com/s2/favicons?domain=https://7dach.ru/) [7dach (https://7dach.ru/)](https://7dach.ru/)*: top 10M, ru* 1. ![](https://www.google.com/s2/favicons?domain=https://dota2.ru/) [Dota2 (https://dota2.ru/)](https://dota2.ru/)*: top 10M, gaming, ru*, search is disabled @@ -2052,7 +2052,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=) [Buzznet ()]()*: top 100M*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=) [Caringbridge ()]()*: top 100M* 1. ![](https://www.google.com/s2/favicons?domain=) [Carrd.co ()]()*: top 100M* -1. ![](https://www.google.com/s2/favicons?domain=) [Cash.app ()]()*: top 100M* +1. ![](https://www.google.com/s2/favicons?domain=) [Cash.app ()]()*: top 100M, finance, fintech, us* 1. ![](https://www.google.com/s2/favicons?domain=) [Castingcallclub ()]()*: top 100M* 1. ![](https://www.google.com/s2/favicons?domain=) [CD-Action ()]()*: top 100M* 1. ![](https://www.google.com/s2/favicons?domain=) [Cda.pl ()]()*: top 100M* @@ -2510,7 +2510,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=http://prizyvnikmoy.ru) [prizyvnikmoy.ru (http://prizyvnikmoy.ru)](http://prizyvnikmoy.ru)*: top 100M, ru* 1. ![](https://www.google.com/s2/favicons?domain=https://pvpru.com/) [pvpru (https://pvpru.com/)](https://pvpru.com/)*: top 100M, gaming, ru*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=https://python.su/) [python.su (https://python.su/)](https://python.su/)*: top 100M, ru* -1. ![](https://www.google.com/s2/favicons?domain=https://qiwi.me) [qiwi.me (https://qiwi.me)](https://qiwi.me)*: top 100M, finance, ru*, search is disabled +1. ![](https://www.google.com/s2/favicons?domain=https://qiwi.me) [qiwi.me (https://qiwi.me)](https://qiwi.me)*: top 100M, finance, fintech, ru*, search is disabled 1. ![](https://www.google.com/s2/favicons?domain=https://forum.quik.ru) [quik (https://forum.quik.ru)](https://forum.quik.ru)*: top 100M, forum, ru* 1. ![](https://www.google.com/s2/favicons?domain=http://realitygaming.fr/) [realitygaming.fr (http://realitygaming.fr/)](http://realitygaming.fr/)*: top 100M, forum, fr* 1. ![](https://www.google.com/s2/favicons?domain=http://relasko.ru) [relasko.ru (http://relasko.ru)](http://relasko.ru)*: top 100M, ru* @@ -3138,7 +3138,7 @@ Rank data fetched from Majestic Million by domains. 1. ![](https://www.google.com/s2/favicons?domain=https://warpcast.com) [Warpcast (https://warpcast.com)](https://warpcast.com)*: top 100M, crypto, social* 1. ![](https://www.google.com/s2/favicons?domain=https://polymarket.com) [Polymarket (https://polymarket.com)](https://polymarket.com)*: top 100M, crypto* 1. ![](https://www.google.com/s2/favicons?domain=https://zora.co) [Zora (https://zora.co)](https://zora.co)*: top 100M, crypto, nft* -1. ![](https://www.google.com/s2/favicons?domain=https://revolut.me) [Revolut.me (https://revolut.me)](https://revolut.me)*: top 100M, payment* +1. ![](https://www.google.com/s2/favicons?domain=https://revolut.me) [Revolut.me (https://revolut.me)](https://revolut.me)*: top 100M, finance, fintech* 1. ![](https://www.google.com/s2/favicons?domain=https://paragraph.com) [Paragraph (https://paragraph.com)](https://paragraph.com)*: top 100M, blog, crypto* 1. ![](https://www.google.com/s2/favicons?domain=https://tonometerbot.com) [Tonometerbot (https://tonometerbot.com)](https://tonometerbot.com)*: top 100M, crypto* 1. ![](https://www.google.com/s2/favicons?domain=https://www.spatial.io) [Spatial (https://www.spatial.io)](https://www.spatial.io)*: top 100M, crypto, gaming* @@ -3150,13 +3150,13 @@ Rank data fetched from Majestic Million by domains. The list was updated at (2026-04-21) ## Statistics -Enabled/total sites: 2557/3143 = 81.36% +Enabled/total sites: 2565/3144 = 81.58% -Incomplete message checks: 326/2557 = 12.75% (false positive risks) +Incomplete message checks: 331/2565 = 12.9% (false positive risks) -Status code checks: 639/2557 = 24.99% (false positive risks) +Status code checks: 639/2565 = 24.91% (false positive risks) -False positive risk (total): 37.74% +False positive risk (total): 37.81% Sites with probing: 500px, Armchairgm, BinarySearch (disabled), BleachFandom, Bluesky, BongaCams, Boosty, BuyMeACoffee, Calendly, Cent, Chess, Code Sandbox, Code Snippet Wiki, DailyMotion, Discord, Diskusjon.no, Disqus, Docker Hub, Duolingo, FandomCommunityCentral, GitHub, GitLab, Google Plus (archived), Gravatar, HackTheBox, Hackerrank, Hashnode, Holopin, Imgur, Issuu, Keybase, Kick, Kvinneguiden, LeetCode, Lesswrong, Livejasmin, LocalCryptos (disabled), Medium, MicrosoftLearn, MixCloud, Monkeytype, NPM, Niftygateway, Omg.lol, Paragraph, Picsart, Plurk, Polarsteps, Rarible, Reddit, Reddit Search (Pushshift) (disabled), Revolut.me, RoyalCams, Scratch, Soop, SportsTracker, Spotify, StackOverflow, Substack, TAP'D, Topcoder, Trello, Twitch, Twitter, Twitter Shadowban (disabled), UnstoppableDomains, Vimeo, Warframe Market, Warpcast, Weibo, Wikipedia, Yapisal (disabled), YouNow, en.brickimedia.org, nightbot, notabug.org, qiwi.me (disabled) @@ -3202,7 +3202,7 @@ Sites by engine: Top 20 tags: -- (1058) `NO_TAGS` (non-standard) +- (1057) `NO_TAGS` (non-standard) - (750) `forum` - (128) `gaming` - (80) `coding` @@ -3214,10 +3214,10 @@ Top 20 tags: - (33) `music` - (31) `shopping` - (27) `crypto` +- (26) `finance` - (25) `sharing` - (23) `video` - (23) `education` -- (23) `finance` - (21) `art` - (21) `freelance` - (18) `hobby`