Identity and accounts
Defensive OSINT: what you let leak
Using offensive intelligence tools on yourself to anticipate what an adversary will see in 2 hours.
Last reviewed:
This version was translated with AI assistance and reviewed by a human.
A client sits across from me, arms crossed, and tells me there is nothing about him online. Two hours later I put a single sheet of paper on the table: his last three home addresses, the first names of his two children and the school one of them attends, the make and plate region of the car parked in front of his house, the name of the wealth manager he plays tennis with, and the GPS coordinates baked into a photo he posted of his terrace. Every line came from a public source. He never said a word back. He just stared at the children’s names.
Angle de lecture
The usual trap
The dominant advice on personal exposure is some variation of “Google yourself and clean up what you find.” It is comforting, it is wrong, and it is exactly why people who believe they are careful are the easiest to profile. Googling your own name returns a curated, SEO-shaped reflection of what you intentionally published and what large platforms decided to rank. It is the lobby of the building. An adversary does not stop at the lobby. A competent investigator uses specialized tools, cross-references breach dumps, reconstructs your relationship graph, and knows that the information that hurts you lives in metadata and in secondary traces you never realized you were leaving.
The second piece of bad advice is the legal fantasy: invoke the right to be forgottenGDPR Article 17: right to erasure of personal data under conditions., fire off a few takedown requests, assume the problem is solved. GDPREU Regulation 2016/679 on personal data protection, in force since May 2018. Article 17 gives you a real lever against some data brokersCompany collecting, aggregating, and reselling personal data at scale. and against search engine indexing inside the EU, but it does nothing about the original sources, the offshore aggregators, the cached copies on archive.todayOn-demand web archiving service with permanent snapshots., or the breach databases already circulating across a dozen channels. De-indexing is not deletion. The data stays exactly where it is; you simply stop seeing it from your chair, which makes you feel safe and changes nothing about what the attacker pulls.
There is a third, quieter mistake that traps the careful: optimizing the wrong axis. People who take privacy seriously tend to harden the things that feel exposed — the public Facebook posts, the embarrassing tagged photos — and ignore the things that are actually exploitable because they feel boring. A reused base password across three breached services is invisible, unembarrassing, and far more dangerous than any photo. A consistent username spanning a professional profile and a decade-old forum account feels harmless and is the single thread that collapses two identities into one. The instinct to protect what is embarrassing rather than what is exploitable is exactly the instinct an attacker counts on, because it leaves the load-bearing leaks standing while you tidy the cosmetic ones.
The trap, underneath all three, is believing that invisibility-to-yourself equals invisibility-to-an-attacker. The two are unrelated. The only way to know what an adversary sees is to do what an adversary does — same tools, same method, against yourself, on a schedule. That discipline has a name: defensive OSINTIntelligence from open (public) sources: social media, registries, archives.. It is not paranoia, and it is not a one-time spring cleaning. It is reconnaissance you run on your own attack surface before someone else runs it for profit, so that the surprises land on your timeline instead of theirs.
The real threat model: what two hours of reconnaissance pulls
Let me be concrete about what competent work surfaces, because the abstract version never lands. An attacker preparing a SIM swapAttack where a fraudster convinces your carrier to port your number to their SIM., a spear-phishingTargeted phishing on a specific person, built from their OSINT profile. campaign, or a CEO fraudScam where an attacker impersonates an executive to order an urgent wire transfer. wire does not start with an exploit. They start with a profile. And the profile is built from sources you handed them, sorted into four layers that stack into something far more dangerous than any single piece.
LinkedIn is the spine. It is engineered to maximize professional visibility, which makes it a reconnaissance gift. A standard profile yields full name, current photo, complete employment timeline, education with graduation years that bracket your age, named internal projects, languages, and a visible connection graph. Cross-reference your connections against theirs and an investigator reconstructs your org chart, identifies your direct manager and your reports, and finds the mutual contact who makes a pretext call believable. “Hi, I’m following up on what you discussed with [a real, named colleague] last week” is devastating precisely because the name is real, public, and verifiable in thirty seconds.
Breach data is the password layer. Run your email through Have I Been PwnedFree public service by Troy Hunt indexing emails in public breaches. and you see which services leaked you. An attacker goes deeper, into the actual leak databasesService indexing data from public or semi-public breaches., combolists, and stealer logs that get sold or dumped. They are not chasing one old password; they are mapping your password patterns — the reused base words, the predictable suffixes, the social engineeringHuman manipulation to obtain information or actions, bypassing technical defenses. answers you gave the same way in 2014, and the secondary email addresses that quietly link your identities together. A decade of breaches is a fingerprint of how your brain builds credentials, and brains do not change their habits.
Metadata is the layer nobody thinks about. This is where the GPS coordinates on my client’s terrace photo came from. EXIFMetadata attached to images: date, GPS, device model, capture settings. data in images can carry latitude, longitude, timestamp, device model, and serial number, and it survives upload on more platforms than people assume. Office documents and PDFs carry author names, internal usernames, organization paths, software versions, and revision history. A company publishes a “press kit” PDF that leaks the internal editor’s Windows username and a network share path; that single string hands an attacker your Active Directory naming convention before they have touched your perimeter. MetadataData about data: who wrote what, when, where, to whom. is the involuntary confession of digital life.
Username reuse stitches it all together. Pick a handle once and you tend to reuse it forever. A tool like Sherlock checks one username across hundreds of platforms in seconds. The polished LinkedIn persona, the gaming forum from 2009, the photography account, the fitness app that maps your morning run past your own front door — same handle, now one graph. Maltego and Spiderfoot exist to automate exactly this stitching, turning scattered traces into a relationship map a human can read at a glance. The danger is never the single data point. It is the join.
Walk the graph the way an investigator does and the stacking becomes obvious. The seed is almost always a single email address, because email is the hinge every other identity hangs from. From that one address, a tool like Holehe fires password-reset probes at hundreds of services and reports back which ones recognize the address as registered — including the dozen you forgot you ever signed up for. Each registered service exposes a username or a display name. Each username, run through Sherlock, lights up other platforms. Each platform yields photos, comments, friend lists, and a writing style. A photo with intact EXIFMetadata attached to images: date, GPS, device model, capture settings. yields coordinates; coordinates yield a home address through a land registry or a people-finder; the address yields neighbors, relatives, a property value, and the make of the car in the driveway. In parallel, the email run through Have I Been PwnedFree public service by Troy Hunt indexing emails in public breaches. and the underlying leak databasesService indexing data from public or semi-public breaches. yields old passwords and the security-question answers that let an attacker pass a bank or carrier identity check. Email becomes usernames becomes photos becomes address becomes family becomes employer becomes the script for the phone call. None of it required breaking anything. Every hop was a public source, and the whole chain is the difference between “nothing comes up when I Google myself” and the sheet of paper on my client’s table.
The right routine: run the attack on yourself, on a clock
The shift is to stop treating this as a cleanup chore and start treating it as recurring reconnaissance with a defined scope, a method, and a cadence. You are building a threat modelMapping of actors, motivations, capabilities and potential impacts against a target. of yourself, then collapsing it deliberately rather than reacting to whatever you happen to stumble on.
Map before you delete. The instinct is to find something embarrassing and immediately scrub it. Resist that on the first pass. Enumerate everything first — every account, every leaked credential, every metadata leak, every cross-linked identity — because the connections between the data points are almost always more dangerous than any single point. An old forum post is harmless. An old forum post that reuses your professional username, reveals your home city, and confirms a security-question answer is a SIM-swap kit waiting for a phone call. You can only see that if you map before you cut.
Triage by exploitability, not by embarrassment. Sort what you find into three buckets. Removable: data broker listings, de-indexable search results, accounts you can close, metadata you can strip before re-posting. Mitigable: information you cannot remove but can render useless — change the reused passwords, rotate the security answers to non-truthful ones, move MFAMulti-factor authentication: combining two independent proofs of identity to log in. off SMS onto a FIDO2Strong authentication standard using hardware cryptographic keys, phishing-resistant. key so a leaked phone number stops being a master key. Accept-and-monitor: the press article, the public company filing, the conference talk you cannot and should not erase — these you log, you set an alert on, and you stop worrying about because you have priced them in. Most people invert this and burn their energy scrubbing the embarrassing-but-harmless while the exploitable-but-boring sits untouched.
Scope the people, not just the person. The hardest correction I make on these engagements is widening the target list. Everyone wants to audit themselves and stop there, but you are rarely the softest path to you. The assistant who books your travel publishes your itinerary by accident. The teenager in your household geotags the house on a platform you have never opened. A former colleague’s open profile confirms the project name that makes a pretext call land. An attacker does not respect your org chart or your front door; they take the weakest link in the cluster around you and walk inward. So the scope of a serious self-audit is the cluster: you, the people who handle your logistics and your money, and the household members whose accounts touch your physical location. Audit the cluster the way an attacker maps it, or you will harden the one node they were never going to bother with.
Fix metadata at the source, not after the fact. For metadata specifically, the fix is fast and permanent if you build it into the workflow. ExifTool strips EXIF from a photo in one command (exiftool -all= photo.jpg); the same tool cleans document properties. The discipline is to strip before publishing, every time, so the leak never happens rather than getting chased after it is already cached in five places. Modern phones offer to remove location data when sharing — turn it on and then verify it actually worked, because “I assumed the platform stripped it” is precisely how the terrace photo happened.
Set the cadence. New breaches land monthly. You create new accounts. Platforms change defaults and silently re-expose you. A single audit is a photograph; you need the film. Re-run the core checks on a calendar — quarterly for individuals, continuously and tooled for high-exposure executives — so the map stays current and a new leak surfaces while it is still cheap to close.
The toolkit is small, free, and the same one an attacker uses, which is the point. Holehe enumerates the services your email is registered against, including the long-abandoned ones, by abusing the same password-reset flows that already leak that information to anyone who asks. Sherlock takes a username and checks it across hundreds of platforms in one pass, so you see your own cross-links before the stitching tool does. Have I Been Pwned tells you which breaches contain you and exactly what each one exposed — if it lists passwords, assume those credentials are being tested against your accounts in credential-stuffingSocial engineering attack pushing targets to disclose credentials or execute code. runs right now, today, and treat every reuse as already compromised. ExifTool reads and strips the metadata in photos and documents from one command line. Spiderfoot and Maltego automate the aggregation and draw the relationship graph for you when the manual walk gets too large to hold in your head. You do not need to master all of them. You need to run the first three against yourself this quarter and put a stripping habit in front of every upload, and you will have closed most of the cheap, high-yield paths an opportunistic profiler relies on.
One sober note so the routine does not curdle into false confidence: some of what you find, you will never erase. The right to be forgottenGDPR Article 17: right to erasure of personal data under conditions. reduces indexation; it does not delete data. The Wayback MachineWeb archive by Internet Archive capturing pages since 1996. has been archiving the web since 1996 and does not honor arbitrary deletion requests. Breach dumps, once redistributed, circulate independently of any legal process — data exfiltrated in 2017 is still feeding social engineeringHuman manipulation to obtain information or actions, bypassing technical defenses. campaigns today. Screenshots and reposts made by other people are outside your reach entirely. This is not a counsel of despair; it is the reason the framework ends in accept-and-monitor rather than delete everything. You cannot make yourself disappear. You can make yourself an expensive, well-understood target instead of a cheap, surprising one — and against the economics most attackers actually run on, that is the win that matters.
What this means concretely
For you, as a person
Three things, this week, under 200 euros total — most of it free.
- Strip metadata before anything leaves your hands. Install ExifTool (free) or use your phone’s built-in “remove location when sharing.” Check the last ten photos you posted publicly and any PDF or document on your personal site or LinkedIn. The terrace-photo GPS leak takes five minutes to close and never comes back once stripping is a habit.
- Audit your own exposure like an attacker would. Run your email through Have I Been Pwned (free), search your most-used username across platforms (Sherlock or manually), and Google your name in quotes alongside your city and former employers. Write down what links to what. Then change every reused password the breaches reveal and move your email and bank MFAMulti-factor authentication: combining two independent proofs of identity to log in. off SMS onto an authenticator app or a key.
- Remove yourself from the loudest data brokers. Spend an hour submitting opt-outs to the major data brokerCompany collecting, aggregating, and reselling personal data at scale. and people-search sites that list your home address and your relatives. It is tedious and partial, but it raises the cost for an opportunistic profiler and breaks the easiest path to your physical location.
For you, CISO / CIO / executive
1. Commission an offensive OSINT exercise on your 3 to 5 most exposed people. Have a third party run real reconnaissance — LinkedIn graphing, breach correlation, metadata harvesting, username stitching — against your CEO, CFO, and anyone who can authorize a wire transfer. Direct consequence: you learn your exposure before your adversary monetizes it, and the report becomes the concrete brief for executive protection instead of another generic awareness slide.
2. Treat published metadata as a leak class with a control. Documents, images, and PDFs leaving your organization carry author names, internal paths, and software versions that map your environment for free. Put a stripping step in the publishing and marketing pipeline. Direct consequence: you stop handing attackers your AD naming convention and internal usernames through a press release — one of the cheapest reconnaissance wins you can simply deny them.
3. Make breach and exposure monitoring continuous, not annual. Tie executive emails and corporate domains to monitoring of new leak databasesService indexing data from public or semi-public breaches. and stealer logs. Direct consequence: a credential dump triggers a forced reset and an MFA check within hours, instead of being discovered during the incident it caused.
Mistakes we see all the time
- Deleting before mapping. People find one ugly result, scrub it in a panic, and never see the graph it belonged to. They feel safer and are not. Map the whole surface first; cut second.
- Confusing de-indexing with deletion. A successful right-to-be-forgotten request removes a result from EU search engines. The source page, the cache, the archive copy, and the breach dump are all untouched. Treat de-indexing as cosmetic and go fix the source.
- Trusting platforms to strip metadata. “I assumed LinkedIn / the CMS / the phone removed the GPS” is the single most common cause of the location leak. Verify it yourself with ExifTool; assume nothing.
- Auditing once and filing it. A January audit is fiction by June. New breaches, new accounts, changed platform defaults. Without a cadence the report ages into a comfort blanket.
- Scoping to the CEO only. The executive assistant who books the travel, the finance clerk who runs the wires, and the spouse with the wide-open social account are frequently softer targets and just as useful to an attacker. Scope to the people around the principal, not only the principal.
- Treating handles as harmless. Reusing one username across professional and personal platforms is what lets a tool stitch your identities in seconds. Compartmentalize handles the way you compartmentalize passwords.
- Searching only your real name. Usernames, secondary emails, registered domains, and former employers are all OSINT seeds that lead straight back to the full profile. The real name is the least productive starting point.
Actionable checklist
- N1 Run your primary and secondary emails through Have I Been Pwned and note every breached service
- N1 Search your most-used username across platforms (Sherlock or manual) and write down every cross-link
- N1 Strip EXIF/GPS from your last 10 public photos and from any public PDF or document
- N1 Change every reused password the breaches reveal; rotate security answers to non-truthful ones
- N2 Move email and bank MFA off SMS onto an authenticator app or FIDO2 key
- N2 Submit opt-outs to the major data brokers and people-search sites listing your address and relatives
- N2 Build metadata stripping into your publishing workflow so it happens before, not after, posting
- N2 Schedule a recurring self-audit: quarterly for individuals, continuous monitoring for high-exposure roles
- N3 Commission a third-party offensive OSINT exercise on your 3-5 most exposed people
- N3 Tie executive emails and corporate domains to continuous leak-database and stealer-log monitoring
Further reading
The OSINT Framework and Bellingcat’s Online Investigation Toolkit (both in the sources) are the same maps an investigator uses — work through them against yourself and you see precisely what they see. Michael Bazzell’s IntelTechniques is the reference for personal exposure and for the data-broker opt-out workflow no single article can fully list. For the breach layer, Have I Been Pwned is the practical starting point, and ExifTool is the one tool every person who publishes photos or documents should have installed and wired into their habits before the next upload.
Sources and further reading
- OSINT Framework [official]
- Bellingcat — Online Investigation Toolkit [official]
- IntelTechniques (Michael Bazzell) [official]
- Have I Been Pwned — Pwned websites [official]
- ExifTool (Phil Harvey) [official]