Inference.

Published Jun 8, 2023

The way we leak information will eventually change...

The way we leak information will eventually change, but for now we're overlooking this because there isnt enough known-bad events to support the change, aside from those teams that have the means and business support to eradicate inference capabilities.

Let's look at what we're not defending against. there are a few types of inference we can do online, I tend to stick to email addresses and phone numbers as identity markers, but by favourite inferences are sign-up pages, password reset flows and anything from that is application specific

Inference is learning this thing by asking about that thing

For example, If I was to pick on my mate Daniel Cuthbert and I wanted to know why his hair is so beautiful and thick at this stage in life ... I'm not angry, I'm just dissapointed in my own hairline.

If I go to johnnybravothickhair.com and click sign up and provide Daniels email address often it will check that email to see if it already exists, lets say I get an error message 'Account already exists, would you like to log in ?'

ah, So Daniel does have an email address here ! - that's it, there's your inference.

Let's say it isnt a cool hair site, maybe medical, maybe defence, maybe personal cough, maybe insert reason for secrecy/privacy ... you could create pressure with this information or even put users at risk.

let's use twitter as a nice example:

I dont have an account on elon twitter, so when it my number it tells me

But if I provide it with a number that does have an account (sorry Caesar)

It asks me for a password, Ceaser doesnt get notified, I learn that number has at least one account on that service. similarly I could use email addresses but in this case I used phone numbers because I wanted to talk about scraping.

You want all the phone numbers in the UK ?

run this in Bash

echo 07{000000000..999999999} > MobilePhoneNo.txt

you'll end up with 12 gig of all possible phone numbers (not including international dial code)

If you were so inclined, you could script something and start building inference and profiling around those numbers against services and start profiling what numbers use those services

Tinder™

Account recovery process:

If someone where to do this with thier partners No. while they're in the same room, and they here an SMS/beep ... that might be a conversation. but if you see this screen below it means that number has recieved a text message and they still have an account (it may not be active, but they have an account)

If they dont, you'll see the below image

Looking at the technical stuff under the hood, it looks real easy to do some naughty automation on Tinder.

EMAIL Address Already Exists

There are so many sites that will tell you email already exists or a varient of that, when you hit the sign-up page, thanks, good to know.

the problem is the assumption that the full process of password reset, user sign up, or other edge features of an application is going to be compleated for intended reasons - inference as a concept if left untreat will lead to mass data collection, scraping what ever you call it

OK So, we can make every phone number, but we cant make every email... the answer here as an attacker is just to use whatever emails you can get a hold of. facebook dump, web breach data, it's all available if you look, obviously you wont get 100% coverage but this is about pulling enough data to be concerning, that concern may be driven by volume of scrapee's or by context of scraped from and whom is identified etc...

The other side of this is mobile phone address books and the access we surrender, it's disgusting. but we can also use it in our favour for inference and that means the bad guys can too, how many times have you had a random number call and you've saved it to see what pops up on WhatsApp, Signal, Insta, Facebook, Whatever ? you might get a picture, a status, a video without even talking to that person... scale that, scrape it, store it. are people already doing this? how would we ever know? (we wouldnt)

The answer to a lot of this is magic link access to services (not the address book issue), websites that dont tell the internet anything, but only ask what you need and give the same message to all, but in the background do some work

Websites need not assume the person providing the information is the owner of the information, especially in unauthenticated zones of the app

I'll make this post longer when I have some time to create more meaningful examples and possibly some tooling.

Inference.

Read next

Get TI from historical breach data?

A Method for identifying .onion associated IP addresses