Compromise from the first moment we open a browser

Copper Contributor

Internet Vs Local file Content Interference and Control Part 1

 

I started looking at this a while ago. I’d look something up online, go away, tell people about it or make a calendar appointment; only to return to the source material and find it somewhat different to my recollection. At first I thought I was going a bit forgetful, but the problem was the memory was very clear, I’d double check, it would still change.

 

Over time it started seeping out further - progressing from time differences on the files saved in onedrive (desktop vs mobile view, not just my memory vs PC and this was the first clue); to lists rearranging them in excel, and data copied from one source to another showed alteration…. But seemingly only when I was being particularly boastful, or especially stressed.

 

I remember I’d just started working at a new place, and keen to impress, made a spreadsheet that saved a LOT of manual reconciliation for the team. The first run and second went great, no major issues…. The third and on developed errors (and this was in a vlookup, it’s quite a standard easy to predict feature, shouldn’t change much between times). Whilst the saved file hadn’t been tampered with, I started to suspect that the data input may have been. This wasn’t the start of the issues, it’s just a good example of it interfering with life in a way that makes you doubt your skills and abilities. Electronic gaslighting.  

 

Background

The actual code that allows such interference eluded me however, until I came across a website called string-functions.com. Here we could convert string characters from or to hex, decimal, binary,  

"

              String(Hello) = hex(48656c6c6f0d0a) = Dec(20377714673257738) = Binary(1001000011001010110110001101100011011110000110100001010)            "

 

… but oddly binary back to string gave me this: “?????” which I presume is down to how the software defines and executes each step. Do more than one and there’s a chance that, without standardisation between all elements, you can’t reverse the process, at least not by presuming output(step n) = input (step 1). The same is true when you translate languages when you think about it.

 

Then I spotted character encoding. I’d largely forgotten that this was a thing, but it’s the second half of making sure what I type is what you see. The text is stored in this document as strings when you view it, and probably as binary somewhere further down the storage chain (ask Microsoft!) but to get from the building blocks back to the text we need to know how it’s encoded.

 

Internet packet analysers often use terms like “Big5”, “ANSI”, “UTF-8”, “YAML”, “UTF-16” and etc. Vast lookup tables tell software that when a document opens with (X), it needs to read the data with encoder (Y). There’s no left to right or right to left when looking at a datafile. Just sequence.

 

Here’s the clever bit. Some methods of encoding have visible character to tell the computer how to display the sequence of data. Some don’t. Either way, the data is still there and will be read, and in just about every single modern application, it will be executed in the order in which it is read.

 

For example, I could add a “right to left” control character in the above paragraph between “clever and “bit”, with a “stop” and “left to right” control character between “the” and “order” (last paragraph) to make sure it will only ever be copied and pasted as “Here’s the clever order in which it is read”.    

The same applies to information received by your browser, and all the webpages that it prefetches every time you open the Bing search page on a fresh install of windows. These control characters can call every database function you can think of in addition to reordering text (which includes websites, IP addresses and search terms you put into that address bar, which thanks to convenience, if no longer just a locked down address bar).

 

I think I'm on the right lines of thought here. My computer has a small fit every time I try to post about it or investigate further. 

 

Next time, I'll cover cookies, timestamping, and computer certificates.

1 Reply

An Example

This attack vector has a number of helpful side effects for the attacker. A standard user would never be able to trace it, it won’t trip malware detectors, IT professionals will spend so much time chasing it down blind alleys that they will exhaust themselves. Why? Because it’s so attractive to chase the last thing that moved; a very base predator species instinct still kicking around our heads somewhere I suspect. We seldom stop to think “so who released that tasty looking rabbit, and where was it released from?”

 

The important thing to remember which this attack, is that everything that is reported to you, every single thing you see that is “looked up” from somewhere, has to be processed by something to decode and format it. I was busily chasing an promising looking source IP when I realised this. Go have a look at the interaction charts on Virustotal if you need proof. Almost everything on there is one or two hops away from malware and a suspicious foreign IP. You’d be forgiven for thinking you had got to the root of the problem when seeing that in a submission you make, especially if you find a manifest file that looks like its written in Chinese.

 

I can almost guarantee you it’s encoded with UTF-32 or similar. Mine decoded to a reference to “bytewise-leveldb”, after I’d finished tripping over control characters. 

 

Changing Results first run to rerun

 

So, I did a trace command on MX toolbox.  It returned a get command call to the api, and two google ad collect calls. All looked very standard. Lookup, advert, advert, get next advert…

RichardDrozda_0-1626442667966.png

Except that for some reason, some icons could be highlighted whilst others couldn’t.

Mid way through investigating the next block, my computer went from low power warning to flat in about 30 seconds, which isn’t it’s usual staying power. When I reloaded by .har, the waterfall for this section looked like this, which is not the same of my experience of waiting for those events to complete.

RichardDrozda_1-1626442667968.png

 

Another data burst arrived, uninvited

 

RichardDrozda_2-1626442667972.png

The next block of activity looked like the above. I'd not done anything to trigger it that I know of, unless phoning mum counts?

 

Note how the “value=” entries repeat and grow with each iteration. Each has a different IP.

 

Fontglyphicons-halflings-regular.woff2 – a font family, request type is common, like get favicon.ico and is normally seen as formatting / background essentials. Arriving at the party a little late and from local cache too.

https://mxtoolbox.com/fonts/glyphicons-halflings-regular.woff2 - copy of the “link”

data:image/gif, - the link response (served from cache. Note, blocking “data:” breaks so many things, so this is a good place to keep anything you need to call, at any time, from just about anywhere thanks to intents.

 

Although the request was reported as served from memory cache, who.is says this is an Amazon web services IP 13.33.52.74:80 and the DNS entry is: SOA              899       dns-external-master.amazon.com root@amazon.com 14569 3600 900 604800 900

 

My old usually reliable command prompt hates it when ports are used on a request. So do most online lookups so I’d removed the “:80” However the final three items on the list, whilst going to the same url, resolved to three different IP addresses, so I put the :80 back.

 

Somehow (again) this happened:

RichardDrozda_3-1626442667980.png

 

So I looked up the content of the record too:

RichardDrozda_4-1626442667986.png

 

…and spotting another rabbit escaping my grasp, I thought “if whatever is happening on my PC can redirect traffic based on additions to the query part of the url that tells the server how to process the request, not to tell the PC where to find the server…. Lets presume the IP was processed as an IP, which brings us to the first of the above screenshots and THEN add the “:80”

This time we got to:

 

RichardDrozda_5-1626442667990.png

 

dns-external-master.amazon.com80       SOA              86394                 a.root-servers.net nstld@verisign-grs.com 2021071600 1800 900 604800 86400

 

…. And a verisign reference; which is remarkable by way of being the only certificate I’ve ever exported from my certificate store that (on reopening) became TWO separate certificates, and if that’s not enough, had a “logotype” field in the properties that contained both a link to the logo and a lot of other data too.

 

Learning from this section

  • The language a file is written in may not be language, it could be encoding.
  • A computer, when not influenced by any malicious external force, will process the same request, in the same way, using the same application. An application like my browser suddenly being able to process the port in an IP via a lookup site is worrying, especially as the results differed. We know a port is a way to speak to a given server at a given address, but now it has become an address / server local modification.
  • What else are we aware of that can modifiy a URL? Yes – cookies.
  • The "URL" or "IP" that you think you are entering may not correspond to the query your PC is processing. I can't think of a better way than that to keep security professionals running around in circles whilst most people continue largely unaware. This means a potential sword hanging over the head of all who investigate such issues. 
  • Presuming that “logotype” isn’t a standard feature of the certificate released by verisgn way back when (and I presume it’s not, given it’s relative uniqueness in my certificate store, and the fact looking up information about it is virtually impossible) – we have a possible route of entry bringing not just a url, but a lot of surrounding data with it.

Looking at the certificates now there:

One has a subject with:

OU = www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign

OU = VeriSign International Server CA - Class 3

OU = VeriSign, Inc.

O = VeriSign Trust Network

 

…and the one I remembered from 1997… no way to check if the data is the same sadly, has this:

The logo call:

“30 5f a1 5d a0 5b 30 59   0_.].[0Y

30 57 30 55 16 09 69 6d   0W0U..im

61 67 65 2f 67 69 66 30   age/gif0

21 30 1f 30 07 06 05 2b   !0.0...+

0e 03 02 1a 04 14 8f e5   ........

d3 1a 86 ac 8d 8e 6b c3   ......k.

cf 80 6a d4 48 18 2c 7b   ..j.H.,{

19 2e 30 25 16 23 68 74   ..0%.#ht

74 70 3a 2f 2f 6c 6f 67   tp://log

6f 2e 76 65 72 69 73 69   o.verisi

67 6e 2e 63 6f 6d 2f 76   gn.com/v

73 6c 6f 67 6f 2e 67 69   slogo.gi

66                        f”

… and different properties in the certificate details screens we can view.

RichardDrozda_6-1626442667999.png

 

 

I’ve tried to get help from the authorities to find out who did this to my computers, and who continues to have influence into them despite factory reset after reset and even buying new devices.

Can anyone in the community suggest anything? I’ve not even started on what the TPM credentials look like yet, it’s too depressing.