Tag Archives: security

Going beyond Wireshark: experiments in visualising network traffic

Introduction
At NVISO Labs, we are constantly trying to find better ways of understanding the data our analysts are looking at. This ranges from our SOC analysts looking at millions of collected data points per day all the way to the malware analyst tearing apart a malware sample and trying to make sense of its behaviour.

In this context, our work often involves investigating raw network traffic logs. Analysing these often takes a lot of time: a 1MB network traffic capture (PCAP) can easily contain several hundred different packets (and most are much larger!). Going through these network events in traditional tools such as Wireshark is extremely valuable; however they are not always the best to quickly understand from a higher level what is actually going on.

In this blog post we want to perform a series of experiments to try and improve our understanding of captured network traffic as intuitively as possible, by exclusively using interactive visualisations.

Screen Shot 2018-02-15 at 00.07.26.png

A screen we are all familiar with – our beloved Wireshark! Unmatched capabilities to analyse even the most exotic protocols, but scrolling & filtering through events can be daunting if we want to quickly understand what is actually happening inside the PCAP. In the screenshot a 15Kb sample containing 112 packets.

For this blog post we will use this simple 112 packet PCAP to experiment with novel ways of visualising and understanding our network data. Let’s go!

Experiment 1 – Visualising network traffic using graph nodes
As a first step, we simply represent all IP packets in our PCAP as unconnected graph nodes. Each dot in the visualisation represents the source of a packet. A packet being sent from source A to destination B is visualised as the dot visually traveling from A to B. This simple principle is highlighted below. For our experiments, the time dimension is normalised: each packet traveling from A to B is visualised in the order they took place, but we don’t distinguish the duration between packets for now.

network_traffic_as_nodes_raw.mov.gif

IP traffic illustrated as interactive nodes.

This visualisation already allows us to quickly see and understand a few things:

  • We quickly see which IP addresses are most actively communicating with each other (172.29.0.111 and 172.29.0.255)
  • It’s quickly visible which hosts account for the “bulk” of the traffic
  • We see how interactions between systems change as time moves on.

A few shortcomings of this first experiment include:

  • We have no clue on the actual content of the network communication (is this DNS? HTTP? Something else?)
  • IP addresses are made to be processed by a computer, not by a human; adding additional context to make them easier to classify by a human analyst would definitely help.

Experiment 2 – Adding context to IP address information
By using basic information to enrich the nodes in our graph, we can aggregate all LAN traffic into a single node. This lets us quickly see which external systems our LAN hosts are communicating with:

network_traffic_with_context_raw.mov.gif

Tagging the LAN segment clearly shows which packets leaves the network.

By doing this, we have improved our understanding of the data in a few ways:

  • We can very quickly see which traffic is leaving our LAN, and which external (internet facing) systems are involved in the communication.
  • All the internal LAN traffic is now represented as 1 node in our graph; this can be interesting in case our analyst wants to quickly check which network segments are involved in the communication.

However, we still face a few shortcomings:

  • We still don’t really have a clue on the actual content of the network communication (is this DNS? HTTP? Something else?)
  • We don’t know much about the external systems that are being contacted.


Experiment 3 – Isolating specific types of traffic
By applying simple visual filters to our simulation (just like we do in tools like Wireshark), we can make a selection of the packets we want to investigate. The benefit of doing this is that we can easily focus on the type of traffic we want to investigate without being burdened with things we don’t care about at that point in time of the analysis.

In the example below, we have isolated DNS traffic; we quickly see that the PCAP contains communication between hosts in our LAN (remember, the LAN dot now represents traffic from multiple hosts!) and 2 different DNS servers.

network_traffic_rogue_dns_raw.mov.gif

When isolating DNS traffic in our graph, we clearly see communication with a non-corporate DNS server.

Once we notice that the rogue DNS server is being contacted by a LAN host, we can change our visualisation to see which domain name is being queried by which server.We also conveniently attached the context tag “Suspicious DNS server” to host 68.87.71.230 (the result of our previous experiment). The result is illustrated below. It also shows that we are not only limited by showing relations between IP addresses; we can for example illustrate the link between the DNS server and the hosts they query.

network_traffic_rogue_dns_domains_raw.mov.gif

We clearly see the suspicious DNS server making request to 2 different domain names. For each request made by the suspicious DNS server, we see an interaction from a host in the LAN.

Even with larger network captures, we can use this technique to quickly visualise connectivity to suspicious systems. In the example below, we can quickly see that the bulk of all DNS traffic is being sent to the trusted corporate DNS server, whereas a single hosts is interacting with the suspicious DNS server we identified before.

network_traffic_rogue_dns_lots_of_traffic_raw.mov.gif

So what’s next?
Nothing keeps us from entirely changing the relation between two nodes; typically, we are used to visualising packets as going from IP address A to IP address B; however, more exotic visualisations as possible (think about the relations between user-agents and domains, query lengths and DNS requests, etc.); in addition, there is plenty of opportunity to add more context to our graph nodes (link an IP address to a geographical location, correlate domain names with Indicators of Compromise, use whitelists and blacklists to more quickly distinguish baseline vs. other traffic, etc.). These are topics we want to further explore.

Going forward, we plan on continuing to share our experiments and insights with the community around this topic! Depending on where this takes us, we plan on releasing part of a more complete toolset to the community, too.

our_rats_love_cheese_raw.mov.gif

Making our lab rats happy, 1 piece of cheese at a time!

Squeesh out!Ā šŸ€

About the author
Daan Raman is in charge of NVISO Labs,Ā the research arm of NVISO. Together with the team, he drives initiatives around innovation to ensure we stay on top of our game; innovating the things we do, the technology we use and the way we work form an essential part of this. Daan doesn’t like to write about himself in third-person. You can contact him at draman@nviso.be or find Daan online onĀ TwitterĀ and LinkedIn.

 

Stalking a reporter – behind the scenes!

Introduction
Around mid-October we got a call from a reporter working on an article covering online privacy and social media. Rather than writing about others, the reporter wanted to have his own story. So, he asked NVISO to research him on-line, and find out as much as possible about him! Of-course, after agreeing on some “ground rules” with the journalist, we were 100% up for it!

The ground rules that were put in place:

  • We would focus on mining only publicly available information, not make him a target of an attack.
  • We were not allowed to use social engineering tricks on him or his friends and family to get additional information.
  • In other words: we could use any information already available online, without actively asking for more.

The article that was recently published by the journalist can be found online (Dutch only, sorry!):Ā http://www.nieuwsblad.be/cnt/dmf20171107_03174488. For anyone interested in the “behind the scenes” on how we approached this – keep on reading!

Let’s hunt!

The team that stalks togetherā€¦

We assembled a small group of volunteers and got to the task. We created a repository to collectively track our findings, as all bits of information would help the other researchers to move further on their own search, or validate information pieces gathered from different sources. The starting point was obvious: We had the journalists’ name, the email address he used to propose the experiment and a phone number on his signature. Plenty of information to start from!

The first step was to find his Facebook profile. We quickly found out that the reporter does not use the combination name + last name as the profile name, making it more difficult to track down the profile (assuming that he actually has a Facebook profile šŸ˜Š). But as it often happens, some friends had mentioned his full name on publicly tagged Facebook pictures. We knew his face now! After that, identifying his own profile was possible by looking at metadata in the pictures, including the tagged friends. From there on we started building our file. The privacy settings for the profile were (unfortunately for us) quite restrictiveā€¦ luckily for us that was not the case for all his friends!

Screen Shot 2017-11-14 at 14.57.36

Facebook showing profile picture after you’ve entered a wrong password

We found his personal email account by guessing and trying to login with the email account to Facebook. Facebook shows you your profile picture when you say that you forgot your password. That is how we could link his personal email with Facebook. We correctly guessed that he also used this email for other social media and apps, and used the same method to see of he had an account.Ā From there onwards, figuring which other services he was using was easy. From there we could gather additional interests, routines, professional activities, social relationsā€¦

Of course, this kind of research leads to many false positives. In our case, someone with the same name happens to live close by to our reporter, and some of our data actually referred to that person. That is where crossing data from different sources comes in handy. It allows to discard some of the bits that donā€™t really match the puzzle.

During our investigations, we also discovered details on the ex-girlfriend of the journalist – her online activity proved to be an excellent source of information! Prolific Instagrammer, her account gave us a lot of info about travels they did together, pictures, friends… Why do we know she was his ex? They are not friends on Facebook anymore! We got no juicy stuff about the breakup, though.

With the parents name (which we found in a cached document on Google), we could find their house, pets, and social media accounts with additional clues… We could assemble a fairly decent family tree. We were also able to find his home address.

With these results we got back to our reporter. He was quite surprised by the things we found out without directly approaching him or his friends! He found particularly scary what we found out about his family and his ex-girlfriend. Ā He was surprised, though, not seeing his birthdate on our data list.

One step further … go phishing
After our initial investigations, we mutually agreed to take it one step further.Ā 

So what did we do ? We created a fake Facebook profile to trick him or his friends into sharing additional information, contacting his parent or just some good, old phishing to get his credentials and access his email account.Ā We opted for the last option.

We crafted an email based on his interests (which we already identified during the first part of the research). We sent him a link that sounded very relevant for him, so he would definitely try to check it it. And it worked, even though he knew he was going to be targeted by us in this time window. He clicked, and he was directed to a google authentication page. Google? Well, actually NVISO-owned, Google-looking. That is how we got his password.

Once we gain access to his account, we stopped the game. We called him and showed him we were in by sending a mail via his own private inbox to his work email. The challenge was completed. We left nicely, whiteout reading his emails.

Untitled

Emailing from within the inbox of the reporter to the reporter (yes, this is a dummy screenshot!)

In a meeting afterwards, we explained him how he was phished. Up till then he had no clue how he had given us his credential!Ā But we have to confess: still, we didnā€™t get his birthdate.

Conclusion
Most people aren’t too surprised anymore about the wealth of information available on each of us online. What is interesting, though, is how often we believe we are fine, just because we have our privacy settings nicely set and reviewed for all our accounts (as was the case with the journalists’ Facebook profile for example). That old account you forgot you had, the friends that tagged you, the university bulletin, a legal document or a nice note in memoriam of a loved one can give most valuable information to anyone who is interested.

Entering the active reconnaissance part, phishing once again proved a very reliable method to get additional details from a target – in our case, it even gave us full access to the journalists’ mailbox.

We are all on-line. Most of our life is these days, and that is not necessarily a bad thing. But it is important to remember that, despite all privacy and security rules we want to enforce, humans are still the weakest link. Understanding how we share information online and the impact it has on us and others is key in an increasingly digitalised world – we are happy to have contributed to the article & hope to have raised some awareness with the readers!