Multi-dimensional data visualization

October 1st, 2009 by kowsik

Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way better than your eyes can (try closing your eyes and listening to a Bach Fugue). So back then, we tried to take these types of data (stocks, sales reports, expenses, etc) and created MIDI files out of it to understand trends. Ever since I saw the Hans Rosling’s TED Talk I’ve wondered the applicability of this type of visualization on something other than economics.

Enter pcapr! With the recent influx of packet captures on pcapr, we are rapidly exceeding the amount of data that one can process and searches become harder since there’s just a lot of protocols and packets. So we focused on a few different things to unravel the meaning of it all:

  • How does the coverage and number of pcaps for a given protocol trend over time?
  • When was a protocol first introduced into pcapr?
  • How do I quickly get to a pcap uploaded on a certain date?
  • What is 42 and what does it have to do with packet captures?

Gap Minder, used by Hans Rosling, has since been converted by Google into a visualization mashup component. And CouchDB, which powers all of pcapr, provides all the necessary ingredients to rapidly harness data using map/reduce.

Armed with a 20-line Ruby script, we extracted 5-dimensional data from pcapr:

[ time, protocol, %coverage, #pcaps, %contribution ]

and ended up with Trends.


Pcapr Trends

What you find is a beautiful orchestration of multi-dimensional data visualized in a nifty way. As you drag the slider, it’s immediately obvious, when a protocol entered pcapr, the overall coverage of the pcap that contained the protocol as well as the total number of pcaps that contained that protocol and finally the meaning of 42.

Just so you know, there were no packets harmed in this process!

Posted in CouchDB, UI, pcapr, Announcements, Ruby, Tools | Permalink | Trackback

7 Responses

  1. Twitter Trackbacks for Mu Dynamics Research Labs » Blog Archive » Multi-dimensional data visualization [mudynamics.com] on Topsy.com

    […] Mu Dynamics Research Labs » Blog Archive » Multi-dimensional data visualization labs.mudynamics.com/2009/10/01/multi-dimensional-data-visualization – view page – cached Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way… (Read more)Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way better than your eyes can (try closing your eyes and listening to a Bach Fugue). So back then, we tried to take these types of data (stocks, sales reports, expenses, etc) and created MIDI files out of it to understand trends. Ever since I saw the Hans Rosling’s TED Talk I’ve wondered the applicability of this type of visualization on something other than economics. (Read less) — From the page […]

  2. Jason Davies

    Wow, this is amazing. I’ve been combining protovis with CouchDB recently but I didn’t know Hans Rosling’s software had been made available by Google.

  3. Cedric Bonhomme (cedricbonhomme) 's status on Friday, 02-Oct-09 08:29:48 UTC - Identi.ca

    […] http://labs.mudynamics.com/2009/10/01/multi-dimensional-data-visualization/ ça serait cool pour IP-Link ( http://ip-link.wikidot.com/ ) […]

  4. James Marca

    Cool stuff! Glad I finally remembered to check this out on my machine that Flash doesn’t crash.

    I too have scads of data in CouchDB and this is an awesome idea for displaying it. Any gotchas or tricks you used on the CouchDB end would be helpful (like, did you use a full map reduce, or just a null document when building the view?). I for one would appreciate any tips you could post up, say on the CouchDB wiki?

  5. Tims Blog » Blog Archive » Another hurdle

    […] multi-dimensional data visualization […]

  6. Alfred Inselberg

    Real Nice Blog

    Allow me to recommend Parallel Coordinates - This book is about visualization, systematically incorporating
    the fantastic human pattern recognition into the problem-solving …
    www.springer.com/mathematics/numerical…/978-0-387-21507-5 -

    which is now available. It contains an easy to read chapter (10) on Data Mining. Among
    others, I received a wonderful compliment from Stephen Hawking who also recommended
    this “valuable book” to his students.

    Best regards

    Alfred

    — Alfred Inselberg, Professor
    School of Mathematical Sciences
    Tel Aviv University
    Tel Aviv 69978, Israel

  7. kowsik

    Thanks Alfred. I’m aware of of parallel coordinates and definitely looks like a cool way to visualize n-dimensional data. Been thinking of adding that to http://www.pcapr.net/xtractr to visualize trends within packets.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.