Multi-dimensional data visualization
Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way better than your eyes can (try closing your eyes and listening to a Bach Fugue). So back then, we tried to take these types of data (stocks, sales reports, expenses, etc) and created MIDI files out of it to understand trends. Ever since I saw the Hans Rosling’s TED Talk I’ve wondered the applicability of this type of visualization on something other than economics.
Enter pcapr! With the recent influx of packet captures on pcapr, we are rapidly exceeding the amount of data that one can process and searches become harder since there’s just a lot of protocols and packets. So we focused on a few different things to unravel the meaning of it all:
- How does the coverage and number of pcaps for a given protocol trend over time?
- When was a protocol first introduced into pcapr?
- How do I quickly get to a pcap uploaded on a certain date?
- What is 42 and what does it have to do with packet captures?
Gap Minder, used by Hans Rosling, has since been converted by Google into a visualization mashup component. And CouchDB, which powers all of pcapr, provides all the necessary ingredients to rapidly harness data using map/reduce.
Armed with a 20-line Ruby script, we extracted 5-dimensional data from pcapr:
[ time, protocol, %coverage, #pcaps, %contribution ]
and ended up with Trends.
What you find is a beautiful orchestration of multi-dimensional data visualized in a nifty way. As you drag the slider, it’s immediately obvious, when a protocol entered pcapr, the overall coverage of the pcap that contained the protocol as well as the total number of pcaps that contained that protocol and finally the meaning of 42.
Just so you know, there were no packets harmed in this process!
Posted in CouchDB, UI, pcapr, Announcements, Ruby, Tools | Permalink | Trackback

October 1st, 2009 at 8:19 am
[…] Mu Dynamics Research Labs » Blog Archive » Multi-dimensional data visualization labs.mudynamics.com/2009/10/01/multi-dimensional-data-visualization – view page – cached Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way… (Read more)Way back in grad school, I was working on a project involving Auralization. The key idea was that your ear can process multi-dimensional data (pitch, volume, instruments, silence, tempo, etc) way better than your eyes can (try closing your eyes and listening to a Bach Fugue). So back then, we tried to take these types of data (stocks, sales reports, expenses, etc) and created MIDI files out of it to understand trends. Ever since I saw the Hans Rosling’s TED Talk I’ve wondered the applicability of this type of visualization on something other than economics. (Read less) — From the page […]
October 1st, 2009 at 9:05 am
Wow, this is amazing. I’ve been combining protovis with CouchDB recently but I didn’t know Hans Rosling’s software had been made available by Google.
October 2nd, 2009 at 12:29 am
[…] http://labs.mudynamics.com/2009/10/01/multi-dimensional-data-visualization/ ça serait cool pour IP-Link ( http://ip-link.wikidot.com/ ) […]
October 16th, 2009 at 12:14 pm
Cool stuff! Glad I finally remembered to check this out on my machine that Flash doesn’t crash.
I too have scads of data in CouchDB and this is an awesome idea for displaying it. Any gotchas or tricks you used on the CouchDB end would be helpful (like, did you use a full map reduce, or just a null document when building the view?). I for one would appreciate any tips you could post up, say on the CouchDB wiki?
October 25th, 2009 at 5:34 pm
[…] multi-dimensional data visualization […]
April 2nd, 2010 at 8:28 am
Real Nice Blog
Allow me to recommend Parallel Coordinates - This book is about visualization, systematically incorporating
the fantastic human pattern recognition into the problem-solving …
www.springer.com/mathematics/numerical…/978-0-387-21507-5 -
which is now available. It contains an easy to read chapter (10) on Data Mining. Among
others, I received a wonderful compliment from Stephen Hawking who also recommended
this “valuable book” to his students.
Best regards
Alfred
— Alfred Inselberg, Professor
School of Mathematical Sciences
Tel Aviv University
Tel Aviv 69978, Israel
April 2nd, 2010 at 8:34 am
Thanks Alfred. I’m aware of of parallel coordinates and definitely looks like a cool way to visualize n-dimensional data. Been thinking of adding that to http://www.pcapr.net/xtractr to visualize trends within packets.