Sunday, February 12, 2017
Technologies unfolding before our eyes
Sunday, March 28, 2010
Health Data, Self-serve, Visualization, Semantic Analysis and Collective Intelligence
The title for the session was "Biomedical Data Mining: Successes, Failures, and Challenges" (streamed online from Fireside C).
The topic stemmed from the similarly named last year's session - Biomedical Data Mining: Dimensionality, Noise, Applications - now split into several discussions including Bioinformatics & Genome Sequencing organized by Raymond McCauley, Dimensionality Reduction moderated by Luca Rigazio (HLDA / HDA; LDPP; Core Vector Machines; Sparse Proj SQ; Random Projection and Feature Selection).
Main sub-topics of the biomedical session were:
- Reality Mining
- Visualization
- Imaging
- Signal Processing
Another interesting aspect of reality mining is crowdsourcing or collective intelligence - in order to get useful information from all the data (temporospatial location, GPS, activity, food, symptoms, behavior, communication content, proximity sensing), we need to analyze it not only on individual but also group level. We need to share more, without sacrificing privacy and security. Collective contributions can be reliable - Shamod Lacoui's answer to this is in selecting those who contribute, restricting inputs to domain. It would help to “filter out the dross”, while “saving the best”. It is needed to suppress noise, to infer intelligence from the collection of facts, clicks, steps, whatever one can contribute. This resonates with discussions at the Transparency Camp - one of the useful tools is SwiftRiver - free, open source software platform to validate and filter news. Swift relies on Natural Language Processing, Machine Learning and Veracity Algorithms to track and verify the accuracy of reports and suppress noise (like duplicate content, irrelevant cross-chatter and inaccuracies). Transparency Camp also posed a question on whether there is a need for an FDA-like institution to ensure information safety and healthy information consumption.
Self-serve was a topic of a smaller Data Mining Camp Session. Even though it was aimed at sales reps that need to go beyond Excel spreadsheets to mine private data of their interest, self-service is currently the only option for health care consumers. People need to analyze everyday life for health implications. They need better tools to not focus on metrics that are easy to collect instead of metrics we need to collect.
In order to mine high-dimensional health space, many disparate types of data should be mashed and validated, gaps should be bridged and structured metadata added to data. Randy Kerber talked about data formats and approaches to make it happen. Semantic web discussions involved NoSQL experts that mentioned limitations of gaining popularity technologies such as MongoDB, Cassandra and HBase. Another relevant session - on cloud computing - discussed its (sometimes over-rated ?) performance and Hadoop technologies.

One of the participants of Biomedical Session developed kidsdata.org (@kidsdata on twitter). It provides insights into geospatial autism statistics and visualizes trends and other useful health-related information.
Another way to display geospatial data is Dynamic Choropleth Maps. Complex networks can be also explored with alluvial diagrams and other approaches. More visualization techniques and tools can be applied to health data - to look at the data in new ways and gain useful insights.
Some of the questions from the audience were on the availability of data. Sources discussed included CDC (see, for example, NHANES laboratory files; eHealth metrics) and Entrez Life Sciences databases.
Signal Detection and Signal Processing for Mining Information was another discussion topic.
Questions were on data mining versus simple tracking and signal monitoring. It was agreed that data mining is the key to health management. Cardionet, body sensors (see posts on teletracking, M-health, Telemedicine: part 1; Telemedicine: part 2; Health 2.0 Software tools, Devices to keep you healthy), SNP detection, telemedicine applications, random and rare electrocardiographic events and other applications were also discussed.
See other materials from Data mining Camp 2010:
- ACM Data Mining Camp, March 20, 2010 (sfbayacm.org)
- Data Mining Camp Report (Zemanta)
- Mining Data mining Camp Impressions (Aurametrix)
- My Experience at ACM Data Mining Camp #DMcamp (bytemining.com)
- Transparency Camp, March 27-28, 2010 (The Sunlight Foundation)
- Notes from Transparency Camp (David “Oso”, Rising Voices)
Monday, December 21, 2009
The Power Of Visualization

The figure was generated by (i) clustering the original networks observed at each time point; (ii) generating and clustering the bootstrap replicate networks for each time point; (iii) determining significance of the clustering for at each time point; and finally to reveal stories in the network data and connect the changes between time points (iv) generating an alluvial diagram.
The technology used to generate biological networks is not and can't be very accurate, "medium" accuracy -for yeast two-hybrid screening of protein-protein interactions, as an example, - is on the order of 50% false positives. A recent article in PNAS titled, "Missing and Spurious Interactions and the Reconstruction of Complex Networks" by Guimera et al. outlines a method to accurately analyze different types of complex networks. It strongly relies on the fact that the nodes in networks can be organized into groups. Group membership governs which nodes can interact.
Many other simpler data visualization tools and methods exist suitable to other specific data-rich applications. IBM's Many Eyes site provides a variety of ways to analyze text, maps, trends and relationships among data points.
Bubble diagrams (also known as bubble charts and spray diagrams) are centered around one (like mind maps) or more concepts. They can then be used to compare and contrast concepts, and identify the common ground and the areas of difference. Diagramic builds such diagrams from spreadsheets and text files.
World Wibe Web makes huge quantities of information readily available."How do you take a big collection of things and make sense out of it?" asks Gary Flake, founder and director of Microsoft Live Labs designing experimental Web tools. The lab's answer to this question is Pivot, a tool recently released to the public - see a demonstration Flake gave at the TED conference in Long Beach, CA.
Ajax.org
This platform is a pure javascript application framework for creating real-time collaborative applications that run in the browser.
AnyChart
AnyChart is a flexible Flash based solution that allows you to create interactive and great looking flash charts.
Axiis
Axiis is a Data Visualization Framework for Flex. It has been designed to be a concise, expressive, and modular framework that let developers and designers create compelling data visualization solutions.
BirdEye
BirdEye is a community project to advance the design and development of a comprehensive open source information visualization and visual analytics library for Adobe Flex. The actionscript-based library enables users to create multi-dimensional data visualization interfaces for the analysis and presentation of information.
Degrafa
Degrafa is a declarative graphics framework for creating rich user interfaces, data visualization, mapping, graphics editing and more.
DojoX Data Chart
An addition in the Dojo 1.3 release is the new dojox.charting class. Its primary purpose is to make connecting a chart to a Data Store a simple process.
Chronoscope
If you need to visualize thousands or millions of points of data, check this out. Very well designed and can be navigated with the keyboard or mouse. There's a Javascript API, a Google Visualization API or try it as a Google Gadget on Google Spreadsheets, iGoogle, or Open Social.
Dundas
Dundas provides a wide range of data visualization solutions for Microsoft technologies. They offer a number of data visualization tools including: Chart, Gauge, Map and Calendar for .net and Dashboards for Silverlight.
ExtJs
Ext JS is a cross-browser JavaScript library for building rich internet applications. It also includes charts.
Flex
Flex has built in chart controls: area, bar, bubble, candlestick, column, HLOCC, Line, Pie, Plot.
Flex uses FXG, a graphical interchange format developed by Adobe and is similar in many ways to SVG. Nice article here by James Whittaker looking at FXG and Degrafa. See quick tutorial and article on Creating Visual Experiences with Flex 3.0.
FlexMonster Pivot Table and Charts
Flexmonster provides Pivot table Flex/Flash components rich internet application (RIA) development services.
FusionCharts
Animated flash charts for web apps.
Google Chart API
The Google Chart API lets you dynamically generate charts.
gRaphaƫl
gRaphaƫl is a Javascript library to help you create stunning charts on your website.
iLog Exlixir
Enhance data visualization within Flex and AIR applications with IBM ILOG Elixir.
JFreeChart
Creates charts such as bar charts, line charts, pie charts, time series charts, candlestick charts, high/low/open/close charts, wind plots, and meter charts. I wish these charts looked better out of the box, because the features and functionality are good, but the visual design really detracts from the graphs. JFreeChart guys- email me- together we can make the world of JFreeChart a prettier place.
JQuery Plugins
- Visualize by the Filament Group
- JQChart
- Flot
- Sparklines
- TufteGraph
JPowered
The PHP graphing scripts provide a very easy way to embed dynamically generated graphs and charts into PHP applications and HTML web pages.
JSCharts
JS Charts is a JavaScript chart generator that requires little or no coding. JS Charts allows you to easily create charts in different templates like bar charts, pie charts or simple line graphs.
Kap IT Labs Diagrammer and Visualizer
Kap Lab's Diagrammer provides ready-to-use yet highly customizable multi-layout data visualization and diagramming for Adobe Flex and Air.
Visualizer displays data as graphs to better visualize connections. Kap Lab's Visualizer provides ready-to-use yet highly customizable multi-layout data visualization for Adobe Flex and Air.
MilkChart
A simple to use, yet robust library for transforming table data into a chart. This library uses the HTML5 tag and is only supported on browsers other than IE until ExCanvas gets proper text support.
Open Flash Charts
Open source Flash charts.
PlotKit
PlotKit is a Chart and Graph Plotting Library for Javascript. It has support for HTML Canvas and also SVG via Adobe SVG Viewer and native browser support.
Protovis
Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritance, scales and layouts to simplify construction.
Silverlight
Microsoft Silverlight comes with the bar, line, pie, column, and scatter charts.
Telerik Charts for Silverlight, WFP, ASP.NET
Telerik Charts offers rich functionality and data presentation capabilities.
VisiFire
Visifire is a set of open source data visualization controls - powered by Microsoft® Silverlight™ & WPF.
yFiles for Ajax , .NET or Flex
The yFiles product family means state-of-the-art software components for the visualization of networks and diagrams. Unequaled automatic diagram layout, cutting edge graph analysis, and extraordinary visualization.
A few more resources:
Gephi, the open graph Viz platform
XML/SWF Charts is a simple, yet powerful tool to create attractive charts and graphs from XML data.
Bime, Browser-based business intelligence solution (based on Flex)
75+ tools to visualize your data
Kquery-based Highcharts
Flare is an ActionScript library for creating visualizations that run in the Adobe Flash Player
Microsoft Chart Controls, ASP.NET and Windows Forms Chart Controls for .NET Framework 3.5 SP1
Periodic table of visualization methods (data, information,concept, strategy, metaphor)
Popular and Creative visualization methods
jqPlot Charts and Graphs for jQuery
jQuery Sparklines
AnyChart Flash Chart Component
Chart FX from Software FX has lots of great products for every platform.Also check out MIT's SIMILE project
Google Vizualisation API with maps and magic-tables
InstantAtlas™ enables information analysts and researchers to create highly-interactive online reporting solutions that combine statistics and map data to improve data visualization, enhance communication, and engage people in more informed decision making.
Dygraph is an open source JavaScript library that produces interactive, zoomable charts of time series. It is designed to display dense data sets and enable users to explore and interpret them.
Mindsetgeometrics allows to build custom charts using predefined SVG files or Adobe Catalyst models.
Seer is a lightweight, semantically rich Ruby on Rails wrapper that provides a seamless interface for the Google Visualization API . It allows to easily create a visualization of data in a variety of formats with a single line of code.
YUI Charts from Yahoo
Silverlight Pivot Grid
AmCharts is a set of Flash charts for your websites and Web-based products. AmCharts can extract data from simple CSV or XML files, or they can read dynamic data generated with PHP, .NET, Java, Ruby on Rails, Perl, ColdFusion, and many other programming languages.