Sunday, March 21, 2010

Mining Data Mining Camp Impressions

Data Mining Camp organized by Patricia Hoffman and San Francisco Bay Area Chapter of ACM, the Association for Computing Machinery, uses an Open Space Technology (OST) approach - no formal agenda beyond the overall data mining theme. Except the expert panel, sessions are compiled on-the-fly, based on real-time interest and participation.

Overal, the unconference - with a new location and almost doubled attendance - was a success - even if judging only by results of twitter sentiment analysis tools (subject of a not-so-successful data mining camp topic) - tweetfeel, twitrratr and twendz.

Obviously, completely ad-hoc sessions could be a bit chaotic - even though organizers briefly presented their topics and rooms were assigned adter counting a show of hands, there were surprises and unmet expectations. Here are sample quotes:
DataJunkie: OMFG Chaos trying to set up and plan which sessions to attend. Idea: have an online vote, and use sim annealing for scheduling. #DMCAMP

ihat: some sessions at #dmcamp have very low signal-to-noise... feature selection referenced tibshirani and boyd. and ppl butchered their work...

Many people preferred traditional formats to round table discussions - tutorials were the most attended sessions, while discussions were either the most or least liked sessions. The arrangement of chairs and the look of the room preset expectations of participants - some organizers did not really plan to present but had to come out with slides or tutorials. Great observation by Dominique Levin:

Room shape impacts success of un-conference: Circle of chairs works wonders to solicit audience participation at #dmcamp. Circle time!

Another interesting observation was that Linkedin turned to be the most efficient marketing tool for the conference. Twitter and other social networks did not seem to have an impact. The explanation could be very simple though - age group and education level of the target audience.

A winner of retweets was Chris Wensel -(interpreted as "influencer" by twitter data mining tools) - his message "Facebook dropped Cassandra for inbox search and hired HBase person to switch" was retwitted 18 times.

