Thursday, December 31, 2009
According to Bill Vaughan, an optimist stays up until midnight to see the new year in. A pessimist stays up to make sure the old year leaves.
Goodbuy 2009 was more frequently twitted as 2010 was sweeping across the globe, but there were no more pessimists left shortly after 2010 arrived on the islands of Western Hemisphere.
2009 was a year to remember but not repeat . Everyone hopes for a better year, so "Happy New Year"will continue to be a trending topic for a while - even if not among the top 10.
Monday, December 21, 2009
The figure was generated by (i) clustering the original networks observed at each time point; (ii) generating and clustering the bootstrap replicate networks for each time point; (iii) determining significance of the clustering for at each time point; and finally to reveal stories in the network data and connect the changes between time points (iv) generating an alluvial diagram.
The technology used to generate biological networks is not and can't be very accurate, "medium" accuracy -for yeast two-hybrid screening of protein-protein interactions, as an example, - is on the order of 50% false positives. A recent article in PNAS titled, "Missing and Spurious Interactions and the Reconstruction of Complex Networks" by Guimera et al. outlines a method to accurately analyze different types of complex networks. It strongly relies on the fact that the nodes in networks can be organized into groups. Group membership governs which nodes can interact.
Many other simpler data visualization tools and methods exist suitable to other specific data-rich applications. IBM's Many Eyes site provides a variety of ways to analyze text, maps, trends and relationships among data points.
Bubble diagrams (also known as bubble charts and spray diagrams) are centered around one (like mind maps) or more concepts. They can then be used to compare and contrast concepts, and identify the common ground and the areas of difference. Diagramic builds such diagrams from spreadsheets and text files.
World Wibe Web makes huge quantities of information readily available."How do you take a big collection of things and make sense out of it?" asks Gary Flake, founder and director of Microsoft Live Labs designing experimental Web tools. The lab's answer to this question is Pivot, a tool recently released to the public - see a demonstration Flake gave at the TED conference in Long Beach, CA.
AnyChart is a flexible Flash based solution that allows you to create interactive and great looking flash charts.
Axiis is a Data Visualization Framework for Flex. It has been designed to be a concise, expressive, and modular framework that let developers and designers create compelling data visualization solutions.
BirdEye is a community project to advance the design and development of a comprehensive open source information visualization and visual analytics library for Adobe Flex. The actionscript-based library enables users to create multi-dimensional data visualization interfaces for the analysis and presentation of information.
Degrafa is a declarative graphics framework for creating rich user interfaces, data visualization, mapping, graphics editing and more.
DojoX Data Chart
An addition in the Dojo 1.3 release is the new dojox.charting class. Its primary purpose is to make connecting a chart to a Data Store a simple process.
Dundas provides a wide range of data visualization solutions for Microsoft technologies. They offer a number of data visualization tools including: Chart, Gauge, Map and Calendar for .net and Dashboards for Silverlight.
Flex has built in chart controls: area, bar, bubble, candlestick, column, HLOCC, Line, Pie, Plot.
Flex uses FXG, a graphical interchange format developed by Adobe and is similar in many ways to SVG. Nice article here by James Whittaker looking at FXG and Degrafa. See quick tutorial and article on Creating Visual Experiences with Flex 3.0.
FlexMonster Pivot Table and Charts
Flexmonster provides Pivot table Flex/Flash components rich internet application (RIA) development services.
Animated flash charts for web apps.
Google Chart API
The Google Chart API lets you dynamically generate charts.
Enhance data visualization within Flex and AIR applications with IBM ILOG Elixir.
Creates charts such as bar charts, line charts, pie charts, time series charts, candlestick charts, high/low/open/close charts, wind plots, and meter charts. I wish these charts looked better out of the box, because the features and functionality are good, but the visual design really detracts from the graphs. JFreeChart guys- email me- together we can make the world of JFreeChart a prettier place.
The PHP graphing scripts provide a very easy way to embed dynamically generated graphs and charts into PHP applications and HTML web pages.
Kap IT Labs Diagrammer and Visualizer
Kap Lab's Diagrammer provides ready-to-use yet highly customizable multi-layout data visualization and diagramming for Adobe Flex and Air.
Visualizer displays data as graphs to better visualize connections. Kap Lab's Visualizer provides ready-to-use yet highly customizable multi-layout data visualization for Adobe Flex and Air.
A simple to use, yet robust library for transforming table data into a chart. This library uses the HTML5 tag and is only supported on browsers other than IE until ExCanvas gets proper text support.
Open Flash Charts
Open source Flash charts.
Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritance, scales and layouts to simplify construction.
Microsoft Silverlight comes with the bar, line, pie, column, and scatter charts.
Telerik Charts for Silverlight, WFP, ASP.NET
Telerik Charts offers rich functionality and data presentation capabilities.
Visifire is a set of open source data visualization controls - powered by Microsoft® Silverlight™ & WPF.
yFiles for Ajax , .NET or Flex
The yFiles product family means state-of-the-art software components for the visualization of networks and diagrams. Unequaled automatic diagram layout, cutting edge graph analysis, and extraordinary visualization.
A few more resources:
Gephi, the open graph Viz platform
XML/SWF Charts is a simple, yet powerful tool to create attractive charts and graphs from XML data.
Bime, Browser-based business intelligence solution (based on Flex)
75+ tools to visualize your data
Flare is an ActionScript library for creating visualizations that run in the Adobe Flash Player
Microsoft Chart Controls, ASP.NET and Windows Forms Chart Controls for .NET Framework 3.5 SP1
Periodic table of visualization methods (data, information,concept, strategy, metaphor)
Popular and Creative visualization methods
Software FX has lots of great products for every platform.
Also check out MIT's SIMILE project
Google Vizualisation API with maps and magic-tables
InstantAtlas™ enables information analysts and researchers to create highly-interactive online reporting solutions that combine statistics and map data to improve data visualization, enhance communication, and engage people in more informed decision making.
Mindsetgeometrics allows to build custom charts using predefined SVG files or Adobe Catalyst models.
Seer is a lightweight, semantically rich Ruby on Rails wrapper that provides a seamless interface for the Google Visualization API . It allows to easily create a visualization of data in a variety of formats with a single line of code.
YUI Charts from Yahoo
Silverlight Pivot Grid
AmCharts is a set of Flash charts for your websites and Web-based products. AmCharts can extract data from simple CSV or XML files, or they can read dynamic data generated with PHP, .NET, Java, Ruby on Rails, Perl, ColdFusion, and many other programming languages.
Thursday, December 17, 2009
Richard Gasquet was cleared of doping today, on the grounds that cocaine had been passed into his system from a nightclub kiss.
What else could we get from an innocent kiss?
Remember the tragic story of a peanut allergic teenager from Quebec? Christina Desforges went into anaphylactic shock and died after being kissed by her boyfriend who had just eaten a peanut butter sandwich. An almost immediately administered adrenaline shot didn’t help.
Though not widely recognized, food hypersensitivity by inhalation can cause a lot of problems. 8 top allergens (TEMPS WFS: Tree nuts, Eggs, Milk, Peanuts, Shellfish (crab, lobster, shrimp), Fish (bass, cod, flounder), Wheat, Soy) cause 90% of food allergies. Food related allergies have been on the rise in recent years. Whether this is due to the use of baby creams and lotions (e.g., with a peanut oil content) in early childhood, 20th century hygiene, synthetic & fortified foods or other issues of Western civilization, we need to be careful and think about all possible ways of being exposed.
Epstein-Barr virus (EBV) (causing mononucleosis - infamously known as the kissing disease), Herpes Simplex Virus-1 (causing cold sores) bacterium Streptococcus (causing various infections such as gum disease and strep throat), H.pylori, Candida or other yeast species, Hepatitis B Virus and cytomegalovirus (CMV) are spread via oral transmission from microbe-containing saliva. Many sexually-transmitted diseases follow this route too.
Even cigarettes - besides their harmful toxic chemicals - host a bacterial bonanza —hundreds of different germs, including those responsible for many human illnesses, according to a new study. For example, Campylobacter, which can cause food poisoning and Guillain-Barre Syndrome; Clostridium, which causes food poisoning and pneumonias; Corynebacterium, also associated with pneumonias and other diseases; E. coli; Klebsiella, Pseudomonas aeruginosa and Stenotrophomonas maltophilia, all of which are associated not only with pneumonia but also with urinary tract infections; and a number of Staphylococcus species that underlie the most common and serious hospital-associated infections.
One of the best natural defenses in our saliva is the “good” bacteria and supporting it molecular environment that protects us from settlements of pathogenic microbes and prevents their growth. Some people are more susceptible than others, but everybody should be careful... about whom they kiss.
Sunday, December 13, 2009
Image by scribbletaylor via Flickr
According to a recent study, reported at the annual meeting of American Society of Nephrology in San Diego. primary care physicians are failing to diagnose chronic kidney disease, especially in women.
Study leader Dr. Maya Rao of Columbia University in New York says primary care doctors typically order a blood test called creatinine to measure kidney function, but this alone is not a particularly accurate measure of kidney function.
Glomerular filtration rate (GFR) is the best test to measure your level of kidney function and determine your stage of kidney disease.
You can calculate your Glomerular Filtration Rate (GFR), by entering results of your blood creatinine test, your age, race, gender and other factors.
The distribution of creatinine test results in 6434 people is shown on the right (NHANES 2006 study, weighted for USA demographics, graph by WolframAlpha)
The typical reference ranges in adult males are 0.7 to 1.2 mg/dL according to Wikipedia and 0.6 to 1.2 milligrams (mg) per deciliter (dl) according to medicinenet. In adult females, it's 0.5 to 1.1 milligrams per deciliter (wikipedia) or 0.5 to 1.1 mg/dl (medicinenet). LabCorps lists normal values between 0.57 and 1.00 mg/dL. (see also reference values for other medical tests and Aurametrix blog on liver tests). Men tend to have higher levels of creatinine because they generally have more skeletal muscle mass than women. Muscular young or middle-aged adults may have more creatinine in their blood than the norm for the general population. Vegetarians have been shown to have lower creatinine levels. Creatinine tends to be slightly lower in pregnancy. It increases with height and body weight. While a baseline serum creatinine of 2.0 mg/dL (150 μmol/l) may indicate normal kidney function in a male body builder, a serum creatinine of 0.7 mg/dL (60 μmol/l) can indicate significant renal disease in a frail old woman.
Infants have normal levels of about 0.2 or more, depending on their muscle development. In people with malnutrition, severe weight loss, and long standing illnesses the muscle mass tends to diminish over time and, therefore, their creatinine level may be lower than expected for their age.
A person with only one kidney may have a normal level of about 1.8 or 1.9. Creatinine levels that reach 2.0 or more in babies and 10.0 or more in adults may indicate severe kidney impairment and the need for a dialysis machine to remove wastes from the blood.In the United States, creatinine is typically reported in mg/dL, while in Canada and Europe μmol/litre may be used. 1 mg/dL of creatinine is 88.4 μmol/l.
Aurametrix is developing decision support systems to help you evaluate personal health risks, decide on preventative measures, estimate cost/benefits of performing diagnostic tests. Better tools for a healthier world.
Saturday, December 12, 2009
Image via WikipediaVinegar and Water Diet was made popular in 1820 by Lord Byron - so says the Fad Diet Timeline (Fad Diets Throughout the Years) article by the American Dietetic Association.
Some people found this diet can do miracles, while others had side effects without any positive results and called it a scam.
The primary reason that this diet works could be that you are told to eat moderate portions, watch the nutritional composition of the food you eat, and get exercise. Just doing those alone is often enough to stimulate your body to maintain a healthy weight, if not lose weight.
Yet, there could be properties in the vinegar that will help you lose weight.
One of the premises of the Apple Cider Vinegar Diet is that taking small amounts of apple cider vinegar daily suppresses appetite and assists in weight loss. Apple Cider Vinegar is also said to reduce glucose levels, treat acid reflux disease and cure acne, among other things. There also were ungrounded claims of cider vinegar to be a special source of various B vitamins and amino acids.
There have been clinical studies, on both rats and humans, suggest apple cider vinegar does make one feel fuller. Many people agree about apple cider vinegar’s ability to suppress appetite (at least because of the taste), but more research is needed. Various other claims of benefits are not supported by research but there is some support for apple cider vinegar’s role in lowering glucose and cholesterol levels and increasing HDL (good) cholesterolin people. Apple cider vinegar was shown to reduce serum triglyceride (TG) levels and increased HDL-cholesterol in diabetic animals.
Another interesting fact is that vinegar was shown to reduce counts of not-so-benefitial bacteria. Diluted solutions of various household sanitizers (apple cider vinegar, white vinegar, bleach, and a reconstituted lemon juice product) were tested for their effectiveness in reducing counts of inoculated Escherichia coli and naturally present aerobic, mesophilic bacteria on lettuce. (J Food Prot 2002,Oct,01;65(10):1646-50; (PMID: 12380754)) Of the sanitizers tested, 35% white vinegar (1.9% acetic acid) was the most effective in reducing E. coli levels (with a 5-log10 reduction after 5 min with agitation and after 10 min without agitation) and in reducing aerobic plate counts (with a >2-log10 reduction after 10 min with agitation).
Could apple cider vinegar be replaced with anything else to achieve the same effects?
Most dieters would answer no to this question. They would tell you to "be careful with which vinegar you use. White vinegar is good for many many thing, but don't drink it will remove all of the minerals and nutrients from your boby. Apple cider vinegar is the only one you want to ingest".
I may say this statement is rather too strong, but what about the actual chemical content?
Comparison of various vinegars shows that no chemical in particularly stands out - apple cider vinegar has a higher Manganese content, but that's about it.
Typical white distilled vinegar is at least 4% acidity and not more than 7%. Cider and wine vinegars are typically slightly more acidic with approximately 5-6% acidity.
Of course, there also could be a yeast and bacterial content - at least dead Acetic acid bacteria (these critters derive their energy from the oxidation of ethanol to acetic acid during respiration. They are Gram-negative, aerobic, and rod-shaped).
Commercially available vinegars, however, are well filtered (no mother of vinegar) and were reported to work in people's diets even if not organic and expensive types.
Usual apple cider vinegar substitutes are:
malt vinegar OR white vinegar (a good choice for pickles) OR wine vinegar (not for pickles)
lemon juice (as a flavoring or for acidulating water) OR lime juice (as a flavoring or for acidulating water) OR brandy (for deglazing pans) OR fortified wine (for deglazing pans and perking up sauces) OR wine (for deglazing pans and perking up sauces) OR ascorbic acid (mixed with water) OR amchoor OR tamarind paste
Since vinegar can be made from anything with sugar, there are probably too many different types to count made in countries throughout the world. Each country may use starting materials native to their area and tailored to the specific tastes of the region.
Typical retail varieties of vinegar include white distilled, cider, wine (white and red), rice, balsamic, malt and sugar cane. Other, more specialized types include banana, pineapple, raspberry, flavored and seasoned (e.g., garlic, tarragon).
Are there Formal Standards for Vinegar?
The following varieties of vinegar are classified by a U.S. Food and Drug Administration (FDA) Compliance Policy Guide for labeling purposes according to their starting material and method of manufacturing:
•Cider vinegar or Apple vinegar is made from the two-fold fermentation of the juices of apples. Vinegar can be made from other fruits such as peaches and berries with the labels describing starting materials.
•Wine vinegar or Grape vinegar is made from the two-fold fermentation of the juice of grapes.
•Malt vinegar, made by the two-fold fermentation of barley malt or other cereals where starch has been converted to maltose.
•Sugar vinegar, made by the two-fold fermentation of solutions of sugar syrup or molasses.
•Spirit or distilled vinegar, made by the acetic fermentation of dilute distilled alcohol.
•Blended Vinegar made from a mixture of Spirit vinegar and Cider vinegar is considered a combination of the products that should be labeled with the product names in the order of predominance. It is also the product made by the two-fold fermentation of a mixture of alcohol and cider stock.
•Rice or Rice Wine vinegar (although not part of FDA’s Compliance Policy Guide) has increased in popularity over the past several years and is made by the two-fold fermentation of sugars from rice or a concentrate of rice without distillation. Seasoned rice or rice wine vinegars are made from rice with the “seasoning” ingredients noted on the label.
•Balsamic vinegar (also not a part of FDA’s Compliance Policy Guide) continues to grow in market share and “traditional” and “commercial” forms are available. The products are made from the juice of grapes, and some juice is subjected to an alcoholic and subsequent acetic fermentation and some to concentration or heating. See the “Today’s Vinegar” section of the Web site for more information regarding Traditional and Commercial Balsamic Vinegar.
Aurametrix is developing decision support systems to help you manage your tests. Better tools for a healthier world.
Wednesday, December 9, 2009
Monday, November 30, 2009
Slowly but inevitably we are moving from "Dr. knows Best" to "everybody is a little bit doctor", to next generation diagnostics, better software tools and databases, personal devices, wireless health tools, and new healthcare models.
The first milestone on this road is known as Health 2.0:
Monday, November 23, 2009
Recent mammogram debate shows why reform will fail - most women don't like to get mammograms later in life and less frequently. Of course, this furious reaction is fueled by those who would face payment cuts if the new guidelines are implemented - radiologists, for example.
How many medical tests do you need to sleep sound at night knowing that you are healthy enough? Billions, trillions, googols?
How many out of a few thousand existing medical tests are redundant or irrelevant to individual consumers?
Mamograms and colonoscopies are perfect examples of overused diagnostic procedures. (see also the consumerreports article on overused tests and treatments).
Colonoscopy accounts for up to 75% of costs of an IBS patient, while the probability of it yielding meaningful results is less than 3%.
Citing Dr. Rex: Low-risk patients might undergo too many colonoscopies; high-risk patients, too few. When given a range of hypothetical findings on initial colonoscopy, most physicians would recommend repeat colonoscopy earlier than is indicated by the U.S. Multi-Society Task Force guidelines.
If 2000 women are screened regularly for 10 years, one will benefit from the screening, as she will avoid dying from breast cancer.10 healthy women, however, will have either a part of their breast or the whole breast removed. Even statistics deny that early screening for breast cancer saves lives.
Early and frequent screenings often lead to false alarms and unneeded biopsies, without substantially improving women's odds of survival. About 90 percent of abnormal mammogram findings are benign. A 2009 study in the British Medical Journal estimated that roughly one in three breast cancers detected by mammograms would never have caused harm. Earlier studies were suggesting this too, raising (along with harmful consequences of false positive results)
Perhaps the biggest problem with performing too many screening tests in healthy people is a phenomenon called over-diagnosis: Screening can detect slow-growing, harmless cancers that would never have killed you. But even if a mammogram "involves a tiny dose of radiation", a bigger dose used in radiotherapy is harmful when given to healthy people. Breast cancer radiotherapy regimens can increase mortality from heart disease and lung cancer 10-20 years afterwards.
How often does this happen with breast cancer?
Dr. Kevin Pho, an internist in Nashua, N.H., thinks the rebellion indicates a larger problem with ObamaCare. "The fact that [the administration] is distancing itself from what I consider to be very robust guidelines portends a very poor future for comparative effectiveness," he says. "If they back down now, what's going to happen when a comparative effectiveness body says there's no difference between angioplasty and medical management of heart disease?"
The American College of Obstetrics and Gynecology (ACOG) has just revised their guidelines for Pap smears under some pressure. This resulted from an Annals of Internal Medicine article which documented that only 16.4% of gynecologists followed the College’s prior guidelines. Most did more screenings than indicated, the worst record of the specialties tested. But the ACOG still recommends that nearly all women obtain regular screening at intervals of 1-3 years.
Cervical cancer is a rare disease in the US: just over 11,000 cases are predicted in 2009. There will be nearly as many cases of testicular cancer, 8,400. In comparison both breast and prostate cancer are just under 200,000. Most women have been led to believe that cervical cancer is rampant and they need yearly screening to prevent it. Testicular cancer however, is rarely mentioned. Most physicians don’t even bother to recommend that young men self-examine.Here is a different example. Man sues over “botched” testicular surgery. Doctors later discovered that the tumor was not malignant and did not need to be removed. Was it possible to offer a better testing, say biopsy? Urologists can tell you that testes should never be biopsied prior to removal if cancer is suspected as it significantly increases the risk of tumor dissemination. Studies show that those who had scrotal incisions for biopsy have a higher local recurrence rate as well as a higher relapse rate. If there is a solid growth in the testes, there is 95-97% probability of the growth being malignant. Sounds very reasonable, in this case doctors are sued for proper care, but was another cost - the cost of inconvenience, perhaps even mental trauma - ever taken into consideration?
Medicine is as much art as science – there are many cases where there are no “right” answers. Health care providers can prescribe as many procedures as they want and charge whatever they like. It's time for this to be changed.
Personal risk of cervical cancer, for example, can be easily estimated by every woman as it depends on whether she had multiple sexual partners, prior negative Paps, long term mutually monogamous relationship. HIV (that has a five times greater incidence than cervical cancer) tests are not administered to people without risks, right? Well, not quite right, but that's another story.
Dr. Joel Sherman (Medical Privacy, A Patient Oriented Discussion) has seen many women who are angry that the facts on cervical cancer have been hidden from them. They are pushed into getting Paps, but never told the pros and cons of screening. Never mentioned are the high incidence of abnormalities that resolve spontaneously, like negative biopsies and colposcopies.
And last, but not least - another reason of why healthcare is so costly - the insurer's overestimates, in a way their overdiagnosis of our health problems.
If the law says insurers have to treat every person the same, without taking into account whether they’re sick or healthy, young or old, a rational insurer will do some rational things. For example, it will assume disproportionate numbers of people who buy a policy from them will be sick and old. Of course, when they do this, the product becomes expensive, and young, healthy people start to wonder if they should even buy it in the first place. After all, they don’t really need insurance, right? Coupled with the overall rise in the cost of health care, insurers now push through new rounds of price increases, which, in turn, create more uninsured people. It is a very nasty cycle.
Aurametrix is developing decision support systems to help you evaluate personal health risks, decide on preventative measures, estimate cost/benefits of performing diagnostic tests. Better tools for a healthier world.
Wednesday, November 18, 2009
Here are my brief meeting minutes from the Cofounders Wanted November Meetup, organized by @alain94040
(see also FairSoftware's blog on this event and Aurametrix' analysis of startups presenting at Silicon Valley New Technologies meetup)
- Interesting idea of hotels for everyone and by everyone was presented by russ.hearl at staysherpa.com The startup is looking for a CTO-to-be PHP programmer. http://www.sherpatravelexchange.com/ & http://twitter.com/sherpatravelx
- Greg Gentschev is looking for a technical cofounder for a Business Search & Workflows project: http://twitter.com/gentschev
- Justin at binarynomads.com is looking for a technical cofounder too: http://binarynomads.com/
- Peter Kazanjy is building a site to outperform LinkedIn and Yelp, his LI prfile is at http://www.linkedin.com/in/kazanjy see also http://getunvarnished.com/
- Ira Chayut is building a Fun Site with Sound and Voices: http://www.linkedin.com/in/irachayut
- Sujit Kirpekar (kirpekar at gmail.com) has a great Autolicious idea, follow him on twitter: http://twitter.com/discoganya
- Kristen at Hazard.com presented her Timesarrow project: http://www.hazardbio.com/, see also http://twitter.com/timesarrow. She is a yoga buff and is looking for a business cofounder.
- Georgi Dagnall at geogad hopes people will travel more, with their personal mobile tour guide. See http://www.geogad.com/
- Junk Cloud brings the power of auctions to online classified ads: http://www.junkcloud.com/ and http://twitter.com/JunkCloud. Contact Ryan at juncloud.com for more information
- Al Brown, Deep Therapeutics, is looking for mechanical, electronics, robotics engineers to help implement the prototype.
- http://fairsoftware.net/- legal framework for early-stage projects, finding cofounders
- http://www.chagora.com/ - microdonations
- http://revenzy.com/ - social auctions site
- As to Aurametrix, it's all hush hush now. The public site is for general informational purposes only. The company may be looking for a cofounder at a later stage.
Thursday, November 5, 2009
Microbiota is specific to every individual, and varies systematically across body habitats and time, as well as geographical location, preventing or causing a disease after exposures to infectious agents.
Some human skin locations harbor even more diverse bacterial communities than the gut that we were thoughtfully nourishing with probiotics.
New analysis published in Science Express adds more information to the earlier results (from May 2009, for example), showing how diverse the microbiota is and how easy it is to re-colonize the skin.
We mapped some of the findings as shown in the Figure (on the right; the figure on the left maps bacteria in GI tract, from Dr. Richard Lord’s presentation at the 2008 Functional Medicine Symposium in Carlsbad, CA). Moist sites are shown with blue arrows, such as inside the nose, the armpits, the navel, dry areas are shown with green arrows, such as the forearm and oily sites are shown with yellow arrows: inside the ear, between the eyebrows, forehead, the back of the scalp.
Sites of most bacterial diversity were : The index finger, back of knee, forearm, palm and sole of foot.The forehead displayed the least diversity (with bacterial populations strongly preffering this site and not letting other bacteria to co-habit the space), but there were individual differences between different people. The mouth cavity showed the least variation in diversity both within individuals and between people. Studies of other microbes such as viruses and bacteriophages show low diversity in the airways as well, even though the human respiratory tract is constantly exposed to a wide variety of microbes and environmental agents. There is a difference between diseased and non-diseased individuals though - in Cystic Fibrosis (CF). for example, viromes are enriched in aromatic amino acid metabolism. Note that this disease causes a distinct acidic breath - the more severe the condition is in an individual, the more acidic his breath becomes. The microbes were especially sensitive to amino-acid starvation indicating that therapeutic measures may be more effective if used to change the respiratory environment, as opposed to shifting the taxonomic composition of resident microbiota.
Altered breath resulting from changed micrflora is a known phenomenon and it can be detected not only by complex mass spec machines, but also by devices used in QA testing of foods (e.g. Cyranose pick up the scent of penzane, isoprene acetone, and benzene in the breath of lung cancer patients) and car air quality sensors to study human "fermentome". (See also ongoing clinical trials on chemicals in human breath for diagnostics of diseases).
Altered bacterial populations could, indeed, be studied by metabonomic profiling. At present, however, the most accurate analysis, was performed based on microbial DNA or 16S RNA.
The study subjects were sampled four times each over a three-month period, typically after showering an hour or two earlier. Microbial DNA was then isolated directly from swabs used for sampling each body site. To recover bacteria from the skin surface, it was enough to swab it once by a wet cotton swab in 30s.
Wednesday, November 4, 2009
Workflow, procedures and organization of diagnostic laboratories have changed little since the end of the 19th century. Technology improved quality and safety, lead to higher throughput and allowed private electronic access to patients' lab test results - at least partially, but other than that not much has changed.
The introduction of first generation transcriptome technologies in the mid 1990s (Schena et al., 1995; DeRisi et al., 1996) has led to a phenomenal ability to simultaneously measure thousands of genes, create molecular profiles of cancers, all other diseases and conditions.
Proteomics - coined a couple of years later (James, 1997) - was accepted as an even more promising technique for effective diagnostics of diseases. Figure on the left, however, demonstrates that it too faces many challenges from discovery of biomarkers to their verification and approval. After more than a decade (Oliver et al., 1998), metabolomics has been accepted as - at least - an equally promising technique, but it still lagging behin genomics and proteomics.
All these techniques will have significant impact on the business model of diagnostics. Multiplexing (measuring multiple biomarkers at once) is obviously much more cost-effective. Diagnostics industry is historically very resistant to disruptive technological change, but potential cost advantages should outweigh this, leading to novel business models in health management.
A change will also come from the growing near patient testing (NPT) sector. NPT is already finding a role in wellness monitoring. Existing self tests - such as cholesterol kits - may not be very accurate, but with the advent of inexpensive multiplexing assays this will be overcome. Even when blood testing is done by trained professionals in a lab, there can be significant variability in test results. Same applies to blood pressure measurements - you may need to do three measurements per day for five days in order to get a decent baseline. This only justifies the need to have inexpensive tests that can be done more often.
But lets go back to metabolomics and its potential to provide noninvasive inexpensive diagnostics. Are there any clinical trials attempting to translate it into clinical practice?
Here are our favorite ones:
NCT00757952: Diagnosing ovarian cancer in exhaled breath. (Pine Street Foundation & University of Maine)
NCT00898209: Diagnosing Lung cancer in exhaled breath. (Vanderbilt-Ingram Cancer Center)
NCT00898209: Exhaled breath analyzed for lung cancer. (Vanderbilt-Ingram Cancer Center)
NCT00639067: Breath test for early detection of lung cancer (Menssana Research)
NCT00873366: Breath tests to access effectiveness of breast cancer treatment (Mayo Clinic and National Cancer Institute (NCI))
NCT00330603: Metabolomic breath analysis to predict treatment for chronic cough (University of Virginia)
NCT00632307: Breath analysis to diagnose COPD; lung cancer; airway infection; interstitial lung disease, sleep apnea; pulmonary disorders with pleural infusions; sarcoidosis (Lung Clinic Hemer, Germany)
NCT00294489: Breath analysis to diagnose Hepatitis C (Hadassah Medical Organization, Jerusalem, Israel)
Schena M, Shalon D, Davis RW, Brown PO 1995 Quantitative monitoring of gene expression patterns with complementary DNA microarray. Science 270 : 467 –470
DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS, Ray M, Chen Y, Su YA 1996 Use of a cDNA microarray to analyze gene expression patterns in human cancer. Nat Genet 14 : 457 –460[CrossRef][Medline]
James P 1997 Protein identification in the post-genome era: the rapid rise of proteomics.". Quarterly reviews of biophysics 30 (4): 279–331. doi:10.1017/S0033583597003399. PMID 9634650.
Oliver SG, Winson MK, Kell DB, Baganz F. 1998. Systematic functional analysis of the yeast genome. Trends in Biotechnology 16: 373-378.
Aurametrix is conducting research to develop next-generation diagnostics to help you evaluate your personal health risks and benefits. Better tools for a healthier world.
Sunday, November 1, 2009
Image by juhansonin via Flickr
ACM Silicon Valley Data Mining Camp on November 1, 2009 has attracted more than 200 people with different backgrounds and interests. It was held at Hacker’s Dojo, sponsored by REvolution Computing, KXEN (Knowledge Extraction Engines), and LinkedIn (See notes on this event by @Andraz of Zemanta, Ken's open source tools, and relevant #dmcamp twits) .
Biomedical/Healthcare data mining topic was suggested by Junling Hu and Irene Gabashvili and supported by A.J. Chen, Greg Makowski, Sukanta Ganguly, and 40 other participants of the Data Mining Camp. Below is a brief transcript of the discussion.
The session started from introductions, here are some of them:
- Irene, with background in biophysics, medical informatics and CS, pursuing a personal health management venture, interested in data mining to advance personalized medicine;
- Lawrence, with background in physics and software engineering and interest in health IT. He is the organizer of Google Wave meetup (you may know about Google Health Wave);
- Hua, formerly with Kaiser, interested in medical scheduling and web development;
- Liana, interested in Natural Language Processing for biomedical knowledge mining;
- Maura, interested in Health IT, medical engineering and security;
- Magnus, developing Medical Databases;
- Kevin, interested in medical startups;
- Watson, with background in genomics and machine learning;
- Peters, working on medical devices and software embedded systems;
- Steve, formerly of Applied Biosystems;
- Roy of Codexis, focusing on data mining and pattern recognition in multivariate time series
- Jima, with background in medical informatics;
- Karsleep, interested in biomedical data mining;
- Deena, scientific analyst interested in how data mining technologies could be applied to healthcare;
- Junling Hu, scientist at Bosch, working on a device and software collecting and analyzing patients' information, based on daily questionnaires and other collected data.
Junling started the session from mentioning a recently published paper on computer technologies for healthcare determining strategic directions in the area. Irene also suggested to check the mHealth Summit focusing on mobile technologies to improve research data collection, healthcare delivery, and health outcomes.
Junling described the project she was working on - inexpensive device collecting data and sending it to a "coaching" nurse that monitors stay-at-home patients. Next step is to mine the data automatically, thus reducing the load on healthcare professionals without sacrificing patients' well-being. Junling also mentioned some of the challenges such as compliance of participants who are typically not eager to fill out the 20-question surveys. This is especially bad for obesity studies.
The data mining challenges mentioned during the session were:
(1) Missing Data
We are not talking about sparse data (discussed in one of the previous sessions on data mining with R), but actually missing data. Data is sparse if only a small fraction of the attributes are non-null - like the number of items we typically buy in a grocery store is much less than the number of products they offer. Data is missing if the values were never entered or the member combination is not meaningful (for example, obstetrics/gynecolgy values not meaningful for men) . One of the suggestions from the experts in the audience was to utilize "multiple imputation". Other suggestions included "once-a-week" questioning instead of daily surveys. Irene mentioned the 7D-PAR (Seven-Day Physical Activity Recall) , one of standardized questionnaires developed in the 80s (1,2) and other established methods.
Questions and comments from the audience:
- Data mining methods utilized for Chronic Disease Assesment and Elderly monitoring. Junling talked about unsupervised classification algorithms and two supervised learning methods she found to be most useful for her work - SVM and logistic regression. Both were equally good in predicting hospitalization events
- Indicators of Goodness of Model Predictions. Suggested events were hospitalizations, mortality... It was noted that good indicators are yet to be found.
This was another health data mining challenge emphasized during the discussion. All standard methods can be applied such as accuracy, precision, recall, true positives, false positives and especially combinations of the last 2 measures. Junling mentioned breast cancer classifier developed by Siemens and other algorithms predicting emergency situations with 90% accuracy. Irene noted that one of the problems of digital mammography and other cancer predictors is a high rate of false positives. From 30 to 40% of cancers are overdiagnosed (3), thus increasing healthcare costs. This has to be changed.
Several people in the audience emphasized that existing methods are averaging the population. Medicine needs to be truly personalized, we need better methods and more data.
(3) Large Number of Input Features
One of the main problems of health data mining is coping up with large number of input features. Obviously, a 20-question test is not sufficient. Should it rely on thousand questions or trillion inputs? And how to select a subset of relevant features to build robust learning models? Junling's preffered approaches are logistic regression and singular value decomposition. She would add features one by one and check if the overall accuracy for predictions remains good.
Questions from the audience included:
- A 3-5 year Vision for Health Data Mining: what do we expect to achieve?
Participants expressed an optimistic outlook
- Ray: Are most input variables discrete or continuous? The answer was: mixed
The good thing about pattern recognition is that the more patterns you have, the better it performs. Google translator is a good proof of this assertion (although this translator needs even more patterns to do a decent job).
Biomedical data sets such as CT, MRI, PET scans and other image data, gene expression, genetic variation are very large scale in nature. The challenge for data miners is to integrate and extract information from data of such scale.
Questions from the audience:
- What are the other large-scale studies trying ot mine patterns in health data, outside of US?
Studies in China and Taiwan using similar devices and models; also in Europe
Adding to this interesting discussion that was unfortunately interrupted because of the lack of time, I'd like to mention a few other challenges facing biomedical data mining.
- We should not underestimate the complexity of relationships between causative and effect variables in human health. Simplistic approaches are deemed to fail. Over-fitting could be a problem too
- Integration between heterogeneous data sources and types,and putting content in context (semantic integration) remains a challenge.
- Privacy Concerns associated with the Sharing of Individual Health Information.
- Blair S. How to assess exercise training habit and physical fitness. In: Behavioral Health, edited by Matarazzo JD. New York: Wiley, 1984, p. 424-447.
- Rauramaa R., Tuomainen P., Väisänen S., and Rankinen T. Physical activity and health- related fitness in middle-aged men. Med Sci Sports Exerc 27: 707-712, 1995.
- Gøtzsche, P.C., Jørgensen, K.J., Mæhlen, J. and Zahl, P.-H. Estimation of lead time and overdiagnosis in breast cancer screening. British Journal of Cancer (2009) 100, 219–219.