Summary: A majority of the patient information patients and care teams use to make health care decisions is effectively “locked” in clinical notes such as those written or dictated by physicians. Natural Language Processing techniques have been maturing to extract these concepts, fill information gaps, and support health care’s clinical, operational, and financial objectives. I delivered this webinar to summarize some of Atrius Health’s work using the Linguamatics I2E platform.
data
Forget Big Data. It’s Time to Talk About Small Data.
With all of the talk of “big data,” it can be hard to remember that there was ever any other kind of data. If you’re not talking about big data — you know, the 4 V’s: volume, variety, velocity, and veracity — you should go back to running your little science fair experiments until you’re ready to get serious. Prevalent though this message may be, it has, at least in health care, stunted our ability to focus on and capture the hidden 5th V of big data: value.
Why the stakes are so high in the open data debate
It is hard to understate just how much of a currency data has become in medicine. Whether talking about evidence-based medicine, precision medicine, or genomics, the ability to collect and distill data into information, transform it into knowledge, and use that knowledge to drive effective action is at the heart of what modern medicine seeks to accomplish. The centrality of data to this process has created well-entrenched stakeholders, which is why it comes as no surprise that the conversation around open sharing of research data following publication has shifted into controversial territory.
Seeking a diagnosis on the Internet: survey results

Testing design assumptions with users is a critical ingredient in user-centered design. In Symcat’s early stages (ca 2012), we thought, for better or worse, that we would identify some eligible test users through Craigslist NYC. We were surprised by just how many people were willing to participate and collected some pretty interesting data in the process. I just stumbled upon it and I suspect much of it is still relevant, so I thought I would share. Get ready for some graphs.
On the Evaluation of symptom checkers for self diagnosis and triage: audit study
I should begin by acknowledging the authors’ important contribution to elucidating the gap between what symptom checkers may hope to provide and the existing state of the art. Semigren et al adopt a pragmatic approach both by identifying which symptom checkers patients may reasonably find and assessing them in the most intuitive way imaginable: making them take the standardized patient tests we all take in medical school.
Identification of High Risk Commercial Patients for Population Management (Epic XGM 2015)
Just got back from Epic XGM 2015 presenting some of the work I have been doing at Atrius Health in predicting high risk patients.
Some of the session details (slides below):
Summary: Atrius Health expects a large proportion of commercially insured patients to shift into accountable care arrangements in the near future. The presenters will describe their work to develop new risk models for commercial patients, using both financial claims and Epic data, and compare these against other risk models.
The Non-Physician’s Guide to Hacking the Health Care System
Written with my friend and co-founder of Symcat, David.
We are residents and a software developers. Before starting residency, we spent time as software developers in the startup community. We were witness to tremendous enthusiasm directed at solving problems and engaging people in their health. The number of startups trying to disrupt healthcare using data and technology has grown dramatically and every day established healthcare companies appear eager to feed this frenzy through App and Design Competitions.
Continue reading The Non-Physician’s Guide to Hacking the Health Care System on THCB.
How the Big Data Trend will Support Medical Research
It is no secret that research relies critically on data collection. Whether you’re talking about pharmaceutical research, market research, or outcomes research, successful analysis can only be done with robust data that captures the metrics most relevant to the question at hand. Unfortunately, that degree of data collection can be an expensive proposition, especially when it comes to health care.
Continue reading How the Big Data Trend will Support Medical Research on the Symcat blog.
Symcat: A Consumer-side Expert System for Diagnosis (Johns Hopkins Informatics Grand Rounds)
I am appreciative for the opportunity to share alongside David some of my journeys in conceiving of and building Symcat during the Johns Hopkins Informatics Grand Rounds. In it, we talk about some of the history of decision support, the technology behind Symcat, and some additional points about entrepreneurship and web development that excite us.
Symcat: Data-Driven Symptom Checker
Problem
When people get sick, they have several options for obtaining health care. These include going to the emergency room, urgent care center, or calling a doctor or nurse. However, 80% of people experiencing symptoms start with an Internet search. Unfortunately, searching on Google offers spotty results and frequently leads to undue concern. For example, one is 1000x more likely to encounter “brain tumor” in web search results for “headache” than they are to ever have the disease. Undue concern is a contributor to the 40% of emergency room visits and 70% of physician visits that are considered to be inappropriate.