Crowdsourcing voices to study Parkinson’s disease

Mathematician Max Little is launching a project that aims to literally give Parkinson’s disease (PD) patients a voice in their own diagnosis.

Patients Voice Analysis (PVA) is an open science project that uses phone-based voice recordings and self-reported symptoms, along with software Little designed, to track disease progression. Little, a TEDMED 2013 speaker and TED Fellow, is partnering with the online community PatientsLikeMe, co-founded by TEDMED 2009 speaker James Heywood, and Sage Bionetworks, a non-profit research organization, to conduct the research.


The new project is an extension of Little’s Parkinson’s Voice Initiative, which used speech analysis algorithms to diagnose Parkinson’s from voice records with the help of 17,000 volunteers. This time, he seeks to not only detect markers of PD, but also to add information reported by patients using PatientsLikeMe’s Parkinson’s Disease Rating Scale (PDRS), a tool that documents patients’ answers to questions that measure treatment effectiveness and disease progression.

“It’s a much more subtle problem,” Little says. Right now, PatientsLikeMe has some 6,500 PD patients using its platform, and he hopes response will mimic his earlier venture, which drew many thousands more volunteers than he had anticipated.

“Any way in which participants can use their misfortune to help others in future is something empowering. In diseases like Parkinson’s, because there’s no cure, people are generally told to sit tight. It’s awfully frustrating and it’s not good enough,” he says.

Another hoped-for end result is to give patients tools for self-reporting. That includes rethinking the current jargon-laden progression scale with language patients themselves might use. (The latter is part of a larger effort by PatientsLikeMe, as the company’s Vice President for Advocacy, Policy and Patient Safety, Sally Okun, explained at TEDMED 2013.)

“The classical scale is rarely used in clinical practice. It is technical jargon meant for the clinician to understand, but is hard for individual users to grasp. PatientsLikeMe has brought it to the masses,” Little says. With the PVA system, patient voice samples will be linked to their self-reported symptoms, so they can monitor changes over time, perhaps contributing to remote (and certainly less expensive) diagnosis.

As openly shared information, the collected data has potential to help vast numbers of individuals by tapping into collective ingenuity. Little has long argued that for science to progress, researchers need to democratize research and move past jostling for credit. Sage Bionetworks has designed a platform called Synapse to allow data sharing with collaborative version control, an effort led by open data advocate John Wilbanks.

“If you can’t share your data, how can you reproduce your science? One of the big problems we’re facing with this kind of medical research is the data is not open and getting access to it is a nightmare,” Little says.

With the PVA project, “Basically anyone can log on, download the anonymized data and play around with data mining techniques. We don’t really care what people are able to come up with. We just want the most accurate prediction we can get.”

“In research, you’re almost always constrained by what you think is the best way to do things. Unless you open it to the community at large, you’ll never know,” he says.

– Stacy Lu

Join TEDMED’s Google+ Hangout Tuesday, Feb. 25 at 12pmET to discuss crowdsourcing and other research innovations with Dr. Little and other special guests.

What’s the new way to ask big questions in science?

Parkinson’s Voice Initiative founder and TEDMED 2013 speaker Max Little is an applied mathematician whose goal is to “see connections between subjects, not boundaries…to see how things are related, not how they are different” – which gives him an unusual perspective on how big data could change medicine. We  interviewed him via e-mail to find out more.

You’ve been working to discover the practical value of abstract patterns in various fields, with surprising results in areas as varied as diagnosing Parkinson’s disease over the phone to predicting the weather. Can you explain your approach?

Max Little
Max Little

As an applied mathematician, my training shows me patterns everywhere. Electricity flows like water in pipes, and flocks of birds behave like turbulent fluids. In my projects, I collate mathematical models from across disciplines, ignoring the assumptions of that discipline to a large extent, I put in overly simple models. I use artificial intelligence to throw out inaccurate models. And this approach of exploiting abstract patterns has been surprisingly successful.

For example, during my PhD I stumbled across the rather niche discipline of biomedical voice analysis, originating in 1940’s clinical work. With some new mathematical methods, and combining these with recent mathematics in artificial intelligence, I was able to make accurate medical predictions about voice problems. The clinician’s methods were not accurate. This sparked off research in detecting Parkinson’s disease from voice recordings – the basis of the Parkinson’s Voice Initiative.

But, success like this raises suspicions. So, with collaborators, I tried to make this approach fail. We assembled 30,000 data sets across a wide range of disciplines: exploration geophysics, finance, seismology, hydrology, astrophysics, space science, acoustics, biomedicine, molecular biology, meteorology and others. We wrote software for 9,000 mathematical models from a deep dive into the literature. We exhaustively applied each model to each data set.

When finished, a very revealing, big picture emerged. We found that many problems across the sciences could be accurately solved in this way. In many cases, the best models were not the ones that would be suggested by prevailing, disciplinary wisdom.

Are you doing other research that might have implications for clinical diagnosis?

Here is another example: There is a decades-old problem in biomedical engineering: automatically identifying epileptic seizures from EEG recordings. But, we found over 150 models, some exceedingly simple, each of which, alone, could detect seizures with high accuracy.


This challenges quite a few assumptions – but it is not as if we are the first to find this. It happens often when new approaches to address old problems are attempted: for example, in obesity, a new, simple mathematical model revealed some surprising relationships about weight and diet.

You’ve also used fairly simple algorithms to successfully predict weather.

After my PhD, I teamed up with a hydrologist and an economist. We wanted to try weather forecasting using some fairly simple mathematics applied to rainfall data. Now, weather forecasting throws $10m-supercomputers and ranks of atmospheric scientists together, and they crunch the equations of the atmosphere to make predictions. So, competing against this Goliath with only historical data and a laptop would seem foolhardy.

But after two years of hard work, I came up with mathematics that, when fed with rainfall data, could make predictions often as accurate as weather supercomputers. We even discovered that models as simple as calculating the historical average rainfall, and using this as a forecast, were sometimes more accurate than supercomputers. We were all surprised. but this finding seems to line up with results that others have found in climate science: it is actually possible to make forecasts of future global temperatures using simple statistical models that are as accurate as far more complex, general circulation models relied upon by the Intergovernmental Panel on Climate Change.

Is this a new way of doing science?

If we divide science into three branches: experiment, theory and computer simulation, then what I am describing here doesn’t quite fit. These are not just simulations: the results are entirely reproducible with just the data and the mathematics. This approach mixes and matches models and data across disciplines, using recent advances in artificial intelligence.

The three branches of science. What happens when we add computational algorithms to the mix?
The three branches of science. What happens when we add computational algorithms to the mix?

I don’t know what to call this approach, but I’m not the only one doing it. The most enthusiastic proponents are computer scientists, who do something like this regularly in mass-scale video analysis competitions or one-off prizes financed by big pharma for molecular drug discovery as do statisticians working in forecasting.

In your TEDMED talk, you expressed concern that advances in science have stagnated. Can you explain?

Like many scientists, I’m concerned that science is becoming too fragmented. So many scientific papers are published each year that it is impossible to keep track of most new findings. Since most articles are never read, much new research has never been independently tested.

And, unfortunately, scientists are encouraged to ‘hyper-specialize’, working only in their narrow disciplines. It is alien to we applied mathematicians that a scientist who studies animal behavior might never read a scientific paper on fluid mechanics!  In isolation from each other, could they just be duplicating each other’s mistakes?

What can we do to create a more unified approach?

First of all, open up the data. There is far too much politics, bureaucracy and lack of vision in sharing data among researchers and the public. Sharing data is the key to eliminating the lack of reproducibility that is becoming a serious issue. Second, don’t pre-judge. We need to have a renewed commitment to radical impartiality. Too often, favoured theories, models, or data persist (sometimes for decades), putting whole disciplines at risk of missing the forest for the trees.

More collaboration would also greatly speed advances. Is first-to-publish attribution of scientific findings really that productive? I think of science as a collaborative journey of discovery, not a competition sport of lone geniuses and their teams.

Scientific theories that can withstand this “challenge” from other disciplines will have passed a very rigorous test. Not only will they be good explanatory theories, they will have practical, predictive power. And this is important because without this mixing of disciplinary knowledge, we will never know if science is really making progress, or merely rediscovering the same findings, time and again.

Follow Max Little @MaxALittle.