Why keeping up with data science matters for our health

Share this article

Woman holding heart with binary code

Data without science is nothing; just 1s and 0s, floating around a cloud waiting for someone to make sense of them. Data science is the process of extracting value from data, using advanced analytics tools. Enormous amounts of health information are being gathered every second, and we are rapidly getting better at decoding it: turning bytes into insights that can be used to improve the lives of patients. But the pace, methods and ethics of data science adoption varies dramatically between countries and regions. Why should we care about keeping up?

This article by Amy LeBlanc was part of a series on data science in the lead up to Science for health 2021.

Data science is all about creating value from data using advanced new tools. In the health sector, that means creating value for the patient. It enables the extraction of information that can’t be accessed with normal statistics, either because the data sets are too big, disordered, or qualitative instead of quantitative (e.g., with health records or medical imaging).

By bringing together AI tools – such as machine learning, natural language processing and advanced analytics – data science can create new insights, increase efficiency, and smooth processes in the health sector. Many countries around the world are already embracing this evolution in health science, and for good reason!

How can data science improve health?

There are many aspects of healthcare where data science is having a transformational effect. Some of these areas include:

Medical Imaging

It’s estimated that medical images – such as x-rays, MRIs and CT scans – now account for up to 90% of all medical data. Processing these images has classically always been done manually, with doctors visually inspecting images, looking for irregularities. Accurately providing a diagnosis this way takes up valuable time and requires well-trained physicians, and even experienced doctors may still miss microscopic deformities in the image. With the advent of deep learning technologies in data science, it is now possible to pick up on such details and to form an accurate diagnosis faster and with less variability than is possible with manual processing.


Ever since the Human Genome Project, genomics research has been advancing rapidly. Before the widespread availability of powerful computers, analyzing gene sequences required an extraordinary amount of time and money. With advanced data science tools, it is now possible to derive insights from the human genome in a much shorter period of time and at a much lower cost. This in turn opens up the possibility of personalized healthcare, where medical decisions can be being tailored to the individual patient based on their predicted response to certain interventions or their risk of developing specific conditions.

Early Discovery

Early discovery is often about finding patterns, something computers can do better than any person (particularly for data sets too large or complex for a human brain to process). By leveraging tools such as AI, we can optimize and improve research, shortening timelines and accelerating innovation and development pipelines. This is particularly important as early discovery is a slow and expensive process. By using data science, companies can reduce costs and get better solutions to patients faster.

Health Monitoring

In recent years, there has been a massive upsurge in the number of digital devices available on the market. In healthcare, these devices are often wearable health monitors, able to track user heartbeat, temperature, and other medical parameters. The results can be used by the wearer themselves, or to inform physicians of their patient’s ongoing condition and to warn them of any acute events such as a heart attack. By analyzing the resulting real world data sets, researchers can make huge breakthroughs in predictive medicine and disease prevention.

Predictive Analytics

Predictive models use historical data, learn from it, find patterns, and generate predictions which can be used to inform decisions. By creating and tweaking a ‘digital twin’ (a parallel virtual version of the patient, process, drug, or machine), predictive analytics can be used to improve everything from patient care and chronic disease management to the efficiency of supply chains and hospital logistics.

These are just a few of the most common data science applications in health. There are many other areas, such as including clinical trials, epidemiology, and pandemic preparedness, where data science can be used to improve on other analysis methods.

Don’t be left in the data dust!

So, if data science is so great, why isn’t everyone already using it? This field is growing unbelievably fast, but the pace of progress varies dramatically even between similarly well-developed countries. There are a few key aspects that regions need to prioritize if they want to keep up with the rapid acceleration of data science in health.

Firstly, we need more data scientists. Currently, they are few and far between, and highly sought after. Nearly 75% of the world’s data scientists work in just five industries: Information Technology (26%), Education/Science (14%), Consulting (13%), Financial Services (11%) and Healthcare & Medical (9%). Despite the importance of data science in health, the sector is lacking the skilled individuals it needs to incorporate and make full use the new tools being developed.

We also need more scientists who are themselves adept at using data science tools. In the same way that a researcher regularly uses statistics without being a statistician, they can use data science without it necessarily being their main expertise. By building complementary teams of researchers and data scientists, vital skills can be shared between individuals, strengthening the whole ecosystem for the future.

By bringing together AI tools, data science can create new insights, increase efficiency, and smooth processes in the health sector.

Finally, we need greater investments in data science by companies. Driven by the increased demand for data science, the industry’s market value is rising fast. The global Data Analytics Market was estimated to be worth $24.63 billion in 2021, with an estimated compound annual growth rate of 25% between now and 2030. This growth rate varies dramatically by region however, with countries like the USA speeding ahead of many others. With the rapid pace of progress, regions like the EU risk being left behind if they ignore this space – not just in terms of profits, but also regarding access to vital tools and competencies necessary to keep up with international competition.

Privacy, policy, and protection: the role of politicians

As with companies and investors, government bodies also need to start prioritizing data science. This is important not just for the health of their citizens, but also from the perspective of the economy and international security – governments have many compelling reasons to invest in data science in their region. They also have important roles to play in many of the practical and ethical challenges of digitalized healthcare.

The sheer volume of health data has become overwhelming, far outstripping global storage and maintenance capacities. Federal support is key to proper data storage, cleaning, and harmonization, which is necessary to enable researchers to access useable data in a manner that protects patient privacy.

No matter the system, we need to balance the protection of patients’ privacy and data ownership rights with the benefits that big data brings to medical research.

Federal networks or consortia of organizations are needed to combine datasets which are often fragmented. In Europe, information is usually controlled by different institutions, with scientific data held by researchers and hospitals in control of health records. This often makes it impossible for researchers to bring the information together in order to glean meaningful medical insights.

Regulatory clarification is also vital to ensuring that ethical requirements surrounding data use are met. Government stances on data currently vary dramatically across the world: the EU General Data Protection Regulation (GDPR), for example, maintains that data is owned by the individual, whereas China posits that data is owned by the state. In many countries, it is not the individual or the state but instead the companies that control the data. No matter the system, we need to balance the protection of patients’ privacy and data ownership rights with the benefits that big data brings to medical research (and, therefore, back to the patients themselves).

Dive into data!

Progress in data science has been inspired by other industries, such as internet retailers and online streaming platforms: companies that use big data to improve their services. The healthcare industry has collected mountains of data over the years, but across the world we haven’t been leveraging them to the full extent.

Data science has an enormous transformational potential in health: for everyone’s benefit, we must embrace it!

As biological, chemical, and real-world data increases in both quantity and quality, we need to take full advantage of the new tools at our disposal. By turning data into insights, we can create new health solutions and improve informed decision making. Companies and governments have to invest more – in programs, initiatives, and training data scientists – to make sure that we get this right and don’t fall behind global progress. Data science has an enormous transformational potential in health: for everyone’s benefit, we must embrace it!