Leveraging terabytes of single cell data

Share this article

Yvan Saeys is a young professor at VIB-UGent. As the group leader of the department of Data Mining and Modeling for Biomedicine he focuses on developing computational models for single cell analysis. At the Janssen Pharmaceuticals – Flanders.Bio Partner day last month, he gave a talk on partnerships between academia and pharmaceutical companies from the perspective of his research.

How the collaboration with Janssen started

Saeys’ group started 4 years ago and participated in the FlowCap challenge, a competition where scientists have to develop modeling techniques to analyze flow cytometry data in an alternative way. They had to come up with a predictive model to forecast when a HIV patient would develop AIDS, based on flow cytometry data from a blood sample of a patient. Basically a high-throughput technique for single cell analysis. They presented their results at a flow cytometry conference in the US, where their research was picked up by  Janssen Pharmaceutica.

We can publish almost everything we want, while still preserving the interest of the company. You do have to be prepared to deliver, even as an academic partner. If not, then you have problem. You cannot be opportunistic and just take the money and run.

Don’t take the money and run

A collaboration between academia and industry is a win-win situation, as the strengths of both partners can be combined. By collaborating with academia, pharmaceutical companies gain access to basic research and the scientific rigor of intelligent minds, from students as well as professors. Furthermore, academic parties are usually not financially or organizationally able to set up their own large scale projects. Thanks to the collaboration with Janssen, Saeys’ lab gets access to tremendous amounts of cell data that it couldn’t produce by itself.

Of course, academia and pharmaceutical companies might have different interests in the long run. Whereas academic institutes want to publish their research as early as possible, pharmaceutical companies want to keep it secret as long as possible. According to Saeys, you have to find a middle ground: “We can publish almost everything we want, while still preserving the interest of the company. You do have to be prepared to deliver, even as an academic partner. If not, then you have problem. You cannot be opportunistic and just take the money and run.”

Machine learning thanks to terabytes of data

Saeys’ lab analyzes huge datasets of single cells. They analyze over one billion cells, exceeding tens of terabytes of data, using AI and predictive modeling techniques to come up with interesting answers that cannot be obtained with classical methods. They screen data of single cell images, segment them, and extract information. Then then they use machine learning techniques to come up with a computational model that allows for prediction of compound activity profiles.

Reprogramming the immune system

The Saeys group is also focused on understanding the dynamics of the immune system. His group is trying to generate single cell transcriptomics data to identify different cell populations present in a sample, including very rare cell types such as circulating tumor cells. Saeys concludes: “We want to get a better understanding of the different stages a cell goes through and which stages result in a cancer phenotype. The immune system is amazingly complex. If we can grasp how cells transition between states, we may have the potential to try and reprogram them.”