Janssen Pharmaceutica uses the VSC Tier-1 infrastructure to analyse large data sets with the aim of reducing the failure rate in the clinic II phase of the development of new medicines.
Janssens Pharmaceutica in a few words
In 1953 Dr. Paul Janssen founds Janssen Pharmaceutica. He had only one goal in life: more quality of life thanks to the development of better medicines. In 1961 the company affiliates with Johnson & Johnson, global healthcare market leader. Today, the group has more than 250 operational companies in 57 countries today and more than 128,000 employees worldwide. Janssen in Belgium is the largest Johnson & Johnson site outside of the USA and takes in a major position in the American concern. Janssen Belgium is the flagship R&D site of Janssen in Europe and the largest R&D center of Johnson & Johnson worldwide for small molecules.
HPC at Janssens Pharmaceutica
Jörg Wegner, Senior Scientist at Janssen Pharmaceutica, points out the difference in computation requirements when confronted with huge amounts of data: “As in any other scientific environment, our R&D is facing an exponential data growth. It may be easy to count within a file with 10,000 lines, but it is really difficult to do so in a file with 100 million entries. And, typically, tasks in R&D are far more complex.
Computation in itself becomes more critical since most computational tasks do not scale linearly, but many scale in a quadratic, cubic, or even exponential way with the growing amount of data. In other words, a doubling in the amount of data does not increase the computation requirements by a factor of 2 but by a factor of 4 (quadratic), 9 (cubic), or 10 and more (exponential). This requires careful analysis and change in the way data is processed by using the expertise of large-scale computing experts.”
At least three supercomputing projects are ongoing at Janssen Pharmaceutica in collaboration with multiple partners such as the VSC (Flemish Supercomputer Center), IMEC (Interuniversity Microelectronics Center), Intel or other academic partners:
- Large-scale next-generation sequencing (NGS) analysis
- High-content imaging (HCI) analysis
- Large-scale machine learning (ML)
Jörg Wegner: “The progress in sequencing is producing TB amounts of NGS and HCS data. Both will help to get a better understanding of the patients’ biological details and complex biological phenotypic systems. To judge the relevance and impact of these data sets for R&D, large-scale machine learning approaches can be utilized for understanding and supporting novel experimental designs and especially their risk estimations. A simple example is the attrition rate of drugs in the clinic phase II, which everyone in the pharma industry is facing.”
Janssens Pharmaceutica and the VSC
Jörg Wegner: “The HPC efforts with the VSC aim to reduce the risks of progressing lead compounds to drugs and patient-disease sequencing data by understanding and analyzing such large data sets. By utilizing large-scale computations on the data being available and produced by Janssen and all of our partners, we finally hope to reduce the failure rate in the clinic II phase. We also aim at unraveling new paths for unmet medical needs of patients for whom no treatment options might exist at this point in time.”