2 May 2019
A group of biostatisticians and genomics researchers have completed a one-day workshop on ‘Introduction to parallel programming in R on the High Performance Computing platform (Mendy) at the Fajara campus of the MRC Unit The Gambia at LSHTM.
The workshop, organised by the IT and Statistics departments, was presented by Dr. David Jeffries, Lead Statistician and Head of Statistics & Bioinformatics at the MRCG at LSHTM, and Archibald Worwui, HPC Systems Administrator, assisted by Karim Mane and Bankole Ahadzie.
“This is just the beginning of more trainings on how to use the new tools and facilities within the Unit for genomic research. We hope to continue these trainings for better science”, said Mr. Worwui.
The workshop aimed to build capacity in parallel processing on the HPC platform, which is a very important skill for ‘Big Data’ analysis from genomics and other workloads that require a lot of computational power, including processor cores and memory.
Badou Gaye, the Head of Information and Communication Technology at the MRCG at LSHTM, remarked that the HPC platform presents a powerful infrastructure for the Unit to analyse huge life sciences data with a quick turnaround time. Where this previously took several days to process, analysis can now be done in only a few hours. The HPC platform would, therefore, be very significant in reducing the time leading up to publications, and minimising the Unit’s reliance on external collaborators to process and analyse our data.
Participants were introduced to facilities for logging into and using the HPC system for basic parallel processing from an R code perspective. Other related topics covered included the basics of Unix like command line codes, basic shell scripts and batch processing and copying source codes and data from and to user PCs into their home folders on the HPC, using various tools.
Expressing his appreciation, Abdoulie Kanteh, one of the trainees remarked that “the training could not have come at a better time. The MRCG at LSHTM has invested in genome sequencing, and the genomics platform generates huge sequencing data which requires powerful infrastructure like the HPC for data storage and processing. Therefore, logging into Mendy and running bioinformatic jobs in parallel at the Unit is a step in the right direction for huge data analysis”.