Big data makes coding on a super computer like developing embedded code

by Matthew Pocock

2020-02-25 16:00 (60 min) in USB 2.022

IDRIS is an application for epetope identification (developed by Dr Keith Flannagan as part of his PhD). It uses a tokenization approach to discovering polypeptides that are found across all species within a group of interest, and are not found in any other species. Some of these targets have gone on to be experimentally validated, and shown to work in practice. Since this was initially developed, the number of available bacterial genomes has exploded, breaking the IDRIS code.

In this talk I will recount my journey attempting to resurrect this code on a high spec server, and how pushing even very substantial hardware to its limits begins to feel more like embedded coding on a tiny, resource-limited device.