Knowledge representation in synthetic biology

by James McLaughlin

16:00 (40 min) in USB 4.005

Synthetic biology is a relatively new field concerning the formalisation of genetic engineering into a design, build, test, learn lifecycle common to other engineering disciplines. This lifecycle can be used to systematically develop biological systems, such as genetic circuits - where transcriptional machinery is repurposed to construct familiar electronic circuit concepts such as logic gates - and other engineered devices such as biosensors, or drug production factories.

Synthetic biological systems are typically designed by repurposing existing natural and synthetic biological parts. This design process is made possible by knowledge about part structure and function, which can be experimentally derived or predicted using bioinformatics methodologies. However, the process of gathering such knowledge is arduous, as it is often computationally intractable (i.e. described in a publication as free text), distributed across multiple disparate databases with semantic and syntactic heterogeneity, or even not recorded at all.

There are both short-term and long-term solutions to this problem. The short-term solution is to improve access to existing knowledge through data integration and harmonisation. The long-term solution is to establish the software and data infrastructure necessary to enable future parts to be documented in a well-defined and tractable manner. This talk explores approaches to both, culminating in the development of a novel suite of tools for informed synthetic biology design.