Books

Hallam Stevens’ Life Out of Sequence

Stevens - CoverLife Out of Sequence: A Data-driven History of Bioinformatics

by Hallam Stevens

The University of Chicago Press, 2013. 294 pages.

 

Life Out of Sequence is a lucid ethnographic and historical account of how computational tools changed how biologists think about and engage with living systems. In it, Hallam Stevens tells a captivating story about how genes and genomes become meaningful through the emerging field of bioinformatics. It takes the reader through a series of “data-driven” studies of key actors and locations of a new material culture where data is at the centre. Unsatisfied with simple proclamations about the digitizations of life, Stevens carefully describes how the virtualization of nucleic acids has changed epistemic practices in biology.

 

Chapter 1 starts with the development of digital computers originally envisioned for military applications, which later came to be trusted for bio-scientific information management and analysis. In a recent interview Stevens observes that this historical backdrop complements Joseph November’s postwar account (Biomedical Computing, 2012) by continuing the story from the 1960s until the present day. Stevens argues that while initial attempts at computerizing biology failed (because they tried to shape computers to solve biological problems), biologists eventually came to pursue the kind of questions that computers were particularly good at solving. Through narrations of bioinformatic pioneers such as Margaret Dayhoff (a physical chemist and the ‘mother’ of bioinformatics), Walter Goad (a postwar physicist who introduced computing into biology and helped found GenBank), and James Ostell (an early innovator of nucleic acid analysis software), we learn how these tools gradually become trusted, and eventually ubiquitous, in current biology. Stevens shows that despite the pressing data management problems posed by massively collaborative undertakings like the Human Genome Project, bioinformatics has an interesting historical trajectory independent of the HGP’s organizational demands. Importantly, the book shows how epistemic subjects in the life sciences have morphed from the stereotypical individual scientist-entrepreneur working in small laboratory teams into Big Science. In the latter form, production of biological knowledge is distributed very differently than in the former.

Chapter 2 describes how post-genomic biologists can do biology without interacting with wet biological material and what new divisions of labor result from this. In the epistemologically controversial terrain of data-driven or hypothesis-free biology, computers take on roles as “induction machines”—“wide instruments” that can tame massive amounts of digital data in order to reach novel insights[1]. Stevens describes how biologists come to trust what computers tell them and perceive themselves as “setting the data free to tell their own story” (p. 69). He also draws attention to ongoing debates on epistemic norms, the logic of scientific discovery, and the status of Popperian ideals about hypothetic-deductive methods in the age of big data.

According to Stevens, data is “constrained by the physical and virtual structures that create and store them” (p. 70). What consequences does this have? Chapter 3 show that new modes of producing biological knowledge entails alternative ways of organizing scientists and their workspaces. The chapter takes us into an ethnography of the Broad Institute of MIT/Harvard fame, where epistemic credit is differentially distributed between producers and consumers of biological data. In this new knowledge-economy, the consumption and analysis of data carry more prestige than its production; a division reflected in physical workspaces. New spaces for bioinformatic knowledge are arranged so that laboratory work itself can be “managed as data”, implying a form of “quantification and control of space and work”. Large-scale sequence facilities leverage “lean production” principles from the Japanese auto-industry to manage contemporary genomics research, improve accuracy and become more efficient in the quest for more and better sequence-data. In these environments principles from operations management recombine with conventional modes of biological knowledge production. Economization and efficiency, as well as performance indicators and control through automation and barcode-tracking of biological samples and materials, all become integrated into laboratory environments. In these hybrid facilities, space and value take on new meaning as a novel kind of knowledge worker emerge. This epistemic subject is neither a Fordist automaton, nor a traditional lab bench scientist, but a “lean biologist”. Lean biology is central to commoditizing life in the biotech age.

Chapters 4, 5 and 6 follow biological objects as they transform from wet-materials into data via “pipelines”, get ordered in databases, classified and standardized in ontologies, networked, and finally visualized and analyzed on computer screens. We learn that biological databases, such as NCBI or Ensembl are not simply archives, but devices more “oriented to the future than the past” as they structure and constrain future biomedical knowledge-making (p. 138). Databases are structured and connected according to underlying theories about biological mechanisms and pathways. As objects of material culture the tools embody theoretical biology. The co-evolution of biological theories and bioinformatic systems are reflected in their parallel trajectories, from former assumptions about ‘one gene, one protein’ interactions, to the complexity of contemporary federated databases and current developments in ‘omics’[2].

Visualization and manipulation of data from databases through genome browsers such as NCBI and Ensembl are an integral part of bioinformatic work, and according to Stevens, the scientific visuals in computational biology act as theoretical models with distinct inferential properties. Doing bioinformatics is to a large extent practical problem-solving: how do you translate and manipulate biological objects into representations that can be used by biologists? Genomes in the wild are curled-up macromolecules with structural elements we cannot see, not even through powerful microscopes. Since we have no intuitive ontological understanding of genomes, Stevens argues that computational representations come to define what genomes are. The computational biologist does not interact directly with genomes, but computations. Translating between these entities becomes a question of representation and re-representation. Pictures have different semiotic properties than numbers, and biologists grapple with the constraints of these representations daily. On the question of how genomes become meaningful, Stevens’ account could have benefited from engagement with recent cognitive studies of the material and visual culture of science (see for instance Morana Alač’s Handling Digital Brains (2011)). From such a perspective, genomes, as meaningful visual representations of numerical data, are produced by large-scale distributed cognitive networks that enter into new, extended cognitive systems assembled on the spot by canny, embodied cognizers in front of computer screens.

In the conclusion, Stevens looks at how the Web’s future is closely linked to that of bioinformatics. Web 3.0 promises to systematically connect massive amounts of data by pulling heterogenous elements together in networked representations that not only humans, but also machines, can make meaning out of. Web 3.0 is also likely to become a “wet web, existing at the interface between the biological and digital” (p. 218). In ‘biology 3.0’, the boundaries between the biological and digital are erased, and bioinformatics will become biology as usual. The books ends by considering Homo statisticus, a new post-genomic vision of the human based on beliefs about the statistical properties of individual genomes and its entailments for our conception of the self.

Life Out of Sequence is not structured chronologically, but as a series of explicitly data-driven parallel accounts. This is a successful move. Theoretical elaborations are rather brief and succinct compared to the sometimes heavily theory-driven STS-literature. There are extensive footnotes. The result is well-written, clear prose of interdisciplinary relevance. As an anthropologist doing a cognitive ethnography in a community of marine biologists performing functional genomic studies of an economically important parasite for salmon aquaculture, my own observations suggests a less radical shift away from wet-work than Life Out of Sequence argues. Although my interlocutors use many of the tools described in the book, laboratory wet-work (and particularly stabilizing ‘wet’ experimental systems) is central to their ongoing activities. Nonetheless, for comparative studies of scientific meaning-making, this work offers a goldmine of insights into the distributed and situated nature of computational, material and cognitive practice in contemporary biology. Stevens’ account also offer exciting opportunities for comparative work on the semiotic properties of scientific visuals, and how these epistemic artifacts enter into the multimodal, distributed cognitive ecosystems of contemporary science. Highly recommended.

 

Mads Solberg is a doctoral fellow at the University of Bergen, Norway. He works on a cognitive ethnography of knowledge-making and technological innovation in marine sciencein particular, the development of novel solutions for managing sea lice, a persistent threat to salmon farming.

[1] In contrast to narrow instruments that make few measurements to test specific hypotheses, wide instruments in modern genomics can make hundreds of thousands of measurements.

[2] Omics is a suffix for approaches that aim to capture the entirety of interactions between genetic elements and their products, including epigenomics, proteomics, metabolomics. For a review of current developments and proliferations of ‘omics’ see the popular Nature piece Big biology: the ‘omes puzzle (2013).


Leave a Reply

Your email address will not be published. Required fields are marked *