Tracking and Segmentation for Cell Lineage Reconstruction

Research Area: A.2-Light Microscopy
Status: In progress  
Project leaders: Collaborators:
  • Eugene Myers
  • Raju Tomer
  • Khaled Khairy
Proposed start date: 2010-09-07 Proposed end date: 2013-09-02

Our final goal is to reconstruct full lineage trees for each individual cell with morphology information in entire developing complex model organisms, such as Drosophila and zebrafish. Such information would open new doors to quantitative analyses of cellular dynamics such as comprehensive mapping of gene expression dynamics or automated cellular phenotyping and biophysical analyses of cell shape changes and cellular forces, to mention a few.

In the past years, light sheet microscopy has emerged as an essential tool for quantitative imaging of cellular dynamics in the embryonic development of complex organisms [1,2]. Briefly, it allows three-dimensional (3D) recordings of full embryos in vivo at high-speed time intervals for long periods of time. At each time point, multiple images from different viewpoints are recorded to obtain full coverage, resulting in terabytes of 3D image data. Thus, a comprehensive high-throughput computational pipeline is needed to efficiently process and analyze such recordings.

In order to transform all these images into biological knowledge, a correct segmentation and tracking is indispensable before answering any relevant scientific questions. Moreover, since complex systems are formed by thousands of cells it is impossible to approach the problem only with manual annotation tools. The goal is to develop joint tracking and segmentation algorithms to recover full lineage trees for each individual cell with morphology information.

An advantage with traditional segmentation and tracking applications in computer vision is the fact that we do not have occlusions in our 3D datasets and cell movement is more constrained with logic rules than human motion in a crowded environment. Moreover, we do not need real time performance. However, the number of object to track is several orders of magnitude larger and many neighboring cells look alike. In fact, a key distinction is that cells divide (unlike humans) which expands the one-to-one or one-to-nothing set of hypothesis traditionally stated in tracking problems with a one-to-two state, which increases the combinatorial complexity of the solution. Finally, the error metric to reconstruct lineage trees is very demanding, since a single data association mistake between adjacent time points invalidates that whole branch of the tree. Thus, tracking and segmentation accuracies above 99% are required to resolve the problem satisfactorily.

Drosophila lineage

Figure 1: (a) Global nuclei tracking in the entire Drosophila syncytial blastoderm. Raw image data from Supplementary Video 1 were superimposed with automated tracking results using our sequential Gaussian mixture model approach. Images show snapshots before the 12th mitotic wave and after the 13th mitotic wave (using a random color scheme in the first time point), which is propagated to daughter nuclei using tracking information. (b) Global detection of nuclear divisions during the 13th mitotic wave in the Drosophila syncytial blastoderm. Non-dividing nuclei are shown in cyan and dividing nuclei in magenta. The color of dividing nuclei progressively fades back to cyan within 5 time points. (c) Enlarged view of a reconstructed embryo with nuclei tracking information (left) and morphological nuclei segmentation (right). From [3].

Our proposed approach is a two-step solution: first, develop greedy but efficient algorithms to home in on the right solution. Since many cases of segmentation and tracking are easy (for example, when cells do not move or do not divide and do not touch each other) we expect to get around 95% accuracy between adjacent time points with less than 5 minutes of CPU time per 3D stack. Second, perform inference in large-scale probabilistic models where all the biological priors can be properly included in order to resolve the problem with proper accuracy. Inference should focus in areas where simple metrics show reason to doubt the solution from the greedy approach and should use the previous solution to derive useful robust statistics for different parameters in the model. This “heavy algorithm artillery” should bring the solution to the desired accuracy. However, it is impossible to conceive a fully automatic solution for such a complex problem. Thus, we also work on visualization and annotating tools to provide a close efficient loop between the algorithm results and the user annotations. This sort of interaction is easily included into the inference model. However, developing efficient visualization schemes and editing tools to check results in 5D datasets (3D space + color channels + time) on a crowded cell environment is very challenging.


[1] P. J. Keller, A. D. Schmidt, J. Wittbrodt, and E. H. K. Stelzer, “Reconstruction of Zebrafish Early Embryonic Development by Scanned Light Sheet Microscopy,” Science, vol. 322, no. 5904, pp. 1065-1069, Nov. 2008.

[2] A. McMahon, W. Supatto, S. E. Fraser, and A. Stathopoulos, “Dynamic Analyses of Drosophila Gastrulation Provide Insights into Collective Cell Migration,” Science, vol. 322, no. 5907, pp. 1546 -1550, Dec. 2008.

[3] R. Tomer, K. Khairy, F. Amat, and P. J. Keller, “Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy,” Nature Methods, vol. 9, no. 7, pp. 755–763, 2012.