Self classifying mnist digits Michael Levin Research Paper Summary

PRINT ENGLISH BIOELECTRICITY GUIDE

PRINT CHINESE BIOELECTRICITY GUIDE


What is the Goal of This Study?

  • The goal is to show how a group of simple agents (cells) can work together to classify digits using local communication between neighboring agents.
  • The agents are placed on a grid and each agent decides its own color, based on the collective shape it forms with its neighbors. The aim is for all agents to agree on the label of the digit they form.

What Are Cellular Automata (CAs)?

  • A Cellular Automaton (CA) is a computational model made of cells that interact with their neighbors to create complex patterns.
  • Each cell follows simple rules based on its neighbors’ states, but when combined, these simple rules can lead to complex behavior and shapes.
  • This study uses Cellular Automata as a model for how cells might communicate and classify patterns like digits in a group.

How Does This Model Work?

  • The cells do not know where they are located but are aware of directions (up, down, left, right) on the grid.
  • The cells communicate with neighbors to share information about their shape and label.
  • The model works by assigning labels to digits (0-9) and the goal is for the group of cells to figure out which digit they are forming based on local messages from neighbors.

How Do the Cells Classify Digits?

  • The MNIST dataset is used for this task, where each image of a digit is represented as a 28×28 grid of pixels.
  • Each cell in the grid receives information about the pixel value it represents, and depending on the pixel intensity, a cell is either “alive” or “dead”.
  • Cells communicate with their neighbors to decide on the overall label of the digit they are forming. The label is determined based on the majority of cells agreeing on the label.

Key Components of the Model

  • Target Labels: The model uses 10 channels to represent the 10 possible digit labels (0-9), with the most active channel corresponding to the correct digit.
  • Alive vs Dead Cells: Cells are “alive” if the corresponding pixel value in the MNIST image is above 0.1, and they perform updates. “Dead” cells do not update but remain visible to their neighbors.
  • Perception: The model uses convolutional layers to process information about the cell’s neighbors and make decisions about the digit label.

Experiment 1: Self-Classify, Persist & Mutate

  • The model is trained to classify a digit and then mutate it. After mutation, the cells have to adjust and reclassify the new shape.
  • This experiment tests the model’s ability to adapt to changes in the digit and keep reclassifying correctly.
  • The model learns to process new information by mutating the digit during training and forcing the cells to update their classification accordingly.
  • Cross-entropy loss is used during training to measure how well the cells classify the digit.

Experiment 2: Stabilizing Classification

  • The main problem observed is that after mutation, the cells often disagree on the correct digit, leading to flickering or instability in the classification.
  • To fix this, the researchers track the “total agreement” among cells to measure how stable the classification is over time.
  • The model was trained with different types of loss functions to see how it affects the stability of the classification.
  • One of the new methods used is L2 loss, which helps reduce the instability by keeping the internal states of the cells more balanced.

Key Findings from Experiment 2

  • Using L2 loss made the model more stable by reducing flickering and increasing total agreement among cells.
  • Noise was also added to the residual updates, which helped the system become more robust and less prone to instability.
  • The total agreement increased significantly when noise was added to the updates, showing better stability over time.

Robustness of the Model

  • The model is robust to changes in the digit’s shape, meaning that if you draw a digit differently (e.g., using a thicker or thinner line), the model can still classify it correctly.
  • This is similar to how biological systems, like planaria (a type of worm), can regenerate correctly even after many mutations or changes.
  • The model is also tested on digits that were not part of the MNIST dataset to see how it performs on out-of-distribution data. The model can generalize to new shapes but is not perfect for extreme changes.

What Are the Implications for Biology?

  • This model helps us understand how simple rules followed by cells can lead to complex behaviors, such as classification, similar to biological processes like regeneration.
  • The findings are important because they show how a group of cells can collectively achieve a goal (like classifying a digit) that individual cells could not achieve on their own.
  • This approach could be applied to regenerative medicine, where instead of editing genes of individual cells, cells could be taught to work together to achieve desired outcomes, like regenerating a missing limb.

Conclusion

  • This study demonstrates that a simple, self-organizing system of cells can be used for classification tasks by allowing them to communicate and adapt to changes.
  • By training the cells to classify digits and adapt to mutations, the researchers show that such a model can be a powerful tool for understanding complex biological processes like tissue repair and regeneration.

观察到的目标 (目标)

  • 目标是展示一组简单的代理(细胞)如何通过与邻近代理的局部通信共同分类数字。
  • 这些代理被安排在一个网格上,每个代理根据与邻居形成的整体形状决定其颜色。目标是所有代理一致决定它们所形成的数字标签。

什么是细胞自动机 (CAs)?

  • 细胞自动机(CA)是一种计算模型,由相互作用的细胞组成,通过简单的规则创建复杂的模式。
  • 每个细胞根据其邻居的状态执行简单的规则,但当组合在一起时,这些简单的规则可以导致复杂的行为和形状。
  • 本研究使用细胞自动机模型模拟细胞如何沟通并根据局部信息对模式进行分类。

模型如何运作?

  • 这些细胞不知道它们在网格上的位置,但知道上下左右的方向。
  • 细胞通过与邻居的通信共享信息,决定它们所形成的数字标签。
  • 该模型的目标是让细胞根据它们收到的信息,识别它们所形成的数字标签。

细胞如何对数字进行分类?

  • 本研究使用MNIST数据集,其中每个数字图像是一个28×28的像素网格。
  • 每个网格中的细胞接收它所代表的像素值的信息,根据像素的强度,细胞会被认为是“活”或“死”。
  • 细胞通过与邻居通信决定数字标签。标签由大多数细胞一致决定。

模型的关键组成部分

  • 目标标签:该模型使用10个通道表示10个数字标签(0-9),并根据最活跃的通道确定正确的数字。
  • 活细胞与死细胞:如果对应的像素值大于0.1,则该细胞为“活”细胞,进行更新;否则是“死”细胞,不进行更新。
  • 感知:该模型使用卷积层处理细胞邻居的信息,并作出决策。

实验 1: 自我分类,保持与突变

  • 该实验的目标是训练模型在突变后重新分类数字。训练过程中,模型不断适应新的数字形状,并且重新分类。
  • 该实验测试模型对数字形状的变化的适应能力。
  • 模型通过在训练期间进行数字突变来学习新的信息,并根据突变后的新形状更新分类。
  • 使用交叉熵损失进行训练,衡量细胞对数字分类的准确度。

实验 2: 稳定分类

  • 观察到的问题是,在突变后,细胞之间经常对分类有不同的看法,导致分类出现闪烁或不稳定。
  • 为了解决这个问题,研究人员追踪了“完全一致”的细胞比例,以衡量分类的稳定性。
  • 模型使用不同的损失函数进行训练,以观察它如何影响分类的稳定性。
  • 一种新的方法是使用L2损失,这有助于减少不稳定性,使细胞的内部状态更加平衡。

实验 2的关键发现

  • 使用L2损失使模型更加稳定,减少了闪烁现象,并提高了细胞之间的一致性。
  • 向残差更新中添加噪音,使系统更加鲁棒,减少了不稳定性。
  • 添加噪音后,总的一致性显著提高,表明在时间上更稳定。

模型的鲁棒性

  • 该模型对数字形状的变化具有鲁棒性,这意味着即使你改变数字的形状(例如使用更粗或更细的线条),模型仍然能够正确分类。
  • 这类似于生物系统,例如平原虫(计划)能够在经历许多突变后,仍然能够正确再生。
  • 该模型还在MNIST数据集外的数字上进行了测试,看看它如何处理未见过的数字形状。模型可以推理出新的形状,但对于极端的变化,它不能完美分类。

模型的生物学意义

  • 该模型帮助我们理解细胞如何通过简单的规则和局部的互动来进行复杂的行为,如分类,这与生物过程(如再生)类似。
  • 这些发现非常重要,因为它们展示了细胞群体如何共同完成单独细胞无法完成的任务(例如分类)。
  • 这种方法可以应用于再生医学领域,通过让细胞共同工作来实现所需的目标(例如再生缺失的肢体)。

总结

  • 本研究展示了一个简单的自组织细胞系统,如何通过局部通信和适应变化来完成分类任务。
  • 通过训练细胞分类数字并适应突变,研究人员展示了该模型如何成为理解复杂生物过程(如组织修复和再生)的强大工具。