CLANet: A Comprehensive Framework for Cross-Batch Cell Line Identification Using Brightfield Images
Abstract: Cell line authentication plays a crucial role in the biomedical field, ensuring researchers work with accurately identified cells. Supervised deep learning has made remarkable strides in cell line identification by studying cell morphological features through cell imaging. However, batch effects, a significant issue stemming from the different times at which data is generated, lead to substantial shifts in the underlying data distribution, thus complicating reliable differentiation between cell lines from distinct batch cultures. To address this challenge, we introduce CLANet, a pioneering framework for cross-batch cell line identification using brightfield images, specifically designed to tackle three distinct batch effects. We propose a cell cluster-level selection method to efficiently capture cell density variations, and a self-supervised learning strategy to manage image quality variations, thus producing reliable patch representations. Additionally, we adopt multiple instance learning(MIL) for effective aggregation of instance-level features for cell line identification. Our innovative time-series segment sampling module further enhances MIL's feature-learning capabilities, mitigating biases from varying incubation times across batches. We validate CLANet using data from 32 cell lines across 93 experimental batches from the AstraZeneca Global Cell Bank. Our results show that CLANet outperforms related approaches (e.g. domain adaptation, MIL), demonstrating its effectiveness in addressing batch effects in cell line identification.
- Improving phenotypic measurements in high-content imaging screens. BioRxiv , 161422.
- Machine learning assisted classification of cell lines and cell states on quantitative phase images. Cells 10, 2587.
- Verification and unmasking of widely used human esophageal adenocarcinoma cell lines. Journal of the National Cancer Institute 102, 271–274.
- An automatic method for robust and fast cell detection in bright field images from high-throughput microscopy. BMC bioinformatics 14, 1–12.
- Data-analysis strategies for image-based cell profiling. Nature methods 14, 849–863.
- Applications in image-based profiling of perturbations. Current opinion in biotechnology 39, 134–142.
- Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems 33, 9912–9924.
- Emerging properties in self-supervised vision transformers, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 9650–9660.
- Image-based profiling for drug discovery: due for a machine-learning upgrade? Nature Reviews Drug Discovery 20, 145–159.
- Dual attention multiple instance learning with unsupervised complementary loss for covid-19 screening. Medical Image Analysis 72, 102105.
- Self-supervised learning of phenotypic representations from cell images with weak labels. arXiv preprint arXiv:2209.07819 .
- Reproducibility: changing the policies and culture of cell line authentication. Nature methods 12, 493–497.
- Multi-scale domain-adversarial multiple-instance cnn for cancer subtype classification with unannotated histopathological images, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3852–3861.
- Attention-based deep multiple instance learning, in: International conference on machine learning, PMLR. pp. 2127–2136.
- Why most published research findings are false. PLoS medicine 2, e124.
- Charisma: An integrated approach to automatic h&e-stained skeletal muscle cell segmentation using supervised learning and novel robust clump splitting. Medical image analysis 17, 1206–1219.
- Minimum class confusion for versatile domain adaptation, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, Springer. pp. 464–480.
- Segment anything. arXiv preprint arXiv:2304.02643 .
- Removing batch effects from histopathological images for enhanced cancer diagnosis. IEEE journal of biomedical and health informatics 18, 765–772.
- Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14318–14328.
- Deep learning enables accurate clustering with batch effect removal in single-cell rna-seq analysis. Nature communications 11, 2338.
- Short tandem repeat profiling provides an international reference standard for human cell lines. Proceedings of the National Academy of Sciences 98, 8012–8017.
- Towards image-based cancer cell lines authentication using deep neural networks. Scientific Reports 10, 1–15.
- Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence 24, 971–987.
- Cellpose 2.0: how to train your own model. Nature Methods , 1–8.
- Cancer cell line identification by short tandem repeat profiling: power and limitations. The FASEB journal 19, 1–18.
- Authentication of human cell lines by str dna profiling analysis. Assay guidance manual [Internet] .
- Attention2majority: Weak multiple instance learning for regenerative kidney grading on whole slide images. Medical Image Analysis 79, 102462.
- Rxrx1: A dataset for evaluating experimental batch correction methods. arXiv preprint arXiv:2301.05768 .
- An automated cell line authentication method for astrazeneca global cell bank using deep neural networks on brightfield images. Scientific reports 12, 1–11.
- Selective search for object recognition. International journal of computer vision 104, 154–171.
- Temporal segment networks for action recognition in videos. IEEE transactions on pattern analysis and machine intelligence 41, 2740–2755.
- An artificial intelligent platform for live cell identification and the detection of cross-contamination. Annals of Translational Medicine 8.
- Deep semi-supervised multiple instance learning with self-correction for dme classification from oct images. Medical Image Analysis 83, 102673.
- Ud-mil: uncertainty-driven deep multiple instance learning for oct image classification. IEEE journal of biomedical and health informatics 24, 3431–3442.
- Optimizing connected component labeling algorithms, in: Medical Imaging 2005: Image Processing, SPIE. pp. 1965–1976.
- Cell type classification and unsupervised morphological phenotyping from low-resolution images using deep learning. Scientific reports 9, 1–13.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.