Papers
Topics
Authors
Recent
Search
2000 character limit reached

Knowledge Sharing in Manufacturing using Large Language Models: User Evaluation and Model Benchmarking

Published 10 Jan 2024 in cs.HC, cs.AI, and cs.IR | (2401.05200v2)

Abstract: Recent advances in natural language processing enable more intelligent ways to support knowledge sharing in factories. In manufacturing, operating production lines has become increasingly knowledge-intensive, putting strain on a factory's capacity to train and support new operators. This paper introduces a LLM-based system designed to retrieve information from the extensive knowledge contained in factory documentation and knowledge shared by expert operators. The system aims to efficiently answer queries from operators and facilitate the sharing of new knowledge. We conducted a user study at a factory to assess its potential impact and adoption, eliciting several perceived benefits, namely, enabling quicker information retrieval and more efficient resolution of issues. However, the study also highlighted a preference for learning from a human expert when such an option is available. Furthermore, we benchmarked several commercial and open-sourced LLMs for this system. The current state-of-the-art model, GPT-4, consistently outperformed its counterparts, with open-source models trailing closely, presenting an attractive option given their data privacy and customization benefits. In summary, this work offers preliminary insights and a system design for factories considering using LLM tools for knowledge management.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Gradio: Hassle-free sharing and testing of ML models in the wild. https://doi.org/10.48550/arXiv.1906.02569
  2. Hussam Alkaissi and Samy I. McFarlane. 2023. Artificial Hallucinations in ChatGPT: Implications in Scientific Writing. Cureus 15, 2 (2023), e35179. https://doi.org/10.7759/cureus.35179
  3. Is Industry 5.0 a Human-Centred Approach? A Systematic Review. Processes 11, 1 (2023). https://doi.org/10.3390/pr11010193
  4. Assessing the capabilities of ChatGPT to improve additive manufacturing troubleshooting. Advanced Industrial and Engineering Polymer Research (3 2023). https://doi.org/10.1016/J.AIEPR.2023.03.003
  5. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity. arXiv:2302.04023 [cs.CL]
  6. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  7. Human-Centered HCI Practices Leading the Path to Industry 5.0: A Systematic Literature Review. In HCI International 2023 Posters, Constantine Stephanidis, Margherita Antona, Stavroula Ntoa, and Gavriel Salvendy (Eds.). Springer Nature Switzerland, Cham, 3–15.
  8. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv:2305.14314 [cs.LG]
  9. Clustering and classification of maintenance logs using text data mining. Volume 87-Data Mining and Analytics 2008 (2008), 193–199.
  10. Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30, 4 (2020), 681–694.
  11. Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
  12. TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models. (2022). https://commons.wikimedia.org/
  13. What Does BERT Learn about the Structure of Language?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 3651–3657. https://doi.org/10.18653/v1/P19-1356
  14. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. arXiv:2212.14882 [cs.CL]
  15. Harnessing Large Language Models for Cognitive Assistants in Factories. In Proceedings of the 5th International Conference on Conversational User Interfaces (Eindhoven, Netherlands) (CUI ’23). Association for Computing Machinery, New York, NY, USA, Article 44, 6 pages. https://doi.org/10.1145/3571884.3604313
  16. Bum Chul Kwon and Nandana Mihindukulasooriya. 2022. An Empirical Study on Pseudo-log-likelihood Bias Measures for Masked Language Models Using Paraphrased Sentences. TrustNLP 2022 - 2nd Workshop on Trustworthy Natural Language Processing, Proceedings of the Workshop (2022), 74–79. https://doi.org/10.18653/V1/2022.TRUSTNLP-1.7
  17. Evaluating the use of large language model in identifying top research questions in gastroenterology. Scientific Reports 2023 13:1 13 (3 2023), 1–6. Issue 1. https://doi.org/10.1038/s41598-023-31412-2
  18. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 793, 16 pages.
  19. Code as Policies: Language Model Programs for Embodied Control. (9 2022). https://arxiv.org/abs/2209.07753v4
  20. Jerry Liu. 2022. LlamaIndex. https://doi.org/10.5281/zenodo.1234
  21. Industry 5.0: A survey on enabling technologies and potential applications. Journal of Industrial Information Integration 26 (2022), 100257. https://doi.org/10.1016/j.jii.2021.100257
  22. Stable Beluga models. [https://huggingface.co/stabilityai/StableBeluga2](https://huggingface.co/stabilityai/StableBeluga2)
  23. Mercedes-Benz. 2023. Benz tests chatgpt in intelligent vehicle production.: Mercedes-Benz Group. https://group.mercedes-benz.com/innovation/digitalisation/industry-4-0/chatgpt-in-vehicle-production.html
  24. Putting ChatGPT’s Medical Advice to the (Turing) Test. arXiv:2301.10035 [cs.HC]
  25. OpenAI. 2023. ChatGPT: Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/.
  26. WikiChat: A Few-Shot LLM-Based Chatbot Grounded with Wikipedia. arXiv:2305.14292 [cs.CL]
  27. Olivier Serrat. 2017. The five whys technique. Knowledge solutions: Tools, methods, and approaches to drive organizational performance (2017), 307–310.
  28. Large Language Models Encode Clinical Knowledge. arXiv:2212.13138 [cs.CL]
  29. Does Synthetic Data Generation of LLMs Help Clinical Text Mining? arXiv:2303.04360 [cs.CL]
  30. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
  31. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
  32. Legal Prompt Engineering for Multilingual Legal Judgement Prediction. arXiv:2212.02199 [cs.CL]
  33. ChatGPT for design, manufacturing, and education. Procedia CIRP 119 (2023), 7–14. https://doi.org/10.1016/j.procir.2023.04.001 The 33rd CIRP Design Conference.
  34. An Overview on Language Models: Recent Developments and Outlook. (3 2023). https://arxiv.org/abs/2303.05759v1
  35. Emergent Abilities of Large Language Models. arXiv:2206.07682 [cs.CL]
  36. Anatomy of a Digital Assistant. In Advances in Production Management Systems. Artificial Intelligence for Sustainable and Resilient Production Systems, Alexandre Dolgui, Alain Bernard, David Lemoine, Gregor von Cieminski, and David Romero (Eds.). Springer International Publishing, Cham, 321–330.
  37. Towards autonomous system: flexible modular production system enhanced with large language model agents. arXiv:2304.14721 [cs.RO]
  38. PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance. (6 2023). https://arxiv.org/abs/2306.05443v1
  39. A systematic evaluation of large language models of code. (6 2022), 1–10. https://doi.org/10.1145/3520312.3534862
  40. Industry 4.0 and Industry 5.0—Inception, conception and perception. Journal of Manufacturing Systems 61 (2021), 530–535. https://doi.org/10.1016/j.jmsy.2021.10.006
  41. A Preliminary Evaluation of ChatGPT in Requirements Information Retrieval. (4 2023). https://arxiv.org/abs/2304.12562v1
  42. A Survey of Large Language Models. arXiv:2303.18223 [cs.CL]
Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.