Papers
Topics
Authors
Recent
Search
2000 character limit reached

Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models

Published 20 Mar 2024 in cs.CL | (2403.13250v1)

Abstract: Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems. However, research on detecting pornographic language within human-machine interaction dialogues is an important subject that is rarely studied. To advance in this direction, we introduce CensorChat, a dialogue monitoring dataset aimed at detecting whether the dialogue session contains pornographic content. To this end, we collect real-life human-machine interaction dialogues in the wild and break them down into single utterances and single-turn dialogues, with the last utterance spoken by the chatbot. We propose utilizing knowledge distillation of LLMs to annotate the dataset. Specifically, first, the raw dataset is annotated by four open-source LLMs, with the majority vote determining the label. Second, we use ChatGPT to update the empty label from the first step. Third, to ensure the quality of the validation and test sets, we utilize GPT-4 for label calibration. If the current label does not match the one generated by GPT-4, we employ a self-criticism strategy to verify its correctness. Finally, to facilitate the detection of pornographic text, we develop a series of text classifiers using a pseudo-labeled dataset. Detailed data analysis demonstrates that leveraging knowledge distillation techniques with LLMs provides a practical and cost-efficient method for developing pornographic text detectors.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  2. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  3. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  4. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
  5. P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” Advances in neural information processing systems, vol. 30, 2017.
  6. Y. Bai, A. Jones, K. Ndousse, A. Askell, A. Chen, N. DasSarma, D. Drain, S. Fort, D. Ganguli, T. Henighan et al., “Training a helpful and harmless assistant with reinforcement learning from human feedback,” arXiv preprint arXiv:2204.05862, 2022.
  7. S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg et al., “Sparks of artificial general intelligence: Early experiments with gpt-4,” arXiv preprint arXiv:2303.12712, 2023.
  8. H. Qiu, H. He, S. Zhang, A. Li, and Z. Lan, “Smile: Single-turn to multi-turn inclusive language expansion via chatgpt for mental health support,” arXiv preprint arXiv:2305.00450, 2023.
  9. H. Lu, Z. Guo, C. Li, Y. Yang, H. He, and S. Bao, “Towards building an open-domain dialogue system incorporated with internet memes,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
  10. K. Zhou, L. Zhuo, Z. Geng, J. Zhang, and X. G. Li, “Convolutional neural networks based pornographic image classification,” in 2016 IEEE Second International Conference on Multimedia Big Data (BigMM).   IEEE, 2016, pp. 206–209.
  11. L. Zhuo, Z. Geng, J. Zhang, and X. guang Li, “Orb feature based web pornographic image recognition,” Neurocomputing, vol. 173, pp. 511–517, 2016.
  12. A. Tabone, K. Camilleri, A. Bonnici, S. Cristina, R. Farrugia, and M. Borg, “Pornographic content classification using deep-learning,” in Proceedings of the 21st ACM Symposium on Document Engineering, 2021, pp. 1–10.
  13. S. Samal, R. Nayak, S. Jena, and B. K. Balabantaray, “Obscene image detection using transfer learning and feature fusion,” Multimedia Tools and Applications, pp. 1–29, 2023.
  14. C. Jansohn, A. Ulges, and T. M. Breuel, “Detecting pornographic video content by combining image features with motion information,” in Proceedings of the 17th ACM international conference on Multimedia, 2009, pp. 601–604.
  15. M. Perez, S. Avila, D. Moreira, D. Moraes, V. Testoni, E. Valle, S. Goldenstein, and A. Rocha, “Video pornography detection through deep learning techniques and motion information,” Neurocomputing, vol. 230, pp. 279–293, 2017.
  16. S. Samal, Y.-D. Zhang, T. R. Gadekallu, R. Nayak, and B. K. Balabantaray, “Sbmyv3: Improved mobyolov3 a bam attention-based approach for obscene image and video detection,” Expert Systems, p. e13230, 2023.
  17. K. Song, Y. Kang, W. Gao, Z. Gao, C. Sun, and X. Liu, “Evidence aware neural pornographic text identification for child protection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 17, 2021, pp. 14 939–14 947.
  18. A. Madaan, N. Tandon, P. Gupta, S. Hallinan, L. Gao, S. Wiegreffe, U. Alon, N. Dziri, S. Prabhumoye, Y. Yang et al., “Self-refine: Iterative refinement with self-feedback,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  19. W. Saunders, C. Yeh, J. Wu, S. Bills, L. Ouyang, J. Ward, and J. Leike, “Self-critiquing models for assisting human evaluators,” arXiv preprint arXiv:2206.05802, 2022.
  20. B. Zhao, W. Jin, J. Del Ser, and G. Yang, “Chatagri: Exploring potentials of chatgpt on cross-linguistic agricultural text classification,” Neurocomputing, vol. 557, p. 126708, 2023.
  21. L. Loukas, I. Stogiannidis, O. Diamantopoulos, P. Malakasiotis, and S. Vassos, “Making llms worth every penny: Resource-limited text classification in banking,” in Proceedings of the Fourth ACM International Conference on AI in Finance, 2023, pp. 392–400.
  22. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  23. P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, “A tutorial on the cross-entropy method,” Annals of operations research, vol. 134, pp. 19–67, 2005.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub