Papers
Topics
Authors
Recent
Search
2000 character limit reached

Skeleton: A New Framework for Accelerating Language Models via Task Neuron Localized Prompt Tuning

Published 18 Apr 2024 in cs.CL and cs.AI | (2404.11916v2)

Abstract: Prompt tuning methods have shown comparable performance to general training methods as parameter-efficient fine-tuning (PEFT) methods in various natural language understanding tasks. However, existing prompt tuning methods still utilize the entire model architecture even when solving a specific task, which prevents them from accelerating inference speed during the application procedure. In this paper, we propose a novel prompt tuning framework called Skeleton to efficiently utilize a LLM in terms of memory and time complexity for solving various tasks, retaining only task-relevant neurons by using an explainability method. From our framework, we can efficiently solve various tasks by using only task-relevant neurons and prepending adequate task-specific prompt tokens with only a single LLM. Experiments reveal that our method significantly enhances inference efficiency (at most x 1.73 speed up) for various widely used benchmarks, showing comparable performance to the prompt tuning method. Moreover, our method is applicable across various transformer-based architectures, confirming its practicality and scalability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Few-shot unified question answering: Tuning models or prompts? arXiv preprint arXiv:2305.14569.
  2. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  3. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
  4. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  5. Prompts can play lottery tickets well: Achieving lifelong information extraction via lottery prompt tuning.
  6. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv preprint arXiv:2110.07602.
  7. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  8. Xprompt: Exploring the extreme of prompt tuning. arXiv preprint arXiv:2210.04457.
  9. Learning word vectors for sentiment analysis. Association for Computational Linguistics.
  10. Task-specific skill localization in fine-tuned language models. arXiv preprint arXiv:2302.06600.
  11. Not just a black box: Interpretable deep learning by propagating activation differences. arXiv preprint arXiv:1605.01713, 4.
  12. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
  13. Fine-grained retrieval prompt tuning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 2644–2652.
  14. Task-specific compression for multi-task language models using attribution-based pruning. In Findings of the Association for Computational Linguistics: EACL 2023, pages 582–592.
  15. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.