Papers
Topics
Authors
Recent
Search
2000 character limit reached

Annotated Job Ads with Named Entity Recognition

Published 18 Oct 2023 in cs.CL | (2310.11769v1)

Abstract: We have trained a named entity recognition (NER) model that screens Swedish job ads for different kinds of useful information (e.g. skills required from a job seeker). It was obtained by fine-tuning KB-BERT. The biggest challenge we faced was the creation of a labelled dataset, which required manual annotation. This paper gives an overview of the methods we employed to make the annotation process more efficient and to ensure high quality data. We also report on the performance of the resulting model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Bert: Pre-training of deep bidirectional transformers for language understanding.
  2. Playing with words at the national library of sweden – making a swedish bert.
  3. On the stability of fine-tuning bert: Misconceptions, explanations, and strong baselines.
  4. Anna Nyqvist. 2021. Bootstrapping annotated job ads using named entity recognition and swedish language models. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-305901.
  5. Fredrik Olsson. 2009. A literature survey of active machine learning in the context of natural language processing.
  6. Andrew Schein and Lyle Ungar. 2007. Active learning for logistic regression: An evaluation. Machine Learning, 68:235–265.
  7. Burr Settles. 2009. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison.
  8. Felix Stollenwerk. 2021. nerblackbox: a python package to fine-tune transformer-based language models for named entity recognition. https://github.com/flxst/nerblackbox.
  9. Felix Stollenwerk. 2022. Adaptive fine-tuning of transformer-based language models for named entity recognition.
  10. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics. https://www.aclweb.org/anthology/2020.emnlp-demos.6.
  11. Cold-start active learning through self-supervised language modeling.
  12. Revisiting few-sample bert fine-tuning.
  13. Joey Öhman. 2021. Active learning for named entity recognition with swedish language models. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-303866.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.