Abstract
Keyphrases extraction has a considerable importance in many applications such as search engine optimization, clustering, summarization, and sentiment analysis. The importance of keyphrases comes from the semantic meaning they provide as they can be used as descriptors for the documents. In this paper we compare four approaches for extracting keyphrases from Arabic documents. The first method uses the KP-Miner keyphrase extraction system. The second method uses Arabic natural language processing tools (stemmer and part of speech tagger) in order to filter some patterns that can be weighted by token frequency inverse document frequency (TF-IDF) algorithm. The third method uses Google'sWord2Vec library to calculate the weighting of the resulting patterns by measuring the similarity of the candidate pattern and the document title. The fourth method combines the weightings result from the second and the third method.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 1st International Conference on Arabic Computational Linguistics |
Subtitle of host publication | Advances in Arabic Computational Linguistics, ACLing 2015 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 133-137 |
Number of pages | 5 |
ISBN (Electronic) | 9781467391559 |
DOIs | |
State | Published - Feb 29 2016 |
Event | 1st International Conference on Arabic Computational Linguistics, ACLing 2015 - Cairo, Egypt Duration: Apr 17 2015 → Apr 20 2015 |
Other
Other | 1st International Conference on Arabic Computational Linguistics, ACLing 2015 |
---|---|
Country/Territory | Egypt |
City | Cairo |
Period | 04/17/15 → 04/20/15 |
Keywords
- Keyphrases extraction
- POS tagging
- Stemming
ASJC Scopus subject areas
- Computer Science Applications
- Signal Processing
- Linguistics and Language