Amber Baig, Mutee U Rahman, Sehrish Abrejo, Khalid H Mohamadani, Ahsanullah Baloch
Twitter, a social media platform has experienced substantial growth over the last few years. Thus, huge number of tweets from various communities is available and used for various NLP applications such as Opinion mining, information extraction, sentiment analysis etc. One of the key pre-processing steps in such NLP applications is Part-of-Speech (POS) tagging. POS tagging of Twitter data (also called noisy text) is different than conventional POS tagging due to informal nature and presence of Twitter specific elements. Resources for POS tagging of tweet specific data are mostly available for English. Though, availability of tagset and language independent statistical taggers do provide opportunity for resource-poor languages such as Urdu to expand coverage of NLP tools to this new domain of POS tagging for which little effort has been reported.
Amber Baig, Mutee U Rahman, Sehrish Abrejo, Khalid H Mohamadani, Ahsanullah Baloch (2021); Performance Comparison of Bootstrapped Statistical Taggers on Urdu Tweets; International Journal of Scientific and Research Publications (IJSRP)
11(7) (ISSN: 2250-3153), DOI: http://dx.doi.org/10.29322/IJSRP.11.07.2021.p11559