Hybrid Hashtag Sub-Corpus


Download the HH Sub-Corpus

The Hybrid Hashtag Sub-Corpus (HH Sub-Corpus) is a subset of tweets in the MLT corpus containing hashtags made up of Māori and English words (so-called "hybrid hashtags"). There are 81 hybrid hashtags in this dataset, used in 5,684 twets and posted to Twitter by 3,771 distinct users.