Gathering a Dataset of Nepali Chat Messages Using Roman Characters to Express Nepali

Gathering a Dataset of Nepali Chat Messages Using Roman Characters to Express Nepali

I am trying to gather a dataset of Nepali chat messages that use Roman characters to express Nepali, such as ‘Ke cha’, ‘k xa’, etc., instead of ‘के छ’. Each Nepali person has their own way of using these combinations. I am currently using my own chat history to gather this data, but it is very limited. I thought that maybe a public Discord or WhatsApp group where multiple people are talking would provide me with a nice enough dataset.

So, my question is: what would be a good Discord or WhatsApp group where there are multiple Nepali people of different age groups talking mostly in Nepali but using the Roman font?


View on r/Nepal by Pipalbot