Automatic caption generation on images is a major research field in deep learning while there is no current publication that focuses on the task of caption generation in Urdu. With the aim of filling this gap, we have developed an attention based deep learning model for caption generation in Urdu. For this, we prepared a dataset by translating Flickr8k, a state of the art dataset for image captioning. We evaluated the proposed technique on our dataset and it generate captions with the average BELU score of 0.83.
was used and this dataset includes 5 captions per image and which can be used for caption generation model training.
Then we used this captions and translate these ones to urdu and used the Urdu translated captions for the model training.
All the notebooks we used for this projects can be found on the given GITHUB link below
ایک آدمی عوامی ٹرانسپورٹ پر اپنی گود میں بیگ لے کر سو رہا ہے
ایک آدمی تیز پانی میں واٹر بورڈنگ کر رہا ہے
ایک آدمی پانی کے پاس گھاٹ پر بیٹھا ہوا پڑھ رہا ہے
MSDS-18045
MSDS-19042