Your goal is to predict whether a meme is hateful or non-hateful. This is a binary classification problem with multimodal input data consisting of the meme image itself (the image mode) and a string representing the text in the meme image (the text mode).
Given a meme id, meme image file, and a string representing the text in the meme image, your trained model should output the probability that the meme is hateful.