Ok, we’ve been playing around with Captionbot since it came out last week, and it looks like Microsoft purposefully left out recognition for gorillas/apes/monkeys in an effort to avoid the Google Photos fiasco where Google mislabelled black people as gorillas.
Take a look at the evidence:
Seriously, a black hat, Microsoft? Two giraffes near a tree? A cat wearing a tie??? We get it – image recognition is hard. No one wants to pull a Google Photos. But does that make it ok to forgo teaching your model an entire concept?
Look at Clarifai’s image recognition results for the exact same images:
Clarifai demonstrates that teaching visual recognition to be smart IS possible. Instead of omitting concepts that are difficult to teach computers, let’s find ways to make our technology smarter!