How AIs Fail at Reading Clocks

May 8

and Why That’s Important

5 Comments

That is interesting data. I ran into this issue months ago when I was having 4o create an image for an article that included two dials. For one of the dials, the image showed the pointer in the wrong “zone”. I tried several times to correct it, even using different models, but the issue persisted. Of course now I know that it was a failure of the image translation layer, so swapping models wouldn’t make a difference. I eventually gave up. The dial on the left is supposed to point to “low”.

I’m not sure why the translation layer has such a difficult time with dials, but until that is fixed, I guess it’s better to use AI only with digital readouts or text readouts for such things.

Reply (1)

T.D. Inoue

That’s a cool example. I’ve taken to using subregion editing for things like that. AIs have a hard time doing two things at once.

Reply (1)

Seby

I thought I had attached the image, but oh well. Yes, I thought about trying that but I don't have a lot of experience creating AI prompted art. The AI art I have is mostly autogenic. LOL

The Logosmitten

This study seems to assume models were trained on this specific analog perception. They probably weren't to a great degree. If they had been, they would get it right. This, in no way, proves you wrong. It actually shows that if you were to extrapolate this across a great degree of similar concepts, AI would fail to replace humans. Reading gauges is important, but reading synonymous conceptual concepts that were not the focus of training is why I have no interest in using agents.

Reply (1)

T.D. Inoue

Totally agree.

The primary purpose of the study was to characterize some of the underlying capabilities of the vision model in a similar way I did with my color perception studies. As you noted, it should serve as a cautionary tale on a couple of dimensions. The take home message is that we can’t take it for granted that AIs are competent at tasks without testing them thoroughly in those domains (your point).

Fuego: Topics in Synthetic Sentience

How AIs Fail at Reading Clocks