Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But how do you know you're getting the correct picture from that throwaway UI? A little while back there was an blog posted where the author wrote an article praising AI for his vibe-coded earth-viewer app that used Vulkan to render inside a GUI window. Unfortunately, that wasn't the case and AI just copied from somewhere and inserted code for a rudimentary software rendering. The AI couldn't do what was asked because it had seldom been done. Nobody on the internet ever discussed that particular objective, so it wasn't in the training set.

The lesson to learn is that these are "large-language models." That means it can regurgitate what someone else has done before textually, but not actually create something novel. So it's fine if someone on the internet has posted or talked about a quick UI in whatever particular toolkit you're using to analyze data. But it'll throw out BS if you ask for something brand new. I suspect a lot of AI users are web developers who write a lot of repetitive rote boilerplate, and that's the kind of thing these LLMs really thrive with.





> But how do you know you're getting the correct picture from that throwaway UI?

You get the AI to generate code that lets you spot-check individual data points :-)

Most of my work these days is in fact that kind of code. I'm working on something research-y that requires a lot of visualization, and at this point I've actually produced more throwaway code than code in the project.

Here's an example: I had ChatGPT generate some relatively straightforward but cumbersome geometric code. Saved me 30 - 60 minutes right there, but to be sure, I had it generate tests, which all passed. Another 30 minutes saved.

I reviewed the code and the tests and felt it needed more edge cases, which I added manually. However, these started failing and it was really cumbersome to make sense of a bunch of coordinates in arrays.

So I had it generate code to visualize my test cases! That instantly showed me that some assertions in my manually added edge cases were incorrect, which became a quick fix.

The answer to "how do you trust AI" is human in the loop... AND MOAR AI!!! ;-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: