Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> why? they trained on my code without my consent, why is user data any different?

It's different. Code and other content that you shared online - no matter what license you shared it under - is still fundamentally different from user data that you never shared anywhere at all.

That difference is really important, to me at least.

I would be absolutely furious if I found that someone had trained an AI model on my private data in Dropbox. I personally have no problem at all with someone training an AI model on content I have posted to my blog.



A huge problem with the art community right now is if you're a professional artist, you basically need to maintain a public portfolio and social media presence to find work. That ecosystem was developed and existed before image generation AI was a commercial thing, yet artists are finding out retroactively that their data got pulled into these training sets without their knowledge or consent. Even if it's legal, it's still pretty gross imo.


The two things aren't the same, and one is more egregious than the other. But I do think that hoovering up all that web data to train models was way over the line. Worse, there's nothing I can do to prevent it. That's why I felt forced to remove my websites from the public web. I can't think of any other way to defend myself from these entities.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: