Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If str is utf-8, as it likely is nowadays outside certain OS APIs, then that code is fine. You can't grab an arbitrary character from the str and move it around and have it retain its meaning. But you couldn't necessarily do that before either, as you would be potentially breaking up a word, eg. To do unicode while pretending it's ascii in C, you look for ascii characters you recognize, like punctuation and the like, which you split the string on. You then treat every other substring as a black box of characters that can only be moved around as a unit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: