Most likely, and probably inferring the structure on texts with "similar" writing forms. Tried with my handwriting (in italian) and the performance wasn't that stellar.
More annoyingly, it is still a LLM and not a "pure" OCR, so some sentences were partially rephrased with different words than the one in the text.
This is crucially problematic if they would be used to transcribe historical documents
> Tried with my handwriting (in italian) and the performance wasn't that stellar.
Same here, for diaries/journals written in mixed Swedish/English/Spanish and with absolutely terrible hand-writing.
I'd love for the day where the writing is on the wall for handwriting recognition, which is something I bet on when I started with my journals, but seems that day has yet to come. I'm eager to get there though so I can archive all of it!
When does a character model become a language model?
If you're looking at block text with no connections between letter forms, each character mostly stands on its own. Except capital letters are much more likely at the beginning of a word or sentence than elsewhere, so you probably get a performance boost if you incorporate that.
Now we're considering two-character chunks. Cursive script connects the letterforms, and the connection changes based on both the source and target. We can definitely get a performance boost from looking at those.
Hmm you know these two-letter groupings aren't random. "ng" is much more likely if we just saw an "i". Maybe we need to take that into account.
Hmm actually whole words are related to each other! I can make a pretty good guess at what word that four-letter-wide smudge is if I can figure out the word before and after...
I call out the Lindy effect. Handwriting survived printed characters, typewriters, and the last 50-70 years of computers and keyboards, it will survive this too.
Utter disrespect for using the term "biology" relating to LLM. No one would call the analysis of a mechanical engine "car biology".
It's an artificial system, call it system analysis.
The analogy stems from the notion that neural nets are "grown" rather than "engineered". Chris Olah has an old, but good post with some specific examples: https://colah.github.io/notes/bio-analogies/
"not designed by humans"? Since when? Unless you count cortical organoids /wetware (grown in some instrumented petri dish) every artificial neural network, doesn't matter how complicated, it is designed by humans. With equations and rules designed by humans. Backpropagation, optimization algorithms, genetic selections etc... all designed by humans.
There is no biology here, and there are so many other words that describe perfectly what they are doing here, without twisting the meaning of another word.
Still designed by humans. The loss function, backpropagation and all other mechanisms didn't just appear magically in the neural network. Someone decided which loss function to use, which architecture or which optimization techniques.
Only because it takes a big GPU a lot of number crunching to assign those weights, it doesn't mean it's biological.
In the same way, a weather forecast model using a lot of complicated differential equations is not biological.
A finite element model analyzing some complicated electromagnetic field, or the aerodynamics of a car is not biological.
Just because someone around 70-75 years ago called them 'perceptrons' or 'neurons' instead of thingamajigs does not make them biology.
"Still designed by humans." No they are not. They are learned via backpropagation. This is the entire reason why neural networks work so well and why we have no idea how they work when they get big.
And who designed backpropagation? It is not a magical property of artificial neurons or some law of nature or god's miracle. A bunch of mathematicians banged their head on the problem of backpropagation, tossed it to a computer, and voilà , neural networks made sense.
Neural networks work so well because someone chooses the right loss function for the right problem. Wrong loss function -> wrong results. It's not magic. Nor it's biology.
Sure, but it makes no sense at all if you define biology as “the smell of a freshly opened can of tennis balls.” The original comment is probably better understood using a standard definition of the words it used, rather than either of our definitions.
I have a framework: don't use it, if you never used it don't start using it, public shame people, stop talking about it.
Slow down. Think long and deep about your problems. Write less code.
"This will be a big business"
No. It shouldn't be a "business", it should be laws that are enforced fast, education, public shaming of companies putting poison in their products.
Volatile Organic compounds in paint were known to be poisonous since 17th century (see Bernardino Ramazzini's works). Just listen to the goddamn scientists for once.
You can't solve a problem caused by capitalism corner cutting with more capitalism.
Damage from lead. The 'obsession' here is that the right level of lead in any product should be ZERO.
There should be international pacts, like what was done for gases destroying the ozone layer, to remove lead entirely from products
The error was to buy a second one after "the first one was just poor manufacturing".
I never saw manufacturing quality improve over time from car companies.
After my Nissan car started to have transmission problems that would cost thousands of dollars to fix (among various other small issues), I sold it as quickly as possibly and swore I'll touch the make again.
Subaru burned me on this. I bought my wife an outback. It started to have transmission issues with a full transmission failure at about 145k miles. This is after a life of small problems here and there that didn't really impact performance.
It was a known issue between 125 and 150k miles. Subaru's solution was to extend the warranty to 100k, as if that did anything at all.
We got rid of the broken one, and the one that I drove as well. I'll never go back. I loved those cars, but that's so shady.
There are distros slightly optimized for games (e.g. Garuda is based on Arch) and the support from Valve and Proton is quite good at the moment.
Problems appear only for games launcher that insist on being Windows-only (anything from EA and Ubisoft, Epic games launcher etc...)
reply