That’s built on a dataset and paper I wrote called CommonForms, where I scraped CommonCrawl for hundreds of thousands of fillable form pages and used that as a training set:
Next step is training and releasing some DETRs, which I think will drive quality even higher. But the ultimate end goal is working on automatic form accessibility.
That’s built on a dataset and paper I wrote called CommonForms, where I scraped CommonCrawl for hundreds of thousands of fillable form pages and used that as a training set:
https://arxiv.org/abs/2509.16506
Next step is training and releasing some DETRs, which I think will drive quality even higher. But the ultimate end goal is working on automatic form accessibility.