As noted, recent changes to OpenBSD TCP handling[1] may improve performance.
On a 4 core machine I see between 12% to 22% improvement with 10
parallel TCP streams. When testing only with a single TCP stream,
throughput increases between 38% to 100%.
I'm not sure that directly translates to better pf performance, and four cores is hardly remarkable these days but might be typical on a small low-power router?
Would be interesting if someone had a recent benchmark comparison of OpenBSD 7.8 PF vs. FreeBSD's latest.
That particular change improves throughput received locally. Though over the past few years there's been a ton of work on unlocking the network layer generally to support more parallelism.
For a firewall I guess the critical question is the degree of parallelism supported by OpenBSD's PF stack, especially as it relates to common features like connection statefulness, NAT, etc.
On a 4 core machine I see between 12% to 22% improvement with 10 parallel TCP streams. When testing only with a single TCP stream, throughput increases between 38% to 100%.
I'm not sure that directly translates to better pf performance, and four cores is hardly remarkable these days but might be typical on a small low-power router?
Would be interesting if someone had a recent benchmark comparison of OpenBSD 7.8 PF vs. FreeBSD's latest.
[1] https://undeadly.org/cgi?action=article;sid=20250508122430