SQLite (with WAL) doesn't do `fsync` on each commit under default settings

charleslmunger · 2025-08-24T16:30:31 1756053031

The default is FULL

https://sqlite.org/compile.html#default_synchronous

>SQLITE_DEFAULT_SYNCHRONOUS=<0-3> This macro determines the default value of the PRAGMA synchronous setting. If not overridden at compile-time, the default setting is 2 (FULL).

>SQLITE_DEFAULT_WAL_SYNCHRONOUS=<0-3> This macro determines the default value of the PRAGMA synchronous setting for database files that open in WAL mode. If not overridden at compile-time, this value is the same as SQLITE_DEFAULT_SYNCHRONOUS.

Many wrappers for sqlite take this advice and change the default, but the default is FULL.

nh2 · 2025-08-24T17:09:45 1756055385

From this (linked https://sqlite.org/pragma.html#pragma_synchronous), does anybody understand EXTRA?

> EXTRA provides additional durability if the commit is followed closely by a power loss.

means?

How can one have "additional" durability, if FULL already "ensures that an operating system crash or power failure will not corrupt the database"?

Is it that FULL only protects against "corruption" as stated, but will still lose committed transactions?

It seems so from the points on https://stackoverflow.com/questions/58113560/during-power-lo...

Which is also quite nasty. I want my databases to be fully durable by default, and not lose anything once they have acknowledged a transaction. The typical example for ACID DBs are bank transactions; imagine a bank accidentally undoing a transaction upon server crash, after already having acknowledged it to a third-party over the network.

agwa · 2025-08-24T17:19:44 1756055984

This is what the documentation (https://sqlite.org/pragma.html#pragma_synchronous) says about EXTRA:

> EXTRA synchronous is like FULL with the addition that the directory containing a rollback journal is synced after that journal is unlinked to commit a transaction in DELETE mode

So it only has an effect in DELETE mode; WAL mode doesn't use a rollback journal.

That said, the documentation about this is pretty confusing.

nh2 · 2025-08-24T17:53:36 1756058016

Yes, I'm talking about the fact that sqlite in its default (journal_mode = DELETE) is not durable.

Which in my opinion is worse than whatever may apply to WAL mode, because WAL is something a user needs to explicitly enable.

If it is true as stated, then I also don't find it very confusing, but would definitely appreciate if it were more explicit, replacing "will not corrupt the database" by "will not corrupt the database (but may still lose committed transactions on power loss)", and I certainly find that a very bad default.

SQLite · 2025-08-25T14:33:43 1756132423

> sqlite in its default (journal_mode = DELETE) is not durable.

Not true. In its default configuration, SQLite is durable.

If you switch to WAL mode, the default behavior is that transactions are durable across application crashes (or SIGKILL or similar) but are not necessarily durable across OS crashes or power failures. Transactions are atomic across OS crashes and power failures. But if you commit a transaction in WAL mode and take a power loss shortly thereafter, the transaction might be rolled back after power is restored.

This behavior is what most applications want. You'll never get a corrupt database, even on a power loss or similar. You might lose a transaction that happened within the past second or so. So if you cat trips over the power cord a few milliseconds after you set a bookmark in Chrome, that bookmark might not be there after you reboot. Most people don't care. Most people would rather have the extra day-to-day performance and reduced SSD wear. But if you have some application where preserving the last moment of work is vital, then SQLite provides that option, at run-time, or at compile-time.

When WAL mode was originally introduced, it was guaranteed durable by default, just like DELETE mode. But people complained that they would rather have increased performance and didn't really care if a recent transaction rolled back after a power-loss, just as long as the database didn't go corrupt. So we changed the default. I'm sorry if that choice offends you. You can easily restore the original behavior at compile-time if you prefer.

charleslmunger · 2025-08-25T16:07:10 1756138030

>If you switch to WAL mode, the default behavior is that transactions are durable across application crashes (or SIGKILL or similar) but are not necessarily durable across OS crashes or power failures. Transactions are atomic across OS crashes and power failures. But if you commit a transaction in WAL mode and take a power loss shortly thereafter, the transaction might be rolled back after power is restored.

How is this behavior reconciled with the documentation cited in my comment above? Are the docs just out of date?

nh2 · 2025-08-30T22:31:53 1756593113

> Not true. In its default configuration, SQLite is durable.

Could you explain why this is?

I have quoted above the documentation that suggests it's not durable (summarising: DEELTE is the default, and DELETE appears only durable in EXTRA mode which is not a default).

This is also supported by the comment in the SQLite forum: https://sqlite.org/forum/forumpost/8d7e2aae21a261f2

Thanks!

nh2 · 2025-08-24T18:52:27 1756061547

> That said, the documentation about this is pretty confusing.

I now filed a suggestion to clarify the docs on this:

https://sqlite.org/forum/forumpost/ec171a77a3

ncruces · 2025-08-24T21:04:55 1756069495

For DELETE (rollback) mode, and given the way fsync works, FULL should not lose committed transactions (plural) but it might lose at most one committed transaction.

Because DELETE depends on deleting a file (not modifying file contents), it much depends on the specific file system's (not SQLite's) journaling behavior.

nh2 · 2025-08-24T23:52:32 1756079552

I don't see how the journal delete depends any more on a specific file system's behaviour than the main data write: If a specific file system can decide to automatically fsync unlink(), it can equally decide to automatic fsync write().

In either case it is clear that (on Linux), if you want guarantees, you need to fsync (the file and the dir respectively).

mrkeen · 2025-08-24T17:41:21 1756057281

> The typical example for ACID DBs are bank transactions; imagine a bank accidentally undoing a transaction upon server crash

That's why they don't try to do it that way! But it's still an informative way to think about it.

Also, while we're discussing defaults, your ACID db is probably running at READ COMMITTED by default, meaning that your bank transactions are vanishing/creating money:

* You read accounts A and B($30) in order to move $5 between them. The new balance for B should be $35. Just before you write the $35, someone else's transaction sets B to $100. Your transaction will proceed and blindly set it to $35 anyway.

But to your overall point, I'm also frustrated that these systems aren't as safe as they look on the box.

nh2 · 2025-08-24T18:00:04 1756058404

> your ACID db is probably running at READ COMMITTED by default

You're probably refering to PostgreSQL. Yes, I am also frustrated that that doesn't default to SERIALIZABLE.

I do wish the top two open-source "ACID" DBs (Postgres and SQLite) used guaranteed-safe, zero-surprise defaults.

lanstin · 2025-08-24T18:52:34 1756061554

It isn’t worth it. Mostly financial transactions are done via append only ledgers, not updating; two phase auth and then capture; a settlement process to actually move money, and a reconciliation process to check all the accounts and totals. Even without DB corruptions they have enough problems (fraud and buggy code) with creating money and having to go back and ask people to more money or to give them more money so they have those systems in place any ways.

guenthert · 2025-08-25T10:24:56 1756117496

> Mostly financial transactions are done via append only ledgers, not updating;

Well, financial institutions will act as you describe, I presume, but lowly web shops will update 'shopping cart' and 'inventory' using the default settings of whatever DBMS the system came with.

lanstin · 2025-08-26T05:12:57 1756185177

Which is reasonable I guess, except that even with modern hardware, updating the same record in an ACID database has a surprising low capacity in terms of txn / second; if you have some popular item in your inventory (or popular merchant in your 2-sided shopping thing), you'll be forced to create multiple receiving accounts or other ugly stuff if you do the balance based transactionality.

billywhizz · 2025-08-24T19:54:29 1756065269

yes. most folks don't seem to understand this. but, you can get something approaching such guarantees if you are able to limit yourself to something as (seemingly) simple as updating a ledger. this approach is used in a lot of places where high performance and strong consistency is needed (see e.g. LMAX disruptor for similar). https://tigerbeetle.com/

layer8 · 2025-08-24T22:37:44 1756075064

SERIALIZABLE isn't zero-surprise, since applications must be prepared to retry transactions under that isolation level. There is no silver bullet here.

mrkeen · 2025-08-25T11:00:17 1756119617

A CRUD architecture with proper ACID is an OK contender against other possible architectures. Personally I always go for event-sourcing (a.k.a. WAL per the article's title).

But a CRUD that doesn't do ACID properly is crap. And since the people making these decisions don't understand event-sourcing or that they're not being protected by ACID, CRUD gets chosen every time.

The DB also won't be set to SERIALIZABLE because it's too slow.

charleslmunger · 2025-08-24T17:16:34 1756055794

>EXTRA synchronous is like FULL with the addition that the directory containing a rollback journal is synced after that journal is unlinked to commit a transaction in DELETE mode. EXTRA provides additional durability if the commit is followed closely by a power loss.

It depends on your filesystem whether this is necessary. In any case I'm pretty sure it's not relevant for WAL mode.

guenthert · 2025-08-25T10:10:08 1756116608

No corruption doesn't imply no data loss. Reverting to an earlier, consistent, state is in some situations acceptable (think Unix fsck), in others one might depend on committed transactions to have indeed been recorded as such.

I'd think SQLite isn't typically used for bank applications, but rather to keep your web browser's bookmarks and such.

fooster · 2025-08-25T12:05:55 1756123555

Even worse because browser bookmarks are not exactly a high volume use case where an extra fsync matters at all.

fkrlook · 2025-08-25T01:02:45 1756083765

This makes me so sick. For years and years I’ve gotten the vibe from SQLite that it never took being a reliable database seriously, but I bought into the hype for the past several years that it was finally a great DB for using in production, and then this. I swear. Sure, change the default config for now and make it actually behave in a sane way so that it doesn’t lose your data. But later- use a real database.

avinassh · 2025-08-24T16:43:18 1756053798

hey, I just tested and `NORMAL` is default:

    $ sqlite3 test.db
    
    SQLite version 3.43.2 2023-10-10 13:08:14
    Enter ".help" for usage hints.
    sqlite> PRAGMA journal_mode=wal;
    wal
    sqlite> PRAGMA synchronous;
    1
    sqlite>

edit: fresh installation from homebrew shows default as FULL:

    /opt/homebrew/opt/sqlite/bin/sqlite3 test.db
    SQLite version 3.50.4 2025-07-30 19:33:53
    Enter ".help" for usage hints.
    sqlite> PRAGMA journal_mode=wal;
    wal
    sqlite> PRAGMA synchronous;
    2
    sqlite>

I will update the post, thanks!

larschdk · 2025-08-24T16:50:15 1756054215

Just checked debian/ubuntu/alpine/fedora/arch docker images. All are FULL by default.

eatonphil · 2025-08-24T16:44:38 1756053878

Is this sqlite built from source or a distro sqlite? It's possible the defaults differ with build settings.

supriyo-biswas · 2025-08-24T17:23:59 1756056239

The one which avinassh shows is MacOS's SQLite under /usr/bin/sqlite3. In general it also has some other weird settings, like not having concat() method, last I checked.

ncruces · 2025-08-24T21:21:29 1756070489

The Apple built macOS SQLite is something.

Another oddity: misteriously reserving 12 bytes per page for whatever reason, making databases created with it forever incompatible with the checksum VFS.

Other: having 3 different layers of fsync to avoid actually doing any F_FULLFSYNC ever, even when you ask it for a fullfsync (read up on F_BARRIERFSYNC).

zimpenfish · 2025-08-25T09:28:54 1756114134

> it also has some other weird settings

You also can't load extensions with `.load` (presumably security but a pain in the arse.)

    user ~ $ echo | /opt/homebrew/opt/sqlite3/bin/sqlite3 '.load'
    [2025-08-25T09:27:54Z INFO  sqlite_zstd::create_extension] [sqlite-zstd] initialized
    user ~ $ echo | /usr/bin/sqlite3 '.load'
    Error: unknown command or invalid arguments:  "load". Enter ".help" for help

mediumsmart · 2025-08-24T18:56:56 1756061816

Same with macports here - 2 (opt/local/bin/sqlite3)

and /usr/bin/sqlite3 is 1

dangoodmanUT · 2025-08-24T16:38:56 1756053536

SQLite defaults have many, many weird choices that make unsuspecting users choose dangerous configurations.

And it's not reasonable to expect them to read through all the docs to figure out that the defaults are not safe.

Defaults should be safe, tune for performance. Not the other way around.

jabwd · 2025-08-24T16:47:31 1756054051

Sqlite defaults in many ways perfectly fine, you get the foot guns when you need the performance. Read the article rather than commenting on HN because WAL is not default.

btown · 2025-08-24T17:25:33 1756056333

There's some nuance here. The compiler flags SQLITE_DEFAULT_SYNCHRONOUS and SQLITE_DEFAULT_WAL_SYNCHRONOUS are set to FULL by default, which does fsync on each commit.

https://sqlite.org/compile.html

But there is a thing called NORMAL mode which, in WAL and non-WAL mode, does not fsync on each commit. In WAL mode, at least this doesn't cause corruption, but it can still lose data.

https://www.sqlite.org/pragma.html#pragma_synchronous is very explicit that the thing called NORMAL does have these risks. But it's still called NORMAL, and I'd think that's something of a foot-slingshot for newcomers, if not a full-fledged footgun.

nutjob2 · 2025-08-26T17:08:21 1756228101

FULL can also lose data if you lose power or crash before the fsync. This is just a simple trade of losing slightly more data (possibly) in return for better performance.

Fsync is relatively expensive. Recovery price is not going to differ much between the two settings.

It's like 1 in 1000 loss for 999 in 1000 gain. Makes perfect sense to me.

ehutch79 · 2025-08-24T22:34:33 1756074873

But I constantly read advice to change to WAL on this very site. No further nuance, just that if I do it’ll out perform a MySQL cluster.

lupusreal · 2025-08-25T09:30:07 1756114207

That's why you should read the documentation and not comments from random bozos telling you to do a thing without you, or probably they, even knowing what it actually does. Read the documentation and learn how your tools work. Don't cargo cult.

nh2 · 2025-08-24T17:53:14 1756057994

> Sqlite defaults in many ways perfectly fine

In the sibling comment we are discussing how the default of sqlite is not durable, so it's only ACI, not ACID.

https://news.ycombinator.com/item?id=45005866

So you do get the foot guns automatically.

> rather than commenting on HN

I appreciate the parent commenting on HN, because they seem to be ... right.

tobias3 · 2025-08-24T18:28:07 1756060087

The default with the build in sqlite in macos. Means Apple engineers made this choice.

lupusreal · 2025-08-24T17:34:45 1756056885

I swear I think people choose WAL mode because they read something about it online, where that something obviously isn't the documentation. This behavior shouldn't be catching any engineer by surprise.

sgarland · 2025-08-24T17:37:26 1756057046

Yes, yes it does [0]. I fully understand the need for backwards-compatibility given the sheer number of SQLite installations, I find their attitude towards flexible types [1] appalling. At least they’ve added STRICT.

Similarly, the fact that FKs aren’t actually enforced by default is a horrifying discovery if you’re new to it, but tbf a. MySQL until quite recently allowed CHECK constraints to be declared, which it then ignored b. The FK enforcement decision is for backwards-compatibility.

[0]: https://www.sqlite.org/quirks.html

[1]: https://www.sqlite.org/flextypegood.html

sevensor · 2025-08-25T11:59:28 1756123168

I disagree about flexible types. SQLite bills itself as a better fopen, not as a full on database engine. If you have data sitting outside your application, trusting it to be well typed is just going to cause trouble. If your database is the application, that’s a different story.

sgarland · 2025-08-26T13:27:22 1756214842

I’d rather have the INSERT fail than to discover it later when I’m trying to do math with a string, but to each their own.

conradev · 2025-08-24T17:21:59 1756056119

This is perhaps my favorite one. By default, SQLite doesn’t actually delete data:

  The default setting for secure_delete is determined by the SQLITE_SECURE_DELETE compile-time option and is normally off. The off setting for secure_delete improves performance by reducing the number of CPU cycles and the amount of disk I/O.

https://www.sqlite.org/pragma.html#pragma_secure_delete

rcxdude · 2025-08-24T17:26:17 1756056377

That isn't much unlike most filesystems (in fact, on CoW filesystems, even with this setting the data is likely to still be hanging around on disk)

ectospheno · 2025-08-25T13:28:23 1756128503

The article just demonstrates that you should compile SQLite yourself for any real app. After reading the documentation.

conradev · 2025-08-24T16:15:59 1756052159

One Apple platforms, you also want to use F_FULLFSYNC instead of fsync if you need durability:

https://www.sqlite.org/pragma.html#pragma_checkpoint_fullfsy...

ncruces · 2025-08-24T21:38:59 1756071539

If Apple compiled your SQLite library, not even the fullfsync PRAGMAs will do F_FULLFSYNC. Apple has “secretly” patched their SQLite to do F_BARRIERFSYNC instead.

https://bonsaidb.io/blog/acid-on-apple/

baruz · 2025-08-24T16:52:37 1756054357

Is there a performance hit for f_fullfsync?

mananaysiempre · 2025-08-24T16:57:29 1756054649

Yes, that’s why those systems can cheat on benchmarks by defaulting to half-arsed fsync() in the first place.

londons_explore · 2025-08-24T17:03:20 1756055000

Half arsed fsync is all I want.

I am happy to lose 5 or 10 seconds of data in a power failure. However I'm not okay with a file becoming so corrupted that it is unmountable when the power recovers.

Half arsed fsync provides exactly that - and considering you get way more performance this seems like a good tradeoff.

Avamander · 2025-08-24T17:08:15 1756055295

Fsync is also overused at times, like some random web app's local storage does not need to be forced onto the disk.

londons_explore · 2025-08-24T17:17:29 1756055849

That fsync behaviour I think is a good part of the reason that apps seem to run faster/better on osx than windows/Linux.

I wish Linux and windows would have settings to change all fsyncs to barriers too.

Unfortunately I think Linux recently removed such an ability on the basis the code complexity wasn't worth it.

Avamander · 2025-08-24T17:44:44 1756057484

Would it not be possible to achieve this with something akin to how libeatmydata can be LD_PRELOAD-ed?

conradev · 2025-08-24T17:25:58 1756056358

You don’t need fsync at all for that, just WAL. fsync is only half arsed on Apple platforms.

mxey · 2025-08-25T07:36:55 1756107415

You need write barriers for the ordering guarantees of a WAL. that’s why Apple uses barrier sync and not full sync. AFAIK other operating systems do not have this distinction.

jmull · 2025-08-24T18:18:26 1756059506

The takeaway should be:

Don't assume sqlite is doing an fsync on each commit.

While that's the "default default", you may be using a sqlite3 compiled by someone else or using non-default build options, or using a wrapper that sets its own defaults.

More generally, be careful about assuming any of the various levers sqlite3 has are set in a specific way. That is, take control of the ones that are important to your use case (well, and before that, review the levers so you know which ones those are).

chasil · 2025-08-24T16:31:44 1756053104

There is enough going on with WAL mode that it really shouldn't be enabled unless all the limitations are understood.

In addition to the need for all clients to see shared memory, it disables acid transactions on attached databases (2nd to last paragraph below):

https://sqlite.org/lang_attach.html

bawolff · 2025-08-24T17:01:54 1756054914

> In addition to the need for all clients to see shared memory, it disables acid transactions on attached databases

It only disables global cross-db transactions. It does not disable transactions in the attached db.

The fact that global transactions are a thing at all is the surprising bit to me. I think the WAL mode is the more expected behaviour.

boris · 2025-08-24T18:21:14 1756059674

The main reason you would attach a database and then jump through hoops like qualifying tables is to have transactions cover all the attached databases. If you don't need that, then you can just open separate connections to each database without needing to jump through any hoops. So the fact that WAL does not provide that is a big drawback.

SoftTalker · 2025-08-24T16:54:36 1756054476

This is true for any database, for any concurrency or durability settings. You must understand the implications of the defaults and your choices if you change them.

arpinum · 2025-08-24T17:21:36 1756056096

This is popping up because SurrealDB was found to turn fsync off by default. But there are important differences:

- SurrealDB provides poor documentation about this default

- SQLite is typically run client side, while SurrealDB is typically run as a remote server

- SQLite is actually full sync by default, but distros may package it with other defaults - SurrealDB explicitly did this for benchmarking reasons (for comparison fairness) while SQLite distros turn off fsync for typically practical reasons as it's run pure client side.

sethev · 2025-08-24T16:18:32 1756052312

One of the things I like about SQLite is how easy it is to understand its behavior if you read the docs. This is one of the potentially surprising defaults - but it seems reasonable for an embedded database, where a power loss is likely to cause the application to "forget" that something was committed at the same time that SQLite does.

However, it depends on the application - hence the need for clear docs and an understandable model.

liuliu · 2025-08-24T16:22:45 1756052565

Does it matter? For all we know, it keeps the serializability. At this point (of computer hardware history), you would care more about serializability than making sure data written to disk after power loss, the latter would now depend on so many layers of drivers doing correct things (for things that is hard to test correctly).

wellpast · 2025-08-24T16:31:00 1756053060

Wouldn’t it depend on use case?

If the app confirms to me my crypto transaction has been reliably queued, I probably don’t want to hear that it was unqueued because a node using SQLite in the cluster had died at an inconvenient specific time.

bawolff · 2025-08-24T17:07:27 1756055247

If you had a power failure between when the transaction was queued and the sqlite transaction was comitted, no amount of fsync will save you.

If that is the threat you want to defend against this is not the right setting. Maybe it would reduce the window for it a little bit, but power failures are basically a non existent threat anyways, does a solution that mildly reduces but not eliminate the risk really matter when the risk is negligible?

astrobe_ · 2025-08-24T19:04:50 1756062290

> but power failures are basically a non existent threat anyways

Not in the contexts sqlite3 is often used. Remember, this is an embedded database, not a fat MySQL server sitting in a comfy datacenter with redundant power backups, RAID 6 and AC regulated to the millidegree. More like embedded systems with unreliable or no power backup. Like Curl, you can find it in unexpected places.

bawolff · 2025-08-24T19:23:20 1756063400

I think in that context, durability is even less expected.

kevincox · 2025-08-24T18:38:01 1756060681

A better example is probably

1. I general a keypair and commit it.

2. I send the public key to someone.

I *really* want to be sure that 1 is persisted. Because if they for example send me $1M worth of crypto it will really suck if I don't have the key anymore. There are definitely cases where it is critical to know that data has been persisted.

This is also assuming that what you are syncing to is more than one local disc, ideally you are running the fsync on multiple geographically distant discs. But there are also cryptography related applications where you must never reuse state otherwise very bad things happen. This can apply even for one local disc (like a laptop). In this case if you did something like 1. Encrypt some data. 2. Commit that this nonce, key, OTP, whatever has been used. 3. Send that datasome where. Then You want to be sure that either that data was comitted or the disc was permanently destroyed (or at least somehow wouldn't be used accidentally to be encrypt more data).

wellpast · 2025-08-24T18:28:42 1756060122

Of course it will because same programmers don’t ack their customers until their (distributed, replicated) db says ack.

wellpast · 2025-08-24T18:28:51 1756060131

sane*

duped · 2025-08-24T20:01:25 1756065685

I believe in the comment they're referring to the "crypto transaction" not the SQLite transaction.

throwawaymaths · 2025-08-24T17:11:04 1756055464

if you are doing crypto you really ought to have a different way of checking that your tx has gone though that is the actual source of truth, like, for exple, the blockchain.

wellpast · 2025-08-24T18:26:17 1756059977

I knew I shouldn’t have said crypto, but it is why I said queued. I knew a pedant was going to nitpick. Probably subconsciously was inviting it. I think my point still stands.

duped · 2025-08-24T20:13:28 1756066408

Durability just guarantees that you don't return that a write transaction has completed successfully until after all the layers are done writing to disk. fsync is the high level abstraction that file systems implement to mean "this data has actually gone to disk" (although handling errors is a rabbit hole worth reading about). It absolutely has a performance cost which is why applications that can live without durability sometimes get away with it.

If your application can tolerate writes silently failing then you can live without it. But a lot of applications can't, so it does matter.

liuliu · 2025-08-26T17:02:38 1756227758

It depends on if there will be holes and whether you communicate "externally". If none of these are concerns (for WAL and using SQLite locally, none of these are), it is OK.

To elaborate, for a local app that using WAL, if a transaction committed locally, then reverted *along* with everything afterwards, the app restarts, and it will continue to function as expected, no ill-defined states.

If you use WAL within a quorum, sure, durability is a concern and I think you would be better off to have ways above SQLite to maintain that durability rather than relying on fsync solely (your SSD can break).

Also, to add, WAL mode uses checksum to make sure there is no holes, so even your SSD re-orders writes, I think no hole for your writes is a pretty safe assumption.

nolist_policy · 2025-08-24T17:03:33 1756055013

Without the "making sure data written to disk after power loss" part you won't get serialization either in modern storage stacks.

liuliu · 2025-08-26T17:05:00 1756227900

As I mentioned in the other comment, https://www.sqlite.org/fileformat2.html#the_write_ahead_log WAL uses checksum to make sure there is no holes for its writes. I need to do more analysis, but it goes beyond just rely on fwrite to do the right thing for serialize the writes (I think it is "re-order writes" safe, but I cannot guarantee that without thinking more about it).

wiradikusuma · 2025-08-24T16:39:24 1756053564

Does anyone, preferably from Cloudflare, know the settings used in D1, since it's a SQLite dialect?

kentonv · 2025-08-25T16:20:51 1756138851

D1 is built on Durable Objects, where SQLite is used on top of a custom distributed storage layer. This blog post describes in detail how the system works:

https://blog.cloudflare.com/sqlite-in-durable-objects/

In short, while we don't technically fsync, we do much better. If the local disk suddenly explodes, we won't lose confirmed writes.

(I wrote much of said storage layer.)

koakuma-chan · 2025-08-24T17:29:38 1756056578

Is D1 a dialect? I thought they actually run SQLite.

wiradikusuma · 2025-08-24T17:36:04 1756056964

I hit some issues where they worked in localhost but not in production. Fortunately, they're not dealbreakers. Unfortunately, my mind is very good at forgetting bad experiences (especially if they have a workaround), so I don't remember the issues.

I do know the transaction is handled "differently".

weinzierl · 2025-08-24T16:18:06 1756052286

I don't know why SQLite has this default, I suspect primarily performance, but I think even with modern SSD's there is a fsync frequency where it starts to hurt lifetime.

Also, you probably do not gain much with synchronization=FULL if the SSD does not really honor flushes correctly. Some SSDs historically lied - acknowledging flushes before actually persisting them. I don't know if this is still true.

nodesocket · 2025-08-24T19:57:22 1756065442

Related, I use Lightstream and their documentation[1] actually suggested to use synchronous=NORMAL. Any idea if this is a wise change? Should I revert back to the default of FULL using WAL + Lightstream?

[1] https://litestream.io/tips/#synchronous-pragma

geysersam · 2025-08-25T01:01:14 1756083674

If you're using lightstream you still might loose transactions since the last checkpoint, if the server goes down, so I don't think it makes sense to make sure every single transaction is persisted to disk before success is returned. If my understanding is correct it will just slow the app down without reducing your chances of data loss.

Skinney · 2025-08-25T00:57:50 1756083470

Litestream is already backing up your changes, so synchronous=normal seems reasonable.

hoppp · 2025-08-24T16:43:07 1756053787

Calling fsync on each commit makes it slower, doesn't it?

Its an extra syscall. It depends on the use-case if its needed or not

senderista · 2025-08-24T19:44:14 1756064654

You can enqueue fsync as an op in io_uring, using link flag to sequence it after previous ops and drain flag to ensure no new ops are started until it completes.

dgshsg · 2025-08-25T07:35:01 1756107301

You still need to wait for it to complete to report the commit success to the user, the cost of syscall itself is rather minor here.

senderista · 2025-08-25T18:47:00 1756147620

Avoiding the syscall cost is about maximizing throughput, not minimizing latency. The media's time-to-durability is what it is.

mxey · 2025-08-25T07:38:15 1756107495

It’s slow because it forces a write to disk. I don’t think the syscall itself matters much.

praptak · 2025-08-24T18:02:42 1756058562

That's why big DBs do group commits, trading latency for throughput.

sgammon · 2025-08-24T17:18:55 1756055935

yeah it says that in the docs