Windows file access performance compared to Linux

marsrover · on Dec 29, 2018

I develop on Windows 10 because I’m a .NET developer and really like the idea of WSL. But the more I use it the more frustrated I become by its file access performance. I started off using it for Git but now I just use Git bash in Poweshell (which also annoys me with its slowness).

I haven’t developed on an actual *NIX machine in years but recently I deployed something to my DO VPS and it “built” (php composer) in what felt like 1/100 of the time it was taking on my computer, whether running the Windows binaries in Powershell/CMD or the Linux binaries in WSL. Although I will say WSL is slower.

In fact, it was so fast that I’m about to call Microcenter and see about picking up a Thinkpad to install Linux on.

Joeri · on Dec 29, 2018

A major performance improvement is adding the working folders for my coding projects to the exclusion list of whatever antivirus solution is running. On low-specced machines I disable AV entirely, because I feel these days it is mostly snake oil anyway with zero days being commonplace.

computerex · on Dec 30, 2018

Uhh what? Software is probably more secure now than ever before. I don't think AV software is snake oil at all.

cmurf · on Dec 30, 2018

UEFI Secure Boot, mandatory signed binaries, and Windows Defender (XProtect on macOS), have contributed more to protecting from malware than 3rd party anti-virus. Although I think the existence, cost, and PITAness of 3rd party anti-virus might very well have contributed to motivating the OS vendors into securing their products better.

Wowfunhappy · on Dec 30, 2018

It should be noted, I believe the parent comments included Windows Defender as an anti-virus. 3rd party was never specified, and disabling Windows Defender can indeed improve file access performance.

dvlsg · on Dec 30, 2018

Can confirm. I usually have to turn off windows defender whenever I'm doing anything with docker, or node modules, or something similar. If I don't, my computer slows to a crawl.

computerex · on Dec 31, 2018

Source? I thought UEFI was just a way to make Linux a pain in the ass to dual boot with Windows? What's your evidence that it's effective against malware? I am biased here, and hate uefi.

wtallis · on Dec 31, 2018

UEFI is not the same thing as UEFI Secure Boot. UEFI booting in general makes dual-booting far easier than BIOS-based booting where operating systems have to fight over who owns the MBR. Secure Boot makes it harder to set up a multi-boot system because you need a signed bootloader for your Linux system.

throwaway2048 · on Dec 30, 2018

Software maybe, but that has precious little to do with AV.

Mistletoe · on Dec 30, 2018

This sounds like anti vaxxer logic. I don’t think you remember what it was like before anti-virus.

freedomben · on Dec 30, 2018

I do remember, but correlation != causation. The major improvements that have made software so much more secure are not AV, they are things like ASLR, non-executable stack, stack canaries, a shift to less-privileged code and having more functions in user space, memory-safe(r) languages being more common place, and an increase in general security awareness. If anything anti-virus is much less useful now that polymorphic shell code is commonplace, as well as the fact that user error (such as falling for a phishing attack) is by far the largest cause of security failings.

computerex · on Dec 30, 2018

> If anything anti-virus is much less useful now that polymorphic shell code is commonplace

Source? I disagree with this statement. Polymorphic viruses have been in commonplace since decades. I don't think that diminishes from the importance of AV. AV software isn't restricted to comparing file hashes with known threats, there's so much more that can be done for security.

freedomben · on Dec 30, 2018

Are you asking for a source for only that statement or for my post in general? Source is myself. I have a masters in Cyber Security and have worked in the field for 15 years. I've written numerous exploits and have actively evaded antivirus in the past. I can tell you from experience that ASLR is 10 times the pain in the ass that AV is, and NX bits/DEP are maybe 100 times more. Not trying to have a dick measuring contest, just justifying why I don't mind citing myself :-D

Regarding:

> Polymorphic viruses have been in commonplace since decades

I disagree. I wouldn't describe them as "commonplace" until maybe the last decade or so. Regardless, this is probably the weakest of the arguments that I made.

> AV software isn't restricted to comparing file hashes with known threats, there's so much more that can be done for security.

With this I agree, tho I would contend that even the most advanced heuristics and things like hook interceptions such as those Comodo experimented with in the late 2000s are still not what has made us so much more secure. At best AV is a small layer of a Defense in Depth strategy. At worst it's a bloated unnecessary layer that eats cycles and robs system resources that could be devoted to useful activities.

That said, if I had any Windows machines in my home (been on Linux exclusively for a bit over 10 years now), I would likely run Defender on them. I'm not suggesting that AV is worthless, just that it isn't the reason things are much more secure these days.

jcelerier · on Dec 29, 2018

To give another data point, for the software I'm developing (https://ossia.io/, a few hundred thousand LOC of modern C++ / Qt), a full build, with the same compiler version (clang-7), on the same machine, on the same pci-express NVMe disk takes 2 to 3 times slower on windows than on linux.

every time I have to build on windows I feel sad for all the developers forced to use this wretched environment.

Const-me · on Dec 29, 2018

For C++, on Windows I use MSVC compiler, and I'm usually happy with its performance. If your clang supports precompiled headers, try enabling them. This saves CPU time parsing standard headers, but also saves lots of IO accessing these random small include files, replacing the IO with large sequential read from the .pch.

jcelerier · on Dec 29, 2018

I already use PCH, and the same project built on MSVC is even slower.

MrRadar · on Dec 29, 2018

I have a similar experience. I'm a developer on a product that has both C and C# components. The C component runs on both Windows and Linux and is by far the larger of the two while the C# GUI component is Windows-only.

Our main workstations are Windows but we also have a shared Linux development VM on our group's VM host. The VM host is not very powerful, it's a hand-me-down machine from another group that's around 7 years old at this point, with a low-clocked Bulldozer-based AMD Opteron CPU and spinning rust storage. In comparison my workstation has a 4th-gen Core i7 processor and a SATA SSD.

Despite the fact that my workstation should be around 2-3x as fast per thread and have massively better disk I/O the VM host still builds our C code in less than half the time. If I didn't have to maintain the C# code and use MS Office I'd switch to Linux as my main workstation OS in a heartbeat. (Note that the compilation process on neither platform uses parallel compilation so it's not that our VM is farming the compile out to more cores.)

akdor1154 · on Dec 30, 2018

In a similar situation - I've had reasonable success in Linux developing a .net framework c# app in VS Code with Mono from the Ubuntu PPA. Almost everything works but for the odd project calling native Windows DLLs, so I keep windows around for manipulating VS projects and compiling these oddities. Most dev can happen in Linux though.

MrRadar · on Dec 30, 2018

In my case we use a closed-source commercial UI library very extensively whose vendor has previously stated they have no interest in supporting Linux or WINE so doing C# dev on Linux is totally out of the question for me, unfortunately. The company I work for has a long-term strategic initiative to transition to an entirely web-based UI so eventually we may be able to remove the C# code but that's still years off for my group.

(In retrospect tying ourselves to a closed-source library like that was a mistake; if I could go back in time and put us on a different path (using the technologies available at the time) I would have gone with C++ and Qt instead which would have allowed for cross-platform development and deployment. Not to mention that because Qt is open-source (even if we would have had to buy a proprietary license for it at the time) we could fix any bugs we encountered ourselves, unlike our current UI library where we just need to come up with work-arounds until the vendor can get around to fixing them. But these decisions were made before I was hired so I just have to live with them.)

splittingTimes · on Dec 29, 2018

I am not a C# dev, so please bear with me.

So you have you core business logic and algorithms written in C and only the front end/GUI uses C#? Can you cleanly separate the two (via MVVC or a variation thereof)?

Does the whole application runs on C# and you have RPCs between the two? What does that mean for performance?

Cheers and thanks for any Insights about your setup.

MrRadar · on Dec 29, 2018

It's a pretty classic client/server setup (not too different from modern web apps, actually). The C code is the server and the C# code (which is actually a plugin to another application written in C#) is the client. The C server can run independently of the C# clients but isn't very useful (to our customers at least) by itself.

Most of the C# code's responsibility is taking user commands from the plugin's host application and converting them to messages sent to the C server for further processing, which isn't very speed/latency sensitive most of the time since it only has to operate at user-perceivable timescales and on small amounts of data. The remainder of the C# code is some custom UI controls and dialogs.

The network communications (if you squint hard enough) somewhat resembles REST with custom binary protocols instead of HTTP and JSON; if REST were in style when this code was first written (around 10 years ago) it almost certainly would have been used for the network/message layer instead.

natmaka · on Dec 30, 2018

Do you take the buffercache into account? If the server has more RAM unused by processes it probably uses it as a cache, which can be 3x quicker than a SSD.

MrRadar · on Dec 30, 2018

Both machines should have sufficient memory to cache the entire codebase (including all header and library files) in RAM.

dlgeek · on Dec 30, 2018

Why not flip to linux and run Windows in a VM?

tyingq · on Dec 29, 2018

This helped me: https://github.com/Microsoft/WSL/issues/1932#issuecomment-40...

Microsoft made some change a year or so ago that brought more of WSL into Windows Defender's view.

Strom · on Dec 30, 2018

Windows Defender is an abomination. It makes file operations on regular Windows also extremely slow. For example unpacking the Go 1.11 zip which is 8700 files takes a second with my PCIe SSD with Windows Defender disabled. Enable it and the extraction time rises to several minutes.

tyingq · on Dec 30, 2018

I wonder if their "real time protection" is making a network request every time you open a new file.

sethrin · on Dec 29, 2018

Can I plug System76 as an alternative? I feel like it's important to purchase Linux-native hardware. Microsoft has largely prevented competition in this market, and I think more viable options would benefit consumers. Also, S76 has pretty good hardware.

gerdesj · on Dec 30, 2018

Thanks for the heads up (S76) but Linux runs on pretty much anything these days. This laptop is a Dell Inspiron with a 17" touch screen and lots of sensors. I'm running Arch Linux on it and everything is supported out of the box. The only tweak I have made towards hardware is changing the driver in use for the Synaptics mouse, which dmesg mentioned and will probably become the norm soon anyway.

There really is no such thing as MS native only anymore. I got Linux on here without accepting any obnoxious licenses and my laptop's price was partially subsidised by all the crap that I never even saw. To be honest, I'm not sure what the exact price breakdown really is on this thing but I do know that MS did not get in my way.

Dell are pretty Linux friendly, for example to update firmware I copy the new image to my /boot partition (EFI) and then use the built in option at boot to update the firmware - simples! No more farting around with turning swap into FAT32 for a while and a FreeDOS boot disc.

sethrin · on Dec 30, 2018

The issue is not whether Linux can be made to run, the issue is whether you are subsidizing Microsoft for that privilege.

freedomben · on Dec 30, 2018

You can buy Dell laptops these days that ship directly with Ubuntu (such as XPS and Precision), and you explicitly see the Microsoft tax fall off when making the selection. It feels so good.

technofiend · on Dec 30, 2018

Intel's NUC bricks come in two varieties: prebuilt and bring your own memory, storage and OS. The latter does not impose a Windows license fee and as long as you're running kernel 4 all the hardware is generally supported, although I see running Ubuntu combined with a thunderbolt switch doesn't work.

Sporting laptop-class processors these are not powerhouse machines and fall below the latest Mac Minis in performance. But for a development platform they can be plenty, particularly the newer models that ditch classic SATA for two NVME slots.

gerdesj · on Dec 30, 2018

Not too sure what the break down on my lappy is wrt bundles. As far as I can tell I paid a fair price for it and I did not accept any licenses that I did not want. When I say fair price, I think that it is pretty decent. I don't know what a 17" Apple laptop would cost with a touch screen but probably more than the £950ish I paid for this beast.

dewyatt · on Dec 30, 2018

I'll plug the Librem 13/15. I have the 13 and I'm very happy with it (initially it had a bug in the firmware so my NVMe SSD that I bought separately wasn't bootable, but they fixed it very quickly).

samstave · on Dec 29, 2018

+1 for S76 HW guts, but I find their laptop chassis to have been really flimsy in the past - where screws fell out and into the case of 3 laptops.

They were, in the past, using CLEO as an OEM....

They came out with that custom steel and wood case all made in the USA recently, but don't know if they are building their own laptop HW these days.

sethrin · on Dec 30, 2018

I got a Galago Pro recently, because my previous netbook fell apart. The construction on the Galago Pro is quite solid, I believe; getting it apart was a bit more challenging than I am used to. However, the laptop is still being manufactured by CLEVO, and as one consequence of this, the battery life is pretty limited (I was aware of this drawback before purchase). I believe that, as the sibling poster mentions, they are bringing their design and manufacture in-house, and that subsequent revisions will be an improvement. I don't find the issue to be hugely limiting, and I am personally willing to forgive quite a lot to have (1) no Windows key, and (2) non-soldered RAM.

jchw · on Dec 30, 2018

They are going to be building laptops. Not sure if they have sold any of the ones developed in house, but here's the related press release.

https://blog.system76.com/post/159767214983/entering-phase-t...

computerex · on Dec 30, 2018

Sadly they are overpriced to hell.

Waterluvian · on Dec 29, 2018

Curious just how feasible doing . NET dev in Linux would be? Is it a non start? Is it inconvenient and rough? Is it fine if you give it enough love?

marsrover · on Dec 29, 2018

If I only messed with .NET Core I think it would be fine. I already use JetBrains Rider in place of VS.

The problem is that many systems still rely on .NET Framework which is a non start on Linux.

fartcannon · on Dec 29, 2018

Also stuff like this:

https://github.com/dotnet/cli/issues/3093#issuecomment-44343...

Trust, is an issue.

mmrezaie · on Dec 30, 2018

and their response is even worst:

https://github.com/dotnet/cli/issues/10497#issuecomment-4494...

Sadly, I thought they would have fixed it and not ignore it.

platz · on Dec 29, 2018

Depends on if you're just writing pure libraries, or actually doing interesting things like UI or linking to 3rd party drivers.

I think the experience with mssql is better on linux now, but I imagine anything slightly outside the box db-wise may break you.

tracker1 · on Dec 29, 2018

I know SQL for linux's docker image has been a breeze to work with, even in docker for windows. Also, SQL Operations Studio (electron based) is catching up to SSMS.

akiselev · on Dec 29, 2018

It's the lack of visual studio that is often the nonstarter.

mjul · on Dec 29, 2018

Jetbrains’ Rider IDE is now so mature that I use it as a better Visual Studio even on Windows. It also runs on Mac and Linux. Great code manipulation, navigation and refactoring tools, and great support for adjacent technologies like build and test tools for both the .net code and web front-ends.

devereaux · on Dec 29, 2018

Jetbrains’ Rider is interesting.

Can you code on linux using Jetbrains to create .NET 3.5 apps for a Windows target? 3.5 is just an example because the matching VisualStudio works more or less in wine, while I haven't tested 4.0 and more recent.

codeulike · on Dec 29, 2018

.NET Core is the way to go if you want to be cross platform

https://dotnet.microsoft.com/download?initial-os=linux

devereaux · on Jan 2, 2019

I do not "want" to be cross platform. I just want to use popular frameworks with a large adoption.

(Working well inside wine qualifies as popular for me)

I'd think 3.5 had a greater user base than .net core.

platz · on Dec 29, 2018

You can get by with vscode for a lot, but not everything I think.

Waterluvian · on Dec 29, 2018

Is this purely about convenience tools in an IDE or are there some things actually locked into the Windows environment? Is .NET a lot like Xcode where it's not like you can just download the libraries + compiler + a text editor and have all you need? The latter is broadly true for every language I've really dove into so this feels like a foreign concept.

Edit: what I could stand to gain is that I work weekly in four or five languages. Already well tooled up in Ubuntu and vscode. Nothing frustrates me more than having to keep multiple IDEs consistent. Imagine driving two cars all day that have their controls in all different places.

paavohtl · on Dec 29, 2018

Depends on what you're doing. Some technologies are only supported on Windows (like the official UI frameworks), but most things like webdev and gamedev libraries are supported on all platforms. Giving up Visual Studio can be a hard sell, as from my experience C# + Visual Studio (+ ReSharper) is one of the most productive programming environments you can have.

You can download the SDK from https://dotnet.microsoft.com/ and use it on any platform with your editor of choice, including JetBrains Rider, which is a cross-platform .NET IDE.

tracker1 · on Dec 29, 2018

Depending on your UI needs, may want to look at Eto.Forms and MonoGame. :-) That said, I agree on productivity for C# + VS. I find I'm more productive with node + npm + vs code though.

.Net Core hasn't been too bad outside VS... I do wish they'd stuck to the JSON project format. I also wish dotnet had a task runner like npm in it.

izacus · on Dec 29, 2018

The question here is - what do you gain? You're giving up arguably one of the best developer tools available for what? A slightly different desktop skin? Remapped shortcuts? Using slightly different command line commands?

izacus · on Dec 29, 2018

VSCode really isn't even a same type of product than Visual Studio proper and it's nowhere near a replacement for VS heavy workflows.

tracker1 · on Dec 29, 2018

I get by with it okay... generally only open full on VS when I HAVE to.

izacus · on Dec 30, 2018

I don't doubt that. But that's not the point - I'm sure plenty of people get by with Paint or Paint.NET, but noone sane would call them a replacement for Photoshop and its workflows.

Same with VS vs. VSCode - I'm happy that it works great for you, but I'm not sure why you'd think they're comparable tools.

akdor1154 · on Dec 30, 2018

Reasonable. .NET Core is obviously fine, but even .NET Framework stuff is largely runnable with up to date Mono, as MS is slowly pushing lots of previously 'system' libraries into NuGet. WPF is the only notable big dealbreaker.

oblio · on Dec 29, 2018

Web stuff is fine.

dataflow · on Dec 29, 2018

> I started off using it for Git but now I just use Git bash in Poweshell (which also annoys me with its slowness).

This may not be a WSL issue. You don't want to mix Windows git (what you're calling git-bash) with Linux git (or MSYS2 git). Even a git status will cause them to trample on each other in ways I don't yet fully understand, and that will also slow them down very significantly as one tries to process a repo the other one has previously accessed. Pick one and stick with it for any given repo.

tracker1 · on Dec 29, 2018

You may actually see better performance via docker or vm for linux under windows. Also, if you're using or can migrate to .Net core it works pretty well there.

I'm using containers for local services I'm not actively working on, even though the application is being deployed to windows, because it's been easier for me. I'd actually prefer Linux host at work, but there's legacy crap I need.

I do work remotely sometimes on my home hackintosh though.

_jcwu · on Dec 29, 2018

Same. I changed to cygwin for my git workflow because WSL is just too slow.

suzuki · on Dec 30, 2018

If you had used WSLGit (https://github.com/andy-5/wslgit), you could use Cyg-Git (https://github.com/nukata/cyg-git) instead. It provides a "Git for Windows" virtually for Go and VSCode. You only need Cygwin and its packages, including Cygwin's git.

nickjj · on Dec 29, 2018

I remember this issue (I commented in it a few years ago).

On the bright side, WSL as a development environment is no longer slow when it comes to real world usage. I've been using it for full time web app development for the last year and even made a video about my entire set up and configuration a week ago[0].

For example with WSL you can easily get <= 100ms code reloads even through a Docker for Windows volume on really big applications with thousands of files, such as a Rails app.

Even compiling ~200kb of SCSS through a few Webpack loaders (inside of Docker with a volume) takes ~600ms with Webpack's watcher. It's glorious.

I haven't had a single performance issue with a bunch of different Rails, Flask, Node, Phoenix and Jekyll apps (with and without Webpack). This is with all source code being mounted in from a spinning disk drive too, so it can get a lot faster.

So while maybe in a micro-benchmark, the WSL file system might be an order of magnitude slower (or worse) than native Linux, it doesn't really seem to matter much when you use it in practice.

[0]: https://www.youtube.com/watch?v=5gu8wWX3Ob4

quietbritishjim · on Dec 29, 2018

> with WSL you can easily get <= 100ms code reloads even through a Docker for Windows volume

(Edited after watching your video.)

In your video it looks like you're running things in Docker containers. Even if you start containers using WSL, they still run in a separate Hyper-V virtual machine with a true Linux kernel, whereas WSL shares the Windows kernel and works by mapping Linux system calls directly to Windows kernel calls. When you run the "docker" command in WSL, it's just communicating with the Docker daemon running outside of WSL.

Docker runs this way on Windows because WSL does not implement all the Linux kernel system calls, only the most important ones needed by most applications, and the missing ones include some needed to run the Docker daemon.

All in all, this means that what you're talking about is not affected by the linked issue because it uses a different mechanism to access files (the Hyper-V driver rather than the WSL system call mapping). Although, if anything, I would expect Hyper-V to be even slower.

nickjj · on Dec 29, 2018

(Your edit makes my reply make a lot less since since you removed all of your original questions, but I'll leave my original reply, read the part after number 7)

My set up is basically this:

1. I use WSL as my day to day programming environment with the Ubuntu WSL terminal + tmux[0]. It's where I run a bunch of Linux tools and interact with my source code.

2. I have Docker for Windows installed (since the Docker daemon doesn't run directly in WSL yet due to missing Linux kernel features like iptables, etc.).

3. I installed Docker and Docker Compose inside of WSL but the daemon doesn't run in WSL. I just use the Docker CLI to communicate with Docker for Windows using DOCKER_HOST, so docker and docker-compose commands seamlessly work inside of WSL from my point of view[1].

4. All of my source code lives on a spinning disk drive outside of WSL which I edit with VSCode which is installed on Windows.

5. That source code drive is mounted into WSL using /etc/wsl.conf at / (but fstab works just as well)[2].

6. That source code drive is also shared with Docker for Windows and available to be used as a volume in any container.

7. All of my Dockerized web apps are running in Linux containers, but using this set up should be no problem if you use Windows containers I guess? I never used Windows containers, but that seems out of scope for WSL / Docker CLI. That comes down to Docker for Windows.

But, it's worth mentioning I have installed Ruby directly in WSL and didn't use Docker, and things are still just as fast as with Docker volumes. In fact, I run Jekyll directly in WSL without Docker because I really like live reload and I couldn't get that to work through Docker. My blog has like 200+ posts and 50+ drafts, and if I write a new blog post, Jekyll reloads the changes in about 3 seconds, and that's with Jekyll-Assets too. I have a feeling it wouldn't be that much faster even on native Linux since Jekyll is kind of slow, but I'm ok with a 3 second turn around considering it does so much good stuff for me.

[0]: https://nickjanetakis.com/blog/conemu-vs-hyper-vs-terminus-v...

[1]: https://nickjanetakis.com/blog/setting-up-docker-for-windows...

[2]: https://github.com/nickjj/dotfiles/blob/master/etc/wsl.conf

tracker1 · on Dec 29, 2018

It's worth noting that copying in/out of the docker for windows container can be interesting. Docker for windows mirrors the content in the volume directory on windows to one native inside the container... for some editing, it's fine... but try something a SQLite database with a gui in windows, and a container in linux both connected, and it will blow up on you.

The mount in WSL is really adaptive calls to the native OS, so that side will work without issue... the sync inside the container while editing is fast enough, and the running/building in the container is faster than it would be on the WSL itself as described by GP.

cma · on Dec 29, 2018

Git runs really slow on it. It runs much faster when turning off Windows defender (but still slow).

nickjj · on Dec 29, 2018

Are you talking about running it from the command line?

I use Git almost every day and have it installed directly inside of WSL.

It's not slow for any projects I've worked on, but it's worth mentioning I'm not dealing with code bases with millions of lines of code and many tens of thousands of files.

Most of the code bases I work on have ~100kb LOC or less (most are much less), and hundreds or low thousands of files.

Grepping through 50,000+ files through a WSL mount on a non-SSD is pretty slow, but I haven't tried doing that on a native Linux system in a long time so I can't really say if that slowness is mostly WSL or grepping through a lot of files in general.

cma · on Dec 29, 2018

Its a project with several million lines of code.

If grep is hitting 100% utilization and isn't just file system bound, and you are dealing with ascii stuff, you can speed it up a lot of times by prepending 'LANG=C', so that it doesn't have to deal with unicode.

Groxx · on Dec 29, 2018

I'll throw in my vote to try something like ack / ag that have better defaults for code-grepping.

spatulon · on Dec 29, 2018

Here's some anecdata from my experience using git at work, where we have a single large repository for all projects.

Running `git status` in a Linux VM is fast (near-instant).

With the native Git for Windows, it's noticeably slower - maybe half a second - but still usable.

In WSL it takes around 10 seconds, which is just unusable for me.

ajross · on Dec 29, 2018

It rather depends on what "in practice" means in practice, though. I occasionally look at it with a test dominated by building small C programs and WSL remains several times slower than a Linux/ext4 install on the same hardware

marmaduke · on Dec 29, 2018

But the point is that if your compilation time is less than eg 100ms on WSL, it doesn’t really matter if it’s 2 or 100 times slower than ext4.

ajross · on Dec 30, 2018

Runtime on a Linux desktop machine is about 20 minutes. The point is that it's a workload dominated by small file operations, and in particular small file stat operations, that WSL is particularly bad at. And really it's not so crazy a workload.

tedunangst · on Dec 29, 2018

Some people may have more than one test.

Zardoz84 · on Dec 29, 2018

My test is that W10 take half hour every time that I start it, hitting very hard the hard disk and being barely usable. However I get ready to work on Ubuntu on less that a minute.

_jcwu · on Dec 29, 2018

I think it does because the difference adds up quickly with large projects

herf · on Dec 29, 2018

I spent many years optimizing "stat-like" APIs for Picasa - Windows just feels very different than Linux once you're benchmarking.

It turns out Windows/SMB is very good at "give me all metadata over the wire for a directory" and not so fast at single file stat performance. On a high-latency network (e.g. Wi-Fi) the Windows approach is faster, but on a local disk (e.g., compiling code), Linux stat is faster.

romwell · on Dec 29, 2018

You've done an amazing job.

This is off-topic, but is there any chance of bringing the Picasa desktop client back to the masses?

There's nothing out there that matches Picasa in speed for managing large collections (especially on Windows). The Picasa Image Viewer is lightning-fast, and I still use them both daily.

There are, however, some things that could be improved (besides the deprecated functionality that was gone when Picasa Online was taken away); e.g. "Export to Folder" takes its sweet time. But with no source out there, and no support from the developers, this will not, sadly, happen.

Godel_unicode · on Dec 30, 2018

Have you looked at Adobe bridge? It's now free as long as you sign up for a (free) Adobe account.

avar · on Dec 29, 2018

I'm mostly clueless about Windows, so bear with me, but that makes no sense to me.

If SMB has some "give me stat info for all stuff in a directory" API call then that's obviously faster over the network since it eliminates N roundtrips, but I'd still expect a Linux SMB host to beat a Windows SMB host at that since FS operations are faster, the Linux host would also understand that protocol.

Unless what you mean is that Windows has some kernel-level "stat N" interface, so it beats Linux by avoiding the syscall overhead, or having a FS that's more optimized for that use-case. But then that would also be faster when using a SMB mount on localhost, and whether it's over a high-latency network wouldn't matter (actually that would amortize some of the benefits).

a1369209993 · on Dec 29, 2018

I think the idea is that you're accessing files sparsely and/or randomly.

With the linux approach you avoid translating (from disk representation to syscall representation) metadata you don't need, and the in-memory disk cache saves having to re-read it (and some filesystems require a seek for each directory entry to read the inode data structure, which can also be avoided it you don't care about that particular stat).

With the windows approach, the kernel knows you want multiple files from the same directory, so it can send a (slightly more expensive) bulk stat request, using only one round trip[0]. On linux, the kernel doesn't know whether you're grabbing a.txt,b.txt,... (single directory-wide stat) or foo/.get,bar/.git,... (multiple single stats that could be pipelined) or just a single file, so it makes sense to use the cheapest request initially. If it then sees another stat in the same directory, it might make a bulk request, but that still incurred a extra round trip, and may have added useless processing overhead if you only needed two files.

TLDR: Access to distant memory is faster if assuptions can be made about your access patterns, access to local memory is faster if you access less of the local memory.

0: I'm aware of protocol-induced round trips, but I don't think it effects the reasoning.

smartstakestime · on Dec 30, 2018

just think of they way the OSs are used.

get to a directory:

linux: cd /dir (no info) windows: open directory ... all the info and different views depending on your selection currently like image file thumbnails

in windows you are always accessing this meta data so it makes sense to speed it up. while in linux even the ls fucking does give you meta data you have to add the extra options so it doesnt makes sense to speed up and waste storage on something that is infrequent

seem like both ways are sound

magicalhippo · on Dec 29, 2018

> If SMB has some "give me stat info for all stuff in a directory" API call

It does, it supports FindFirstFile/FindNextFile[1], which returns a struct of name, attributes, size and timestamps per directory entry.

Now I'm not sure how Linux does things, but for NTFS, the data from FindFirstFile is pulled from the cached directory metadata, while the handle-based stat-like APIs operate on the file metadata. When the file is opened[2], the directory metadata is updated from the file metadata.

So while it does not have a "stat N" interface per se, the fact that it returns cached metadata in an explicit enumeration-style API should make it quite efficient.

[1]: https://docs.microsoft.com/en-us/windows/desktop/api/fileapi... [2]: https://blogs.msdn.microsoft.com/oldnewthing/20111226-00/?p=...

asveikau · on Dec 29, 2018

I'm not sure how FindFirstFile/FindNextFile is going to be better than readdir(3) on Unix.

At the NT layer, beneath FindFirstFile/FindNextFile, there is a call that says "fill this buffer with directory entry metadata." - https://docs.microsoft.com/en-us/windows/desktop/devnotes/nt... - I know FindFirstFileEx for example can let you ask for a larger buffer size to pass to that layer, thereby reducing syscall overhead in a big directory.

If you look at getdirentries(2) on FreeBSD for example - https://www.freebsd.org/cgi/man.cgi?query=getdirentries - it's a very similar looking API. I thought I recall hearing that in the days before readdir(3) the traditional approach was to open(2) a dir and read(2) it, but I cannot find a source for that claim. At any rate you can imagine something pretty identical in the layer beneath readdir(3) on a modern Unix-like system and it being essentially the same as what Windows does.

I guess file size needs an extra stat(2) in Unix, since it is not in struct dirent, so if you do care about that or some of the other WIN32_FIND_DATA members the Windows way will be faster.

caf · on Dec 29, 2018

...but I cannot find a source for that claim.

You can see here that in UNIX v6 that /bin/ls had to implement its own readdir() function that calls fopen() on the directory and then getc() 16 times to read each 16-byte dirent:

https://github.com/yisooan/unix-v6/blob/2c7099ee501923775c4c...

You can also see that in those days, struct dirent contained only the inode and dname.

dataflow · on Dec 29, 2018

If by "host" you mean the client rather than the server, and if I understand correctly, the problem I anticipate would be that the API doesn't allow you to use that cached metadata, even if the client has already received it, because there's no guarantee that when you query a file inside some folder, it'll be the same as it was when you enumerated that folder, so I'd assume you can't eliminate the round trip without changing the API. Not sure if I've understood the scenario correctly but that seems to be the issue to me.

avereveard · on Dec 29, 2018

Anectodally[1] javac dos full builds because reading everything is faster than statting every file comparing their compiled version. Eclipse works around this keeping a change list in memory, which had its own drawback with external changes pushing the workspace out of sync

[1] I can't find a source on this but I remember having read it a long time ago, so I'll leave it at that unless I can find an actual autoritative source.

sigi45 · on Dec 29, 2018

Interesting, tx for sharing :)

gsich · on Dec 29, 2018

You know any alternative to Picasa? Especially in regards to face recognition? Google Photos is objectively shit, as you need to upload all photos for that.

bedros · on Dec 29, 2018

There is digikam for Linux (KDE) with facial recognition, I just started playing with it last night, I tested it on a small group of photos, and good so far,

gsich · on Dec 29, 2018

Tried it, unfortunately it has huge problems with large collections. Face recognition is sadly way worse too.

nreilly · on Dec 29, 2018

Depending on your platform - Apple's Photos is pretty good with facial recognition, and it's all done on device.

gsich · on Dec 29, 2018

On which device? I can install macOS in a VM. Transferring all photos to a iPhone just for tagging seems ridiculous.

samstave · on Dec 29, 2018

I cant quite recall the exact number - but wasnt the packet count for an initial listing on windows SMB something like ~45 packets/steps in the transaction for each file?

Like I said - it was years ago, but I recall it being super chatty...

Solar19 · on Dec 29, 2018

This is interesting, and to some extent, expected. I'd expect that emulating one OS on top of another is going to have performance challenges and constraints in general.

But the deep dive into aspects of Windows and Linux I/O reminded me that I'd love to see a new generation of clean sheet operating systems. Not another Unix-like OS. Not Windows. Something actually new and awesome. It's been a long time since Unix/BSD, Linux, and Windows NT/XP were introduced.

A few years ago, Microsoft developed a new operating system called Midori, but sadly they killed it. MS, Apple, and Google are each sitting on Carl Sagan money – billions and billions in cash. Would it hurt MS to spend a billion to build an awesome, clean-sheet OS? They could still support and update Windows 10 for as many years as they deemed optimal for their business, while also offering the new OS.

Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?

If you think there's no point in a new OS, oh I think there's a lot of room for improvement right now. For one thing, we really ought to have Instant Computing in 2018. Just about everything we do on a computer should be instant, like 200 ms, maybe 100 ms. Opening an application and having it be fully ready for further action should be instant. Everything we do in an application should be instant, except for things like video transcoding or compressing a 1 GiB folder. OSes today are way, way too slow, which comes back to the file access performance issue on Windows. Even a Mac using a PCIe SSD fails to be instant for all sorts of tasks. They're all too damn slow.

We also need a fundamentally new security model. There's no way data should be leaving a user's computer as freely and opaquely as is the case now with all consumer OSes. Users should have much more control and insight into data leaving their machine. A good OS should also differentiate actual human users from software – the user model on nix is inadequate. Computers should be able to know when an action or authorization was physically committed by a human. That's just basic, and we don't have it. And I'm just scratching the surface of how much better security could be on general purpose OSes. There's so much more we could do.

013a · on Dec 30, 2018

Google is doing this, very publicly, with Fuchsia. Brand new kernel, not even POSIX compliant.

Microsoft is also doing this, in a different and substantially more expensive way [1]. Over the past several years they've been rewriting and unifying their disparate operating systems (Windows, Phone (before the fall), Xbox, etc) into a single modular kernel they're calling OneCore. Its more than likely that this work is based off of, if not totally contains, much of the NT kernel, but its the same line of thinking.

There is one massive rule when it comes to engineering management we see repeated over and over, yet no one listens: Do Not Rewrite. Period.

Apple is exemplary in this. We don't know how many changes they've made to iOS since its fork from MacOS long ago, which was based on BSD even longer ago. But have you used an iPad in recent history? Instant app starts. No lag. No-stutter rendering at 120fps. When HFS started giving them issues, they swapped it out with APFS. Apps are sandboxed completely from one-another, and have no way to break their sandbox even if the user wants them to. Etc. Comparing an iPad Pro's performance to most brand new Windows laptops is like fighting a low-orbit laser cannon with a civil war era musket. They've succeeded, however they managed to do that.

Point being, you don't rewrite. You learn, you adapt, and you iterate. We'll get there.

(And if you've read all this and then wondered "but isn't Fucshia a rewrite" you'd be right, and we should all have serious concerns about that OS ever seeing the light of day on a real product, and its quality once that happens. It won't be good. They can't even release a passable ChromeOS device [2])

[1] https://www.windowscentral.com/understanding-windows-core-os...

[2] https://www.youtube.com/watch?v=HOh6d_r63Bw

contextfree · on Dec 31, 2018

onecore isn't really about the kernel, but the intermediate layers above it.

blago · on Dec 29, 2018

> Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?

It's not a matter of money, resources, or talent. For reference: The Mythical Man-Month, waterfall process, Windows Vista.

Building something of this scale and complexity from scratch will take a lot longer than even the most conservative estimate and will take many more years to shake out the bugs. Again, remember Windows Vista? And that was not nearly as revolutionary as the things you suggest.

throwaway2048 · on Dec 29, 2018

Consider also that basically every single ground up "we are rethinking everything, and doing it right this time" OS rebuild from scratch has been a failure. There have been dozens, if not hundreds of examples.

https://en.wikipedia.org/wiki/Second-system_effect

https://en.wikipedia.org/wiki/OS/2

https://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs

https://en.wikipedia.org/wiki/Copland_(operating_system)

https://en.wikipedia.org/wiki/Windows_Vista#Development_rese...

https://en.wikipedia.org/wiki/Cairo_(operating_system)

https://en.wikipedia.org/wiki/Multiuser_DOS

https://en.wikipedia.org/wiki/GNU_Hurd

https://en.wikipedia.org/wiki/BeOS

https://en.wikipedia.org/wiki/Singularity_(operating_system)

machinecoffee · on Dec 31, 2018

I would argue that BeOS was not a failure as an OS, but it was a failure in the market where it couldn't find a clear place for itself, and was attempting to break into a market pretty much totally dominated by MS at the time.

Remember in the time of home computers, there were many good Systems that were pretty much from scratch implementations (Amiga, GEM and Archimedes) so the idea of creating a totally new OS against the incumbents (Windows and OSX) is not totally pointless.

pas · on Dec 29, 2018

Google is doing something with that money: https://en.wikipedia.org/wiki/Google_Fuchsia

Solar19 · on Dec 29, 2018

Good point. I forgot to mention that. I don't know if they're going for Instant Computing though. Are they?

pas · on Jan 11, 2019

Umm, I haven't heard this term before. Could you explain it a bit? I found a 2004 paper on arxiv about it, but maybe it's not that.

maccam94 · on Dec 30, 2018

I think the biggest problem is building the application ecosystem. iOS and Android were able to attract developers because they were the first movers on the computer phone platform. Convincing people to use a new OS for desktop (or any existing) computing without any apps would be difficult, and vice-versa convincing devs to support a platform with no users is also tough.

Solar19 · on Dec 31, 2018

I think it's actually much easier now to get people on a new OS then it was a decade ago, or in the 90s, or the 80s. The web is so centric that a new OS with a great browser has won half the battle (I could write five pages on how massively better a browser could be than Chrome, Edge, Firefox, Safari, Opera, etc. and I love Firefox and Opera and Vivaldi and Brave.)

ChromeOS has partly proved the point, especially among college students and younger. Anyway, a serious effort at a new OS, a whole new stack, along with a new office suite and other applications, could succeed with the right (large) team and funding. People aren't very loyal to Microsoft – they're not loved. And Apple users have no idea how much better computers could be, but they would if someone showed them.

Aloha · on Dec 29, 2018

We built something like this once, AS/400, checks off many of the boxes you mentioned.

gerdesj · on Dec 30, 2018

Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?

This gives some idea of what is required to keep Linux moving forward. It would be nice if we could see similar stats for MS and Apple.

https://lwn.net/Articles/767635/

mycall · on Dec 29, 2018

> I'd love to see a new generation of clean sheet operating systems. > We also need a fundamentally new security model.

Have you looked at Genode/seL4?

Solar19 · on Dec 29, 2018

Yes, I like them a lot. A formally verified OS should be the standard these days (and formally verified compilers). But for that to be so, we'll need new programming languages (not Rust) and toolchains I guess.

steveklabnik · on Dec 29, 2018

Why not Rust?

Solar19 · on Dec 30, 2018

So I'm coming at this from a social science perspective. Why did I say "not Rust"? I'm working on a research program to get at what happens when people engage with programming for the first time, aspects of language design that attract or repel learners, and as sort of a related offshoot, why smart women are so disproportionately less interested in programming.

One of my working hypotheses is that programming is terrible. More precisely, the majority of (smart) people who are introduced to programming have a terrible experience and do not pursue it. I think many smart people walk away thinking that it's stupid – that the design and workings of prevailing programming languages are absurd, tedious, and nonsensical. (Note that most men aren't interested in a programming career either. So it's not just smart women.)

Most intro to CS or programming courses seem to use Java these days. That's an unfortunate choice. To veteran programmers, Rust likely seems very different from other languages. Its default memory safety, ownership and borrowship and other features are innovative. (Though the amazing performance of Go's garbage collector seems to weaken the case for Rust's approach somewhat.) However, to new learners, I'm pretty sure Rust will be awful. It's fully traditional in its mass of punctuation noise, just saturated with bizarre (to outsiders) curly braces and semicolons. It also has, in my opinion, extremely unintuitive syntax. I see it as somewhat worse than Java in that respect.

Honestly, Rust is brilliant in more ways than one. I just think they punted on syntax and punctuation noise. I think our timeline is just weird and tragic when it comes to programming languages. They're almost all terrible and just so poorly designed. There's hardly any real scientific research on PL design from a human factors perspective, and what research there is is not applied by PL designers. PLs are just the result of some dude's preferences, and no one seems to be thinking outside of the box when it comes to PL design. In the end, to really make progress in computing and programming, we might need outsiders and non-programmers to design PLs. I'll say that the now-canceled Eve programming language project is a notable exception to the consistent horribleness of modern PLs. We need a few dozen Chris Grangers.

grumdan · on Dec 30, 2018

Is it really important how easy it is to learn a language when we are talking about formally verified operating system kernels? Beginners are probably not going to write formally verified code, let alone formally verified system code, so this area is already only accessible to veteran programmers who may not have a problem with these aspects of Rust.

I don't think it's a contradiction to introduce people to programming with more "elegant" or uniform languages that are closer to math (like Scheme or Haskell), or "easy" languages like Python, and then have them move on to Rust if they want to do system-level programming.

Solar19 · on Dec 31, 2018

I see your point on beginners and formally verified systems code. However, I'm imagining a somewhat different universe.

Formally verified systems code is a nightmare right now. It's almost impossible, and hardly anyone does it for pay. Obscure tools like Isabelle or Coq are needed. People try to shoehorn formal verification onto C, which is a terrible lack of imagination and use of time. Maybe it's easier with Ada or Haskell or something – not sure.

I want a revolution. I want a whole suite of new, clean-sheet programming languages that are the result of furiously brilliant multifaceted design. I want formal verification to be easy, or easier, and baked in to systems languages. I want new operating systems built in these new languages. Some of the dissenters in this thread are assuming that building a new operating system involves firing up your C compiler of choice, and dismiss the whole enterprise as a multiyear bug-ridden nightmare ("never rewrite!" they say) – that is definitely not what I'm thinking about. To hell with C.

It's time to move forward. This is supposed to be a technology industry. It's supposed to be about technological progress. PL design is not progressing. It's awful and it repels newcomers, especially women. We need to drop all these assumptions about programming being done in primitive plain text editors on primitive steampunk Unix-like OSes using primitive command line interfaces with languages as intractable as Linear A. That whole package, which is often what people are introduced to in intro CS courses, is about the worst way to introduce people to the power and wonder of computing that we could have devised.

We need to think more deeply about how to handle and represent various abstractions, functions, and imperative code. Functions should not have bizarre void statements and empty parens, for example. That's so unintuitive. At this point, functions should probably be like equation editors. The pure mathematical function in its normal book representation could be what we see in our IDEs. Visual/graphical programming never really got out of the gate because they just converted code to boxes and arrows. There's an opportunity there to rethink that whole approach.

But if we don't get people interested in programming, if we don't make it less terrible, they never get to systems work. And systems work should not look like Rust or C++. There is no reason why a modern IDE or even text editor should need curly braces or semicolons to understand or compile code at this point in the history of the software industry. Text is a solved problem. Indentation and spacing and line breaks are all trivial to parse. Colors could be semantic. Emoji could be code. Icons, badges like we see for build passing/failing could also be used as code. There's a lot we could do to make programming make a lot more sense.

One of the reasons I want to get more women into programming is that I think they'll be better programmers than men in some respects. Think of archetypes or stereotypes like Hermione vs Harry Potter, Lisa Simpson vs Bart, Wendy Testaburger vs Stan, Kim Wexler vs Jimmy McGill. There is anecdotal, cultural, and scientific evidence to suggest that women – or some subset of women – are more responsible and meticulous than some of the male personality types we often see in tech.

I'll have more on this this week in my report on Medium on Google's firing of developer James Damore for citing well-established social and personality psychology research (my field) on sex differences. I get into some of the reasons programming is generally terrible, and why women might be disproportionately repelled by it. Dart is my victim/example in this report, but I think pretty much all languages are awful. (Eiffel and F# should be the worst case, the bare minimum starting points...)

steveklabnik · on Dec 30, 2018

Thanks, that's very interesting!

We did think a lot about the syntax, but we wanted to choose something that'd be familiar to existing systems programmers. It's largely based on C++ and Java, with a little bit of ML thrown in. This is because the language is, in many senses, C++ with some ML thrown in.

> We need a few dozen Chris Grangers.

Agreed :)

zozbot123 · on Dec 29, 2018

AIUI, proper program verification is still a very long-term goal for Rust (and most of the related short-term effort is currently going into building a formal model for the language, including its "unsafe" fragment). So, "not Rust" in that Rust simply doesn't cut it at present, whereas other solutions (Idris, Coq, even Liquid Haskell) might be closer to what's needed.

steveklabnik · on Dec 29, 2018

That’s true, but they said we’d need a new language, so I assumed it must be something more.

gjs278 · on Dec 29, 2018

raid1 a compresssd ramdisk to your OS drive and everything basically is instant

giancarlostoro · on Dec 29, 2018

> The NT file system API is designed around handles, not paths. Almost any operation requires opening the file first, which can be expensive. Even things that on the Win32 level seem to be a single call (e.g. DeleteFile) actually open and close the file under the hood. One of our biggest performance optimizations for DrvFs which we did several releases ago was the introduction of a new API that allows us to query file information without having to open it first.

Ouch that sounds painful... Is this why deleting gigs worth of files takes a bit? I could of sworn it's not a huge difference on Linux, at least when using the GUI, maybe when doing straight rm it's quicker.

magicalhippo · on Dec 29, 2018

Reminds me of a performance optimization I did at work. A previous developer had implemented a routine to clean up old files. The routine would enumerate all the files in a directory, and then ask for the file age, and if old enough delete the file.

The problem was that asking the file age given a path caused an open/close, as GetFileSizeEx expects a handle.

Now, at least on Windows, enumerating a directory gets[1] you not just the filename, but a record containing filename, attributes, size, creation and access timestamps. So all I had to do was simply merge the directory enumeration and age checking.

The result was several orders of magnitudes faster, especially if the directory was on a network share.

[1]: https://docs.microsoft.com/en-us/windows/desktop/api/minwinb...

tom_ · on Dec 29, 2018

POSIX stuff is terrible for this sort of thing, but luckily it's often a straightforward fix once you know to look for it. Here's a similar fix for the old Win32 version of the silver searcher: https://github.com/kjk/the_silver_searcher/pull/7

Looks like the active Windows port (https://github.com/k-takata/the_silver_searcher-win32) doesn't have this change, but it's possible Mingw32 or whatever exposes this flag itself now (as it probably should have done originally).

dataflow · on Dec 29, 2018

Path-based I/O seems quite dangerous to me. If everything was path-based, you'd easily have inherent race conditions. You want to delete a directory? You stat() all the files, they look empty, so you delete them... but in between, another process writes to some file (or maybe the user forgets the file is being deleted and saves to something there), and suddenly you've deleted data you didn't expect. When you do things in a handle-based fashion, you know you're always referring to the same file (and can lock it to prevent updates, etc.), even if files are being moved around.

However, to answer your question of why removing a directory is slow... if you mean it's slow inside Explorer, a lot of it is shell-level processing (shell hooks etc.), not the file I/O itself. Especially if they're slow hooks -- e.g. if you have TortoiseGit with its cache enabled, for example, it can easily slow down deletions by a factor of 100x. But regarding the file I/O part, if it's really that slow at all, I think it's partly because the user-level API is path-based (because that's what people find easier to work with), whereas the system calls are mostly handle-based (because that's the more robust thing, as explained above... though it can also be faster, since a lot of work is already done for the handle and doesn't need to be re-performed on every access), so merely traversing the directory a/b/c requires opening and closing a, then opening and closing a/b, then opening and closing a/b/c, but even opening a/b/c requires internally processing a and b again, since they may no longer be the same things as before... this is O(n^2) in the directory depth. If you reduce it to O(n) by using the system calls and providing parent directory handles directly (NtOpenFile() with OBJECT_ATTRIBUTES->RootDirectory) then I think it should be faster and more robust.

yason · on Dec 29, 2018

You stat() all the files, they look empty, so you delete them... but in between, another process writes to some file (or maybe the user forgets the file is being deleted and saves to something there), and suddenly you've deleted data you didn't expect

This is fundamentally not any different between the systems, race conditions can happen either way. The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens. Similarly the newly written data would be wiped out unintentionally.

For files for which access from different processes must be controlled explicitly there is locking. No filesystem or VFS is going to protect you from accidentally deleting stuff you're still using in another context.

dataflow · on Dec 29, 2018

> [...] The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens. Similarly the newly written data would be wiped out unintentionally. [...] No filesystem or VFS is going to protect you from accidentally deleting stuff you're still using in another context.

...what? No file system is going to protect you from accidentally deleting in-use files? But that's exactly what Windows does: it prevent you from deleting in-use files. That's what everyone here has been complaining about. File sharing modes let you lock files to make sure they're not written to (and/or read from) before being deleted, it very much need not be the case that the user could write to a file before it's deleted.

yason · on Dec 31, 2018

Read my comment again.

There is an inherent race condition if one program is using a file and another program is deleting it without caring about whether the file is being accessed by other programs.

At that point, all bets are off there regardless of whether the files are accessed by paths or handles.

Windows protects the file from deletion at the exact same time as it is being accessed but does not protect the file from being deleted after it has been accessed. In wall-clock statistics the latter is the way more likely scenario.

So, if an editor saves a document to disk, and another program then deletes the document the editor will happily exit without saving it again thinking that it hasn't been changed.

It doesn't particularly matter whether the two programs clash exactly at the time of saving/deletion or not. The problem exists in the lack of information between the programs and no file system is indeed going to protect you from that.

dataflow · on Dec 31, 2018

> So, if an editor saves a document to disk, and another program then deletes the document the editor will happily exit without saving it again thinking that it hasn't been changed.

I'm trying to explain to you that your understanding of Windows is wrong and that this is impossible. As long as that editor has the document open, unless it has explicitly specified FILE_SHARE_WRITE and FILE_SHARE_DELETE, Windows will not allow another program to alter that file. So the editor would very rightly assume upon exiting that nobody has touched that file while it's had that file open.

yason · on Jan 1, 2019

I know how opening a file affects the exclusivity of access but commonly applications don't seem to keep the file continuously open except during reading or writing. Maybe some Microsoft applications use that pattern extensively but it generally works to save a file in one program and then open it in another program without closing the file in the first program or quitting it.

Nevertheless, this is getting off topic as the thread started with the question of path vs handle access. I still don't see much value in this exclusivity in the latter case because if you have a conflict to begin with it's just a matter of Murphy's for that conflict to manifest in actually losing data.

Even if your editor keeps the file open for the whole day, and you have this second program that is on the trajectory to delete the file it will eventually get to it at a time when the file is not open and thus not protected by the guarantee of exclusive access.

int_19h · on Dec 30, 2018

> This is fundamentally not any different between the systems, race conditions can happen either way. The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens.

When you hold a handle to a file or directory, you get to decide on the degree of shared access with any other users for the duration that handle is held (FILE_SHARE_*). So this does solve the concurrency problem, by allowing you to, effectively hold a lock on the file until you're done.

mirimir · on Dec 29, 2018

> I could of sworn it's not a huge difference on Linux, at least when using the GUI, maybe when doing straight rm it's quicker.

Using Linux file management GUIs can be a disaster. Some years ago, I was massaging many GiB of text, using a crude hack, with scripts that did grep, sed, awk, etc. And I had many working directories with >10^4 tiny files each (maybe 10^5 even).

In terminal, everything was fine. But opening a GUI would swap out so much that the box would freeze. I don't know why. I just learned to avoid doing it.

giancarlostoro · on Dec 29, 2018

Heh, that's fine, I don't mind the terminal or the GUI, as long as you're rationally using either and aren't blaming either for your problems. I see a lot of "I only use the terminal" type of people who screw up GUIs and the reverse can be said, GUI users who screw up the terminal, it's all about RTFM'ing sometimes, or just reading what's right in front of you.

mirimir · on Dec 29, 2018

it was this issue, I think: https://askubuntu.com/questions/892549/nautilus-so-slow-if-t...

aasasd · on Dec 29, 2018

Not only it's extremely surprising that there's no getting metadata without opening each file―and since DrvFs is only the WSL layer, apparently the system itself still to this day doesn't have such a feature.

But I'm now additionally baffled by how Total Commander managed to feel hella snappy ten years ago while navigating lots of dirs, whereas on MacOS all double-panel managers are meh, not least due to the rather slow navigation.

valarauca1 · on Dec 29, 2018

10 years ago there was less file system integration, user land virus scanning, kernel level virus scanning, os-hooks, OS-compatibility re-direction, and 32/64bit compatibility checks.

This was mostly added during NT6.0 era, which occured ~12 years ago. VISTA was the first OS using NT6.0 and VISTA was VERY much not in vogue ~12 years ago. In fact it was avoided like a plague as of 2008 (unless you were using 64bit, and had >=4GiB of RAM)

So many were using Windows-XP 32bit, or the NT5.2 kernel. Even those with >=4GiB of RAM were on Windows-XP 64bit, as VISTA had a ton of driver problems.

NT6.0 didn't catch until Windows7 and NT6.1

jclay · on Dec 29, 2018

Thanks for sharing, that was an interesting read. I know the IO performance has always been the main gripe with WSL.

It makes me think more broadly about some of the trade offs with the discussions that happen on Github Issues.

It’s great that insights like this get pulled out in the discussions, and this could serve as excellent documentation. However, the discoverability will be difficult and it always takes some weeding through the low quality comments to piece the insights from contributors together.

I wonder how Github (and Microsoft) could better handle this. They could allow for flagging in-depth comments like this into a pinned section and they could be updated collaboratively in the future (Wiki?).

It also feels like a reputation system could help to motivate healthier discussions and could bury low quality “me too” comments and gripes. It’s not so bad in the linked example but I often encounter many rude comments aimed against volunteer OSS developers.

Analemma_ · on Dec 29, 2018

This particular post is pretty famous if you've been following the development of WSL. It's constantly linked to and referenced from other GitHub issues regarding WSL performance. So I think GitHub's current system is succeeding in that regard, although there are so many good points raised here that I wish it could get turned into a standalone blog post.

setquk · on Dec 29, 2018

I can’t see MFT contention mentioned once. That’s what absolutely and totally destroys small file write performance. This affects source operations, WSL, compilers, file storage, everything.

And that’s so architecturally tied to the guts of NT you can’t fix it without pushing a new NTFS revision out. Which is risky and expensive.

Which is incidentally why no one at Microsoft even seems to mention it I suspect and just chips at trivial issues around the edges.

Bad show. NTFS is fundamentally broken. Go fix it.

Edit: my experience comes from nearly two decades of trying to squeeze the last bit of juice out of windows unsuccessfully. My conclusion is don’t bother. Ext4, ZFS and APFS are at least an order of magnitude more productive and this is a measurable gain.

Zardoz84 · on Dec 30, 2018

Perhaps we didn't read the same article. What it says that the root of problem is the Windows IO subsystem architecture. Change NTFS for anything and you will get the same problem.

setquk · on Dec 30, 2018

But that’s not the case. The root cause is the MFT and NTFS architecture. People fail to mention that because the problem is harder to fix. It’s almost that there is a “do not speak bad of NTFS” going on.

You can demonstrate this by using a third party file system driver for NT. when NTFS on its own is eliminated the performance is much much better. This is a neat little differential analysis which is conclusive. I can’t remember the product I used when I evaluated this about 8 years ago unfortunately.

quelsolaar · on Dec 29, 2018

I think this is a very good example of how windows is different in its goals and designs from Linux. I have a feeling this is the reason Linux has had a hard time catching on, on the desktop. Its easy to complain about a slow filesystem, but Microsoft lives in a different world, where other priorities exist. For someone building a server running a database serving loads of people Linux is a no brainer. You can pick the parts you want, shrink wrap it and fire it up without any fat. On a desktop machine, you want to be able to update drivers in the background without rebooting, you want virus scanners, and you want to have driver ready the moment the user plug's in a new device. Both Windows and Linux is for the most part very well engineered, but with very different priorities.

laumars · on Dec 29, 2018

I’m very confused by your post. You start off talking about desktop machines but NT was actually engineered for servers and then later ported to the desktop. You then describe a bunch of features Linux does better than Windows (eg updating drivers without a reboot).

I think a more reasonable argument is to make is just that Windows is engineered differently to Linux. There’s definitely advantages and disadvantages to each approach but ultimately it’s a question of personal preference.

bunderbunder · on Dec 29, 2018

NT is engineered for a different category of servers, though - it's a workgroup server first (originally its chief competitor was NetWare), and a Web/Internet server second. That drives a different set of priorities.

For example, as someone elsewhere in the comments pointed out, NT does file access in a way that works very well when accessing network shares. That's a pretty core use case for Windows on business workstations, where it's common for people to store all the most important files they work with on a network share, for easier collaboration with other team members.

acdha · on Dec 29, 2018

NT was architected to handle high-end workstations from day one — there’s a reason why running the GUI was mandatory even when the resource costs were fairly substantial.

Check out e.g. https://en.wikipedia.org/wiki/Windows_NT_3.1 for the history of that era. The big selling point was that your business could code against one API everywhere, rather than having DOS PCs and expensive Unix, VAX, etc. hardware which was completely different and only a few people on staff were comfortable with.

laumars · on Dec 30, 2018

OS/2 was a high end desktop OS, but NT diverged a little and took some heavy design principles from VMS (hence it’s name, WNT) and was thusly pivoted towards back office systems rather than desktop usage.

At that time Microsoft’s server offering was a UNIX platform, Xenix, but it was becoming clear that there needed to be a platform to serve workstations that wasn’t a full blown mainframe. So Microsoft handed Xenix to SCO to focus on their collaboration with IBM so the intent there was always to build something more than just high end workstation. And Given it was intended to be administrated by people who were Windows users rather than UNIX grey beards (like myself) it clearly made sense to make the GUI a first class citizen; but that doesn’t mean it was sold to be a desktop OS.

acdha · on Dec 30, 2018

My point was that it is misleading to say it was billed as a server OS when all of their messaging was that it was both — maybe not as far down as low-end desktops but they were very clear that any serious work could be done on NT, going after the higher end PC and lower end workstation business.

laumars · on Dec 30, 2018

In that era workstations weren’t the same things as desktops. They were an entirely different class of computers and often workstations just ran server OSs with a nicer UI (Next, SGI, etc). So you’re point about workstations doesn’t invalidate what I was saying about NT not originally targeting desktops.

kiwijamo · on Dec 30, 2018

Drivers in Linux live in the kernel. Whenever the kernel is updated a reboot is required (in most distros). Hence your assertion that Linux updates drivers without a reboot better than windows does is questionable.

laumars · on Dec 30, 2018

You only need to restart if there has been a kernel update (on any platform, not just “some distros”). For regular driver updates between the same kernel ABI you can use modprobe to unload and reload the drivers. This works because while drivers share the same kernel memory space (as you illuded to), they aren’t (generally) part of the same kernel binary. They normally get bundled in the same compressed archive but are separate files with a .ko extension.

This isn’t a system that is unique to Linux either. Many UNIX platforms adopt similar mechanisms and Windows obviously also has its drivers as separate executables too.

It just so happens that rebooting is an easier instruction to give users than “unload and reload the device driver”; which might also potentially be dangerous for some devices to “hot-unload” while in use. So a reboot tends to be common practice on all platforms. But at least on Linux, it’s not mandatory like it is on Windows (for reasons other than the ability to reload drivers on a live system)

kiwijamo · on Dec 30, 2018

It's not mandatory on Windows either. I've updated various drivers for a wide range of devices over the years without needing a reboot. From what you describe, it seems the situation on Windows is similar to that on Linux.

laumars · on Dec 30, 2018

Windows is a little different: to unload a driver in Windows it needs to support an unload method, which not all drivers do. And without that you cannot even write the updates to disk (due to the file locking vs inode differences which have already been discussed in this thread) let alone load them into the kernel.

That said, if a kernel module is in use on Linux then it’s sometimes no easy task finding the right procedure to follow to do an unload and reload of it.

Ultimately this is all academic though. These things mattered more on monolithic systems with little redundancy but these days it’s pretty trivial to spin up new nodes for most types of services so you wouldn’t get downtime (generally speaking. There are notable exceptions like DB servers on smaller setups where live replication isn’t cost beneficial)

anticensor · on Dec 30, 2018

> unload method, which not all drivers do

There comes the question why Microsoft did not implement a standard convention (not API) to unload a driver. They could have said a driver does this and that on unload, and we perform an as-if-shutdown unload if you fail to follow our convention.

SmellyGeekBoy · on Dec 29, 2018

> On a desktop machine, you want to be able to update drivers in the background without rebooting, you want virus scanners, and you want to have driver ready the moment the user plug's in a new device.

With the exception of the virus scanner these actually sound like arguments in favour of Linux, in my experience.

(Although there are also excellent virus scanners available for Linux anyway)

drb91 · on Dec 29, 2018

I’m pretty confused by this post. What would you identify their priorities as?

Regardless, as a person not a fan of windows (not worth learning a unix alternative), I would argue it’s the polish that makes the experience worth it, not some better engineered experience. For instance: working with variable DPI seems to be trivial, whereas it still seems years off in Linux. Same with printers and notifications and internet and almost everything. These aren’t feats of engineering per se, but they do indicate forethought I deeply appreciate when I do use windows.

acdha · on Dec 29, 2018

I would hesitate to ascribe too much purpose to everything you see. Microsoft is a huge company with conflicting political factions and a deep ethos of maintaining backwards compatibility so there are plenty of things which people didn’t anticipate ending up where they are but which are risky to change.

One big factor is the lost era under Ballmer. Stack ranking meant that the top n% of workers got bonuses and the bottom n% were fired, and management reportedly heavily favored new features over maintenance. Since the future was WinFS and touching something core like NTFS would be a compatibility risk, you really wouldn’t have an incentive to make a change without a lot of customer demand.

platz · on Dec 29, 2018

As a c# Dev, I am constantly annoyed that Windows updates and sometimes installs require reboots or stopping all user activity, while I've never had to reboot or block during an upgrade on ubuntu

gh02t · on Dec 29, 2018

To be fair, a lot of Linux updates require a reboot or at least a logout to properly take effect, too. Windows is just very aggressive about forcing you to upgrade and reboot, which does undeniably have security benefits when you consider that Windows has a lot of non-technical users and a huge attack surface. At least they have relaxed it a bit, the frequent forced reboots caused me some serious problems on a Windows machine I had controlling a CNC machine.

hnra · on Dec 30, 2018

Windows also requires rebooting for the actual upgrading process. A Linux update might need a reboot to take affect but the reboot is still a normal reboot, it won't take longer because it's trying to install something.

Both Windows and macOS suffer from this. Big updates to both systems can render the computer unusable for 30 minutes while they are installing.

snazz · on Dec 29, 2018

This is true. Fedora now only has a Reboot and Update button in the GNOME Software GUI because some software like Firefox and some GNOME components crash if you update them while they're running (although this seems to happen more often with Wayland than Xorg for some reason). At least Linux and the BSDs give you a choice whether to do offline or online updates.

Zardoz84 · on Dec 29, 2018

Interesting... I did a upgrade from Ubuntu 18.04 to 18.10 at same time that I was playing Stellaris. Zero issues.

amaccuish · on Dec 30, 2018

I can report I've had firefox go weird on Linux if something like a font is updated while it's running.

throwaway2048 · on Dec 29, 2018

Most of these things are coincidental byproducts of how Windows (NT) is designed, not carefully envisioned trade offs that are what make Windows Ready for the Desktop (tm).

For some counterexamples of how those designs make things harder and more irritating, look at file locking and how essentially every Windows update forces a reboot, that is pretty damn user unfriendly.

dataflow · on Dec 29, 2018

Even without file locking, how would live updates work when processes communicate with each other and potentially share files/libraries? I feel like file locking isn't really the core problem here.

yason · on Dec 29, 2018

Everything that is running keeps using the old libraries. The directory entries for the shared libraries or executables are removed but as long as a task holds a live file descriptor the actual shared library or executable is not deleted from the disk. New processes will have the dynamic linker read the new binaries for the updated libraries. Unless the ABI or API somehow changes during the update (and they don't, big updates usually bump the library version) things work pretty fine.

dataflow · on Dec 29, 2018

Do they work fine though?

1. On the one hand I see folks accessing files over and over by paths/names, and on the other hand they demand features that would break unless they switched their fundamental approach to handles/descriptors. Which is it? You can't claim descriptors would fix a problem and simultaneously insist on path-based approaches being perfectly fine. Most programs use paths to access everything (and this goes beyond shared libraries) and assume files won't have changed in between. You can blame it on the program not using fds if that makes you feel better, but the question is how do you magically fix this for the end user?

2. Do you actually see this working smoothly on a Linux desktop environment in practice, or do you just mean this is possible in a theoretical sense? Do you not e.g. get errors/crashes after an apt-get upgrade that presumably upgraded a package your desktop environment depended on (say GTK or whatever)? That happens to me frequently (and I'm practically guaranteed to see a problem if I open a new window in some program in the middle of an update), and it scares me what might be getting corrupted on the way -- makes me wish it would reboot instead of crashing and stop giving me errors.

yason · on Dec 31, 2018

1. In general updating the same files at the same time is not a that much of a common problem in any practical sense. The user generally won't be editing the document in two different editors at the same time. Programs use flock(2) or something similar if they have to update a shared file, or they have a directory structure that allows different instances of the program to update simultaneously by using little small files instead of having a mutually exclusive access to a single file.

I think the most common real-life problem is editing a shell script while it's still running: this happens often during development if the shell script takes a bit longer to run. You edit the file and hit save until the previous run has finished. The on-disk data changes which is reflected in the shell process that mmap()ped the script file, and eventually the contents that changed or shifted will break the shell's parser.

2. I have 106 days of uptime on my laptop. It has gone through several apt upgrades and I don't think I've shut down my X11 session for 106 days either. Firefox sometimes restarts itself after an update because it apparently knows it needs to do that but other than that I generally never restart X or reboot the machine because of updates. This has basically been the case for years, even decades. The scheme probably has to break eventually but I generally bump into other stuff such as important kernel updates before that. Fair enough for me, never really ran into any issues because of it.

dataflow · on Dec 31, 2018

1. User opens a document. User moves a higher-level directory not realizing it was an ancestor of that file. Then user goes back to the program and it can no longer find the file because it was using file paths. What should happen? Should the OS play any role?

2. You manage to keep X11 open, but that's hardly the point I was making. Do you also keep everything open and use your GUI programs as normal when going through an upgrade, or do you change your behavior in some way to avoid it messing up what you're running? And/or are you selective about which updates you apply to minimize their effects on what you're running?

Furthermore, are you familiar at all with the kinds of errors I referred to? Or have you never seen them in your life/don't know what I'm even talking about? If you don't think I'm just making things up when I say updates frequently cause me to get get crash and error messages ("Report to Canonical?" etc.), then in your mind, why does this happen right when I update? Is it just some random cosmic bit flip or disk corruption on my new computer that pops up exactly when I update? Is it not possible for it to be the update perhaps changing files when programs didn't expect them?

yason · on Jan 1, 2019

1) No, I think the assumption has to be that the user should know what s/he's doing. However, how would using handles even help there? If you close the file you will need to access it by a path even on Windows, and the very path has changed. Or instead, if you keep the file open and do not try to reopen it, even Linux lets you keep the file descriptor and have the program access the file as before even if it's moved around in the directory tree.

2) Yes, I generally keep stuff running as usual. I don't screen any updates, I just run them whenever I remember to. I think I've seen similar things to what you described. They're a rare exception though.

Obviously, doing something like a major update to a new Ubuntu version would make me close all programs and reboot the machine after the update. But any normal updates I just let through without thinking twice.

There will be problems eventually but the version mismatches become rather evident at that point. A configuration file format has changed or some scripts have moved, or a library has moved. I've seen Gnome Panel get messed a couple of times as Gnome gets notified of configuration file changes and the old Panel tried to load stuff meant for a new version of the Panel. I keep Emacs running all the time and I've seen it fail to find its lazy-loading lisp files some time after an update, being unable to enter a major or minor mode. I've seen Nautilus go wonky and unresponsive some hours/days after an update but killing the process fixed it. Chrome doesn't seem to mind but it gets a bit slow after a few weeks of use so it tends to get an occasional reboot even without updates. I've seen crash dialogs which don't come back after I restart the program, but again those are a handful across several years and were mostly about some long-running panel items like calendars or notification tickers.

However, all these are rare enough that I don't really feel any particular pain. It's quite indistinguishable from these complex programs rarely but sometimes still crashing on their own, all even without updates.

It generally takes a really long-winding session to run with enough cumulated big updates that majorly change things underneath that you can't just keep running the old binaries as they are. When something eventually misbehaves or crashes after the tenth or so update, I'll just restart that particular program. Most of the time the desktop itself keeps running like before. I don't recall a data loss because of live updates ever and I've used Linux since 1994 or so.

The live updates are much more convenient than restarting the whole system after each and every update just to make sure. I only restart one program when that program stops working, and like I said above even that is quite rare indeed.

As for you, you probably run programs that do suffer from this more.

I have the Gnome desktop with all its stuffses running in the background, I keep Emacs running continuously, a couple of browsers but their uptimes are generally around 1-2 weeks anway, a tmux session, then a lot of other programs which I don't keep open all the time.

But as most of the desktop still likely churns along as usually it's pretty easy to quit + relaunch a single program.

Dylan16807 · on Dec 30, 2018

1. Reopening a file pretty much always marks a point where it's safe for the contents to change.

2. I don't think I've ever had a problem caused by continued use between update and reboot.

caf · on Dec 29, 2018

In answer to 2, not the GP but I've not experienced problems doing that. Maybe I'm just lucky, though.

throwaway2048 · on Dec 29, 2018

you can always restart processes, on Windows it is fundamentally impossible to overwrite a running DLL or EXE file. So for example if some services are needed to apply updates, they can never be updated without a reboot.

dataflow · on Dec 29, 2018

Yes, I'm aware how Windows file locking works -- in fact you can sometimes rename running executables -- it depends.

Your solution to a rebooting the system being user-unfriendly is... restarting processes? How would that be so much more user-friendly? That's almost the same from a user standpoint, you might as well actually lock down the system and reboot to make sure the user doesn't try to mess with the system during the update.

And on top of all that, if you're actually willing to kill processes, then they won't be locking files anymore in the first place, so now you can update the files normally...

So yeah, I really don't understand how file locking is the actual problem here, despite Linux folks always trying to blame lack of live updates on that. I know I for one easily get errors after updating libraries on e.g. Ubuntu making programs or the desktop constantly crash until I reboot... if anything, that's far less user-friendly.

throwaway2048 · on Dec 29, 2018

Not all applications need to restart, most updates will effect things that are not the running application (Office suite/webbrowser/game/whatever) ? Meanwhile your entire system has to restart.

dataflow · on Dec 29, 2018

"Most updates" won't affect running applications? What DLLs do you imagine "most updates" affect that are not in use by MS Office, Chrome, games, etc.? Pretty much everything I can imagine would be used all over the system, not merely coincidentally by desktop applications, but especially by desktop applications... if anything, it'd usually be the other way around, where some background services wouldn't need to be killed (because they sometimes only depend on a handful of DLLs), but many applications would (which can have insane dependency graphs). But both applications and background services also use IPC to interact with other processes (sometimes internally through Windows DLLs, not necessarily explicitly coded by them) which could well mean that they would need to be restarted if those processes need to be updated...

Dylan16807 · on Dec 30, 2018

> What DLLs do you imagine "most updates" affect that are not in use by MS Office, Chrome, games, etc.?

Yeah, you can't update libc this way.

But outside of a short list of DLLs that are used by everything, files are mostly specific to a single program, and 90% of my programs are trivial to update by virtue of the fact that they aren't running.

And most of the background services on both linux and windows can be restarted transparently.

dataflow · on Dec 30, 2018

> But outside of a short list of DLLs that are used by everything, files are mostly specific to a single program, and 90% of my programs are trivial to update by virtue of the fact that they aren't running.

Are we talking about the same thing? We're talking about Windows updates, not Chrome updates or something. Windows doesn't force you to reboot when programs update themselves. It forces you to reboot when it updates itself. Which generally involves updating system DLLs that are used all over the place.

Dylan16807 · on Dec 30, 2018

I don't think most updates touch those DLLs. Most have a modified date of my last reinstall. Some updates do, but a whole lot more could install without a restart if microsoft cared at all (like if it cost them ten cents).

gruez · on Dec 29, 2018

>So for example if some services are needed to apply updates, they can never be updated without a reboot

I wouldn't say never. Hotpatching was introduced in windows server 2003[1]. However, it's seldom available for windows update patches, and even if it's available, you have to opt-in (using a command line flag) to actually use it.

[1] https://jpassing.com/2011/05/01/windows-hotpatching/

mappu · on Dec 29, 2018

IIRC this is because under memory pressure, files can be paged out to their existing disk location, rather than taking up extra space in swap.

Zardoz84 · on Dec 29, 2018

> On a desktop machine, you want to be able to update drivers in the background without rebooting, Exactly what does Linux and Windows DON'T does .

int_19h · on Dec 30, 2018

Windows can update many drivers without rebooting - even graphics drivers (try that with Linux and X!).

wvenable · on Dec 30, 2018

Yeah, it's amazing there is just a brief flash and everything is up and running again.

When I had slightly more unstable drivers, Windows could recover from that as well. The driver would crash, screen goes black, and then back up and running again without most apps noticing (excluding games and video playback).

kiwijamo · on Dec 30, 2018

Indeed I’ve updated many a drivers on windows (including graphics as you mention) without a reboot required. Always needed a reboot to do the equivalent kind of updates under Linux.

jclay · on Dec 29, 2018

They mention that file performance decreases with the numbers of filters that are attached to the NTFS drive. Is there a way to list the filters to determine which ones (and how many) are applied on your system?

AbacusAvenger · on Dec 29, 2018

I had the same question. Apparently there are a couple of commands to look at:

  fltmc filters

That shows the filters loaded, and presumably the ones with "num instances" of 0 aren't actually in use on any volumes.

You can figure out what each of the filters is by looking at the output of:

  sc query type=driver

To figure out what the ones I saw in fltmc were, I used this:

  sc query type=driver | grep -A1 -i -e cldflt -e wcifs -e storqos -e filecrypt -e luafv -e npsvctrig -e wof -e fileinfo

tahoemph999 · on Dec 29, 2018

Interesting this article never said "poorly architected". The conclusion that the issue is peanut buttered points at that. Instead of looking into the system for things to optimize is there any proposal or initiative to rework it at a higher level?

tedunangst · on Dec 29, 2018

Proposal from an earlier comment:

> Noting, as an aside, that it isn't really all that necessary for MSFT to work on the project, because I gather there are at least 684 FOSS NT kernel developers available, and qualified, and willing, to volunteer their time to work on projects like that. I assume that's why all those people upvoted, anyway. With a team that size, stabilizing WinBtrfs will take no time at all.

wbl · on Dec 29, 2018

Windows is different. Linux puts effort into making the Unix style of IO work fast.

Abishek_Muthian · on Dec 30, 2018

On similar lines, one of the usual gripe with windows is the horrible file copy estimates, Raymond Chen wrote an article on it to explain why it is[1].

[1]: https://blogs.msdn.microsoft.com/oldnewthing/20040106-00/?p=...

blntechie · on Dec 30, 2018

The Old New Thing is a fantastic blog especially if you develop in Windows. I used to read it religiously until 5 years back and stopped reading about time when Google Reader was retired and somehow didn't setup in my new flow of reading blogs. Thanks for reminding me about this blog and article again.

Thaxll · on Dec 29, 2018

At least windows doesn't freeze your whole desktop under heavy IO.

Edit: I'm getting downvoted by people that never used Linux probably.

newnewpdro · on Dec 29, 2018

You're being downvoted, but it's true about linux for at least two reasons to this day:

1. dm-crypt threads are unfair to the rest of the system's processes [1]. On dmcrypt systems, regular user processes can effectively raise their process scheduling priority in a multithreaded fashion by generating heavy IO on dmcrypt storage.

2. Under memory pressure, the VM system in linux will enter a thrashing state even when there is no swap configured at all. I don't have a reference on hand, but it's been discussed on lkml multiple times without solution. I suspect the recent PSI changes are intended as a step towards a solution though. What happens is clean, file-backed pages for things like shared libraries and executable programs become a thrashing set resembling anonymous memory swapping under memory pressure. As various processes get scheduled, they access pages which were recently discarded from the page cache as evictable due to their clean file-backed status when other processes ran under pressure, and now must be read back in from the backing store. This ping-ponging continues dragging everything down until either an OOM occurs or pressure is otherwise relieved. This often manifests as a pausing/freezing desktop with the disk activity light blazing, and it's only made worse by the aforementioned dmcrypt problem if these files reside on such volumes.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=199857

rikkus · on Dec 29, 2018

This behaviour - and perhaps other causes with similar issues for desktop users, is what drove me away from helping to make Linux ‘ready for the desktop’. The kernel philosophy was incompatible with the needs of desktop users. Those developing for the desktop (like myself) didn’t have the required expertise to make the kernel do the Right a Thing, and couldn’t find enough people willing or able to help.

Things may have changed over the years - I’ve been running a Linux desktop recently and haven’t seen this kind of issue yet (the kind where you need to use magic keys to ask the kernel to kill all, sync and reboot) but reading your post, perhaps this is because RAM is much more plentiful these days.

zanny · on Dec 29, 2018

Its probably the ram plentiful thing. I haven't looked recently into if the Arch mainline kernel kconfig is just bad or not but the oomkiller is trash for me. Used to have the "recommended" swap == ram size but then the memory manager never even tried to cull pages until it was OOM and froze up trying to swap constantly. Currently running a 16/4 spread and probably going to drop to 16/1 because any time I hit memory limits everything just freezes permanently rather than the oomkiller getting invoked. I've hit it twice this week trying to render in Kdenlive and run a debug build of Krita...

blattimwind · on Dec 29, 2018

Yeah, I still observe this from time to time, with SSD-only storage and 32 GB main memory.

newnewpdro · on Dec 29, 2018

If more people reproduced the dmcrypt issue and commented in the bugzilla issue it'd put more pressure on upstream to revert the known offending commit.

For some reason they seem to be prioritizing the supposedly improved dmcrypt performance over fairness under load, even though it makes our modern machines behave like computers from the 90s; unable to play MP3s and access the disk without audio underruns.

I assume it's because they're not hearing enough complaints.

Dylan16807 · on Dec 29, 2018

Is there no way to set a minimum page cache size?

adontz · on Dec 29, 2018

It does?

It does even much weirder things. https://blogs.technet.microsoft.com/markrussinovich/2007/08/...

qha · on Dec 29, 2018

Oh my God, they pretended everybody has a 100 Mbps NIC and capped at that speed. That's completely retarded.

At least I'm going to give them credit for disclosing what the problem was. I would've be too ashamed to admit it.

Zardoz84 · on Dec 29, 2018

I work and play every day over Linux and I never saw the desktop freezing.

kiwijamo · on Dec 30, 2018

I’ve attempted to use Linux and desktop freezing is the norm even on machines that run fine under windows. Admittedly the machines might be underpowered but that does call into question the commonly held belief that desktop Linux is better for low spec machines.