I develop on Windows 10 because I’m a .NET developer and really like the idea of WSL. But the more I use it the more frustrated I become by its file access performance. I started off using it for Git but now I just use Git bash in Poweshell (which also annoys me with its slowness).
I haven’t developed on an actual *NIX machine in years but recently I deployed something to my DO VPS and it “built” (php composer) in what felt like 1/100 of the time it was taking on my computer, whether running the Windows binaries in Powershell/CMD or the Linux binaries in WSL. Although I will say WSL is slower.
In fact, it was so fast that I’m about to call Microcenter and see about picking up a Thinkpad to install Linux on.
A major performance improvement is adding the working folders for my coding projects to the exclusion list of whatever antivirus solution is running. On low-specced machines I disable AV entirely, because I feel these days it is mostly snake oil anyway with zero days being commonplace.
UEFI Secure Boot, mandatory signed binaries, and Windows Defender (XProtect on macOS), have contributed more to protecting from malware than 3rd party anti-virus. Although I think the existence, cost, and PITAness of 3rd party anti-virus might very well have contributed to motivating the OS vendors into securing their products better.
It should be noted, I believe the parent comments included Windows Defender as an anti-virus. 3rd party was never specified, and disabling Windows Defender can indeed improve file access performance.
Can confirm. I usually have to turn off windows defender whenever I'm doing anything with docker, or node modules, or something similar. If I don't, my computer slows to a crawl.
Source? I thought UEFI was just a way to make Linux a pain in the ass to dual boot with Windows? What's your evidence that it's effective against malware? I am biased here, and hate uefi.
UEFI is not the same thing as UEFI Secure Boot. UEFI booting in general makes dual-booting far easier than BIOS-based booting where operating systems have to fight over who owns the MBR. Secure Boot makes it harder to set up a multi-boot system because you need a signed bootloader for your Linux system.
I do remember, but correlation != causation. The major improvements that have made software so much more secure are not AV, they are things like ASLR, non-executable stack, stack canaries, a shift to less-privileged code and having more functions in user space, memory-safe(r) languages being more common place, and an increase in general security awareness. If anything anti-virus is much less useful now that polymorphic shell code is commonplace, as well as the fact that user error (such as falling for a phishing attack) is by far the largest cause of security failings.
> If anything anti-virus is much less useful now that polymorphic shell code is commonplace
Source? I disagree with this statement. Polymorphic viruses have been in commonplace since decades. I don't think that diminishes from the importance of AV. AV software isn't restricted to comparing file hashes with known threats, there's so much more that can be done for security.
Are you asking for a source for only that statement or for my post in general? Source is myself. I have a masters in Cyber Security and have worked in the field for 15 years. I've written numerous exploits and have actively evaded antivirus in the past. I can tell you from experience that ASLR is 10 times the pain in the ass that AV is, and NX bits/DEP are maybe 100 times more. Not trying to have a dick measuring contest, just justifying why I don't mind citing myself :-D
Regarding:
> Polymorphic viruses have been in commonplace since decades
I disagree. I wouldn't describe them as "commonplace" until maybe the last decade or so. Regardless, this is probably the weakest of the arguments that I made.
> AV software isn't restricted to comparing file hashes with known threats, there's so much more that can be done for security.
With this I agree, tho I would contend that even the most advanced heuristics and things like hook interceptions such as those Comodo experimented with in the late 2000s are still not what has made us so much more secure. At best AV is a small layer of a Defense in Depth strategy. At worst it's a bloated unnecessary layer that eats cycles and robs system resources that could be devoted to useful activities.
That said, if I had any Windows machines in my home (been on Linux exclusively for a bit over 10 years now), I would likely run Defender on them. I'm not suggesting that AV is worthless, just that it isn't the reason things are much more secure these days.
To give another data point, for the software I'm developing (https://ossia.io/, a few hundred thousand LOC of modern C++ / Qt), a full build, with the same compiler version (clang-7), on the same machine, on the same pci-express NVMe disk takes 2 to 3 times slower on windows than on linux.
every time I have to build on windows I feel sad for all the developers forced to use this wretched environment.
For C++, on Windows I use MSVC compiler, and I'm usually happy with its performance. If your clang supports precompiled headers, try enabling them. This saves CPU time parsing standard headers, but also saves lots of IO accessing these random small include files, replacing the IO with large sequential read from the .pch.
I have a similar experience. I'm a developer on a product that has both C and C# components. The C component runs on both Windows and Linux and is by far the larger of the two while the C# GUI component is Windows-only.
Our main workstations are Windows but we also have a shared Linux development VM on our group's VM host. The VM host is not very powerful, it's a hand-me-down machine from another group that's around 7 years old at this point, with a low-clocked Bulldozer-based AMD Opteron CPU and spinning rust storage. In comparison my workstation has a 4th-gen Core i7 processor and a SATA SSD.
Despite the fact that my workstation should be around 2-3x as fast per thread and have massively better disk I/O the VM host still builds our C code in less than half the time. If I didn't have to maintain the C# code and use MS Office I'd switch to Linux as my main workstation OS in a heartbeat. (Note that the compilation process on neither platform uses parallel compilation so it's not that our VM is farming the compile out to more cores.)
In a similar situation - I've had reasonable success in Linux developing a .net framework c# app in VS Code with Mono from the Ubuntu PPA. Almost everything works but for the odd project calling native Windows DLLs, so I keep windows around for manipulating VS projects and compiling these oddities. Most dev can happen in Linux though.
In my case we use a closed-source commercial UI library very extensively whose vendor has previously stated they have no interest in supporting Linux or WINE so doing C# dev on Linux is totally out of the question for me, unfortunately. The company I work for has a long-term strategic initiative to transition to an entirely web-based UI so eventually we may be able to remove the C# code but that's still years off for my group.
(In retrospect tying ourselves to a closed-source library like that was a mistake; if I could go back in time and put us on a different path (using the technologies available at the time) I would have gone with C++ and Qt instead which would have allowed for cross-platform development and deployment. Not to mention that because Qt is open-source (even if we would have had to buy a proprietary license for it at the time) we could fix any bugs we encountered ourselves, unlike our current UI library where we just need to come up with work-arounds until the vendor can get around to fixing them. But these decisions were made before I was hired so I just have to live with them.)
So you have you core business logic and algorithms written in C and only the front end/GUI uses C#? Can you cleanly separate the two (via MVVC or a variation thereof)?
Does the whole application runs on C# and you have RPCs between the two? What does that mean for performance?
Cheers and thanks for any Insights about your setup.
It's a pretty classic client/server setup (not too different from modern web apps, actually). The C code is the server and the C# code (which is actually a plugin to another application written in C#) is the client. The C server can run independently of the C# clients but isn't very useful (to our customers at least) by itself.
Most of the C# code's responsibility is taking user commands from the plugin's host application and converting them to messages sent to the C server for further processing, which isn't very speed/latency sensitive most of the time since it only has to operate at user-perceivable timescales and on small amounts of data. The remainder of the C# code is some custom UI controls and dialogs.
The network communications (if you squint hard enough) somewhat resembles REST with custom binary protocols instead of HTTP and JSON; if REST were in style when this code was first written (around 10 years ago) it almost certainly would have been used for the network/message layer instead.
Do you take the buffercache into account? If the server has more RAM unused by processes it probably uses it as a cache, which can be 3x quicker than a SSD.
Windows Defender is an abomination. It makes file operations on regular Windows also extremely slow. For example unpacking the Go 1.11 zip which is 8700 files takes a second with my PCIe SSD with Windows Defender disabled. Enable it and the extraction time rises to several minutes.
Can I plug System76 as an alternative? I feel like it's important to purchase Linux-native hardware. Microsoft has largely prevented competition in this market, and I think more viable options would benefit consumers. Also, S76 has pretty good hardware.
Thanks for the heads up (S76) but Linux runs on pretty much anything these days. This laptop is a Dell Inspiron with a 17" touch screen and lots of sensors. I'm running Arch Linux on it and everything is supported out of the box. The only tweak I have made towards hardware is changing the driver in use for the Synaptics mouse, which dmesg mentioned and will probably become the norm soon anyway.
There really is no such thing as MS native only anymore. I got Linux on here without accepting any obnoxious licenses and my laptop's price was partially subsidised by all the crap that I never even saw. To be honest, I'm not sure what the exact price breakdown really is on this thing but I do know that MS did not get in my way.
Dell are pretty Linux friendly, for example to update firmware I copy the new image to my /boot partition (EFI) and then use the built in option at boot to update the firmware - simples! No more farting around with turning swap into FAT32 for a while and a FreeDOS boot disc.
You can buy Dell laptops these days that ship directly with Ubuntu (such as XPS and Precision), and you explicitly see the Microsoft tax fall off when making the selection. It feels so good.
Intel's NUC bricks come in two varieties: prebuilt and bring your own memory, storage and OS. The latter does not impose a Windows license fee and as long as you're running kernel 4 all the hardware is generally supported, although I see running Ubuntu combined with a thunderbolt switch doesn't work.
Sporting laptop-class processors these are not powerhouse machines and fall below the latest Mac Minis in performance. But for a development platform they can be plenty, particularly the newer models that ditch classic SATA for two NVME slots.
Not too sure what the break down on my lappy is wrt bundles. As far as I can tell I paid a fair price for it and I did not accept any licenses that I did not want. When I say fair price, I think that it is pretty decent. I don't know what a 17" Apple laptop would cost with a touch screen but probably more than the £950ish I paid for this beast.
I'll plug the Librem 13/15. I have the 13 and I'm very happy with it (initially it had a bug in the firmware so my NVMe SSD that I bought separately wasn't bootable, but they fixed it very quickly).
I got a Galago Pro recently, because my previous netbook fell apart. The construction on the Galago Pro is quite solid, I believe; getting it apart was a bit more challenging than I am used to. However, the laptop is still being manufactured by CLEVO, and as one consequence of this, the battery life is pretty limited (I was aware of this drawback before purchase). I believe that, as the sibling poster mentions, they are bringing their design and manufacture in-house, and that subsequent revisions will be an improvement. I don't find the issue to be hugely limiting, and I am personally willing to forgive quite a lot to have (1) no Windows key, and (2) non-soldered RAM.
I know SQL for linux's docker image has been a breeze to work with, even in docker for windows. Also, SQL Operations Studio (electron based) is catching up to SSMS.
Jetbrains’ Rider IDE is now so mature that I use it as a better Visual Studio even on Windows. It also runs on Mac and Linux. Great code manipulation, navigation and refactoring tools, and great support for adjacent technologies like build and test tools for both the .net code and web front-ends.
Can you code on linux using Jetbrains to create .NET 3.5 apps for a Windows target? 3.5 is just an example because the matching VisualStudio works more or less in wine, while I haven't tested 4.0 and more recent.
Is this purely about convenience tools in an IDE or are there some things actually locked into the Windows environment? Is .NET a lot like Xcode where it's not like you can just download the libraries + compiler + a text editor and have all you need? The latter is broadly true for every language I've really dove into so this feels like a foreign concept.
Edit: what I could stand to gain is that I work weekly in four or five languages. Already well tooled up in Ubuntu and vscode. Nothing frustrates me more than having to keep multiple IDEs consistent. Imagine driving two cars all day that have their controls in all different places.
Depends on what you're doing. Some technologies are only supported on Windows (like the official UI frameworks), but most things like webdev and gamedev libraries are supported on all platforms. Giving up Visual Studio can be a hard sell, as from my experience C# + Visual Studio (+ ReSharper) is one of the most productive programming environments you can have.
You can download the SDK from https://dotnet.microsoft.com/ and use it on any platform with your editor of choice, including JetBrains Rider, which is a cross-platform .NET IDE.
Depending on your UI needs, may want to look at Eto.Forms and MonoGame. :-) That said, I agree on productivity for C# + VS. I find I'm more productive with node + npm + vs code though.
.Net Core hasn't been too bad outside VS... I do wish they'd stuck to the JSON project format. I also wish dotnet had a task runner like npm in it.
The question here is - what do you gain? You're giving up arguably one of the best developer tools available for what? A slightly different desktop skin? Remapped shortcuts? Using slightly different command line commands?
I don't doubt that. But that's not the point - I'm sure plenty of people get by with Paint or Paint.NET, but noone sane would call them a replacement for Photoshop and its workflows.
Same with VS vs. VSCode - I'm happy that it works great for you, but I'm not sure why you'd think they're comparable tools.
Reasonable. .NET Core is obviously fine, but even .NET Framework stuff is largely runnable with up to date Mono, as MS is slowly pushing lots of previously 'system' libraries into NuGet. WPF is the only notable big dealbreaker.
> I started off using it for Git but now I just use Git bash in Poweshell (which also annoys me with its slowness).
This may not be a WSL issue. You don't want to mix Windows git (what you're calling git-bash) with Linux git (or MSYS2 git). Even a git status will cause them to trample on each other in ways I don't yet fully understand, and that will also slow them down very significantly as one tries to process a repo the other one has previously accessed. Pick one and stick with it for any given repo.
You may actually see better performance via docker or vm for linux under windows. Also, if you're using or can migrate to .Net core it works pretty well there.
I'm using containers for local services I'm not actively working on, even though the application is being deployed to windows, because it's been easier for me. I'd actually prefer Linux host at work, but there's legacy crap I need.
I do work remotely sometimes on my home hackintosh though.
I remember this issue (I commented in it a few years ago).
On the bright side, WSL as a development environment is no longer slow when it comes to real world usage. I've been using it for full time web app development for the last year and even made a video about my entire set up and configuration a week ago[0].
For example with WSL you can easily get <= 100ms code reloads even through a Docker for Windows volume on really big applications with thousands of files, such as a Rails app.
Even compiling ~200kb of SCSS through a few Webpack loaders (inside of Docker with a volume) takes ~600ms with Webpack's watcher. It's glorious.
I haven't had a single performance issue with a bunch of different Rails, Flask, Node, Phoenix and Jekyll apps (with and without Webpack). This is with all source code being mounted in from a spinning disk drive too, so it can get a lot faster.
So while maybe in a micro-benchmark, the WSL file system might be an order of magnitude slower (or worse) than native Linux, it doesn't really seem to matter much when you use it in practice.
> with WSL you can easily get <= 100ms code reloads even through a Docker for Windows volume
(Edited after watching your video.)
In your video it looks like you're running things in Docker containers. Even if you start containers using WSL, they still run in a separate Hyper-V virtual machine with a true Linux kernel, whereas WSL shares the Windows kernel and works by mapping Linux system calls directly to Windows kernel calls. When you run the "docker" command in WSL, it's just communicating with the Docker daemon running outside of WSL.
Docker runs this way on Windows because WSL does not implement all the Linux kernel system calls, only the most important ones needed by most applications, and the missing ones include some needed to run the Docker daemon.
All in all, this means that what you're talking about is not affected by the linked issue because it uses a different mechanism to access files (the Hyper-V driver rather than the WSL system call mapping). Although, if anything, I would expect Hyper-V to be even slower.
(Your edit makes my reply make a lot less since since you removed all of your original questions, but I'll leave my original reply, read the part after number 7)
My set up is basically this:
1. I use WSL as my day to day programming environment with the Ubuntu WSL terminal + tmux[0]. It's where I run a bunch of Linux tools and interact with my source code.
2. I have Docker for Windows installed (since the Docker daemon doesn't run directly in WSL yet due to missing Linux kernel features like iptables, etc.).
3. I installed Docker and Docker Compose inside of WSL but the daemon doesn't run in WSL. I just use the Docker CLI to communicate with Docker for Windows using DOCKER_HOST, so docker and docker-compose commands seamlessly work inside of WSL from my point of view[1].
4. All of my source code lives on a spinning disk drive outside of WSL which I edit with VSCode which is installed on Windows.
5. That source code drive is mounted into WSL using /etc/wsl.conf at / (but fstab works just as well)[2].
6. That source code drive is also shared with Docker for Windows and available to be used as a volume in any container.
7. All of my Dockerized web apps are running in Linux containers, but using this set up should be no problem if you use Windows containers I guess? I never used Windows containers, but that seems out of scope for WSL / Docker CLI. That comes down to Docker for Windows.
But, it's worth mentioning I have installed Ruby directly in WSL and didn't use Docker, and things are still just as fast as with Docker volumes. In fact, I run Jekyll directly in WSL without Docker because I really like live reload and I couldn't get that to work through Docker. My blog has like 200+ posts and 50+ drafts, and if I write a new blog post, Jekyll reloads the changes in about 3 seconds, and that's with Jekyll-Assets too. I have a feeling it wouldn't be that much faster even on native Linux since Jekyll is kind of slow, but I'm ok with a 3 second turn around considering it does so much good stuff for me.
It's worth noting that copying in/out of the docker for windows container can be interesting. Docker for windows mirrors the content in the volume directory on windows to one native inside the container... for some editing, it's fine... but try something a SQLite database with a gui in windows, and a container in linux both connected, and it will blow up on you.
The mount in WSL is really adaptive calls to the native OS, so that side will work without issue... the sync inside the container while editing is fast enough, and the running/building in the container is faster than it would be on the WSL itself as described by GP.
Are you talking about running it from the command line?
I use Git almost every day and have it installed directly inside of WSL.
It's not slow for any projects I've worked on, but it's worth mentioning I'm not dealing with code bases with millions of lines of code and many tens of thousands of files.
Most of the code bases I work on have ~100kb LOC or less (most are much less), and hundreds or low thousands of files.
Grepping through 50,000+ files through a WSL mount on a non-SSD is pretty slow, but I haven't tried doing that on a native Linux system in a long time so I can't really say if that slowness is mostly WSL or grepping through a lot of files in general.
If grep is hitting 100% utilization and isn't just file system bound, and you are dealing with ascii stuff, you can speed it up a lot of times by prepending 'LANG=C', so that it doesn't have to deal with unicode.
It rather depends on what "in practice" means in practice, though. I occasionally look at it with a test dominated by building small C programs and WSL remains several times slower than a Linux/ext4 install on the same hardware
Runtime on a Linux desktop machine is about 20 minutes. The point is that it's a workload dominated by small file operations, and in particular small file stat operations, that WSL is particularly bad at. And really it's not so crazy a workload.
My test is that W10 take half hour every time that I start it, hitting very hard the hard disk and being barely usable.
However I get ready to work on Ubuntu on less that a minute.
I spent many years optimizing "stat-like" APIs for Picasa - Windows just feels very different than Linux once you're benchmarking.
It turns out Windows/SMB is very good at "give me all metadata over the wire for a directory" and not so fast at single file stat performance. On a high-latency network (e.g. Wi-Fi) the Windows approach is faster, but on a local disk (e.g., compiling code), Linux stat is faster.
This is off-topic, but is there any chance of bringing the Picasa desktop client back to the masses?
There's nothing out there that matches Picasa in speed for managing large collections (especially on Windows). The Picasa Image Viewer is lightning-fast, and I still use them both daily.
There are, however, some things that could be improved (besides the deprecated functionality that was gone when Picasa Online was taken away); e.g. "Export to Folder" takes its sweet time. But with no source out there, and no support from the developers, this will not, sadly, happen.
I'm mostly clueless about Windows, so bear with me, but that makes no sense to me.
If SMB has some "give me stat info for all stuff in a directory" API call then that's obviously faster over the network since it eliminates N roundtrips, but I'd still expect a Linux SMB host to beat a Windows SMB host at that since FS operations are faster, the Linux host would also understand that protocol.
Unless what you mean is that Windows has some kernel-level "stat N" interface, so it beats Linux by avoiding the syscall overhead, or having a FS that's more optimized for that use-case. But then that would also be faster when using a SMB mount on localhost, and whether it's over a high-latency network wouldn't matter (actually that would amortize some of the benefits).
I think the idea is that you're accessing files sparsely and/or randomly.
With the linux approach you avoid translating (from disk representation to syscall representation) metadata you don't need, and the in-memory disk cache saves having to re-read it (and some filesystems require a seek for each directory entry to read the inode data structure, which can also be avoided it you don't care about that particular stat).
With the windows approach, the kernel knows you want multiple files from the same directory, so it can send a (slightly more expensive) bulk stat request, using only one round trip[0]. On linux, the kernel doesn't know whether you're grabbing a.txt,b.txt,... (single directory-wide stat) or foo/.get,bar/.git,... (multiple single stats that could be pipelined) or just a single file, so it makes sense to use the cheapest request initially. If it then sees another stat in the same directory, it might make a bulk request, but that still incurred a extra round trip, and may have added useless processing overhead if you only needed two files.
TLDR: Access to distant memory is faster if assuptions can be made about your access patterns, access to local memory is faster if you access less of the local memory.
0: I'm aware of protocol-induced round trips, but I don't think it effects the reasoning.
linux: cd /dir (no info)
windows: open directory ... all the info and different views depending on your selection currently like image file thumbnails
in windows you are always accessing this meta data so it makes sense to speed it up. while in linux even the ls fucking does give you meta data you have to add the extra options so it doesnt makes sense to speed up and waste storage on something that is infrequent
> If SMB has some "give me stat info for all stuff in a directory" API call
It does, it supports FindFirstFile/FindNextFile[1], which returns a struct of name, attributes, size and timestamps per directory entry.
Now I'm not sure how Linux does things, but for NTFS, the data from FindFirstFile is pulled from the cached directory metadata, while the handle-based stat-like APIs operate on the file metadata. When the file is opened[2], the directory metadata is updated from the file metadata.
So while it does not have a "stat N" interface per se, the fact that it returns cached metadata in an explicit enumeration-style API should make it quite efficient.
I'm not sure how FindFirstFile/FindNextFile is going to be better than readdir(3) on Unix.
At the NT layer, beneath FindFirstFile/FindNextFile, there is a call that says "fill this buffer with directory entry metadata." - https://docs.microsoft.com/en-us/windows/desktop/devnotes/nt... - I know FindFirstFileEx for example can let you ask for a larger buffer size to pass to that layer, thereby reducing syscall overhead in a big directory.
If you look at getdirentries(2) on FreeBSD for example - https://www.freebsd.org/cgi/man.cgi?query=getdirentries - it's a very similar looking API. I thought I recall hearing that in the days before readdir(3) the traditional approach was to open(2) a dir and read(2) it, but I cannot find a source for that claim. At any rate you can imagine something pretty identical in the layer beneath readdir(3) on a modern Unix-like system and it being essentially the same as what Windows does.
I guess file size needs an extra stat(2) in Unix, since it is not in struct dirent, so if you do care about that or some of the other WIN32_FIND_DATA members the Windows way will be faster.
You can see here that in UNIX v6 that /bin/ls had to implement its own readdir() function that calls fopen() on the directory and then getc() 16 times to read each 16-byte dirent:
If by "host" you mean the client rather than the server, and if I understand correctly, the problem I anticipate would be that the API doesn't allow you to use that cached metadata, even if the client has already received it, because there's no guarantee that when you query a file inside some folder, it'll be the same as it was when you enumerated that folder, so I'd assume you can't eliminate the round trip without changing the API. Not sure if I've understood the scenario correctly but that seems to be the issue to me.
Anectodally[1] javac dos full builds because reading everything is faster than statting every file comparing their compiled version. Eclipse works around this keeping a change list in memory, which had its own drawback with external changes pushing the workspace out of sync
[1] I can't find a source on this but I remember having read it a long time ago, so I'll leave it at that unless I can find an actual autoritative source.
You know any alternative to Picasa? Especially in regards to face recognition? Google Photos is objectively shit, as you need to upload all photos for that.
There is digikam for Linux (KDE) with facial recognition, I just started playing with it last night, I tested it on a small group of photos, and good so far,
I cant quite recall the exact number - but wasnt the packet count for an initial listing on windows SMB something like ~45 packets/steps in the transaction for each file?
Like I said - it was years ago, but I recall it being super chatty...
This is interesting, and to some extent, expected. I'd expect that emulating one OS on top of another is going to have performance challenges and constraints in general.
But the deep dive into aspects of Windows and Linux I/O reminded me that I'd love to see a new generation of clean sheet operating systems. Not another Unix-like OS. Not Windows. Something actually new and awesome. It's been a long time since Unix/BSD, Linux, and Windows NT/XP were introduced.
A few years ago, Microsoft developed a new operating system called Midori, but sadly they killed it. MS, Apple, and Google are each sitting on Carl Sagan money – billions and billions in cash. Would it hurt MS to spend a billion to build an awesome, clean-sheet OS? They could still support and update Windows 10 for as many years as they deemed optimal for their business, while also offering the new OS.
Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?
If you think there's no point in a new OS, oh I think there's a lot of room for improvement right now. For one thing, we really ought to have Instant Computing in 2018. Just about everything we do on a computer should be instant, like 200 ms, maybe 100 ms. Opening an application and having it be fully ready for further action should be instant. Everything we do in an application should be instant, except for things like video transcoding or compressing a 1 GiB folder. OSes today are way, way too slow, which comes back to the file access performance issue on Windows. Even a Mac using a PCIe SSD fails to be instant for all sorts of tasks. They're all too damn slow.
We also need a fundamentally new security model. There's no way data should be leaving a user's computer as freely and opaquely as is the case now with all consumer OSes. Users should have much more control and insight into data leaving their machine. A good OS should also differentiate actual human users from software – the user model on nix is inadequate. Computers should be able to know when an action or authorization was physically committed by a human. That's just basic, and we don't have it. And I'm just scratching the surface of how much better security could be on general purpose OSes. There's so much more we could do.
Google is doing this, very publicly, with Fuchsia. Brand new kernel, not even POSIX compliant.
Microsoft is also doing this, in a different and substantially more expensive way [1]. Over the past several years they've been rewriting and unifying their disparate operating systems (Windows, Phone (before the fall), Xbox, etc) into a single modular kernel they're calling OneCore. Its more than likely that this work is based off of, if not totally contains, much of the NT kernel, but its the same line of thinking.
There is one massive rule when it comes to engineering management we see repeated over and over, yet no one listens: Do Not Rewrite. Period.
Apple is exemplary in this. We don't know how many changes they've made to iOS since its fork from MacOS long ago, which was based on BSD even longer ago. But have you used an iPad in recent history? Instant app starts. No lag. No-stutter rendering at 120fps. When HFS started giving them issues, they swapped it out with APFS. Apps are sandboxed completely from one-another, and have no way to break their sandbox even if the user wants them to. Etc. Comparing an iPad Pro's performance to most brand new Windows laptops is like fighting a low-orbit laser cannon with a civil war era musket. They've succeeded, however they managed to do that.
Point being, you don't rewrite. You learn, you adapt, and you iterate. We'll get there.
(And if you've read all this and then wondered "but isn't Fucshia a rewrite" you'd be right, and we should all have serious concerns about that OS ever seeing the light of day on a real product, and its quality once that happens. It won't be good. They can't even release a passable ChromeOS device [2])
> Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?
It's not a matter of money, resources, or talent. For reference: The Mythical Man-Month, waterfall process, Windows Vista.
Building something of this scale and complexity from scratch will take a lot longer than even the most conservative estimate and will take many more years to shake out the bugs. Again, remember Windows Vista? And that was not nearly as revolutionary as the things you suggest.
Consider also that basically every single ground up "we are rethinking everything, and doing it right this time" OS rebuild from scratch has been a failure. There have been dozens, if not hundreds of examples.
I would argue that BeOS was not a failure as an OS, but it was a failure in the market where it couldn't find a clear place for itself, and was attempting to break into a market pretty much totally dominated by MS at the time.
Remember in the time of home computers, there were many good Systems that were pretty much from scratch implementations (Amiga, GEM and Archimedes) so the idea of creating a totally new OS against the incumbents (Windows and OSX) is not totally pointless.
I think the biggest problem is building the application ecosystem. iOS and Android were able to attract developers because they were the first movers on the computer phone platform. Convincing people to use a new OS for desktop (or any existing) computing without any apps would be difficult, and vice-versa convincing devs to support a platform with no users is also tough.
I think it's actually much easier now to get people on a new OS then it was a decade ago, or in the 90s, or the 80s. The web is so centric that a new OS with a great browser has won half the battle (I could write five pages on how massively better a browser could be than Chrome, Edge, Firefox, Safari, Opera, etc. and I love Firefox and Opera and Vivaldi and Brave.)
ChromeOS has partly proved the point, especially among college students and younger. Anyway, a serious effort at a new OS, a whole new stack, along with a new office suite and other applications, could succeed with the right (large) team and funding. People aren't very loyal to Microsoft – they're not loved. And Apple users have no idea how much better computers could be, but they would if someone showed them.
Would it cost more than a billion dollars to assemble an elite team of a few hundred people to build a new OS? Would Apple, Microsoft, or Google notice the billion dollar cost? Or even two billion?
This gives some idea of what is required to keep Linux moving forward. It would be nice if we could see similar stats for MS and Apple.
Yes, I like them a lot. A formally verified OS should be the standard these days (and formally verified compilers). But for that to be so, we'll need new programming languages (not Rust) and toolchains I guess.
So I'm coming at this from a social science perspective. Why did I say "not Rust"? I'm working on a research program to get at what happens when people engage with programming for the first time, aspects of language design that attract or repel learners, and as sort of a related offshoot, why smart women are so disproportionately less interested in programming.
One of my working hypotheses is that programming is terrible. More precisely, the majority of (smart) people who are introduced to programming have a terrible experience and do not pursue it. I think many smart people walk away thinking that it's stupid – that the design and workings of prevailing programming languages are absurd, tedious, and nonsensical. (Note that most men aren't interested in a programming career either. So it's not just smart women.)
Most intro to CS or programming courses seem to use Java these days. That's an unfortunate choice. To veteran programmers, Rust likely seems very different from other languages. Its default memory safety, ownership and borrowship and other features are innovative. (Though the amazing performance of Go's garbage collector seems to weaken the case for Rust's approach somewhat.) However, to new learners, I'm pretty sure Rust will be awful. It's fully traditional in its mass of punctuation noise, just saturated with bizarre (to outsiders) curly braces and semicolons. It also has, in my opinion, extremely unintuitive syntax. I see it as somewhat worse than Java in that respect.
Honestly, Rust is brilliant in more ways than one. I just think they punted on syntax and punctuation noise. I think our timeline is just weird and tragic when it comes to programming languages. They're almost all terrible and just so poorly designed. There's hardly any real scientific research on PL design from a human factors perspective, and what research there is is not applied by PL designers. PLs are just the result of some dude's preferences, and no one seems to be thinking outside of the box when it comes to PL design. In the end, to really make progress in computing and programming, we might need outsiders and non-programmers to design PLs. I'll say that the now-canceled Eve programming language project is a notable exception to the consistent horribleness of modern PLs. We need a few dozen Chris Grangers.
Is it really important how easy it is to learn a language when we are talking about formally verified operating system kernels? Beginners are probably not going to write formally verified code, let alone formally verified system code, so this area is already only accessible to veteran programmers who may not have a problem with these aspects of Rust.
I don't think it's a contradiction to introduce people to programming with more "elegant" or uniform languages that are closer to math (like Scheme or Haskell), or "easy" languages like Python, and then have them move on to Rust if they want to do system-level programming.
I see your point on beginners and formally verified systems code. However, I'm imagining a somewhat different universe.
Formally verified systems code is a nightmare right now. It's almost impossible, and hardly anyone does it for pay. Obscure tools like Isabelle or Coq are needed. People try to shoehorn formal verification onto C, which is a terrible lack of imagination and use of time. Maybe it's easier with Ada or Haskell or something – not sure.
I want a revolution. I want a whole suite of new, clean-sheet programming languages that are the result of furiously brilliant multifaceted design. I want formal verification to be easy, or easier, and baked in to systems languages. I want new operating systems built in these new languages. Some of the dissenters in this thread are assuming that building a new operating system involves firing up your C compiler of choice, and dismiss the whole enterprise as a multiyear bug-ridden nightmare ("never rewrite!" they say) – that is definitely not what I'm thinking about. To hell with C.
It's time to move forward. This is supposed to be a technology industry. It's supposed to be about technological progress. PL design is not progressing. It's awful and it repels newcomers, especially women. We need to drop all these assumptions about programming being done in primitive plain text editors on primitive steampunk Unix-like OSes using primitive command line interfaces with languages as intractable as Linear A. That whole package, which is often what people are introduced to in intro CS courses, is about the worst way to introduce people to the power and wonder of computing that we could have devised.
We need to think more deeply about how to handle and represent various abstractions, functions, and imperative code. Functions should not have bizarre void statements and empty parens, for example. That's so unintuitive. At this point, functions should probably be like equation editors. The pure mathematical function in its normal book representation could be what we see in our IDEs. Visual/graphical programming never really got out of the gate because they just converted code to boxes and arrows. There's an opportunity there to rethink that whole approach.
But if we don't get people interested in programming, if we don't make it less terrible, they never get to systems work. And systems work should not look like Rust or C++. There is no reason why a modern IDE or even text editor should need curly braces or semicolons to understand or compile code at this point in the history of the software industry. Text is a solved problem. Indentation and spacing and line breaks are all trivial to parse. Colors could be semantic. Emoji could be code. Icons, badges like we see for build passing/failing could also be used as code. There's a lot we could do to make programming make a lot more sense.
One of the reasons I want to get more women into programming is that I think they'll be better programmers than men in some respects. Think of archetypes or stereotypes like Hermione vs Harry Potter, Lisa Simpson vs Bart, Wendy Testaburger vs Stan, Kim Wexler vs Jimmy McGill. There is anecdotal, cultural, and scientific evidence to suggest that women – or some subset of women – are more responsible and meticulous than some of the male personality types we often see in tech.
I'll have more on this this week in my report on Medium on Google's firing of developer James Damore for citing well-established social and personality psychology research (my field) on sex differences. I get into some of the reasons programming is generally terrible, and why women might be disproportionately repelled by it. Dart is my victim/example in this report, but I think pretty much all languages are awful. (Eiffel and F# should be the worst case, the bare minimum starting points...)
We did think a lot about the syntax, but we wanted to choose something that'd be familiar to existing systems programmers. It's largely based on C++ and Java, with a little bit of ML thrown in. This is because the language is, in many senses, C++ with some ML thrown in.
AIUI, proper program verification is still a very long-term goal for Rust (and most of the related short-term effort is currently going into building a formal model for the language, including its "unsafe" fragment). So, "not Rust" in that Rust simply doesn't cut it at present, whereas other solutions (Idris, Coq, even Liquid Haskell) might be closer to what's needed.
> The NT file system API is designed around handles, not paths. Almost any operation requires opening the file first, which can be expensive. Even things that on the Win32 level seem to be a single call (e.g. DeleteFile) actually open and close the file under the hood. One of our biggest performance optimizations for DrvFs which we did several releases ago was the introduction of a new API that allows us to query file information without having to open it first.
Ouch that sounds painful... Is this why deleting gigs worth of files takes a bit? I could of sworn it's not a huge difference on Linux, at least when using the GUI, maybe when doing straight rm it's quicker.
Reminds me of a performance optimization I did at work. A previous developer had implemented a routine to clean up old files. The routine would enumerate all the files in a directory, and then ask for the file age, and if old enough delete the file.
The problem was that asking the file age given a path caused an open/close, as GetFileSizeEx expects a handle.
Now, at least on Windows, enumerating a directory gets[1] you not just the filename, but a record containing filename, attributes, size, creation and access timestamps. So all I had to do was simply merge the directory enumeration and age checking.
The result was several orders of magnitudes faster, especially if the directory was on a network share.
POSIX stuff is terrible for this sort of thing, but luckily it's often a straightforward fix once you know to look for it. Here's a similar fix for the old Win32 version of the silver searcher: https://github.com/kjk/the_silver_searcher/pull/7
Looks like the active Windows port (https://github.com/k-takata/the_silver_searcher-win32) doesn't have this change, but it's possible Mingw32 or whatever exposes this flag itself now (as it probably should have done originally).
Path-based I/O seems quite dangerous to me. If everything was path-based, you'd easily have inherent race conditions. You want to delete a directory? You stat() all the files, they look empty, so you delete them... but in between, another process writes to some file (or maybe the user forgets the file is being deleted and saves to something there), and suddenly you've deleted data you didn't expect. When you do things in a handle-based fashion, you know you're always referring to the same file (and can lock it to prevent updates, etc.), even if files are being moved around.
However, to answer your question of why removing a directory is slow... if you mean it's slow inside Explorer, a lot of it is shell-level processing (shell hooks etc.), not the file I/O itself. Especially if they're slow hooks -- e.g. if you have TortoiseGit with its cache enabled, for example, it can easily slow down deletions by a factor of 100x. But regarding the file I/O part, if it's really that slow at all, I think it's partly because the user-level API is path-based (because that's what people find easier to work with), whereas the system calls are mostly handle-based (because that's the more robust thing, as explained above... though it can also be faster, since a lot of work is already done for the handle and doesn't need to be re-performed on every access), so merely traversing the directory a/b/c requires opening and closing a, then opening and closing a/b, then opening and closing a/b/c, but even opening a/b/c requires internally processing a and b again, since they may no longer be the same things as before... this is O(n^2) in the directory depth. If you reduce it to O(n) by using the system calls and providing parent directory handles directly (NtOpenFile() with OBJECT_ATTRIBUTES->RootDirectory) then I think it should be faster and more robust.
You stat() all the files, they look empty, so you delete them... but in between, another process writes to some file (or maybe the user forgets the file is being deleted and saves to something there), and suddenly you've deleted data you didn't expect
This is fundamentally not any different between the systems, race conditions can happen either way. The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens. Similarly the newly written data would be wiped out unintentionally.
For files for which access from different processes must be controlled explicitly there is locking. No filesystem or VFS is going to protect you from accidentally deleting stuff you're still using in another context.
> [...] The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens. Similarly the newly written data would be wiped out unintentionally. [...] No filesystem or VFS is going to protect you from accidentally deleting stuff you're still using in another context.
...what? No file system is going to protect you from accidentally deleting in-use files? But that's exactly what Windows does: it prevent you from deleting in-use files. That's what everyone here has been complaining about. File sharing modes let you lock files to make sure they're not written to (and/or read from) before being deleted, it very much need not be the case that the user could write to a file before it's deleted.
There is an inherent race condition if one program is using a file and another program is deleting it without caring about whether the file is being accessed by other programs.
At that point, all bets are off there regardless of whether the files are accessed by paths or handles.
Windows protects the file from deletion at the exact same time as it is being accessed but does not protect the file from being deleted after it has been accessed. In wall-clock statistics the latter is the way more likely scenario.
So, if an editor saves a document to disk, and another program then deletes the document the editor will happily exit without saving it again thinking that it hasn't been changed.
It doesn't particularly matter whether the two programs clash exactly at the time of saving/deletion or not. The problem exists in the lack of information between the programs and no file system is indeed going to protect you from that.
> So, if an editor saves a document to disk, and another program then deletes the document the editor will happily exit without saving it again thinking that it hasn't been changed.
I'm trying to explain to you that your understanding of Windows is wrong and that this is impossible. As long as that editor has the document open, unless it has explicitly specified FILE_SHARE_WRITE and FILE_SHARE_DELETE, Windows will not allow another program to alter that file. So the editor would very rightly assume upon exiting that nobody has touched that file while it's had that file open.
I know how opening a file affects the exclusivity of access but commonly applications don't seem to keep the file continuously open except during reading or writing. Maybe some Microsoft applications use that pattern extensively but it generally works to save a file in one program and then open it in another program without closing the file in the first program or quitting it.
Nevertheless, this is getting off topic as the thread started with the question of path vs handle access. I still don't see much value in this exclusivity in the latter case because if you have a conflict to begin with it's just a matter of Murphy's for that conflict to manifest in actually losing data.
Even if your editor keeps the file open for the whole day, and you have this second program that is on the trajectory to delete the file it will eventually get to it at a time when the file is not open and thus not protected by the guarantee of exclusive access.
> This is fundamentally not any different between the systems, race conditions can happen either way. The user could write data to file right before the deletion recurses to the same directory and the handle-based deletion happens.
When you hold a handle to a file or directory, you get to decide on the degree of shared access with any other users for the duration that handle is held (FILE_SHARE_*). So this does solve the concurrency problem, by allowing you to, effectively hold a lock on the file until you're done.
> I could of sworn it's not a huge difference on Linux, at least when using the GUI, maybe when doing straight rm it's quicker.
Using Linux file management GUIs can be a disaster. Some years ago, I was massaging many GiB of text, using a crude hack, with scripts that did grep, sed, awk, etc. And I had many working directories with >10^4 tiny files each (maybe 10^5 even).
In terminal, everything was fine. But opening a GUI would swap out so much that the box would freeze. I don't know why. I just learned to avoid doing it.
Heh, that's fine, I don't mind the terminal or the GUI, as long as you're rationally using either and aren't blaming either for your problems. I see a lot of "I only use the terminal" type of people who screw up GUIs and the reverse can be said, GUI users who screw up the terminal, it's all about RTFM'ing sometimes, or just reading what's right in front of you.
Not only it's extremely surprising that there's no getting metadata without opening each file―and since DrvFs is only the WSL layer, apparently the system itself still to this day doesn't have such a feature.
But I'm now additionally baffled by how Total Commander managed to feel hella snappy ten years ago while navigating lots of dirs, whereas on MacOS all double-panel managers are meh, not least due to the rather slow navigation.
10 years ago there was less file system integration, user land virus scanning, kernel level virus scanning, os-hooks, OS-compatibility re-direction, and 32/64bit compatibility checks.
This was mostly added during NT6.0 era, which occured ~12 years ago. VISTA was the first OS using NT6.0 and VISTA was VERY much not in vogue ~12 years ago. In fact it was avoided like a plague as of 2008 (unless you were using 64bit, and had >=4GiB of RAM)
So many were using Windows-XP 32bit, or the NT5.2 kernel. Even those with >=4GiB of RAM were on Windows-XP 64bit, as VISTA had a ton of driver problems.
Thanks for sharing, that was an interesting read. I know the IO performance has always been the main gripe with WSL.
It makes me think more broadly about some of the trade offs with the discussions that happen on Github Issues.
It’s great that insights like this get pulled out in the discussions, and this could serve as excellent documentation. However, the discoverability will be difficult and it always takes some weeding through the low quality comments to piece the insights from contributors together.
I wonder how Github (and Microsoft) could better handle this. They could allow for flagging in-depth comments like this into a pinned section and they could be updated collaboratively in the future (Wiki?).
It also feels like a reputation system could help to motivate healthier discussions and could bury low quality “me too” comments and gripes. It’s not so bad in the linked example but I often encounter many rude comments aimed against volunteer OSS developers.
This particular post is pretty famous if you've been following the development of WSL. It's constantly linked to and referenced from other GitHub issues regarding WSL performance. So I think GitHub's current system is succeeding in that regard, although there are so many good points raised here that I wish it could get turned into a standalone blog post.
I can’t see MFT contention mentioned once. That’s what absolutely and totally destroys small file write performance. This affects source operations, WSL, compilers, file storage, everything.
And that’s so architecturally tied to the guts of NT you can’t fix it without pushing a new NTFS revision out. Which is risky and expensive.
Which is incidentally why no one at Microsoft even seems to mention it I suspect and just chips at trivial issues around the edges.
Bad show. NTFS is fundamentally broken. Go fix it.
Edit: my experience comes from nearly two decades of trying to squeeze the last bit of juice out of windows unsuccessfully. My conclusion is don’t bother. Ext4, ZFS and APFS are at least an order of magnitude more productive and this is a measurable gain.
Perhaps we didn't read the same article. What it says that the root of problem is the Windows IO subsystem architecture. Change NTFS for anything and you will get the same problem.
But that’s not the case. The root cause is the MFT and NTFS architecture. People fail to mention that because the problem is harder to fix. It’s almost that there is a “do not speak bad of NTFS” going on.
You can demonstrate this by using a third party file system driver for NT. when NTFS on its own is eliminated the performance is much much better. This is a neat little differential analysis which is conclusive. I can’t remember the product I used when I evaluated this about 8 years ago unfortunately.
I think this is a very good example of how windows is different in its goals and designs from Linux. I have a feeling this is the reason Linux has had a hard time catching on, on the desktop. Its easy to complain about a slow filesystem, but Microsoft lives in a different world, where other priorities exist. For someone building a server running a database serving loads of people Linux is a no brainer. You can pick the parts you want, shrink wrap it and fire it up without any fat. On a desktop machine, you want to be able to update drivers in the background without rebooting, you want virus scanners, and you want to have driver ready the moment the user plug's in a new device. Both Windows and Linux is for the most part very well engineered, but with very different priorities.
I’m very confused by your post. You start off talking about desktop machines but NT was actually engineered for servers and then later ported to the desktop. You then describe a bunch of features Linux does better than Windows (eg updating drivers without a reboot).
I think a more reasonable argument is to make is just that Windows is engineered differently to Linux. There’s definitely advantages and disadvantages to each approach but ultimately it’s a question of personal preference.
NT is engineered for a different category of servers, though - it's a workgroup server first (originally its chief competitor was NetWare), and a Web/Internet server second. That drives a different set of priorities.
For example, as someone elsewhere in the comments pointed out, NT does file access in a way that works very well when accessing network shares. That's a pretty core use case for Windows on business workstations, where it's common for people to store all the most important files they work with on a network share, for easier collaboration with other team members.
NT was architected to handle high-end workstations from day one — there’s a reason why running the GUI was mandatory even when the resource costs were fairly substantial.
Check out e.g. https://en.wikipedia.org/wiki/Windows_NT_3.1 for the history of that era. The big selling point was that your business could code against one API everywhere, rather than having DOS PCs and expensive Unix, VAX, etc. hardware which was completely different and only a few people on staff were comfortable with.
OS/2 was a high end desktop OS, but NT diverged a little and took some heavy design principles from VMS (hence it’s name, WNT) and was thusly pivoted towards back office systems rather than desktop usage.
At that time Microsoft’s server offering was a UNIX platform, Xenix, but it was becoming clear that there needed to be a platform to serve workstations that wasn’t a full blown mainframe. So
Microsoft handed Xenix to SCO to focus on their collaboration with IBM so the intent there was always to build something more than just high end workstation. And Given it was intended to be administrated by people who were Windows users rather than UNIX grey beards (like myself) it clearly made sense to make the GUI a first class citizen; but that doesn’t mean it was sold to be a desktop OS.
My point was that it is misleading to say it was billed as a server OS when all of their messaging was that it was both — maybe not as far down as low-end desktops but they were very clear that any serious work could be done on NT, going after the higher end PC and lower end workstation business.
In that era workstations weren’t the same things as desktops. They were an entirely different class of computers and often workstations just ran server OSs with a nicer UI (Next, SGI, etc). So you’re point about workstations doesn’t invalidate what I was saying about NT not originally targeting desktops.
Drivers in Linux live in the kernel. Whenever the kernel is updated a reboot is required (in most distros). Hence your assertion that Linux updates drivers without a reboot better than windows does is questionable.
You only need to restart if there has been a kernel update (on any platform, not just “some distros”). For regular driver updates between the same kernel ABI you can use modprobe to unload and reload the drivers. This works because while drivers share the same kernel memory space (as you illuded to), they aren’t (generally) part of the same kernel binary. They normally get bundled in the same compressed archive but are separate files with a .ko extension.
This isn’t a system that is unique to Linux either. Many UNIX platforms adopt similar mechanisms and Windows obviously also has its drivers as separate executables too.
It just so happens that rebooting is an easier instruction to give users than “unload and reload the device driver”; which might also potentially be dangerous for some devices to “hot-unload” while in use. So a reboot tends to be common practice on all platforms. But at least on Linux, it’s not mandatory like it is on Windows (for reasons other than the ability to reload drivers on a live system)
It's not mandatory on Windows either. I've updated various drivers for a wide range of devices over the years without needing a reboot. From what you describe, it seems the situation on Windows is similar to that on Linux.
Windows is a little different: to unload a driver in Windows it needs to support an unload method, which not all drivers do. And without that you cannot even write the updates to disk (due to the file locking vs inode differences which have already been discussed in this thread) let alone load them into the kernel.
That said, if a kernel module is in use on Linux then it’s sometimes no easy task finding the right procedure to follow to do an unload and reload of it.
Ultimately this is all academic though. These things mattered more on monolithic systems with little redundancy but these days it’s pretty trivial to spin up new nodes for most types of services so you wouldn’t get downtime (generally speaking. There are notable exceptions like DB servers on smaller setups where live replication isn’t cost beneficial)
There comes the question why Microsoft did not implement a standard convention (not API) to unload a driver. They could have said a driver does this and that on unload, and we perform an as-if-shutdown unload if you fail to follow our convention.
> On a desktop machine, you want to be able to update drivers in the background without rebooting, you want virus scanners, and you want to have driver ready the moment the user plug's in a new device.
With the exception of the virus scanner these actually sound like arguments in favour of Linux, in my experience.
(Although there are also excellent virus scanners available for Linux anyway)
I’m pretty confused by this post. What would you identify their priorities as?
Regardless, as a person not a fan of windows (not worth learning a unix alternative), I would argue it’s the polish that makes the experience worth it, not some better engineered experience. For instance: working with variable DPI seems to be trivial, whereas it still seems years off in Linux. Same with printers and notifications and internet and almost everything. These aren’t feats of engineering per se, but they do indicate forethought I deeply appreciate when I do use windows.
I would hesitate to ascribe too much purpose to everything you see. Microsoft is a huge company with conflicting political factions and a deep ethos of maintaining backwards compatibility so there are plenty of things which people didn’t anticipate ending up where they are but which are risky to change.
One big factor is the lost era under Ballmer. Stack ranking meant that the top n% of workers got bonuses and the bottom n% were fired, and management reportedly heavily favored new features over maintenance. Since the future was WinFS and touching something core like NTFS would be a compatibility risk, you really wouldn’t have an incentive to make a change without a lot of customer demand.
As a c# Dev, I am constantly annoyed that Windows updates and sometimes installs require reboots or stopping all user activity, while I've never had to reboot or block during an upgrade on ubuntu
To be fair, a lot of Linux updates require a reboot or at least a logout to properly take effect, too. Windows is just very aggressive about forcing you to upgrade and reboot, which does undeniably have security benefits when you consider that Windows has a lot of non-technical users and a huge attack surface. At least they have relaxed it a bit, the frequent forced reboots caused me some serious problems on a Windows machine I had controlling a CNC machine.
Windows also requires rebooting for the actual upgrading process. A Linux update might need a reboot to take affect but the reboot is still a normal reboot, it won't take longer because it's trying to install something.
Both Windows and macOS suffer from this. Big updates to both systems can render the computer unusable for 30 minutes while they are installing.
This is true. Fedora now only has a Reboot and Update button in the GNOME Software GUI because some software like Firefox and some GNOME components crash if you update them while they're running (although this seems to happen more often with Wayland than Xorg for some reason). At least Linux and the BSDs give you a choice whether to do offline or online updates.
Most of these things are coincidental byproducts of how Windows (NT) is designed, not carefully envisioned trade offs that are what make Windows Ready for the Desktop (tm).
For some counterexamples of how those designs make things harder and more irritating, look at file locking and how essentially every Windows update forces a reboot, that is pretty damn user unfriendly.
Even without file locking, how would live updates work when processes communicate with each other and potentially share files/libraries? I feel like file locking isn't really the core problem here.
Everything that is running keeps using the old libraries. The directory entries for the shared libraries or executables are removed but as long as a task holds a live file descriptor the actual shared library or executable is not deleted from the disk. New processes will have the dynamic linker read the new binaries for the updated libraries. Unless the ABI or API somehow changes during the update (and they don't, big updates usually bump the library version) things work pretty fine.
1. On the one hand I see folks accessing files over and over by paths/names, and on the other hand they demand features that would break unless they switched their fundamental approach to handles/descriptors. Which is it? You can't claim descriptors would fix a problem and simultaneously insist on path-based approaches being perfectly fine. Most programs use paths to access everything (and this goes beyond shared libraries) and assume files won't have changed in between. You can blame it on the program not using fds if that makes you feel better, but the question is how do you magically fix this for the end user?
2. Do you actually see this working smoothly on a Linux desktop environment in practice, or do you just mean this is possible in a theoretical sense? Do you not e.g. get errors/crashes after an apt-get upgrade that presumably upgraded a package your desktop environment depended on (say GTK or whatever)? That happens to me frequently (and I'm practically guaranteed to see a problem if I open a new window in some program in the middle of an update), and it scares me what might be getting corrupted on the way -- makes me wish it would reboot instead of crashing and stop giving me errors.
1. In general updating the same files at the same time is not a that much of a common problem in any practical sense. The user generally won't be editing the document in two different editors at the same time. Programs use flock(2) or something similar if they have to update a shared file, or they have a directory structure that allows different instances of the program to update simultaneously by using little small files instead of having a mutually exclusive access to a single file.
I think the most common real-life problem is editing a shell script while it's still running: this happens often during development if the shell script takes a bit longer to run. You edit the file and hit save until the previous run has finished. The on-disk data changes which is reflected in the shell process that mmap()ped the script file, and eventually the contents that changed or shifted will break the shell's parser.
2. I have 106 days of uptime on my laptop. It has gone through several apt upgrades and I don't think I've shut down my X11 session for 106 days either. Firefox sometimes restarts itself after an update because it apparently knows it needs to do that but other than that I generally never restart X or reboot the machine because of updates. This has basically been the case for years, even decades. The scheme probably has to break eventually but I generally bump into other stuff such as important kernel updates before that. Fair enough for me, never really ran into any issues because of it.
1. User opens a document. User moves a higher-level directory not realizing it was an ancestor of that file. Then user goes back to the program and it can no longer find the file because it was using file paths. What should happen? Should the OS play any role?
2. You manage to keep X11 open, but that's hardly the point I was making. Do you also keep everything open and use your GUI programs as normal when going through an upgrade, or do you change your behavior in some way to avoid it messing up what you're running? And/or are you selective about which updates you apply to minimize their effects on what you're running?
Furthermore, are you familiar at all with the kinds of errors I referred to? Or have you never seen them in your life/don't know what I'm even talking about? If you don't think I'm just making things up when I say updates frequently cause me to get get crash and error messages ("Report to Canonical?" etc.), then in your mind, why does this happen right when I update? Is it just some random cosmic bit flip or disk corruption on my new computer that pops up exactly when I update? Is it not possible for it to be the update perhaps changing files when programs didn't expect them?
1) No, I think the assumption has to be that the user should know what s/he's doing. However, how would using handles even help there? If you close the file you will need to access it by a path even on Windows, and the very path has changed. Or instead, if you keep the file open and do not try to reopen it, even Linux lets you keep the file descriptor and have the program access the file as before even if it's moved around in the directory tree.
2) Yes, I generally keep stuff running as usual. I don't screen any updates, I just run them whenever I remember to. I think I've seen similar things to what you described. They're a rare exception though.
Obviously, doing something like a major update to a new Ubuntu version would make me close all programs and reboot the machine after the update. But any normal updates I just let through without thinking twice.
There will be problems eventually but the version mismatches become rather evident at that point. A configuration file format has changed or some scripts have moved, or a library has moved. I've seen Gnome Panel get messed a couple of times as Gnome gets notified of configuration file changes and the old Panel tried to load stuff meant for a new version of the Panel. I keep Emacs running all the time and I've seen it fail to find its lazy-loading lisp files some time after an update, being unable to enter a major or minor mode. I've seen Nautilus go wonky and unresponsive some hours/days after an update but killing the process fixed it. Chrome doesn't seem to mind but it gets a bit slow after a few weeks of use so it tends to get an occasional reboot even without updates. I've seen crash dialogs which don't come back after I restart the program, but again those are a handful across several years and were mostly about some long-running panel items like calendars or notification tickers.
However, all these are rare enough that I don't really feel any particular pain. It's quite indistinguishable from these complex programs rarely but sometimes still crashing on their own, all even without updates.
It generally takes a really long-winding session to run with enough cumulated big updates that majorly change things underneath that you can't just keep running the old binaries as they are. When something eventually misbehaves or crashes after the tenth or so update, I'll just restart that particular program. Most of the time the desktop itself keeps running like before. I don't recall a data loss because of live updates ever and I've used Linux since 1994 or so.
The live updates are much more convenient than restarting the whole system after each and every update just to make sure. I only restart one program when that program stops working, and like I said above even that is quite rare indeed.
As for you, you probably run programs that do suffer from this more.
I have the Gnome desktop with all its stuffses running in the background, I keep Emacs running continuously, a couple of browsers but their uptimes are generally around 1-2 weeks anway, a tmux session, then a lot of other programs which I don't keep open all the time.
But as most of the desktop still likely churns along as usually it's pretty easy to quit + relaunch a single program.
you can always restart processes, on Windows it is fundamentally impossible to overwrite a running DLL or EXE file. So for example if some services are needed to apply updates, they can never be updated without a reboot.
Yes, I'm aware how Windows file locking works -- in fact you can sometimes rename running executables -- it depends.
Your solution to a rebooting the system being user-unfriendly is... restarting processes? How would that be so much more user-friendly? That's almost the same from a user standpoint, you might as well actually lock down the system and reboot to make sure the user doesn't try to mess with the system during the update.
And on top of all that, if you're actually willing to kill processes, then they won't be locking files anymore in the first place, so now you can update the files normally...
So yeah, I really don't understand how file locking is the actual problem here, despite Linux folks always trying to blame lack of live updates on that. I know I for one easily get errors after updating libraries on e.g. Ubuntu making programs or the desktop constantly crash until I reboot... if anything, that's far less user-friendly.
Not all applications need to restart, most updates will effect things that are not the running application (Office suite/webbrowser/game/whatever) ? Meanwhile your entire system has to restart.
"Most updates" won't affect running applications? What DLLs do you imagine "most updates" affect that are not in use by MS Office, Chrome, games, etc.? Pretty much everything I can imagine would be used all over the system, not merely coincidentally by desktop applications, but especially by desktop applications... if anything, it'd usually be the other way around, where some background services wouldn't need to be killed (because they sometimes only depend on a handful of DLLs), but many applications would (which can have insane dependency graphs). But both applications and background services also use IPC to interact with other processes (sometimes internally through Windows DLLs, not necessarily explicitly coded by them) which could well mean that they would need to be restarted if those processes need to be updated...
> What DLLs do you imagine "most updates" affect that are not in use by MS Office, Chrome, games, etc.?
Yeah, you can't update libc this way.
But outside of a short list of DLLs that are used by everything, files are mostly specific to a single program, and 90% of my programs are trivial to update by virtue of the fact that they aren't running.
And most of the background services on both linux and windows can be restarted transparently.
> But outside of a short list of DLLs that are used by everything, files are mostly specific to a single program, and 90% of my programs are trivial to update by virtue of the fact that they aren't running.
Are we talking about the same thing? We're talking about Windows updates, not Chrome updates or something. Windows doesn't force you to reboot when programs update themselves. It forces you to reboot when it updates itself. Which generally involves updating system DLLs that are used all over the place.
I don't think most updates touch those DLLs. Most have a modified date of my last reinstall. Some updates do, but a whole lot more could install without a restart if microsoft cared at all (like if it cost them ten cents).
>So for example if some services are needed to apply updates, they can never be updated without a reboot
I wouldn't say never. Hotpatching was introduced in windows server 2003[1]. However, it's seldom available for windows update patches, and even if it's available, you have to opt-in (using a command line flag) to actually use it.
Yeah, it's amazing there is just a brief flash and everything is up and running again.
When I had slightly more unstable drivers, Windows could recover from that as well. The driver would crash, screen goes black, and then back up and running again without most apps noticing (excluding games and video playback).
Indeed I’ve updated many a drivers on windows (including graphics as you mention) without a reboot required. Always needed a reboot to do the equivalent kind of updates under Linux.
They mention that file performance decreases with the numbers of filters that are attached to the NTFS drive. Is there a way to list the filters to determine which ones (and how many) are applied on your system?
Interesting this article never said "poorly architected". The conclusion that the issue is peanut buttered points at that. Instead of looking into the system for things to optimize is there any proposal or initiative to rework it at a higher level?
> Noting, as an aside, that it isn't really all that necessary for MSFT to work on the project, because I gather there are at least 684 FOSS NT kernel developers available, and qualified, and willing, to volunteer their time to work on projects like that. I assume that's why all those people upvoted, anyway. With a team that size, stabilizing WinBtrfs will take no time at all.
On similar lines, one of the usual gripe with windows is the horrible file copy estimates, Raymond Chen wrote an article on it to explain why it is[1].
The Old New Thing is a fantastic blog especially if you develop in Windows. I used to read it religiously until 5 years back and stopped reading about time when Google Reader was retired and somehow didn't setup in my new flow of reading blogs. Thanks for reminding me about this blog and article again.
You're being downvoted, but it's true about linux for at least two reasons to this day:
1. dm-crypt threads are unfair to the rest of the system's processes [1]. On dmcrypt systems, regular user processes can effectively raise their process scheduling priority in a multithreaded fashion by generating heavy IO on dmcrypt storage.
2. Under memory pressure, the VM system in linux will enter a thrashing state even when there is no swap configured at all. I don't have a reference on hand, but it's been discussed on lkml multiple times without solution. I suspect the recent PSI changes are intended as a step towards a solution though. What happens is clean, file-backed pages for things like shared libraries and executable programs become a thrashing set resembling anonymous memory swapping under memory pressure. As various processes get scheduled, they access pages which were recently discarded from the page cache as evictable due to their clean file-backed status when other processes ran under pressure, and now must be read back in from the backing store. This ping-ponging continues dragging everything down until either an OOM occurs or pressure is otherwise relieved. This often manifests as a pausing/freezing desktop with the disk activity light blazing, and it's only made worse by the aforementioned dmcrypt problem if these files reside on such volumes.
This behaviour - and perhaps other causes with similar issues for desktop users, is what drove me away from helping to make Linux ‘ready for the desktop’. The kernel philosophy was incompatible with the needs of desktop users. Those developing for the desktop (like myself) didn’t have the required expertise to make the kernel do the Right a Thing, and couldn’t find enough people willing or able to help.
Things may have changed over the years - I’ve been running a Linux desktop recently and haven’t seen this kind of issue yet (the kind where you need to use magic keys to ask the kernel to kill all, sync and reboot) but reading your post, perhaps this is because RAM is much more plentiful these days.
Its probably the ram plentiful thing. I haven't looked recently into if the Arch mainline kernel kconfig is just bad or not but the oomkiller is trash for me. Used to have the "recommended" swap == ram size but then the memory manager never even tried to cull pages until it was OOM and froze up trying to swap constantly. Currently running a 16/4 spread and probably going to drop to 16/1 because any time I hit memory limits everything just freezes permanently rather than the oomkiller getting invoked. I've hit it twice this week trying to render in Kdenlive and run a debug build of Krita...
If more people reproduced the dmcrypt issue and commented in the bugzilla issue it'd put more pressure on upstream to revert the known offending commit.
For some reason they seem to be prioritizing the supposedly improved dmcrypt performance over fairness under load, even though it makes our modern machines behave like computers from the 90s; unable to play MP3s and access the disk without audio underruns.
I assume it's because they're not hearing enough complaints.
I’ve attempted to use Linux and desktop freezing is the norm even on machines that run fine under windows. Admittedly the machines might be underpowered but that does call into question the commonly held belief that desktop Linux is better for low spec machines.
I have had issues at times in the past but with things like core dumps on systems with spinning disk and 128gb ram. The OOM on linux can be brutally frustrating. But it's still light years ahead of windows for development...
> At least windows doesn't freeze your whole desktop under heavy IO.
It does it as well on Windows and unlike Linux, Windows still has a risk of permanent damage under low space available, it was even worse in the XP days but these issues are still there.
And it's a particularly interesting issue, because this problem mirrors the congestion control failure observed on most networks in recent years. We all have seen this problem, on a busy network, the latency will increase by two order of magnitudes, ruining other network activities like web browsing, even themselves require only a little bit of bandwidth. The simplest demo is uploading a large file, while observing the ping latency, it would just jump from 100ms to 2000ms. But it should not happen, because the TCP congestion control was just designed to solve it.
In turns out that the cause of this problem, known as bufferbloat, is the accumulated effect of excessive buffering in the network stack, mostly the system packet queue, but also includes the routers, switches, drivers and hardware, since RAM is cheap nowadays. The TCP congestion control works like this: if packet loss is detected, then sends at a lower rate. But when there are large buffers on the path for "improving performance", the packets are never lost when if the path is congested, instead, they would be put into a huge buffer, so TCP will never slow down properly as designed, and during the slow-start, it believes it's on the way going to the moon. On the other hand, all the buffers are FIFO, it means when your new packets have a chance to get out, it's probably no longer relevant, since it takes seconds for moving it from the tail of the queue to the head, the connection would be timed out already.
Solutions include killing buffers and limiting their length (byte queue limit, TCP small queue), another innovation is new queue management algorithms: we don't have to use a mindless FIFO queue, we can make them smarter. As a result, CoDel and fq_codel are invented to implement "DELay-COntrolled queues", they are designed to prioritize new packets that are just arrived, and dropping old packets to keep your traffic flowing.
And people realized the Linux I/O freeze is a variant of bufferbloat, and the very same ideas of the CoDel algorithm can be applied to the Linux I/O freeze problem.
Another interesting aspect was, that the problem is NOT OBSERVABLE if the network is fast enough, or the traffic is low, because the buffering does not occur, so it will never be caught in many benchmarks. On the other hand, when you start uploading a large file over a slow network, or start copying a large file to a USB thumb drive on Linux...
> And people realized the Linux I/O freeze is a variant of bufferbloat, and the very same ideas of the CoDel algorithm can be applied to the Linux I/O freeze problem.
There are myriad causes for poor interactivity on Linux systems under heavy disk IO. I've already described the two I personally observe the most often in another post here [1], and they have nothing at all in common with bufferbloat.
Linux doesn't need to do less buffering. It needs to be less willing to evict recently used buffers even under pressure, more willing to let processes OOM, and a bridging of the CPU and IO scheduling domains so arbitrary processes can't hog CPU resources via plain IO on what are effectively CPU-backed IO layers like dmcrypt.
But it gets complicated very quickly, there are reasons why this isn't fixed already.
One obvious problem is the asynchronous, transparent nature of the page cache. Behind the scenes pages are faulted in and out on demand as needed, this generates potentially large amounts of IO. If you need to charge the cost of this IO to processes for informing scheduling decisions, which process pays the bill? The process you're trying to fault in or the process that was responsible for the pressure behind the eviction you're undoing? How does this kind of complexity relate to bufferbloat?
> It needs to be less willing to evict recently used buffers even under pressure, more willing to let processes OOM
I've had similar experiences. On Windows, where either through bugs or poor coding, an application requests way too much memory, leading to an unresponsive system while the kernel is busy paging away.
On Linux I've experienced the system killing system processes when under memory pressure, leading to crashes or an unusable system.
I don't understand why the OS would allow a program to allocate more than available physical memory, at least without asking the user, given the severe consequences.
Overcommit is a very deliberate feature, but its time may have passed. Keep in mind this is all from a time when RAM was so expensive swapping to spinning disks was a requirement just to run programs taking advantage of a 32-bit address space.
You can tune the overcommit ratio on Linux, but if memory serves (no pun intended) the last time I played with eliminating overcommit, a bunch of programs that liked to allocate big virtual address spaces ceased functioning.
Yeah, I know it was a feature at one point... but at least the OS should punish the program overcommitting, rather than bringing the rest of the system down (either by effectively grinding to a halt or killing important processes).
What is the "correct" handling of swap on an HDD supposed to be like? It is going to be slow no matter what you do. Windows also locks up for long periods of time if you use up almost all the RAM and it has to swap to HDD.
When I say lock up I mean the UI completely stops. As in, my i3 bar stops updating for several minutes. Not even the linux magic commands let me recover.
On windows things may be unresponsive, but at least ctrl-alt-del responds, and at least the mouse moves!
The main difficulty is I can't tell if my machine has crashed vs is overloaded if the UI doesn't do anything for several minutes.
> Not even the linux magic commands let me recover.
Are you sure you've set up the magic sysctls correctly? Ubuntu ships with magic oomkill disabled, by default.
On all machines I've tried, whenever I've needed it, magic oomkill has always worked, and I've been thankful of the fact that it's implemented down in the kernel.
If you renice the UI processes to a higher-than-normal priority, it should work more like it does in Windows. (This used to be somewhat risky, but today Linux UI is not going to hog your system resources.) The underlying problem is that when memory is tight, Linux starts evicting "clean" pages from the page cache that actually will need to be accessed shortly afterwards (i.e. they're part of the working set of some running process), and thrashing occurs. There's no easy solution to this issue, other than making user programs more responsive to memory pressure in the first place. (This could extend as far as some sort of seamless checkpoint+resume support, like what you see in mobile OS's today.)
> Well, why can't distributions that come with recommended GUIs just set that high-than-normal priority by default?
Feel free to file bugs for your preferred distro. It would be especially appropriate to do this for critical UI processes like xorg/wayland or lightdm, and for "lightweight" desktops like xfce/lxde that aren't going to cause resource pressure under foreseeable conditions, even when run at higher-than-normal priority.
"Correct" handling of swap would mean mostly leaving the window manager and its dependencies in memory. Individual application windows may stop responding, but everything else should be pretty quick. And small things like terminal emulators should get priority to stay in ram too.
Not that I'd recommend this, but my work MBP has 16GB of ram, and my typical software development setup (JVM, IntelliJ, Xcode, gradle) easily uses up 30GB. It swaps a lot but generally OSX does a good job of keeping the window manager and foreground applications at priority so I can still use my machine while this is happening.
I attribute this to the fact that the darwin kernel has a keen awareness of what threads directly affect the user interface and which do not (even including the handling of XPC calls across process boundaries... if your work drives the UI, you get scheduling/RAM priority). I don't think the linux kernel has nearly this level of awareness.
> ... the darwin kernel has a keen awareness of what threads directly affect the user interface and which do not (even including the handling of XPC calls across process boundaries... if your work drives the UI, you get scheduling/RAM priority). I don't think the linux kernel has nearly this level of awareness.
You're talking about priority inheritance in the kernel. In Linux, this is in development as part of the PREEMPT_RT ("real-time") patches, already available experimentally in a number of distributions.
I've experienced a total freeze before once one of my programs started swapping/thrashing. The entire desktop froze, not just the one program. This was in the past two years or so, so it's not a solved problem.
Case in point: I recently tried unzipping the Boost library on an up-to-date Windows 10, and after trying to move the frozen progress window after a minute, the whole desktop promptly crashed. I have to say, the experience is better than it used to be, because at least the taskbar reappeared on its own. (Decompression succeeded on the second attempt after leaving the whole computer well alone ... but it certainly took its time even on a high-end desktop computer.)
Could you please leave personal swipes out of your comments here? They have a degrading effect on discussion and evoke worse from others. Your comment would be just fine without the second sentence.
The issue seems to be with how Windows handles file system operations. They allow filters to operate over each request. These filters are like hardware drivers in that they are created and maintained by third parties. So, MS has no real ability to change how things work because they don't have absolute control in the same way that Linux does over the operations (Linux device drivers are part of the kernel)
Not sure if I'm being naive, but couldn't they make it so that if there are no filters present on a given volume, then the entire filter stack is bypassed and the directory nodes are served from a cache managed by the file system at the root of the volume? That way developers who really want performance can make sure they have no filter drivers installed on that volume.
The linked post notes that, even on a default Windows install, you already have a sizable stack of filter drivers active. Not sure whether that only applies to the C: volume or to other ones as well.
If they really wanted to, they could mount an ext4 fs as a single blob on disk directly into WSL. That won't really help if you want to run a GUI editor in windows against that identical directory though.
That looks awesome (thanks for sharing) but it seems to be a different thing? People are looking for a way to run existing Linux FUSE in SL, not develop new virtual file system implementations. They haven't responded to that request at all. https://wpdev.uservoice.com/forums/266908-command-prompt-con...
NTOS (Windows NT's kernel) was originally designed to be backward compatible with Windows (DOS era), OS/2 and POSIX. Designing such a kernel was considered a performance at the time but probably cost its fair share of bloat that kept growing with the years. Also, it's not surprising that Linux is optimized for operating on files since the UNIX philosophy is about considering everything as a file, something that Dave Cutler (NTOS main designer) was known for criticizing[1].
I pray daily that more of my fellow-programmers may find the means of freeing themselves from the curse of compatibility. (Dijkstra)
As another data point here: I had to add a TKinter-based loading screen to a PyQt5 app I wrote for work because the app - which takes half a second to start up on Linux - takes nearly a minute on Windows the first time around (and multiple dozens of seconds on subsequent runs) and I know firsthand that my end-users are unlikely to wait that long unless they can see something loading. I suspect it has to do with the sheer number of file accesses for each of the Python libraries and underlying DLLs and such.
> A Win32 path like C:\dir\file gets translated to an NT path like \??\C:\dir\file, where \??\C: is a symlink in Object Manager to a device object like \Device\HarddiskVolume4
Cool! Can the "NT paths" be used directly on Windows? As far as I know the ordinary Win32/NTFS system doesn't even have symlinks and this feels quite a handicap.
Note that this is behind a "developper flag", i.e. it needs elevation if you have not checked "developper mode" in the W10 parameters. (I guess they don't want the cost of supporting ordinary people building circular dependencies)
It depends on which API the application uses to pass paths to the OS and whether its path handling library does internal parsing that is or is not aware of those prefixes.
For example rust has support[0] for the magical file path prefixes[1] that can be used to escape path length restrictions, reserved identifiers and some other limitations. This is used by default whenever you normalize a part, which means it has fewer restrictions than windows explorer has by default.
These require user-mode round trips for path traversal, so comparable to FUSE, I believe.
So it is relatively easy to add hooks of your own that get called as if they were file reads; NTFS calls these "Reparse Points". Linux would do this with named pipes.
I am just a user of these things, haven't dug into any implementation details recently, but I guess that the stacks to support these features in Linux and Windows each have their cache etc design decisions that have led to current state of filesystem performance under typical use-cases these days.
I'm not a super expert but I think that is the case and you even need them if you want to use some advanced features like paths that are longer than MAX_PATH or these virtual mount kinds of things.
But IMO it's ugly as hell and just begging for compatibility problems. So don't go there, at least not to get super long paths. 260 bytes should be enough for anybody.
How did you manage? I've never hit the limit in like 20 years. Maybe 260 bytes means only 130 2-byte characters (I don't even know) and that I might find a little restrictive maybe. In any case I'm tempted to say, just fix your damn paths (if possible). I guess at 80 characters it starts to get unreadable anyway.
Of course I have only encountered filenames this long a couple times but as for full path like C:\something\something this is very easy.
E.g. "How did you manage? I've never hit the limit in like 20 years." is 63 characters and it still looks like a reasonable title for a book. You may have written or downloaded a book and stored it in a folder named this way in multiple formats (i.e. PDF, EPUB etc) and this can already make something like "C:\Documents and Settings\John Doe\My Documents\My Projects\Books\How did you manage - I've never hit the limit in like 20 years\How did you manage - I've never hit the limit in like 20 years.epub" and this already is 199 characters. Just 60 characters left and 60 characters is as short as "C:\Documents and Settings\John Doe\My Documents", and you might in fact have your book file named a longer way like "John Doe - How did you manage? I've never hit the limit in like 20 years. 2nd edition draft". This is a bit clumsy example but it demonstrates how easily reachable the limit actually is in the real life.
This actually forced me to store my data in c:\a\b\c... (where a, b, and c actually were 1-4-letter acronyms) instead of C:\Documents and Settings\John Doe\My Documents\
Yes, this is how you end up with huge file paths. So don't do that. Obviously. It's not only long in storage terms. Nobody can even read it without getting dizzy.
Name it filepathlimit/filepathlimit.epub or something. Done. You can actually read that. Now if you want the prose, open the damn file. Or use the file explorer which might already show you a more complete title based on the file metadata.
And don't do that "C:\Documents and Settings\John Doe\My Documents\My Projects\Books\". There is no point in storing your things deep in a thousand rabbit holes. It's overzealous hierarchy fetishism. There is no point in creating unreadable paths that wrap around lines like wild. Use D:\Books or something. Or D:\John\Books if you must. Use basic common sense.
I don't do, but I work with other people's stuff and almost every non-geek does that.
> And don't do that "C:\Documents and Settings\John Doe\My Documents\My Projects\Books\". There is no point in storing your things deep in a thousand rabbit holes
That's the standard way Windows users are meant to store their data (although I don't). What is "\home\jdoe\" on Linux is "C:\Documents and Settings\John Doe\My Documents\" on Windows ("C:\Documents and Settings\John Doe\" actually but not from an ordinary user point of view).
> It's overzealous hierarchy fetishism.
I actually find it sad we are still using hierarchies for that when we could be just using tags and semantic attributes instead (semantically a document file doesn't even need a "file name", the actual document title stored as an atribute or as a record within the file is enough and actually better). Apparently it seems ordinary people are more comfortable with the folder metaphor and hierarchies so WinFS was cancelled and 3rd party tagging and semantic desktop systems are little know.
It is actually surprisingly common to hit the max path length with build systems. Why?
You're building a project. Your source code is under C:\Users\twenty-ish character username\Documents\Projects\. The project name is a released build which got extracted to something network-configurator-1.4.5, which builds into a build directory underneath that. That's up to 80 characters right there, only 180 remaining.
If you have any process that starts doing something like "place the artifacts needed to build <result>.foo in result/" (say, cmake), and throw in some hierarchy nesting, you can chew up 180 characters remarkably quickly. Something like node.js produces this monstrosity of a path name:
Names that seem reasonable within the context of their folder may turn out to be highly redundant in the context of the entire pathname--but many people would rather prefer to have readable names at each level of the folder hierarchy, which makes it hard to make full pathnames reasonable. Sometimes, these names are required by means of a language implementation or other build system that is uncontrollable by you.
The unreasonable thing here is not any of the components' names, and the problem fundamentally can not be fixed by shortening them. The problem is the insane uncontrolled nesting.
Hardly. "insane uncontrolled nesting" seems quite a natural way to manage complexity an a automated way. I can see no other principle but the Miller's law (the 7±2 "magic number") to label some level of nesting as "insane" and it doesn't apply for automated processing. What are, if any, good reasons to limit a path length so severely other than human readability? If long paths puts non-negligible penalty on performance I'd say the file system is defective by design.
Good reasons not to have many modules: Compile time, Website loads times and sizes, Project maintainability.
Nested dependencies have negative effects on all of those. They encourage uncontrolled addition of modules, and even addition of modules in multiple versions. They lead to wrong "isolationist" thinking.
In other words, they do not manage complexity but produce unneeded complexity.
Not having many modules means having big modules. The bigger a module is (anything bigger than half a screen) the lower my productivity is. Every module should fit into your mind easily.
Yes they can, though one problems is that because NT paths are less restrictive than Win32 paths, you can create files / directories that you then have trouble doing things with using standard tools
Notably the Windows shell does not support longlong file names, like, at all. The Windows shell and Win32 APIs are also easily confused by files using the NTFS POSIX namespace, which may have \ in their name.
I still scracth my head as to why doing a kernel update of Ubuntu running under Hyper-V on a spinning disk is so horrifically slow. If I migrate the VM while it's running to SSD, let it finish, then migrate it back to spinning, it's faster then just letting it update in place. This is my device desktop, otherwise I would put in on SSD full time.
The above comments make me think something like the stat calls may be the issue and moving the running vm to SSD hides the problem. It obviously isn't raw disk rate at the point.
Is the Ubuntu VM's / on a fixed or dynamic VHD? The former provides (modulo filesystem fragmentation) linear addressing of virtual LBAs to physical LBAs, and the latter can be heavily fragmented and result in sequential read/write operations becoming random access, which on a HDD kills performance.
My advice for anyone running VMs is if they're on HDDs or they're highly tuned for disk usage (databases for example), use fixed VHDs, otherwise use dynamic.
I still scracth my head as to why doing a kernel update of Ubuntu running under Hyper-V on a spinning disk is so horrifically slow.
Define slow please.
For a laugh I picked a random VM (VMWare) at work and ran (I did apt update first):
# time apt upgrade
...
82 to upgrade, 5 to newly install, 0 to remove and 0 not to upgrade
...
real 6m16.015s
user 2m38.936s
sys 0m55.216s
The updates included two client server DB engines (Clickhouse and Postgresql) the fileserving thingie (Samba) and a few other bits. The reboot takes about 90 seconds before the webby interface appears for rspamd.
Specifics,Host OS is Windows 10 Pro,virtual machine is Hyper-V, Guest OS is Ubuntu, Dynamic VHXD, Storage Space Pool with 4 HDD and 1 SSD, Windows Filesystem ReFS.
Only observe this specifically with kernel updates. Everything updates as I would expect on a HDD.
Slow is ~10-20 minutes. Which is why it is faster to migrate the running VM from the Storage pool, onto a single SSD, complete the ubuntu kernel update and migrate back.
Have you cleaned away the old kernel packages? Kernel package operations on Debian and Ubuntu are accidentally quadratic with a high time constant (there is a package script for each kernel that iterates over all installed kernels) so you want to make sure not to let them accumulate.
The build speed difference for Android (Gradle build system) between Windows and Linux is extremely noticeable especially for large projects, where you can see even almost 50% faster builds.
Keep in mind that Linux filesystems are all implemented in the kernel (FUSE aside), and WSL doesn't run a Linux kernel - it just emulates syscalls for userland to work. So there's no ext drivers in there, or any other standard Linux FS drivers.
tldr: discussion is about why disk access under WSL is slower than under Linux, mostly due to the different design constraints of Linux vs Windows for file systems.
An interesting comment from insider. We all knows from various benchmarks that Microsoft Windows' filesystem access performance was way too worse than Linux kernel not only WSL but also Win32 Subsystem too.
Also, process creating performance is worse on Windows. I wonder if it is also the case of "Death by thousand cuts"
Windows performance is abysmal. Instead of beating around the bush, they should just state that it is the combination of everything that makes Windows dog slow compared to either Mac or Linux. Someone said it on that thread, but Windows really should be a DE on Linux now instead of being its own OS.
I haven’t developed on an actual *NIX machine in years but recently I deployed something to my DO VPS and it “built” (php composer) in what felt like 1/100 of the time it was taking on my computer, whether running the Windows binaries in Powershell/CMD or the Linux binaries in WSL. Although I will say WSL is slower.
In fact, it was so fast that I’m about to call Microcenter and see about picking up a Thinkpad to install Linux on.