I recently had to repair a botched Windows 10 update, which changed the volume part of the target path of every NTFS junction point, breaking my environment.
To fix these junction points, I needed to catch them all. I figured I'd just walk the C:\ drive, but apparently, Python 3.6.4 and Python 3.7.0b1 do not fully support junctions.
Let's fix Python.
The first issue is the os.path.islink method only tests for symbolic links.
ntpath.py (lines 247-255):
The Windows-only fix is straightforward. Instead of returning only stat.S_ISLNK(st.st_mode), we want to conditionally return True if the path has the FILE_ATTRIBUTE_REPARSE_POINT flag.
The cross-platform fix, if necessary, would be slightly more involved.
The second issue is the os.walk method, if the followlinks parameter is False, does not yield the top-level path. I want to yield the top-level path, so I can test whether the target path exists.
First, we need to make sure os.walk can identify junction points, not just symbolic links.
os.py (lines 378-394):
At line 8 (line 385), the is_symlink member is set to entry.is_symlink(). This method behaves like os.path.islink but is not an alias, so the previous fix for os.path.islink does not carry over. We need to set is_symlink to path.islink(entry.path).
This allows os.walk to properly test for junction points. We should keep in mind that nt.DirEntry.is_symlink() is cached per entry whereas os.path.islink is probably not. I have not benchmarked this change, so I cannot speak to its performance impact.
os.py (lines 402-409):
And here we see that os.walk will recurse into Directory Junctions if the followlinks parameter is True or if the the path is not a symbolic link or, with my fix, a junction point. But if the followlinks parameter is False, then os.walk will not yield any results.
So, we need to yield the new_path member in that case:
Finally, we have one more issue: os.path.abspath returns the absolute version of a path, or in the case of a symbolic link, the method nonrecursively returns the target path. However, in the case of a junction point, the method returns the path to the junction point.
We could try to use to os.readlink, but that method does not support junction points and will throw a "not a symbolic link" error. We could use pathlib.Path().resolve(), which will return the target path recursively, but let's just fix the issue in the standard library.
ntpath.py (lines 537-554):
At line 8 (line 544), we see that path is set to nt._getfullpathname(path), which is a helper method for the GetFullPathName function in the Windows API. We really want to call the GetFinalPathNameByHandle function text. Fortunately, there's already a helper method:
The problem with using only nt._getfinalpathname is this method returns extended-length paths, which are prefixed (e.g., \\?\D:\dev.) We want a return value that is consistent with the normal behavior of this method so we don't break everything that uses this method.
We can resolve extended-length paths with pathlib.Path().resolve(), but we don't want to add a dependency for pathlib, so I opted for a simple string replacement in this fix.
D:\dev\Python364\python.exe -m find_junctions INFO:root:Link: C:\test\dev (Target: D:\dev)
Outputting nonexistent target paths does not appear to be possible, with either Python or the Windows API, so os.path.abspath will return the path to the junction point instead.
D:\dev\Python364\python.exe -m find_junctions ERROR:root:Link: C:\test\dev (Target: C:\test\dev)
Here are some related open issues on the Python issue tracker:
- Issue 29248: os.readlink fails on Windows
(open since Jan 2017)
- Issue 23407: os.walk always follows Windows junctions
(open since Feb 2015)
- Issue 14094: ntpath.realpath() should use GetFinalPathNameByHandle()
(open since Feb 2012)