Skip to content

Conversation

@iseki0
Copy link

@iseki0 iseki0 commented Sep 28, 2025

Problem

  • The FileSystem API is not working correctly on non-English Windows

Memo

The Windows path related string should be accessed by Win32 API which with W suffix, don't use A suffix API or you will get trouble on non-English content. By mingw is also unrecommended, they always get trouble on encoding handling. So, just use the plain Win32 API, the most eaist choice.

And...
I refactor your project layout, remove the nativeNonAndroid sourceset. (The dir list code is the only code in this sourceset.)
I don't think seperate the world be xx and nonXx is a good choice. Now we need the third part......
If I make something wrong, let me know.

@iseki0
Copy link
Author

iseki0 commented Sep 28, 2025

@fzhinkin

@iseki0
Copy link
Author

iseki0 commented Sep 28, 2025

Another bug exists, for the long path handling. I will fix it in another PR.

@fzhinkin fzhinkin self-requested a review September 29, 2025 07:39
@fzhinkin fzhinkin added the fs label Sep 29, 2025
@fzhinkin
Copy link
Collaborator

@iseki0 thanks for opening the PR!

Is it possible to add tests reproducing issues that might occur when using mingw's opendir/readdir (and SomethingSomethingA Win32 functions)?
Or these encoding-specific issues occur only on Windows systems with a specific setup? In this case, we need to figure out how to test them anyway.

@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

Is it possible to add tests reproducing issues that might occur when using mingw's opendir/readdir (and SomethingSomethingA Win32 functions)?

I will have a try, the simpliest way is add some file/folder with non-English filename. I will add it to mingw test directory.

@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

Please use star-imports instead.

@fzhinkin
Because my IDEA codestyle is not default, I suggest you add it NOW.(Sorry for my poorly English.)

@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

Or these encoding-specific issues occur only on Windows systems with a specific setup? In this case, we need to figure out how to test them anyway.

When setup Windows with the default configuration(non-English version and, <=Windows 10). The A suffix API handling string in the ANSI encoding, the ANSI means current code page encoding.
For Simplified Chinese locale it's 936(GBK), for Japanese locale it's 932(Shift-JIS). Some program (most part from UNIX-world, maybe include MinGW) require the code page is 65001(UTF-8), but it's not the default configuration. Of course, if the user change the OS codepage to 65001, other program will unable to running...

@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

Please do squash merging. And please add a co-author which2who@gmail.com, he also made a patch but he's late😂. He notice me that the basename/dirname... should be fixed too.

@iseki0 iseki0 requested a review from fzhinkin September 29, 2025 19:05
@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

Many testcases failed, because the Windows separator is '\' but the posix is '/'. The testcase compare the string directly...

@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

We shouldn't compare string directly. The path string shouldn't be comparing directly. It's platform dependent.

@iseki0 iseki0 changed the title Fix Windows target path encoding problem Fix the Windows platform implementation of the FileSystem API Sep 29, 2025
@iseki0
Copy link
Author

iseki0 commented Sep 29, 2025

OK, I fixed the SystemPathSeparator val... Now all test passed.

@iseki0 iseki0 force-pushed the fix/windows-encoding branch from 115c841 to 4d7d065 Compare October 9, 2025 12:36
@iseki0
Copy link
Author

iseki0 commented Oct 9, 2025

@fzhinkin Now, I rebased, please review it again. At least it fix the encoding problem, on Windows

@fzhinkin
Copy link
Collaborator

fzhinkin commented Oct 9, 2025

@iseki0,

And please add a co-author which2who@gmail.com, he also made a patch but he's late

I will need an explicit consent from that person for that.

OK, I fixed the SystemPathSeparator val... Now all test passed.

Unfortunately, there are a few test failures: https://teamcity.jetbrains.com/buildConfiguration/KotlinTools_KotlinxIo_BuildAggregated/5556685?buildTab=tests&status=failed

@iseki0
Copy link
Author

iseki0 commented Oct 10, 2025

    @Test
    fun testA(){
        memScoped {
            println(dirname("C:\\foo\\bar".cstr)!!.toKString())
            println(dirname("C:\\foo\\bar".cstr)!!.toKStringFromUtf8())
            println(dirname("C:\\啊啊啊啊\\123".cstr)!!.toKString())
            println(dirname("C:\\啊啊啊啊\\123".cstr)!!.toKStringFromUtf8())
            println(dirname("C:\\irohauta\\いろはにほへと\\ちりぬるを".cstr)!!.toKString())
            println(dirname("C:\\irohauta\\いろはにほへと\\ちりぬるを".cstr)!!.toKStringFromUtf8())
        }
    }

output

C:\foo
C:\foo
C:\啊啊啊�
C:\啊啊啊�
.
.

My Windows:

OS Name:                   Microsoft Windows 10 专业版 (professional version)
OS Version:                10.0.19045 N/A Build 19045
System Locale:             zh-cn;Chinese (China)
Input Locale:              zh-cn;Chinese (China)

Default code page: 936, GBK

@iseki0 iseki0 requested a review from fzhinkin October 10, 2025 07:24
@iseki0
Copy link
Author

iseki0 commented Oct 10, 2025

I add a skip to the TeamCity test kotlinx.io.files.SmokeFileTestWindowsMinGW.mingwProblem[mingwX64].

@iseki0
Copy link
Author

iseki0 commented Oct 12, 2025

I think the Windows path handling still have many problems, after this PR. This only fix the encoding problem and make it basiclly available, on Windows platform. So we'd better merge it quickly. Because after that we'd better to begin to consider the following problems:

  • Path creating: Should we do a basic normalization during the Path object creation? Such as fix and remove redundent slash, but don't resolve the double and single dot. (Because we need it to represent relative path, in some cases. And, the Java do it like that.)
  • Roots: At least on Windows we have many roots.
  • Long path: On Windows for historical reasons, there's a MAX_PATH limitation of the path string, for long path we have to add a prefix \\?\ (or \\?\UNC\ for UNC path). Should we normalize it during the Path object creating?
  • Normalization: Of course we need a normalize method because in the common sense we shouldn't do any resolve about the double and single dot segment during the path creation.

Any, maybe many others...

@iseki0
Copy link
Author

iseki0 commented Oct 28, 2025

@fzhinkin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants