ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Solved File Management removing unprintable characters

    IT Discussion
    users file management unprintable file system archiving
    8
    47
    3.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DustinB3403D
      DustinB3403 @scottalanmiller
      last edited by

      @scottalanmiller said in File Management removing unprintable characters:

      That Regex should, in theory, allow you to pass in a file name, and resave it with a hyphen instead of any unprintable character.

      You just need to loop through the files.

      No way to not have to pass an individual file through this? Hoping to just be able to point this at a directory and it searches file and folder for unprintable characters replacing as it goes.

      (I know wish in one hand, crap in the other)

      dafyreD scottalanmillerS 2 Replies Last reply Reply Quote 0
      • dafyreD
        dafyre @DustinB3403
        last edited by

        @DustinB3403 said in File Management removing unprintable characters:

        @scottalanmiller said in File Management removing unprintable characters:

        That Regex should, in theory, allow you to pass in a file name, and resave it with a hyphen instead of any unprintable character.

        You just need to loop through the files.

        No way to not have to pass an individual file through this? Hoping to just be able to point this at a directory and it searches file and folder for unprintable characters replacing as it goes.

        (I know wish in one hand, crap in the other)

        Something like...

        $myFiles=dir -recurse C:\Some\Folder
        
        foreach ($file in $myFiles) {
          $newName=$file -replace '[^\p{L}\p{Nd}]', '-'
         ren $file $newName
        }
        
        scottalanmillerS P 2 Replies Last reply Reply Quote 2
        • scottalanmillerS
          scottalanmiller @DustinB3403
          last edited by

          @DustinB3403 said in File Management removing unprintable characters:

          @scottalanmiller said in File Management removing unprintable characters:

          That Regex should, in theory, allow you to pass in a file name, and resave it with a hyphen instead of any unprintable character.

          You just need to loop through the files.

          No way to not have to pass an individual file through this? Hoping to just be able to point this at a directory and it searches file and folder for unprintable characters replacing as it goes.

          (I know wish in one hand, crap in the other)

          Well, that's what the loop would be for. One way or another, the files have to be iterated through.

          1 Reply Last reply Reply Quote 1
          • scottalanmillerS
            scottalanmiller @dafyre
            last edited by

            @dafyre exactly

            1 Reply Last reply Reply Quote 0
            • P
              psophos @dafyre
              last edited by

              @dafyre You might want to throw in an if statement there to check that the names are different before you rename.
              I'm assuming the ren will rename the file even if the names are the same, but maybe it won't. Some very light testing suggests it may (no errors thrown).

              1 Reply Last reply Reply Quote 0
              • 1
                1337
                last edited by 1337

                NTFS is utf-16. All characters are allowed in filenames on NTFS, except reserved characters like
                < (less than)
                > (greater than)
                : (colon)
                " (double quote)
                / (forward slash)
                \ (backslash)
                | (vertical bar or pipe)
                ? (question mark)
                * (asterisk)
                and chr(0) to chr(31).

                If the backup system can't handle all allowed characters in a filename, then that is the problem that needs to be fixed.

                There is no such a thing as unprintable characters. Just need the right font that has that character defined.

                This is screenshot from Windows showing valid file and folder names:
                valid_filenames.png

                JaredBuschJ DustinB3403D 2 Replies Last reply Reply Quote 2
                • JaredBuschJ
                  JaredBusch @1337
                  last edited by

                  @Pete-S said in File Management removing unprintable characters:

                  valid_filenames.png

                  Unrelated to the OP....
                  Why do those kanji look familiar? Like seriously...

                  1 Reply Last reply Reply Quote 0
                  • ObsolesceO
                    Obsolesce @DustinB3403
                    last edited by Obsolesce

                    @DustinB3403 said in File Management removing unprintable characters:

                    So long story short I have users who use unprintable characters in file and folder paths, such as  or the little floating dot.

                    Can anyone think of some quick way to replace all of these in every folder and sub folder and file with a normal hyphen?

                    There's some built-in cmdlets to do this pretty easily.

                    Building on @scottalanmiller's regex, I added the exclusion of punctuation characters, because in my testing, it was replacing the "dot" before the file extension. I did not go looking for a way to just exclude dots. Someone else can do that.

                    This line will get each item in a directory and subdirectories -Recurse, and replace any non-"your-language"alphabet character, ignoring regular aphabet/number/punctuation characters.

                    Here's how I'd go about it:

                    
                    # Remove the -WhatIf when you are ready to make the changes.
                    (Get-ChildItem -Path "C:\test" -Recurse | Rename-Item -NewName {$_.Name -replace '[^\p{L}\p{Nd}\p{P}]','-'} -WhatIf)
                    
                    

                    Using the -WhatIf switch will allow the code to be ran while telling you exactly what will change, without actually doing it. Remove the -WhatIf when you are ready to make the changes.

                    DustinB3403D 1 Reply Last reply Reply Quote 3
                    • DustinB3403D
                      DustinB3403 @Obsolesce
                      last edited by

                      @Obsolesce That is absolutely the perfect answer.

                      1 Reply Last reply Reply Quote 1
                      • DustinB3403D
                        DustinB3403 @1337
                        last edited by

                        @Pete-S said in File Management removing unprintable characters:

                        If the backup system can't handle all allowed characters in a filename, then that is the problem that needs to be fixed.

                        While I would generally agree, I have no way to control the file system in my backup location, which complains about the files and just skips them.

                        So while I agree, use a better back tool that isn't the answer I'm looking for.

                        @Obsolesce this is great, thank you!

                        1 Reply Last reply Reply Quote 0
                        • 1
                          1337
                          last edited by 1337

                          @DustinB3403

                          A few things to think about:

                          • Expect some support calls when you rename customer files and they can't open them using "recent files".
                          • Also expect problems when you rename a file that someone has open. Normally you can't => script fail.
                          • Also expect problem if someone makes a new file with the original name. Now there will be two files that will have the same name after the renaming process => script fail.
                          • Also hope that there aren't any application files that will not pass the regex => application fail.

                          Is the backup something homemade or something very old perhaps?

                          I suggest spending some time finding out exactly what characters are supported and making sure the regex is exactly that and not anymore restrictive than needed.
                          Then make sure the renaming script can handle errors mentioned above.
                          And I suggest some log file or email sent with files that can't be renamed so you know.

                          DustinB3403D 1 Reply Last reply Reply Quote 0
                          • DustinB3403D
                            DustinB3403 @1337
                            last edited by

                            @Pete-S said in File Management removing unprintable characters:

                            Expect some support calls when you rename customer files and they can't open them using "recent files".

                            Not my concern in the least

                            @Pete-S said in File Management removing unprintable characters:

                            Also expect problems when you rename a file that someone has open. Normally you can't => script fail.

                            Only targeting files that are marked for "archive" - so not an issue

                            @Pete-S said in File Management removing unprintable characters:

                            Also expect problem if someone makes a new file with the original name. Now there will be two files that will have the same name after the renaming process => script fail.

                            We already punch people in the back of the head for this

                            @Pete-S said in File Management removing unprintable characters:

                            Also hope that there aren't any application files that will not pass the regex => application fail.

                            Nope

                            @Pete-S said in File Management removing unprintable characters:

                            Is the backup something homemade or something very old perhaps?

                            Using B2 CLI to move stuff, so not old at all.

                            1 1 Reply Last reply Reply Quote 0
                            • DashrenderD
                              Dashrender
                              last edited by

                              This really sounds like an HR problem that you're kinda solving with tech. If it's not against company policy to use those characters in file names/paths - then what you wanting to do is likely the wrong approach... and instead management should be approving you to find a new backup solution that works with those filenames.

                              DustinB3403D 1 Reply Last reply Reply Quote 0
                              • DustinB3403D
                                DustinB3403 @Dashrender
                                last edited by

                                @Dashrender said in File Management removing unprintable characters:

                                This really sounds like an HR problem that you're kinda solving with tech. If it's not against company policy to use those characters in file names/paths - then what you wanting to do is likely the wrong approach... and instead management should be approving you to find a new backup solution that works with those filenames.

                                There is no policy on this, not would there be one.

                                This is an insane way to think about that.

                                DashrenderD scottalanmillerS 2 Replies Last reply Reply Quote 0
                                • DustinB3403D
                                  DustinB3403
                                  last edited by

                                  Okay so with a bit more offline assistance from @Obsolesce

                                  This is what works

                                  (Get-ChildItem -Path "some\path" -Recurse | Rename-Item -NewName {$_.Name -replace '•','-'} -verbose)

                                  Obviously it's only hitting on that bullet point, and replacing it with a hyphen, but that's better than the operation failing over and over and not ever knowing it.

                                  ObsolesceO 1 Reply Last reply Reply Quote 0
                                  • ObsolesceO
                                    Obsolesce @DustinB3403
                                    last edited by

                                    @DustinB3403 said in File Management removing unprintable characters:

                                    Okay so with a bit more offline assistance from @Obsolesce

                                    This is what works

                                    (Get-ChildItem -Path "some\path" -Recurse | Rename-Item -NewName {$_.Name -replace '•','-'} -verbose)

                                    Obviously it's only hitting on that bullet point, and replacing it with a hyphen, but that's better than the operation failing over and over and not ever knowing it.

                                    If you want to know if there's an error, you can build that in like this:

                                    
                                    (Get-ChildItem -Path "C:\test" -Recurse | Rename-Item -NewName {$_.Name -replace '[^\p{L}\p{Nd}\p{P}]','-'} -WhatIf -ErrorAction SilentlyContinue -ErrorVariable daError)
                                    if ($daError) {
                                        Write-Output "ERROR - There was an error. Pay attention : [$daError]"
                                    }
                                    
                                    
                                    1 Reply Last reply Reply Quote 0
                                    • DashrenderD
                                      Dashrender @DustinB3403
                                      last edited by

                                      @DustinB3403 said in File Management removing unprintable characters:

                                      @Dashrender said in File Management removing unprintable characters:

                                      This really sounds like an HR problem that you're kinda solving with tech. If it's not against company policy to use those characters in file names/paths - then what you wanting to do is likely the wrong approach... and instead management should be approving you to find a new backup solution that works with those filenames.

                                      There is no policy on this, not would there be one.

                                      This is an insane way to think about that.

                                      See, you wanting to change the way users work to suit IT is generally something @scottalanmiller would say is likely wrong for the business.

                                      1 Reply Last reply Reply Quote 2
                                      • 1
                                        1337 @DustinB3403
                                        last edited by 1337

                                        @DustinB3403 said in File Management removing unprintable characters:

                                        @Pete-S said in File Management removing unprintable characters:
                                        Is the backup something homemade or something very old perhaps?

                                        Using B2 CLI to move stuff, so not old at all.

                                        You have to realize that there are multi-national companies everywhere. So when you have problems with filenames, it's extremely unlikely that something modern will not support whatever it is you are doing - as long as the filename is valid on the OS you are using. And Backblaze of course supports all characters.

                                        https://www.backblaze.com/b2/docs/string_encoding.html

                                        Maybe there is a bug somewhere or maybe you are using it the wrong way. But this is the actual problem, not that users use whatever filename they want.

                                        DustinB3403D 1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @DustinB3403
                                          last edited by

                                          @DustinB3403 said in File Management removing unprintable characters:

                                          @Dashrender said in File Management removing unprintable characters:

                                          This really sounds like an HR problem that you're kinda solving with tech. If it's not against company policy to use those characters in file names/paths - then what you wanting to do is likely the wrong approach... and instead management should be approving you to find a new backup solution that works with those filenames.

                                          There is no policy on this, not would there be one.

                                          This is an insane way to think about that.

                                          Well, let's back up. Why are people using those characters? Chances are they are real characters. They are getting in there somehow, and not likely from someone trying to be weird, but likely from an alternative keyboard or something.

                                          DustinB3403D 1 Reply Last reply Reply Quote 0
                                          • DustinB3403D
                                            DustinB3403 @1337
                                            last edited by

                                            @Pete-S Using a bullet point in a damn folder or file name is not at all normal!

                                            FFS.

                                            If you think a file or folder named

                                            • some crap

                                            Is used globally, you're just wrong.

                                            scottalanmillerS 1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 2 / 3
                                            • First post
                                              Last post