SOLVED

Powershell is corrupting comma delimited files

%3CLINGO-SUB%20id%3D%22lingo-sub-2464555%22%20slang%3D%22en-US%22%3EPowershell%20is%20corrupting%20comma%20delimited%20files%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2464555%22%20slang%3D%22en-US%22%3E%3CP%3E%3CFONT%20size%3D%222%22%3EMy%20PowerShell%20script%20is%20adding%20double%20quotes%20and%20formatting%20the%20text%20correctly.%26nbsp%3B%20I%20am%20removing%20lines%20that%20have%20a%20particular%20string.%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3E%3CSTRONG%3EThis%20line%20created%20by%20PowerShell%20script%3C%2FSTRONG%3E%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3E%22RECTYPE%2CCNTBTCH%2CCNTITEM%2CIDVEND%2CIDINVC%2CTEXTTRX%2CDATEINVC%2CSWCALCTX%2CAMTGROSTOT%2CAMTINVCTOT%22%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3E%3CSTRONG%3EThis%20line%20is%20the%20original.%20No%20double%20quotes.%3C%2FSTRONG%3E%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3ERECTYPE%2CCNTBTCH%2CCNTITEM%2CIDVEND%2CIDINVC%2CTEXTTRX%2CDATEINVC%2CSWCALCTX%2CAMTGROSTOT%2CAMTINVCTOT%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3E%3CFONT%20size%3D%222%22%3EHere%20they%20are%20for%20back-to-back%20comparison%3C%2FFONT%3E%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3E%E2%80%9CRECTYPE%2CCNTBTCH%2CCNTITEM%2CIDVEND%2CIDINVC%2CTEXTTRX%2CDATEINVC%2CSWCALCTX%2CAMTGROSTOT%2CAMTINVCTOT%E2%80%9D%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3ERECTYPE%2CCNTBTCH%2CCNTITEM%2CIDVEND%2CIDINVC%2CTEXTTRX%2CDATEINVC%2CSWCALCTX%2CAMTGROSTOT%2CAMTINVCTOT%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3EThe%20files%20are%20CSV%20but%20Excel%20puts%20the%20new%20file%E2%80%99s%20complete%20line%20in%20one%20cell%20as%20if%20it%20does%20not%20recognize%20a%20delimiter%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CSTRONG%3E%3CFONT%20size%3D%222%22%3EThis%20is%20the%20new%20line%20in%20Hex.%3C%2FFONT%3E%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22JoyriderBC_0-1624051322008.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F290055i750A698CF040FF28%2Fimage-size%2Fmedium%3Fv%3Dv2%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22JoyriderBC_0-1624051322008.png%22%20alt%3D%22JoyriderBC_0-1624051322008.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CSTRONG%3E%3CFONT%20size%3D%222%22%3EThis%20is%20the%20original%20line%3C%2FFONT%3E%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22JoyriderBC_1-1624051448167.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F290056iF32A6E25B365A205%2Fimage-size%2Fmedium%3Fv%3Dv2%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22JoyriderBC_1-1624051448167.png%22%20alt%3D%22JoyriderBC_1-1624051448167.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%222%22%3EThis%20is%20the%20.PS1%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%3CFONT%20size%3D%221%202%203%204%205%206%207%22%3E(get-content%20c%3A%5Cexports%5C'Receiving%20Export%20SAGE.CSV'%7C%20select-string%20-pattern%20'%26nbsp%3B%26nbsp%3B%20%2C0%2C%2C%2C0%2C0'%20-notmatch)%20%7C%20Out-File%20c%3A%5CExports%5C'Receiving%20Export%20SAGE%20Zeroes%20Gone.CSV'%3C%2FFONT%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-2464555%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EWindows%20PowerShell%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
New Contributor

My PowerShell script is adding double quotes and formatting the text incorrectly.  I am removing lines that have a particular string.

This line created by PowerShell script

"RECTYPE,CNTBTCH,CNTITEM,IDVEND,IDINVC,TEXTTRX,DATEINVC,SWCALCTX,AMTGROSTOT,AMTINVCTOT"

This line is the original. No double quotes.

RECTYPE,CNTBTCH,CNTITEM,IDVEND,IDINVC,TEXTTRX,DATEINVC,SWCALCTX,AMTGROSTOT,AMTINVCTOT

 

Here they are for back-to-back comparison

“RECTYPE,CNTBTCH,CNTITEM,IDVEND,IDINVC,TEXTTRX,DATEINVC,SWCALCTX,AMTGROSTOT,AMTINVCTOT”

RECTYPE,CNTBTCH,CNTITEM,IDVEND,IDINVC,TEXTTRX,DATEINVC,SWCALCTX,AMTGROSTOT,AMTINVCTOT

 

The files are CSV but Excel puts the new file’s complete line in one cell as if it does not recognize a delimiter

This is the new line in Hex.

JoyriderBC_0-1624051322008.png

This is the original line

JoyriderBC_1-1624051448167.png

This is the .PS1

(get-content c:\exports\'Receiving Export SAGE.CSV'| select-string -pattern '   ,0,,,0,0' -notmatch) | Out-File c:\Exports\'Receiving Export SAGE Zeroes Gone.CSV'

 

5 Replies

@JoyriderBC 

 

Hi,
I try it from my side and its working perfect with no double quote when using the Out-File.
Are you saving the content from a custom PSObject.
I got the double quotes when I replace the Out-File with Export-CSV, and manage to remove it by using PowerShell 7
(get-content C:\Users\1.CSV ) | Export-Csv c:\Fixed.CSV -UseQuotes never

@farismalaeb The double quotes are minor problem.  It is the record corruption.  See the HEX dump where after every character there are unprintable codes.

Can you share part of the output file, just put dump data in it and let me have a closer look.
Thanks
best response confirmed by JoyriderBC (New Contributor)
Solution

@JoyriderBC You can specify the encoding of the files with the -Encoding parameter.

 

For example:

 

Out-File c:\Exports\'Receiving Export SAGE Zeroes Gone.CSV' -Encoding utf8

 

 

@diecknet Thank you.  Your suggestion fixed the two problems I was having.  The updated script put a blank line as the first line in the output file.  The input file does not have this blank line.  The import into my program fails.  How can the blank line be prevented?

Output file

JoyriderBC_0-1625519976161.png

Input file

JoyriderBC_1-1625520079164.png