SOLVED

Script to Convert fixed width txt to csv missing fields

Brass Contributor

Here is the script:

 

 

 

[cmdletbinding()]
Param(
    [parameter(Mandatory=$true)][string]$Path
)



function Get-ExportHeader([string]$Path)
{
    if (Test-Path -Path $Path)
    {
        # Check the first 10 rows of the text file only to prevent pointlessly reading a large file.
        $HeaderExtract = Get-Content -Path $Path -TotalCount 10;

        for ($index = 0; $index -lt $HeaderExtract.Count; $index++)
        {
            $Line = $HeaderExtract[$index];

            if (($Line.Length -gt 0) -and ([regex]::Replace($Line, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0))
            {
                # This is our row of dashes where each sequence indicates the field length and is separated by exactly one space.
                # The line prior to this is the header, though we need to be wary that there may not be one (unlikely but possible.)
                # Also, be wary that headers appear to be justified, just like the column values themselves. See XREF_NUM as one example.

                if ($index -gt 0)
                {
                    $CurrentStop = 0;

                    $Header = foreach ($Marker in $Line.Split(' '))
                    {
                        [PSCustomObject] @{
                            Name = if (($CurrentStop + $Marker.Length) -lt $Line.Length) { $HeaderExtract[$index-1].SubString($CurrentStop, $Marker.Length).Trim(); } else { $HeaderExtract[$index-1].SubString($CurrentStop).Trim(); };
                            Position = $CurrentStop;
                            Length = $Marker.Length;
                        }

                        $CurrentStop += $Marker.Length + 1;
                    }

                    return($Header);
                }
            }
        }
    }

    # Indicate we couldn't construct a header.
    return($null);
}

function Get-ExportData([string]$Path, $Header)
{
    $FoundHeader = $false;

    Get-Content -Path $Path | ForEach-Object {
        if (-not $FoundHeader)
        {
            $FoundHeader = ($_.Length -gt 0) -and ([regex]::Replace($_, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0);
        }
        else
        {
            if ($_.Length -gt 0)
            {
                [hashtable]$HashTable = [hashtable]::new();

                for ($index = 0; $index -lt $Header.Count; $index++)
                {
                    if ($Header[$index].Position -ge $_.Length)
                    {
                        $HashTable.Add($Header[$index].Name, $null);
                    }
                    elseif ($index -lt ($Header.Count - 1))
                    {
                        $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim());
                    }
                    else
                    {
                        $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim());
                    }
                }
            
                [PSCustomObject]$HashTable;
            }
        }
    }
}
#region Script body.
$Header = Get-ExportHeader -Path $Path;

if ($null -ne $Header)
{
    Get-ExportData -Path $Path -Header $Header;
}

 

 

 

Here is an example of the text I'm converting to a CSV. I've noticed that when the EXT4 and FAX_NUM fields are blank, the preceding field "EMP_Title" fails to show up in the generated csv file. So instead of "Chief of Security" the field will be blank.

 

 

FIRST_NAME                     MI LAST_NAME                      SUFFIX                SUPV   XREF_NUM PHONE          OFFICE                              LOCAT  OFFICE_STREET                  OFFICE_CITY                    ST OFF_ZIP    EMP_TITLE                            EXT4 FAX_NUM
------------------------------ -- ------------------------------ --------------- ---------- ---------- -------------- ----------------------------------- ------ ------------------------------ ------------------------------ -- ---------- ------------------------------------ ---- --------------
Saul                           J. Doenur                                              19539      12652 (234) 225-3650 Administrative	                  F561   280 East First ST., Ste. 110   Houston                        TX 34567      Human Resources                      1885 (234) 225-3661
Adam                           A. Doeeu                                                4825      13847 (123) 688-3497 Messenger Service                   M122   Rm. 122                        Austin                         TX 12345      Messenger                            3497 (123) 137-6903
Sam                            I. Doevedo                                             18631      14137 (123) 688-2529 First Floor                         M946   Rm. 946                        Austin                         TX 12345      Front Desk                           2529 (123) 137-6909
David                          V. Doems                                               18446      15326 (123) 688-2338 Security                            D319   Rm. 319                        Austin                         TX 12345      Chief of Security           
John                           P. Doeabbo                        JR                    5328       9398 (123) 688-2322 Second Floor                        M811   Rm. 811                        Austin                         TX 12345      Human Resources                      2322 (123) 137-6875

 

 

 


When I run the conversion script I see this error multiple times, I suspect it has something to do with the blank EXT4 fields.

 

 

Exception calling "Substring" with "2" argument(s): "Index and length must refer to a location within the string.
Parameter name: length"
At \\domain.local\dfs\Scripts\ConvertFixed-Test.ps1:73 char:25
+ ...             $HashTable.Add($Header[$index].Name, $_.SubString($Header ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : ArgumentOutOfRangeException

 

 

 

 Any help would be greatly appreciated as I'm not sure where to begin fixing this issue.

1 Reply
best response confirmed by Baron164 (Brass Contributor)
Solution

I made the following change which appears to have resolved the issue.

                for ($index = 0; $index -lt $Header.Count; $index++)
            {
                if ($Header[$index].Position -ge $_.Length)
                {
                    $HashTable.Add($Header[$index].Name, $null);
                }
                #elseif ($index -lt ($Header.Count - 1))
                elseif ($Header[$index].Position + $Header[$index].Length -lt ($_.length))
                {
                    $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim());
                }
                else
                {
                    $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim());
                }
            }
1 best response

Accepted Solutions
best response confirmed by Baron164 (Brass Contributor)
Solution

I made the following change which appears to have resolved the issue.

                for ($index = 0; $index -lt $Header.Count; $index++)
            {
                if ($Header[$index].Position -ge $_.Length)
                {
                    $HashTable.Add($Header[$index].Name, $null);
                }
                #elseif ($index -lt ($Header.Count - 1))
                elseif ($Header[$index].Position + $Header[$index].Length -lt ($_.length))
                {
                    $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim());
                }
                else
                {
                    $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim());
                }
            }

View solution in original post