Compare columns between 2 files and delete non common columns

Copper Contributor

Hello,

I have a bunch of files in folder A and their corresponding metadata files in folder B. I want to loop though the data files and check if the columns are the same in the metadata file. If the columns in both match, no action to is to be taken. If Data file has more columns than metadata file, then those columns should be deleted from incoming data file.

ex

Data file is ps_job.dat

"empid"|"name"|"deptid"|"zipcode"|"salary"|"gender"
"1"|"Tom"|"10"|"11111"|"1000"|"M"
"2"|"Ann"|"20"|"22222"|"2000"|"F"

 

Meta data file is ps_job_metadata.dat

"empid"|"name"|"zipcode"|"salary"

 

I would like my output to be 

"empid"|"name"|"zipcode"|"salary"

"1"|"Tom"|"11111"|"1000"
"2"|"Ann"|"22222"|"2000"

 

Thanks!

1 Reply

 

param(
[Parameter(Mandatory=$true)]
[string]$DataFile,
[Parameter(Mandatory=$true)]
[string]$MetaDataFile,
[string]$Separator = "|"
)
begin{

}
process{
#Read Meta Data File
$metacontent = Get-Content $MetaDataFile
#Get Columns
$metacolumns = $metacontent.Replace('"',"").Split($Separator)

#Parsing Data File
$datacontent = Get-Content $DataFile
$datacolumns = $datacontent[0].Replace('"',"").Split($Separator)
$data = @()
for($i = 1; $i -lt $datacontent.Length; $i++){
$values = $datacontent[$i].Replace('"',"").Split($Separator)
$obj = [ordered]@{}
for($j = 0; $j -lt $datacolumns.Length; $j++){
#$obj."$($datacolumns[$j])" = $values[$j]
$obj.Add($datacolumns[$j], $values[$j])
}
$data += New-Object -Property $obj -TypeName psobject
}

#Selecting columns
$filtered = $data | Select-Object -Property $metacolumns
 
#create file
$output = New-Item -Path . -Name "output.dat" -ItemType File -Force
$filtered | Export-Csv -Delimiter $Separator -Path $output
}
end{
Write-Output $filtered
}

@sykashif885264