Forum Discussion
Fred_Elmendorf
Jul 27, 2023Brass Contributor
Read one data element from each of the first two rows of a text file, output to .csv
This new task requires recursing a file structure of hundreds of files to pull one data element from each of the first two rows in each of the appropriate text files, then exporting the results to a ...
- Jul 27, 2023
As with the last script, I'm going to assume this new requirement also only uses the first two lines.
Based on this sample data across two separate files:
File 1
Station=Station1 Name RECORDER=V3.1.8 DFR=ONLINE PC_TIME=07/26/2023-15:41:25 TIME_MARK_SOURCE=IRIG-B TIME_MARK_TIME=07/26/2023-15:41:26.000000 Clock=SYNC(lock) IEEE_1344=Yes DATA_DISK_SIZE=999546736640 DATA_DRIVE=289GB/999GBFile 2
Station=Station2 Name RECORDER=V3.1.8 DFR=ONLINE PC_TIME=07/26/2023-15:41:25 TIME_MARK_SOURCE=IRIG-B TIME_MARK_TIME=07/26/2023-15:41:26.000000 Clock=SYNC(lock) IEEE_1344=Yes DATA_DISK_SIZE=999546736640 DATA_DRIVE=289GB/999GBWe get this output
"Station","Version","Filename","Created" "Station1 Name","V3.1.8","D:\Data\Temp\Forum\forums.txt","20/03/2023 5:51:01 PM" "Station2 Name","V3.1.8","D:\Data\Temp\Forum\forums2.txt","27/07/2023 9:49:17 PM"From this example script
# Specify our CSV output file name. $TargetFile = "D:\Data\Temp\Forum\forum.csv"; # Instantiate a new HashTable outside the loop to minimise overhead, should this have to scale to tens of thousands of files (or more.) $HashTable = @{}; Get-ChildItem -Path "D:\Data\Temp\Forum\*" -Filter "forums*.txt" -Recurse -Depth 2 | ForEach-Object { $File = $_; # Read just the first two lines of the file. Get-Content -Path ($File.FullName) -TotalCount 2 | ForEach-Object { # Treat the first "=" only as the separator, ensuring that even if more "=" are in the rest of the string, it's not broken up into smaller substrings. $Parts = $_.Split([char[]]@("="), 2); # Some basic validation to ensure we're not reading an ineligible file. if (($HashTable.Count -gt 0) -or ("station" -eq $Parts[0])) { # Add the key-value pairs to the HastTable. switch ($Parts[0]) { "station" { $HashTable.Add("Station", $Parts[1]); # Add the station value. continue; } "recorder" { $HashTable.Add("Version", $Parts[1]); # Add the version. break; } default { # We're not going to do anything in this scenario, but it's good practice to include a default handler. continue; } } } } # Time to output something useful, as long as both the values were obtained. if ($HashTable.Count -eq 2) { [PSCustomObject] @{ Station = [string]$HashTable["Station"]; Version = [string]$HashTable["Version"]; Filename = $File.FullName; Created = $File.CreationTime; } } # Clean out the HashTable, to ensure it's ready for use in the next file or just because we've finished and cleaning up after ourselves. $HashTable.Clear(); } | Export-Csv -NoTypeInformation -Path $TargetFile;Cheers,
Lain
LainRobertson
Jul 27, 2023Silver Contributor
As with the last script, I'm going to assume this new requirement also only uses the first two lines.
Based on this sample data across two separate files:
File 1
Station=Station1 Name
RECORDER=V3.1.8
DFR=ONLINE
PC_TIME=07/26/2023-15:41:25
TIME_MARK_SOURCE=IRIG-B
TIME_MARK_TIME=07/26/2023-15:41:26.000000
Clock=SYNC(lock)
IEEE_1344=Yes
DATA_DISK_SIZE=999546736640
DATA_DRIVE=289GB/999GB
File 2
Station=Station2 Name
RECORDER=V3.1.8
DFR=ONLINE
PC_TIME=07/26/2023-15:41:25
TIME_MARK_SOURCE=IRIG-B
TIME_MARK_TIME=07/26/2023-15:41:26.000000
Clock=SYNC(lock)
IEEE_1344=Yes
DATA_DISK_SIZE=999546736640
DATA_DRIVE=289GB/999GB
We get this output
"Station","Version","Filename","Created"
"Station1 Name","V3.1.8","D:\Data\Temp\Forum\forums.txt","20/03/2023 5:51:01 PM"
"Station2 Name","V3.1.8","D:\Data\Temp\Forum\forums2.txt","27/07/2023 9:49:17 PM"
From this example script
# Specify our CSV output file name.
$TargetFile = "D:\Data\Temp\Forum\forum.csv";
# Instantiate a new HashTable outside the loop to minimise overhead, should this have to scale to tens of thousands of files (or more.)
$HashTable = @{};
Get-ChildItem -Path "D:\Data\Temp\Forum\*" -Filter "forums*.txt" -Recurse -Depth 2 |
ForEach-Object {
$File = $_;
# Read just the first two lines of the file.
Get-Content -Path ($File.FullName) -TotalCount 2 |
ForEach-Object {
# Treat the first "=" only as the separator, ensuring that even if more "=" are in the rest of the string, it's not broken up into smaller substrings.
$Parts = $_.Split([char[]]@("="), 2);
# Some basic validation to ensure we're not reading an ineligible file.
if (($HashTable.Count -gt 0) -or ("station" -eq $Parts[0]))
{
# Add the key-value pairs to the HastTable.
switch ($Parts[0])
{
"station" {
$HashTable.Add("Station", $Parts[1]); # Add the station value.
continue;
}
"recorder" {
$HashTable.Add("Version", $Parts[1]); # Add the version.
break;
}
default {
# We're not going to do anything in this scenario, but it's good practice to include a default handler.
continue;
}
}
}
}
# Time to output something useful, as long as both the values were obtained.
if ($HashTable.Count -eq 2)
{
[PSCustomObject] @{
Station = [string]$HashTable["Station"];
Version = [string]$HashTable["Version"];
Filename = $File.FullName;
Created = $File.CreationTime;
}
}
# Clean out the HashTable, to ensure it's ready for use in the next file or just because we've finished and cleaning up after ourselves.
$HashTable.Clear();
} | Export-Csv -NoTypeInformation -Path $TargetFile;
Cheers,
Lain
Fred_Elmendorf
Jul 27, 2023Brass Contributor
Hi Lain,
This solution worked perfectly first try. I really appreciate your concise but thorough responses and the comments for explanation. I'm fairly new to powershell and I'm using it out of necessity because of the resources available. These real-world tasks are helping me learn as I go.
Thanks!
Fred
This solution worked perfectly first try. I really appreciate your concise but thorough responses and the comments for explanation. I'm fairly new to powershell and I'm using it out of necessity because of the resources available. These real-world tasks are helping me learn as I go.
Thanks!
Fred