Forum Discussion
split a text file using a string as delimiter
greetings to the forum...
i have a (huge) text file made like this
____________________________________
string blahblahblah(1)
blahblahblah(2)
string blahblahblah(3)
blahblahblah(4)
string blahblahblah(5)
blahblahblah(6)
..........................................
_________________________________________
i have to obtain a number n of files equal to the n occurrences of string, done this way
___________________________
string blahblahblah(1)
blahblahblah(2)
___________________________
___________________________
string blahblahblah(3)
blahblahblah(4)
___________________________
___________________________
string blahblahblah(5)
blahblahblah(6)
___________________________
and so on...
obviously the various blahblahblah(x) are texts of variable length...
I know it's possible to do it with powershell, but unfortunately I don't master it and the resources on the net didn't help me...
can anyone help me?
Thank you.
- The script creates new files in the same directory as the original file, naming them with a base name followed by a sequence number and the .txt extension (e.g., splitFile_1.txt, splitFile_2.txt, etc.).
powershell
Copy code
# Define your parameters
$filePath = "C:\path\to\your\file.txt" # Path to your huge text file
$delimiter = "string" # Your delimiter
$baseOutputPath = "C:\path\to\output\splitFile_" # Base path and filename for output files
# Initialize variables
$fileCounter = 1
$currentContent = @()
# Read the file line by line
Get-Content -Path $filePath | ForEach-Object {
if ($_ -match $delimiter -and $currentContent.Count -gt 0) {
# Output the current content to a file
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
# Increment the file counter and reset the current content
$fileCounter++
$currentContent = @()
}
$currentContent += $_
}
# Don't forget to output the last chunk if it exists
if ($currentContent.Count -gt 0) {
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
}
Here's how to use this script:
Replace $filePath with the full path to your text file.
Change $delimiter to the string you're using to split the files (it appears you're using "string" as your delimiter).
Set $baseOutputPath to the directory and base filename where you want to save the split files. The script will append numbers to this base name to create the individual filenames.
This script works by reading each line of the input file. Whenever it encounters the delimiter (indicating the start of a new section), it writes the accumulated lines to a new file and starts collecting lines afresh for the next file.
Remember to adjust the file paths and delimiter according to your specific needs before running the script.
6 Replies
- filigranaCopper Contributorwhere can you learn all these great things about windows power shell? I want to learn them too...
- Dalbir3Copper Contributor
YouTube
Udemy.com
Microsoft learn google a few things
if you want to invest into it
pluralsight.com
cbt nuggets
amazon books
essentially learn 1-5 commands then mix and match them, there tons of scripts on git hubUse visual studio code, powershell ise
I would take what you have there and add more scope to it to learn more on the powershell side like log the output
- Dalbir3Copper ContributorThe script creates new files in the same directory as the original file, naming them with a base name followed by a sequence number and the .txt extension (e.g., splitFile_1.txt, splitFile_2.txt, etc.).
powershell
Copy code
# Define your parameters
$filePath = "C:\path\to\your\file.txt" # Path to your huge text file
$delimiter = "string" # Your delimiter
$baseOutputPath = "C:\path\to\output\splitFile_" # Base path and filename for output files
# Initialize variables
$fileCounter = 1
$currentContent = @()
# Read the file line by line
Get-Content -Path $filePath | ForEach-Object {
if ($_ -match $delimiter -and $currentContent.Count -gt 0) {
# Output the current content to a file
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
# Increment the file counter and reset the current content
$fileCounter++
$currentContent = @()
}
$currentContent += $_
}
# Don't forget to output the last chunk if it exists
if ($currentContent.Count -gt 0) {
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
}
Here's how to use this script:
Replace $filePath with the full path to your text file.
Change $delimiter to the string you're using to split the files (it appears you're using "string" as your delimiter).
Set $baseOutputPath to the directory and base filename where you want to save the split files. The script will append numbers to this base name to create the individual filenames.
This script works by reading each line of the input file. Whenever it encounters the delimiter (indicating the start of a new section), it writes the accumulated lines to a new file and starts collecting lines afresh for the next file.
Remember to adjust the file paths and delimiter according to your specific needs before running the script.