SOLVED

split a text file using a string as delimiter

Copper Contributor

greetings to the forum...

i have a (huge) text file made like this

____________________________________

string blahblahblah(1)

blahblahblah(2)

 

string blahblahblah(3)

blahblahblah(4)

 

string blahblahblah(5)

blahblahblah(6)

..........................................

_________________________________________

i have to obtain a number n of files equal to the n occurrences of string, done this way

___________________________

string blahblahblah(1)

blahblahblah(2)
___________________________

___________________________

string blahblahblah(3)

blahblahblah(4)
___________________________

___________________________

string blahblahblah(5)

blahblahblah(6)
___________________________

and so on...

obviously the various blahblahblah(x) are texts of variable length...

I know it's possible to do it with powershell, but unfortunately I don't master it and the resources on the net didn't help me...

can anyone help me?

Thank you.

6 Replies
best response confirmed by filigrana (Copper Contributor)
Solution
The script creates new files in the same directory as the original file, naming them with a base name followed by a sequence number and the .txt extension (e.g., splitFile_1.txt, splitFile_2.txt, etc.).

powershell
Copy code
# Define your parameters
$filePath = "C:\path\to\your\file.txt" # Path to your huge text file
$delimiter = "string" # Your delimiter
$baseOutputPath = "C:\path\to\output\splitFile_" # Base path and filename for output files

# Initialize variables
$fileCounter = 1
$currentContent = @()

# Read the file line by line
Get-Content -Path $filePath | ForEach-Object {
if ($_ -match $delimiter -and $currentContent.Count -gt 0) {
# Output the current content to a file
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
# Increment the file counter and reset the current content
$fileCounter++
$currentContent = @()
}
$currentContent += $_
}

# Don't forget to output the last chunk if it exists
if ($currentContent.Count -gt 0) {
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
}
Here's how to use this script:

Replace $filePath with the full path to your text file.
Change $delimiter to the string you're using to split the files (it appears you're using "string" as your delimiter).
Set $baseOutputPath to the directory and base filename where you want to save the split files. The script will append numbers to this base name to create the individual filenames.
This script works by reading each line of the input file. Whenever it encounters the delimiter (indicating the start of a new section), it writes the accumulated lines to a new file and starts collecting lines afresh for the next file.

Remember to adjust the file paths and delimiter according to your specific needs before running the script.

@Dalbir3

uhm...i am an animal...
i have to save your script as split.ps1, put it in the huge text file folder, run powershell, go in that folder and give the command .\split.ps1, is this correct?

IT WORKS!!!!!!!!!!!!
you are a great!!!!!!!
thank you!!!
where can you learn all these great things about windows power shell? I want to learn them too...

@filigrana 

 

YouTube

Udemy.com

 

Microsoft learn google a few things

 

if you want to invest into it

pluralsight.com

cbt nuggets

amazon books


essentially learn 1-5 commands then mix and match them, there tons of scripts on git hub 

 

Use visual studio code, powershell ise 

 

I would take what you have there and add more scope to it to learn more on the powershell side like log the output 

 

The first module in the PowerShell Master Class. This is a multi-part class that should be viewed as part of its playlist, https://www.youtube.com/playlist?list=PLlVtbbG169nFq_hR7FcMYg32xsSAObuq8 Materials for the class available at https://github.com/johnthebrit/PowerShellMC 0:00:00 Introduction
1 best response

Accepted Solutions
best response confirmed by filigrana (Copper Contributor)
Solution
The script creates new files in the same directory as the original file, naming them with a base name followed by a sequence number and the .txt extension (e.g., splitFile_1.txt, splitFile_2.txt, etc.).

powershell
Copy code
# Define your parameters
$filePath = "C:\path\to\your\file.txt" # Path to your huge text file
$delimiter = "string" # Your delimiter
$baseOutputPath = "C:\path\to\output\splitFile_" # Base path and filename for output files

# Initialize variables
$fileCounter = 1
$currentContent = @()

# Read the file line by line
Get-Content -Path $filePath | ForEach-Object {
if ($_ -match $delimiter -and $currentContent.Count -gt 0) {
# Output the current content to a file
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
# Increment the file counter and reset the current content
$fileCounter++
$currentContent = @()
}
$currentContent += $_
}

# Don't forget to output the last chunk if it exists
if ($currentContent.Count -gt 0) {
$currentContent | Out-File -FilePath ($baseOutputPath + $fileCounter + ".txt")
}
Here's how to use this script:

Replace $filePath with the full path to your text file.
Change $delimiter to the string you're using to split the files (it appears you're using "string" as your delimiter).
Set $baseOutputPath to the directory and base filename where you want to save the split files. The script will append numbers to this base name to create the individual filenames.
This script works by reading each line of the input file. Whenever it encounters the delimiter (indicating the start of a new section), it writes the accumulated lines to a new file and starts collecting lines afresh for the next file.

Remember to adjust the file paths and delimiter according to your specific needs before running the script.

View solution in original post