Forum Discussion
Identify Excel files sans extensions
I created a script for you, did some quick test and it seemed to work but only for XLSX files. XLSX files are in fact archive files so you can open them for example whit 7-zip or even better whit powershell. I will use this possibility to check if there is a specific folder (named xl) inside the (archive)file. If the folder is inside the archive I know pretty sure that we are dealing with an XLSX file. Use this script whit conscience and first whit a small subset of your data. Use this script at your own risk. The script does the following:
Checks (recursively) if there are without extension in the sourcefolder. In sequence copy the files to the processfolder and add the .zip extension. Then expand the archive to the archive folder and check if a folder named xl is there. If it’s there I set type to XLSX, if not I set type to unknow and if the file is to large I set type to file_to_big. Then I add the results to the outputarray and convert that array to a nice CSV output. Hopefully you can use this CSV to plan your next move!
Once again be careful using this script. Hope it’s useful to you! Take care!
Grtz, Manfred de Laat
The script:
# Use script whit caution!
# Variables
$sourcefolder = "c:\temp"
$processfolder = "c:\temp\process"
$maxfilesize = "50000" # 50 MB
# Get all files in $sourcefolder (and subfolders), whitout extension
$items = Get-ChildItem -Path $sourcefolder -File | ?{$_.Extension -eq ""}
$outputarray = @()
foreach($item in $items){
# Create and/or use the processfolder
if(-NOT(Test-Path $processfolder)){
New-Item -Path $processfolder -ItemType Directory | Out-Null
}
# Create Archive folder to extract possible XLSX file
$archivefolder = ($processfolder + "\archive")
New-Item -ItemType Directory $archivefolder | Out-Null
if($item.LinkType -lt $maxfilesize){
# Copy and Extract file
$target = ($processfolder + "\" + $item.BaseName + ".zip")
Copy-Item $item.FullName $target
Expand-Archive $target -DestinationPath $archivefolder -ErrorAction SilentlyContinue
# Check if Archive folder contains a childfolder named "xl"
if(Test-Path ($archivefolder + "\xl")){
$type = "XLSX"
} else {
$type = "unknown"
}
} else {
$type = "file_to_large"
}
$outputarray += New-Object PSOBJECT -Property @{Name = $item.FullName; Type = $type}
Remove-Item $target -Force
Remove-Item $archivefolder -Recurse -Force
}
Remove-Item $processfolder -Recurse -Force
$outputarray | ConvertTo-Csv -NoTypeInformation