"The input couldn't be recognized as a valid Excel document."

MVP

Such alert could appear when you try to use Power BI connector on Excel file. It's understandable if the source file is corrupted and can't be opened in Excel. However, it looks strange if Excel opens the file in question and shows nothing wrong.

 

Based on our experience above is usually means what something is wrong with XML scheme of the Excel workbook.

 

Mushup trace (Data->New Query->Query Options->Diagnostics->Enable tracing) could give some additional information, but often not enough to find the reason.

We had two main scenarios

 

  • XML scheme is not complete

Usually if Excel file was generated by third-party tool. Such tool could generate quite limited XML scheme which is enough to open the file in Excel and to work with it, but not enough for Power BI connector. As an example, trace log shows

[DataFormat.Error] The input couldn't be recognized as a valid Excel document.\r\nStackTrace:\n…
…
[DataFormat.Error] We couldn't find a part named '/xl/sharedStrings.xml' in the Excel package.\r\nStackTrace:\n…

Such case is easy to fix – it's enough to open the file in Excel and save it (without any changes) – Excel is clever enough to fix the scheme. For the routine regular tasks we use poweshell script which does exactly the same in background.

 

  • There is the link within Excel file which is not recognizable as valid

Usually if Excel file is synced/kept with some cloud storage. One of the variants, wrong link could appear with copy/paste from another such file. That could be active link in one of the cells; or the link within conditional formatting formula; or even the link which actually isn't used by Excel but kept somewhere inside the scheme. For example, in one of the files I found in Data->Consolidate->All references the link like

'\\drive.tresorit.com@7235\Tresors….\[file.xlsx]Sheet'!$AC$6:$AC$357

on the file which was deleted long ago and isn't used, but for some strange reason the link was kept within the scheme.

 

Unfortunately for such case trace log doesn't give enough information to localize the issue, it looks like

[DataFormat.Error] The input couldn't be recognized as a valid Excel document.\r\nStackTrace:\n…
…
nExceptionType: System.UriFormatException, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\r\nMessage: Invalid URI: The hostname could not be parsed.\r\nStackTrace:\n

Perhaps I have not enough knowledge for more straight forward localization of the problem, but the only way is to exclude Excel file parts one by one and check if the issue disappeared. Another way could be to unzip Excel file and check if wookbook.xml or sheetNN.xml have something suspicious inside.

6 Replies

I have this error but it seems to be computer specific, or more accurately Excel version specific. Using the same source files, a Power Query | Combine Binaries will work from one computer but not another.

 

Using Excel 365 ProPlus version 1701 build 7766.2071 (First Release for Deferred Channel) it works fine.

 

Using Excel 365 ProPlus version 1609 build 7369.2127 (Deferred Channel updates) I get the "The input couldn't be recognized as a valid Excel document." error. Updated from Current Channel to version 1702 build 7870.2038 and the Combine Binaries worked without error.

  

Scope: Combine Excel rows from multiple similarly formatted excel sheets into one.

Hi Todd,

 

Combine Binaries starts working with Excel workbook formats from January this year https://blogs.office.com/2017/02/08/january-2017-updates-for-get-transform-in-excel-2016-and-the-pow..., i guess from build 7668.2066. On earlier builds it worked only with CSV (or txt) formats.

 

I have no old builds available and can't check how Combine Binaries worked before. Now it in most cases correctly recognizes file format and use Excel.Workbook or Csv.Document functions depends on file format. Perhaps on earlier version connector was not so clever and gave an error what can't recognize the format.

 

That's easy to check if try to open Excel workbook by From File->From CSV, exactly the same connector as within Combine Binaries.

 

Anyway, that's bit different story. In brief, in your situation PQ analyzes which actual format the file has and, if can, use proper connector independently on which UI menu you use to connect the file (latest builds) or give an error (older builds).

 

In my case we connect to the file with proper connector which has to work with our file format, but PQ can't parse the file due to internal errors in XML structure of the file (more exactly, non-standard XML configuration from PQ point of view). 

 

 

 

 

errore.png

 

I face the same problem. I don't know why first 3 tables are ok, but the next two no. How can I fix this?

Did you try the solution provided by Sergei in his first post?

"it's enough to open the file in Excel and save it (without any changes) – Excel is clever enough to fix the scheme."

I realize this is an older post, but relevant nonetheless. I got the error of input not recognized as valid Excel document, and was able thanks to your instructions to save the file without changes and open the file in Power BI. 

 

The challenge I have is that I have a server folder full of these reports that I would like to transform and combine within Power BI based on folder location, without having to open and save each individual file. Do you know of a way within Power BI to resolve the XML issue, or perhaps something I can do with powershell to import the "restored" files?

 

Some background: the Excel files I'm trying to import are created and stored through an SAP batch job. Here's some information I copied from the Tracer:

 

04T15:20:36.0872649Z","Action":"ExcelReaderOpenXml/ExceptionHandler","HostProcessId":"14184","PartitionKey":"Section1/EXLIST_500_APAC_20200803030134/Source","Exception":"Exception:\r\nExceptionType: System.InvalidOperationException, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\r\nMessage: Specified part does not exist in the package.\r\nStackTrace:\n   at System.IO.Packaging.Package.GetPart(Uri partUri)\r\n   at Microsoft.Mashup.Engine1.Library.Excel.ExcelReaderOpenXml.RelationshipHidingPackage.GetPartCore(Uri partUri)\r\n   at System.IO.Packaging.Package.GetPartHelper(Uri partUri)\r\n   at System.IO.Packaging.Package.GetPart(Uri partUri)\r\n   at DocumentFormat.OpenXml.Packaging.OpenXmlPart.Load(OpenXmlPackage openXmlPackage, OpenXmlPart parent, Uri uriTarget, String id, Dictionary`2 loadedParts)\r\n   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, PackageRelationshipCollection relationshipCollection, Dictionary`2 loadedParts)\r\n   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, PackageRelationshipCollection relationshipCollection, Dictionary`2 loadedParts)\r\n   at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, PackageRelationshipCollection relationshipCollection, Dictionary`2 loadedParts)\r\n   at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load()\r\n   at DocumentFormat.OpenXml.Packaging.SpreadsheetDocument.Open(Package package, OpenSettings openSettings)\r\n   at Microsoft.Mashup.Engine1.Library.Excel.ExcelReaderOpenXml.ReadTablesUnguarded()\r\n   at Microsoft.Mashup.Engine1.Library.Excel.ExcelReaderOpenXml.ReadTables()\r\n\r\n\r

 

Thanks in advance for any tips/advice!

this error is specific to excel 2016 whereas i did not observe the same in excel 365.

Any suggestions?