How to import text file with 20,000 columns?

Copper Contributor

How to import txt file of 500 rows x 20K-25K columns?  I have Excel 2010. Excel has this message after attempting to import by Data>From Text>Import text file, and using the Text Import Wizard:

This message can appear due to one of the following:

  • The file contains more than 1,048,576 rows or 16,384 columns. To fix this problem, open the source file in a text editor such as Microsoft Word. Save the source file as several smaller files that conform to this row and column limit, and then open the smaller files in Microsoft Excel. If the source data cannot be opened in a text editor, try importing the data into Microsoft Access, and then exporting subsets of the data from Access to Excel.
  • The area that you are trying to paste the tab-delineated data into is too small. To fix this problem, select an area in the worksheet large enough to accommodate every delimited item.

Notes

  • Excel cannot exceed the limit of 1,048,576 rows and 16,384 columns.
  • By default, Excel places three worksheets in a workbook file. Each worksheet can contain 1,048,576 rows and 16,384 columns of data, and workbooks can contain more than three worksheets if your computer has enough memory to support the additional data.

 I don't think it is feasible to reduce the column # in Word or Notepad.  I can't see any columns so I can't remove columns.

 

Excel recommends that the user save the first import that is truncated table of the txt file to one sheet, and then import a second time to a second sheet only those columns needed to complete the whole table.  The problem is that Excel does not present a feasible option to preview all the columns in the Text Import Wizard, so there is no way to remove the unwanted columns before importing into the second sheet.

 

I also tried Power Query, but that locks up and is unable to work with the data in preview mode, to e.g. first split the single column into the 20K columns and then to remove columns.  Also, it is not possible to transpose the text file contents with PQ before importing.

 

Maybe import the data into the Data Model i.e. PowerPivot via Power Query would work?  I can try.

Thank you,

Ralph

5 Replies

Hi Ralph,

 

Power Query has exactly the same limit on number of colums https://support.office.com/en-us/article/power-query-specifications-and-limits-5fb2807c-1b16-4257-aa...

 

Perhaps you may import your txt with dividing on columns not by delimiter, but using Fixed width, 3-4 separate imports and split on columns each of them with some combining after that. But that very depends on your data structure.

Dear Sergei,

If the original data had been transposed prior to saving in the text file, there would be no problem; 20,000 rows x 500 columns is easy to import.  The data set would be ready to mine.

 

I tried your recommendation, but it may not be suitable for the structure of my data set.  I am able to import big blocks of merged columns using the text import wizard based on fixed width, but defining the fixed widths is tough to do in the tiny preview window, and then deleting some blocks so that the total is less than the 16,384 excel limit requires some guessing, and then matching the data columns (not in quotes) to the column names (are in quotes), because they are not aligned, makes it even more difficult.  Separating the blocks of columns into columns in excel worked well except for the cells on the left and right edges of the blocks.  These cells lost data to adjacent blocks.  There must be another solution. 

 

It might be best to (1) split the original text file and import into excel into two different sheets, transpose each, and then combine, or (2) use a macro in excel to import, or (3) maybe use a pivot table if this is possible, to connect to the remote text file with 20K columns, and import or add to the destination sheet's pivot table just 10K columns or transpose them to rows, but I am not sure how to automate this.  The main point of (3) is to have the excel file connect to a source text file or database to extract just what is called up to avoid exceeding the 16K limit on columns.

 

What do you recommend?

Thanks,

Ralph

Hi Ralph,

 

From my point of view the best way is to connect directly to the database and filter the data on query level. The only most probably some coding will be required, not sure you may filter data with UI due to columns limit.

 

If you have Power Pivot in your version of Excel you may try to connect to database or to text file directly from Power Pivot, data model has another structure in background and no direct limits on columns. If works, result could be landed to Excel sheet as normal table, not as Pivot Table. - if you reduce number of columns within the data model.

@Sergei Baklan 

Hi Sergei,

Yes, I upgraded to Excel 2019, so have access to Power Pivot, but I can't see an option for importing part of the text file into the data model, and don't understand the method for dividing the data in the model even if I knew how to import the entire text file into the data model.

Can you explain or point me to a resource that describes this?

Other possible options include dividing the text file in a text editor, but I don't know how to use an editor like Word to begin dividing a file with e.g. 24,000 columns into two files.

Can R or the command line be used to transpose or divide text files?

Thank you,

Ralph

@Ralph Puchalski Why can't the data be received normalized from the source? Wherever the data exists, it must be stored in tables that are normalized.

Now there's an opportunity for adding value ...