Forum Discussion

W W's avatar
W W
Copper Contributor
Aug 01, 2017
Solved

Transform Text File to Excel

Hi,

I have a text file with millions of numbers separated by comma and semicolon.

How do I get the text file transform to excel file where it will identify number before/after comma to put in column and number before/after semicolon to put in row, automatically?

  • Hi,

     

    as a first step, please check which version of MS Office you have. If it is at least Office 2016, then Power Query is embedded in Excel, and you can find it under the 'Data' menu.  Otherwise, you need to download and install it from https://www.microsoft.com/en-au/download/details.aspx?id=39379. In this case, you will see Power Query menu at the top.

     

    To connect to the file data.txt I sent you in my previous post, please save it on a local or network drive, then:

    1. Go to Data > New Query > From File > From Text, then navigate to data.txt. This should open a window showing all data in one cell of the Column1. Click 'Edit' to open Power Query editor

     

    2. Right click on the cell with your data, and select 'Drill Down' from the dropdown menu. This will convert the cell contents into text.

     

    3. Go to Transform -> Split Text. In the 'Value' field, put ";". This will create a list, splitting you data by rows using the semicolumn as the delimiter

     

    4. Select the header of the list, so that the entire list column is highlighted, right cilck and select 'To Table'. In the field 'Select or enter delimiter' please select 'Comma' from the dropdown menu, then click 'OK'. This will create the required end result table

     

    5. You can load the result into an excel spreadsheet by selecting: Home -> Close & Load, then select 'Table' at the top of the 'Load To' window, and then specify, which sheet you wish to load the data to. For now, you can unclick the 'Add this data to the Data Model'. However, you need to consider the size of your data and weather the excel sheet can handle it. If it is too large, your only option is to load to the Data Model (while selecting the 'Only Create Connection' option instead of lading into Table) and then create a pivot table from the dataset. 

     

    To refresh the table on a sheet, right lick then select 'Refresh' or go to Data->Refresh All

     

    If you wish to manage the source file parameters in an excel sheet similarily to what I did in the example file, you can create the File Name and File Path ranges, go to Power Query Editor ->Home ->Advanced Editor, and replace the line starting with 'Source =' with the following

     

        //Input File Parameters
        FileName = Excel.CurrentWorkbook(){[Name="setFileName"]}[Content]{0}[Column1],
        FilePath = Excel.CurrentWorkbook(){[Name="setFilePath"]}[Content]{0}[Column1],
        Separator = if Text.End(FileName,1) = "\" then "" else "\",
        FullPath = FilePath & Separator & FileName,

        Source = Table.FromColumns({Lines.FromBinary(File.Contents(FullPath), null, null, 1252)}),

     

    Note that setFileName and setFilePath are the names of the above mentioned ranges.

     

    Once you have done it, you can enter file name and path on the spreadsheet, then click 'Refresh'. This will pull data from the specified file.

     

    I hope this is helpful and you like my explanation. If this is the case, I would appreciate you hitting the 'Like' button.

     

    Please let me know if there is anything you wish me to clarify.

     

    Regards

    Yury

     

     

     

     

  • Yury Tokarev's avatar
    Yury Tokarev
    Steel Contributor

    Hi,

     

    you can use Power Query for this purpose. If you put the data file name in the range 'setFileName' and the file path in the range 'setFilePath', then you can use the following query:
       
        //Input File Parameters
        FileName = Excel.CurrentWorkbook(){[Name="setFileName"]}[Content]{0}[Column1],
        FilePath = Excel.CurrentWorkbook(){[Name="setFilePath"]}[Content]{0}[Column1],
        Separator = if Text.End(FileName,1) = "\" then "" else "\",
        FullPath = FilePath & Separator & FileName,

        //Transform input data
        Source = Csv.Document(File.Contents(FullPath),[Delimiter="#(tab)", Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
        Column1 = Source{0}[Column1],
        Split = Text.Split(Column1,";"),
        #"Converted to Table" = Table.FromList(Split, Splitter.SplitTextByDelimiter(","), null, null, ExtraValues.Error)
    in
        #"Converted to Table"

     

    Please see the attached file for an example with sample data and the solution file.

     

    Please let me know if it has worked with your data

     

    Regards

    Yury

     

    • W W's avatar
      W W
      Copper Contributor

      Hi Yury,

      Thanks for the details.

      But I don't really use power query. Is there any other 'more simple' way to do it?

      • Yury Tokarev's avatar
        Yury Tokarev
        Steel Contributor

        Alternatively, you can copy and paste you data into a cell, add "=" at the beginning, encircle the data with '{}' (for example, it can look like  ={1,2,3;4,5,6;7,8,9}   ), then array enter over the required number of rows/column. To array enter, select the required range starting with the data cell mentioned above, press F2, then Ctrl+Shift+Enter.

         

        However, the Power Query option is better in case you wish to repeat this process multiple times with different sets of data

  • Does that mean each number in your text file has either comma or semi-colon embedded but not both? Can you share some sample data? May be 4 or 5 numbers...

    • W W's avatar
      W W
      Copper Contributor

      Hi,

      My data will have the mixture of both comma and semicolon.

      For example: 2,4;6,8,10;12.

      I need this set of data to be converted in one same table in excel, with number after comma (2,6 and 8) to be appeared in column and number after semicolon (4 and 10) to be appeared in row.

      Is that possible?

Resources