Formatting Issue (Getting data from 1 column into 7 columns)

Copper Contributor

Hello. I'm having a devil of a problem trying to import data from the web and getting it into Excel in roughly the same columned format.  I'm copying the relevant data (which is information that I've put together using various filter selections on a specific web site) and then pasting it into Excel.  When I do this, all of the data is shown in one long column.  It looks roughly like this (using the info from just 2 companies):

 

Loving Care Agency, Inc.

Health Services, Hospital & Health Care

$200 Million - $1 Billion

(201) 403-9300

Hasbrouck Heights, NJ

lovingcareagency.com

1971

Frontline Consulting Services Inc.

Business Services, Information Technology and Services

$100 Million - $200 Million

(704) 510-1998

Charlotte, NC

fcs-inc.com

1996

 

I need to get this data into 7 columns (Company Name; Industry; Yearly Revenue; Phone #; Location; Web Site; Year Founded). Using "Text to Columns" is no help as there are no delimiters (commas, tabs, spaces, etc.).  Is there any way to get this data into the same columns as they are presented on the original web site, or do I have to manually insert a delimiter after each line of data?  Any help will be GREATLY appreciated.  Thank you.

9 Replies

@JeffJ625 

You may use Power Query to get the data in the desired format.

To do so, follow these steps...

 

  1. Convert your one column Web Data into an Excel Table if not already formatted as an Excel Table and to do that select your data and press Ctrl+T and uncheck the CheckBox for "My table has headers" and click OK.
  2. Rename the Excel Table you just created with "WebData".
  3. Now go to Data Tab and  under Get & Transform Data group, click on Get Data --> From Other Sources --> Blank Query. This will open Power Query Editor Window.
  4. On Power Query Editor Window --> View --> and click on Advanced Editor and delete whatever you see that and paste the query given below in the Advanced Editor and click on OK.
  5. Now go to Home Tab on Power Query Editor Window and click on Close & Load. This will load the table with data in the desired format into a new sheet.

 

Query:

 

let
    Source = Excel.CurrentWorkbook(){[Name="WebData"]}[Content],
    #"Removed Blank Rows" = Table.SelectRows(Source, each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Blank Rows",{{"Column1", type any}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Modulo", each Number.Mod([Index], 7), type number),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Inserted Modulo", {{"Modulo", type text}}, "en-IN"), List.Distinct(Table.TransformColumnTypes(#"Inserted Modulo", {{"Modulo", type text}}, "en-IN")[Modulo]), "Modulo", "Column1"),
    #"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Index"}),
    #"Filled Up" = Table.FillUp(#"Removed Columns",{"1", "2", "3", "4", "5", "6"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([0] <> null)),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"0", "Company Name"}, {"1", "Industry"}, {"2", "Yearly Revenue"}, {"3", "Phone#"}, {"4", "Location"}, {"5", "Website"}, {"6", "Year Founded"}})
in
    #"Renamed Columns"

 

 

Also, if you are getting the web data through the From Web under Get & Transformed Data Group, you may tweak the existing query to get the data in the desired format.

@Subodh_Tiwari_sktneer 

Bit another transformation

let

    ColumnNames = {"Company Name", "Industry", "Yearly Revenue", "Phone#", "Location", "Website", "Year Founded"},
    ItemsInGroup =7,

    Source = Excel.CurrentWorkbook(){[Name="WebData"]}[Content],
    RemoveNulls = Table.SelectRows(
        Source,
        each not List.IsEmpty(
            List.RemoveMatchingItems(Record.FieldValues(_), {"", null})
        )
    ),
    GroupsToLists = Table.FromList(
        List.Split(
            RemoveNulls[Column1],
            ItemsInGroup
        ),
        Splitter.SplitByNothing()
    ),
    GroupsToRecords = Table.ReplaceValue(
        GroupsToLists,
        each [Column1] ,
        each Record.FromList([Column1],ColumnNames),
        Replacer.ReplaceValue,{"Column1"}
    ),
    ExpandRecords = Table.ExpandRecordColumn(
        GroupsToRecords,
        "Column1", ColumnNames, ColumnNames
    )
in
    ExpandRecords

@Subodh_Tiwari_sktneer 

 

Subodh (and Sergei)...thank you very much for responding.  I've tried both of your Power Querys and get the same message:

 

Expression.Error: We couldn't find an Excel table named 'WebData".

Details:

     WebData

 

What am I doing wrong?

- Jeff

@JeffJ625 

WebData is the table with raw data which you shall create. Select all raw data in column A, press Ctrl+T and create the table.

On ribbon you will stay on Table Design

image.png

Type here table name now

image.png

@JeffJ625 

Hi Jeff,

 

If you look at the points#1 and 2 in my proposed solution, you are supposed to convert your data into an Excel Table and then rename the Excel Table as 'WebData" and then the query will work for you as desired as this query assumes that in your file there is an Excel Table called "WebData" which must be missing at present in your actual file.

To convert your data into an Excel Table, select all your data and then press Ctrl+T and it will convert your data into an Excel Table. Once your data is converted into an Excel Table, a Table Design Tab will appear in the Ribbon and if you click that and look at top right of the ribbon, you will have a TextBox to have Table Name and you can type the WebData in there and hit Enter, that will rename your Table as WebData.

@JeffJ625 

And much better, if you don't copy/paste from web but query the web, combine transformations suggested here with the initial query. If that's copy/paste perhaps it's worth to try query the web, finally it'll be one click solution - open the file, refresh all.

@Subodh_Tiwari_sktneer 

 

Subodh/Sergei,

 

Your suggestions have been most helpful. I was finally able to get the data into the right format but it is a tremendous amount of work.  I'm working off of a prospect list web site where I have to input the specific parameters for the companies I want to discover (for example, location, industry, yearly revenue, etc.).  For large listings, I'm able to see only 25 companies at a time.  It looks like this:

clipboard_image_0.png

I have to copy and paste each grouping of 25 into Excel and all of this info. goes into just one column.  When I've finally created a large enough list, I then run the Power Query that you've designed.  This is a lot of work because if each company doesn't have the exact same number of data points (in this case 7) the report is all jumbled and I have to go back and find where the missing data should go.  

 

Do you think I'm doing this in the most efficient way or is there a more economical way to go about this task.  I don't see how I can pull the info. right off the web site as I work because the web address just shows something like:  https://app.uplead.com/#.   I'm unsure if Excel can pull the exact info. that I've specifically collated on the site.

 

You gentlemen have been so very kind in your guidance and I sincerely appreciate it.

 

Regards,

Jeff

 

 

 

;lkjasdjf

 

@JeffJ625 

Jeff, that's not a free resource and it's hard to say what and how could be automated without playing with web site. It depends on many things, for example how the pagination is organised, by URL parameter or Java script changes pages without changing the URL. It could be some protection against connecting to their data. Etc., etc.

 

But in general yes, it could be possible to automate data pick-up from web site using Power Query. The evil is in the detail.