Forum Discussion

SaraVG's avatar
SaraVG
Copper Contributor
Mar 03, 2020
Solved

Cleaning data to import and transposing

I need to download a long list of citations from an online library database. The data downloads only in plain text. It contains 11 pieces of information such as name, title of article, journal name, volume, year, etc. for each citation. One piece of information is the abstract for the article and this paragraph contains commas and other punctuation marks.

Each piece of information (total 11) is separated by a paragraph return that shows when opened in MS Word.

The next citation is separated from the next one by a paragraph return. It appears that there are two returns after each citation - one at the end of the last entry and one separating the citations.

I want to import this data into Excel and put each complete citation in one row which means the 11 pieces of information in each citation would be in 11 columns.

Then I want the next entry to start on a new row.

I can copy and paste one entry at a time and then transpose it to the horizontal format, but I may have as many as 2,000 citations so I am looking for a way to automate the process if I can.

I've attached a sample of the file (now saved in Word since I can't upload plain txt here) with 25 entries. It should end up as 25 rows with 11 columns.

Thank you for any assistance.

  • Hi SaraVG 

     

    Solution for your query by using the Power Query, you must have Excel 2016 or above to run this query 

    1. I have created table just paste your data in the file 

    2. Go to Sheet2 right click & refresh details will get update 

     

    Note i have observed some paragraph "Hilary E. Jacobsen" has 12 line which will allocate into the next column, to adjust that you just need to manually adjust the records. Make sure to have 11 columns.

     

    Regards, Faraz Shaikh | MCT, MIE, MOS Master, Excel Expert | www.ExcelExciting.com

    If you find the above solution resolved your query don't forget mark as Official Answer.

     

     

4 Replies

  • Hi SaraVG 

     

    Solution for your query by using the Power Query, you must have Excel 2016 or above to run this query 

    1. I have created table just paste your data in the file 

    2. Go to Sheet2 right click & refresh details will get update 

     

    Note i have observed some paragraph "Hilary E. Jacobsen" has 12 line which will allocate into the next column, to adjust that you just need to manually adjust the records. Make sure to have 11 columns.

     

    Regards, Faraz Shaikh | MCT, MIE, MOS Master, Excel Expert | www.ExcelExciting.com

    If you find the above solution resolved your query don't forget mark as Official Answer.

     

     

    • SaraVG's avatar
      SaraVG
      Copper Contributor

      Thank you! This is terrific. I will explore Power Query. I appreciate your help.

      Sara

  • JKPieterse's avatar
    JKPieterse
    Silver Contributor

    I used PowerQuery to add numbering next to all rows which belong to a single reference and then fed that query result to a pivottable, see attached.

Resources