Forum Discussion

samuel_kodjoe's avatar
samuel_kodjoe
Copper Contributor
May 14, 2021
Solved

power query - data cleaning and analysis

Hi Team,

Please the attached excel workbook contains two sheet namely "system data" which was generated from a system and the second sheet "transformed" which was generated using "text to column". i would like to know if i can achieve the same transformation using power query.

  • noushad1ali2000's avatar
    noushad1ali2000
    Copper Contributor

    Hi, 

    I am trying to use the same query what you have provided me previously for this file and trying to fix the columns as per requirement but unfortunately, I am unable to fixed it.

    could you please check once, it would be much helpful

    • noushad1ali2000's avatar
      noushad1ali2000
      Copper Contributor

      Greetings,

      I am regular watching your vlog related excel, and now i need your help to clean a complex type of data extracted from my company system which i need to clean and provide them in simple form to create some insight from that.

      1st sheet as sample extracted

           i have given two sets of data as sample, whereas in my co file having 25860 rows of data for almost 1185-person

      2nd sheets need to be normalized

      it would be much appreciated if you could help me out

      Thanks

      Naushad Ali

      1st table

      Extracted messy data11/11/2012           
      ID   818181   role   data practioner 
      school              
      grade 12            
      try              
                     
      fry              
      sum              
      code         title    
       121212      jsr      
                     
                     
                     
                     
                     
      ________________________________________________________________________________________________________   
      ID   919191   role   Sr Data player 
      school              
      grade 14            
      try              
                     
      fry              
      sum              
      code         title    
       121212      jsr      
       212121      kel      
                     
                     
                     
                     
      ________________________________________________________________________________________________________   
                     

       

      2nd table - needed

      IDRolegradecodetitle
      818181data practioner12121212jsr
      919191Sr Data player14121212jsr
         212121kel
    • samuel_kodjoe's avatar
      samuel_kodjoe
      Copper Contributor
      WOW!!! that was very impressive. How do i accurately specify the positions? it seems to me like a tedious task
      • SergeiBaklan's avatar
        SergeiBaklan
        MVP

        samuel_kodjoe 

        I did that by couple of iteration and approximately, since we use TRIM() exact position most probably is not required.

        However, if you are on Excel with dynamic arrays you add in couple of cells one under another

        =SEQUENCE(,LEN(A2),0)
        and
        =MID(A2,SEQUENCE(,LEN(A2)),1)

        It looks like

        Positions start from 0 and each next position is first space after the previous field. That's one time job if only your system generates the file always in exactly the same format.

         

        If format is different from time time I guess removing of excess spaces could be done by Power Query but that will require M-script coding.

Resources