Forum Discussion

Jeff_Lipschultz's avatar
Jeff_Lipschultz
Copper Contributor
Jun 19, 2023
Solved

Converting PDF to Excel

I'm a struggling with the should-be-simple task of converting a pdf to Excel table.  I can use the Get Data ->From PDF (Select Multiple Items) function to Query but it puts all the tables (there are ...
  • Lorenzo's avatar
    Jun 19, 2023

    Jeff_Lipschultz 

     

    Thanks for sharing a file. Combining all Tables in the Pdf you shared seems to do it (attached), just edit the query to change the path to the file - line File.Contents("ToBeReplaceWithYourPath")

    let
        Source = Pdf.Tables(
            File.Contents("D:\Lorenzo\Downloads\2023_Participant_List_70.3_Waco_4.27.23.pdf"),
            [Implementation="1.3"]
        ),
        KindTable = Table.SelectRows(Source, each ([Kind] = "Table")),
        RemovedOtherColumns = Table.SelectColumns(KindTable,{"Data"}),
        CombinedTables = Table.Combine(RemovedOtherColumns[Data]),
        PromotedHeaders = Table.PromoteHeaders(CombinedTables, [PromoteAllScalars=true]),
        ChangedTypes = Table.TransformColumnTypes(PromotedHeaders,
            {{"LAST NAME", type text}, {"FIRST NAME", type text}, {"GENDER", type text}, {"AGE AS OF#(lf)12/31/2023", Int64.Type}, {"COUNTRY", type text}}
        )
    in
        ChangedTypes

      

Resources