Forum Discussion

MikeRD's avatar
MikeRD
Copper Contributor
Apr 06, 2022
Solved

Convert Json With Nested Arrays to CSV

We are trying to get data that is exported through an API converted to a CSV. Each "columns" array is the data we need, and ALL the "columns" arrays are nested in a single "rows" array.

 

To give you the entire workflow, we call for the API, the JSON output is saved to a JSON file. That file is then recalled to process the CSV export.

 

Here is a code sample. Keep in mind that this is just one "columns" worth of data, and we can expect there to be ~1000 "columns" arrays. The "name" represents the headers and "value" represents the data, and each "columns" array should be a new row.

 

 

"rows": [
            {
                "columns":  [
                                {
                                "name":  "student_user_id",
                                "value":  "123456"
                                },
                                {
                                "name":  "ec_firstname"
                                },
                                {
                                "name":  "ec_lastname"
                                },
                                {
                                "name":  "ec_phonetype"
                                },
                                {
                                "name":  "ec_phone"
                                },
                                {
                                "name":  "ec_sortorder"
                                }
                            ]
            }
        ]

 

 

Some other details that might be relevant: "name is the column header. "student_user_id" will always be populated with a "value" however not all "name" headers will have a "value".

 

Thanks for the input!

  • MikeRD 

     

    It's pretty much in line with what I had above.

     

    Here's a more complete, yet still basic script. I simply named this Get-StudentsFromJson.ps1 so I could provide some use examples below but you can call it what you like.

     

    Use example 1: Providing the path to a properly-formatted JSON file.

     

     

    .\Get-StudentsFromJson.ps1 -Path .\json1.txt

     

     

     

    Use example 2: Piping in the text.

    Here, I'm taking it from a file but it could just as easily come from any command that returns peoperly-formatted JSON, such as Invoke-Method.

     

     

    Get-Content -Path .\json.txt -Raw | .\Get-StudentsFromJson.ps1

     

     

     

    The script itself (again, this is nothing fancy and I'm stopping at converting it to object notation as I fully expect you know how to then further convert it to CSV or whatever else you'd like to do with the object(s).)

     

     

     

    [cmdletbinding(DefaultParameterSetName="ByPath")]
    param(
        [parameter(Mandatory=$true, ParameterSetName="ByPath")][ValidateNotNullOrEmpty()][string] $Path
        , [parameter(Mandatory=$true, ParameterSetName="ByPipe", ValueFromPipeline=$true, Position=0)][ValidateNotNullOrEmpty()][string[]] $RawText
    )
    
    # If ByPath is the entry point, check that the specified file actually exists.
    if ($Path)
    {
        if (Test-Path -Path $Path)
        {
            (Get-Content -Path $Path -Raw | ConvertFrom-Json).results.rows | ForEach-Object {
                $ht = [System.Collections.Hashtable]::new();
                $_.columns | ForEach-Object { $ht.Add($_.name, $_.value); }
                [PSCustomObject]$ht;
            };
        }
        else
        {
            throw [System.IO.FileNotFoundException]::new();
        }
    
    }
    else
    {
        ($RawText | ConvertFrom-Json).results.rows | ForEach-Object {
            $ht = [System.Collections.Hashtable]::new();
            $_.columns | ForEach-Object { $ht.Add($_.name, $_.value); }
            [PSCustomObject]$ht;
        };
    }

     

     

     

    It's worth noting I've tried to chain pipes together as much as possible to promote early object release. You can store conversions in things like variables but that can manifest as memory issues where very large input sources are used.

     

    That's not relevant where you're only talking about 500 - or even thousands - of rows, but if you get into the hundreds of thousands, it's a different story. Anyway, I digress.

     

    Note

    With your JSON example, it's missing the outer prefixed "{" and closing "}", which in turn causes ConvertFrom-Json to throw a hissy-fit.

     

    If you add that in to get a properly formatted response then you get this as example output.

     

     

     

    ec_phonetype    :
    ec_lastname     :
    ec_phone        :
    ec_sortorder    :
    ec_firstname    :
    student_user_id : 1234
    
    ec_phonetype    :
    ec_lastname     :
    ec_phone        :
    ec_sortorder    :
    ec_firstname    :
    student_user_id : 5678

     

     

     

    Cheers,

    Lain

5 Replies

  • LainRobertson's avatar
    LainRobertson
    Silver Contributor

    MikeRD 

     

    Here's a sample based on only caring about the contents of the "columns" array rather than the parent JSON structures.

     

    Hopefully this gives you some ideas on how to get what's effectively a hash table into object notation.

     

    $jsontext = '
    {
        "rows": [
                {
                    "columns":  [
                                    {
                                    "name":  "student_user_id",
                                    "value":  "123456"
                                    },
                                    {
                                    "name":  "ec_firstname"
                                    },
                                    {
                                    "name":  "ec_lastname"
                                    },
                                    {
                                    "name":  "ec_phonetype"
                                    },
                                    {
                                    "name":  "ec_phone"
                                    },
                                    {
                                    "name":  "ec_sortorder"
                                    }
                                ]
                }
            ]
    }
    ';
    
    $json = ConvertFrom-Json -InputObject $jsontext;
    $ht = [System.Collections.Hashtable]::new();
    $json.rows.columns | ForEach-Object { $ht.Add($_.name, $_.value); }
    [PSCustomObject]$ht;

     

    The output produced from using your sample looks like this (Format-List presentation.)

     

    ec_phonetype :
    ec_lastname :
    ec_phone :
    ec_sortorder :
    ec_firstname :
    student_user_id : 123456

     

    Cheers,

    Lain

    • MikeRD's avatar
      MikeRD
      Copper Contributor

      LainRobertson Thanks! This certainly did help. However getting this into production has some other issues.  have tried the last few days to come up with a solution, but am not successful. 

       

      The issue is, the file we want to convert actually has around 500 of the nested arrays. From what I understand about hashtables, you cant have duplicate keys. I have tried to loop the hash to object portion of the script, but has not been successful. Here is a sample:

       

       "results":  {
              "rows":  [
                           {
                               "columns":  [
                                               {
                                                   "name":  "student_user_id",
                                                   "value":  "1234"
                                               },
                                               {
                                                   "name":  "ec_firstname"
                                               },
                                               {
                                                   "name":  "ec_lastname"
                                               },
                                               {
                                                   "name":  "ec_phonetype"
                                               },
                                               {
                                                   "name":  "ec_phone"
                                               },
                                               {
                                                   "name":  "ec_sortorder"
                                               }
                                           ]
                           },
                           {
                               "columns":  [
                                               {
                                                   "name":  "student_user_id",
                                                   "value":  "5678"
                                               },
                                               {
                                                   "name":  "ec_firstname"
                                               },
                                               {
                                                   "name":  "ec_lastname"
                                               },
                                               {
                                                   "name":  "ec_phonetype"
                                               },
                                               {
                                                   "name":  "ec_phone"
                                               },
                                               {
                                                   "name":  "ec_sortorder"
                                               }
                                           ]
                           }
                      ]
                  }

       

      With all the resource I found, I either got errors or it returns a blank object.

      • LainRobertson's avatar
        LainRobertson
        Silver Contributor

        MikeRD 

         

        It's pretty much in line with what I had above.

         

        Here's a more complete, yet still basic script. I simply named this Get-StudentsFromJson.ps1 so I could provide some use examples below but you can call it what you like.

         

        Use example 1: Providing the path to a properly-formatted JSON file.

         

         

        .\Get-StudentsFromJson.ps1 -Path .\json1.txt

         

         

         

        Use example 2: Piping in the text.

        Here, I'm taking it from a file but it could just as easily come from any command that returns peoperly-formatted JSON, such as Invoke-Method.

         

         

        Get-Content -Path .\json.txt -Raw | .\Get-StudentsFromJson.ps1

         

         

         

        The script itself (again, this is nothing fancy and I'm stopping at converting it to object notation as I fully expect you know how to then further convert it to CSV or whatever else you'd like to do with the object(s).)

         

         

         

        [cmdletbinding(DefaultParameterSetName="ByPath")]
        param(
            [parameter(Mandatory=$true, ParameterSetName="ByPath")][ValidateNotNullOrEmpty()][string] $Path
            , [parameter(Mandatory=$true, ParameterSetName="ByPipe", ValueFromPipeline=$true, Position=0)][ValidateNotNullOrEmpty()][string[]] $RawText
        )
        
        # If ByPath is the entry point, check that the specified file actually exists.
        if ($Path)
        {
            if (Test-Path -Path $Path)
            {
                (Get-Content -Path $Path -Raw | ConvertFrom-Json).results.rows | ForEach-Object {
                    $ht = [System.Collections.Hashtable]::new();
                    $_.columns | ForEach-Object { $ht.Add($_.name, $_.value); }
                    [PSCustomObject]$ht;
                };
            }
            else
            {
                throw [System.IO.FileNotFoundException]::new();
            }
        
        }
        else
        {
            ($RawText | ConvertFrom-Json).results.rows | ForEach-Object {
                $ht = [System.Collections.Hashtable]::new();
                $_.columns | ForEach-Object { $ht.Add($_.name, $_.value); }
                [PSCustomObject]$ht;
            };
        }

         

         

         

        It's worth noting I've tried to chain pipes together as much as possible to promote early object release. You can store conversions in things like variables but that can manifest as memory issues where very large input sources are used.

         

        That's not relevant where you're only talking about 500 - or even thousands - of rows, but if you get into the hundreds of thousands, it's a different story. Anyway, I digress.

         

        Note

        With your JSON example, it's missing the outer prefixed "{" and closing "}", which in turn causes ConvertFrom-Json to throw a hissy-fit.

         

        If you add that in to get a properly formatted response then you get this as example output.

         

         

         

        ec_phonetype    :
        ec_lastname     :
        ec_phone        :
        ec_sortorder    :
        ec_firstname    :
        student_user_id : 1234
        
        ec_phonetype    :
        ec_lastname     :
        ec_phone        :
        ec_sortorder    :
        ec_firstname    :
        student_user_id : 5678

         

         

         

        Cheers,

        Lain

    • tumtum1973's avatar
      tumtum1973
      Copper Contributor

      change the last line to accomplish the CSV output you mentioned in your original post.

       

       

      [PSCustomObject]$ht | ConvertTo-Csv -NoTypeInformation

       

       

       

       

      The output should look like this:

       

      "ec_phonetype","ec_lastname","ec_phone","ec_sortorder","ec_firstname","student_user_id"
      ,,,,,"123456"

       

      @LainRobertson

      Great job with your conversion process.

       

      Thanks,

      tumtum

       

       

Resources