Using captured Cookies in Power Query

Unfortunately, Power Query often does not reliably maintain the cookies needed to parse data from complex sites. However, it is possible to sidestep this issue by passing through the cookies manually.

In a previous post, I demonstrated how to parse Internet Explorer cookie files and bring them into Excel.

In this post, we’ll transform those cookies into a cookie string and add to a query.

Cookies need to be specified in the following format:

Cookie1=value1; Cookie2=value2; Cookie3=value3;

This is quite different from the table of cookies that we had at the conclusion of my previous post.

But we can use Power Query to fix that.

Start with a new Blank Query and refer to the #”IE Cookies” query that we had at the end of the previous exercise.

Source = #"IE Cookies"

Then filter to just the cookies needed.

Do this by filtering the Host/path column for those cookies corresponding with the domain of the website that you are parsing e.g. flipboard.com.

(NB. This is only going to work for sites that maintain cookies between sessions.  Often these sites will have a “Remember Me” box that you can tick.)

2016-01-31 10_07_20-Query1 - Query Editor

Having done this, remove other columns except for Cookie Name and Cookie value.

2016-01-31 10_07_59-Query1 - Query Editor

Now to convert our table into a cookie string for our query:

Step 1:

Merge the columns using “=” as our separator and remove other columns

2016-01-31 10_12_02-

 

Resulting in:

2016-01-31 10_08_58-Query1 - Query Editor

Step 2:

Transpose the remaining Merged column into a single row

Step 3

Merge these columns using “; ” (note the space) as the separator and remove other columns

2016-01-31 10_12_54-Merged - Query Editor

Step 4

Drill down:

2016-01-31 10_10_59-Merged - Query Editor

So now we have a cookie text string in the format we’ll need for a query, using cookies that we have harvested from Internet Explorer.

Using the cookies in an actual Web query is quite straight forward. e.g.

In the next post, I’ll demonstrate how to use this with a real web site.

11 thoughts on “Using captured Cookies in Power Query”

  1. Hi Johan,

    Apologies for the delayed response.

    I haven’t had an opportunity to determine why application of my blog posts did not work for you.

    However, here is some M code that I have produced just now that appears to have the desired results:

    let

    // Load page passing in the Cookie needed to avoid the pop-up

    Source = Web.Contents("https://www.toernooi.nl/sport/playerstats.aspx?id=026A49D3-9C36-44CD-81D6-8D3731F37E29&draw=46",
    [Headers=[Cookie="st=c=1&s=2"]]),

    // Convert the output to Text
    #"Imported Text" = Table.FromColumns({Lines.FromBinary(Source,null,null,1252)}),

    // Find start of the table we want
    #"Added Custom" = Table.AddColumn(#"Imported Text", "Custom", each if Text.Contains([Column1], "

    ") then "Start Table" else null),

    // Find end of the table we want
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if Text.Contains([Column1], "

    To use it, create a Blank Query and copy and paste it into the Advanced Editor.

    Let me know how you go.

    Cheers

    Ken

    1. Hi, All
      Can you help me with how to use Cookie line, received in another query, not direct string like this
      Source = Web.Contents(URL, [Headers=[Cookie=”st=c=1&s=2″]]), but I need smth like
      Source = Web.Contents(URL, [Headers=[Cookie=#”Query for cookie]]). This code has two external source, as i understand. How to divide it?
      Thanks

      1. Hi,

        The easiest solution would be to tell Power Query to ignore Privacy Settings. This avoid issue with multiple external sources. Often there is no good reason to maintain privacy settings.

  2. Ken,
    Thank you very much!
    It works!
    Now I can rebuild your query to get the results for the entire Club.

    Johan

  3. Hi,

    Any idea how to process dynamic cookies?

    I have a site (gooddata.com) that creates a short term authentication cookie that you need to use for subsequent quries.

    The picture gets complicated because I need to capture the results from a POST (a dynamic URI) with a JSON body which I then need to GET the result that holds data i actually need.

    So the whole process is
    Logon to site. Get AUTH cookies
    Use AUTH cookies to populate the report (POST with JSON body). Get resulting URI
    Use AUTH and URI information to GET the information.

    Ideas?
    Many thanks

    1. Hi Oren,

      I’ve successfully navigated similar authentication processes in the past using PowerQuery. I’d need to take a closer look at this one to work out exactly how to deal with it. In the absence of that I could show a similar example.

Leave a Reply

Your email address will not be published. Required fields are marked *