Custom web proxies and Power Query – decrypting Google Chrome cookies

It turns out that decrypting Google Chrome cookies is a non-trivial process, so I thought I’d use it as an introduction to the use of custom web proxies.

There are functions available in the .NET framework that allow you to decrypt Google Chrome cookies, however, they are not directly accessible via Power Query (as far as I can tell) and even with VBA you would need to write a COM wrapper.

Instead, I decided to write a very short script in Python which would decrypt the cookies and send the result back to Excel/Power Query.  How to do that?  Well, by calling the running Python script from a Power Query Web contents query.

The solution below has two parts, a Python web server and Power Query.  Power Query imports the Google cookies database.  It then executes a Web query to contact the Python web server and send it a list of encrypted cookies.  The Python script receives the encrypted cookies, decrypts them and send back the result.  Power Query takes the decrypted cookies and combines it with the other data it has for the cookies.

To make this work, you need to have Python (both 2.x and 3.x should work) installed.  You’ll also need the Python extensions for Windows (for the win32crypt library) and the bottle web framework.

I won’t cover the installation of these here, but happy to assist if you run into trouble.

So basically, this short script starts a tiny local web server.

Assuming you run it from your desktop PC it will respond to requests sent to http://localhost:8080/decryptCookie

And if we send it a list of encrypted cookies, it will decrypt them using Microsoft’s libraries and send back the decrypted cookies.

For the purposes of this exercise, we will manually launch it, but we could trigger it in a more sophisticated fashion directly from Excel.

The M code on the Power Query side of things, builds on the post from yesterday which showed how to access the Google Chrome cookie SQLite database.

I did run into a strange issue where the query I put together yesterday was truncating the encrypted cookie field from the cookies database.   For some reason, this was corrected by adding a SQL query – select * from cookies rather than simply connecting to the table.  I would have expected this to produce identical results but it didn’t.

And here is the annotated M query:

Happy to walkthrough this in more detail if necessary or respond to questions.

This data can now be used much like the “harvested” Internet Explorer cookies from other posts.

Bear in mind, that there are countless other uses for custom Web proxies with Power Query. For example, you can use them to perform more complex web authentication, web scraping with BeautifulSoup (no more crazy HTML/XML guesswork), data manipulation using Python libraries such as pandas, execute other applications on your computer,  etc. etc.

Of course, on the flipside, you need to have Python installed and have the Python script running before it will work.  On your home PC, this probably doesn’t matter.  In a corporate environment, you would want this to be hosted on a proper web server.

How to use Youtube API from Power Query – Part 1

2016-01-24 11_59_54-Book1 - Excel

Google offer quite a comprehensive API for Youtube but I haven’t seen any examples of people using Power Query to explore.

So let’s do that – we’re going to use Power Query to access Youtube’s API and display a simple table of popular videos such as the above.

Before you can do this, you will need an API key for accessing the Youtube API.

Google provide instructions for this here.

The key steps are:

  1. Create a new project in the Google Developers Console.
  2. Turn on access for the YouTube Data API v3.
  3. Create a browser API key.

The end result should be a long string of random characters constituting your key for accessing the Youtube Data API.

Now, we can get into Excel and Power Query.

Usually when you are accessing an API in Power Query you would use Web API authentication.  This doesn’t seem to work for Youtube Data API, as it expects the API key to be provided in the query not the header, so we will need to specify our API key manually.

Start with a Blank Query and paste the following into the Advanced Editor:

Update 28/3/2016

You may get a response indicating that you need to specify how to connect or otherwise enter credentials.  If you have already entered credentials you may need to edit them in order for future steps to work.

2016-03-28 10_00_00-Query2 - Query Editor

Click to Edit Credentials.

Then you need to indicate that you want to connect anonymously.  This is because we are providing our credentials (our API key) as part of our Query string not via the Query header.

So when the below window appears, make sure you select Anonymous on the left and select just the Youtube search hyperlink radio button on the right and finally hit Connect.

2016-03-28 10_00_25-Query2 - Query Editor

Expected result:

yt 1

The Youtube API documentation explains in detail what our query actually does, but essentially we are performing a search for the top 50 (by number of views) videos containing the term “power query”.

You could actually enter this directly into your browser if you wanted to. It is equivalent to:

https://www.googleapis.com/youtube/v3/search?key=<your api key here>&q=power%20query&part=snippet,id&maxResults=50&order=viewCount

 

Now, if you click on the items List.  Power Query will return a list of the results as records.

2016-01-24 11_36_30-Youtube Search List Merge - Query Editor

They will be easier to work with as Table so next we’ll convert to a Table.

2016-01-24 11_42_07-Youtube Search List Merge - Query Editor

Returning this:2016-01-24 11_36_52-Youtube Search List Merge - Query Editor

If you then expand out the column, you can start to see all of our data:

2016-01-24 11_37_15-Youtube Search List Merge - Query EditorNext expand out the id and snippet:

2016-01-24 11_37_35-Youtube Search List Merge - Query EditorAs you can see, we now have a list of videos that all look to be related to Power Query!

If you remove columns except for video id, publishedAt, title and channelTitle, you should end up with a result something like this:

2016-01-24 11_59_54-Book1 - Excel

We can immediately learn that ExcelIsFun is a very popular Youtube channel among Power Query enthusiasts – it has 4 videos in the top 10.  I definitely recommend checking it out.

Unfortunately, we don’t have any information about the number of views, or likes, or comments.

In order to add this information we will need to perform another query to the API asking for statistics.

I’ll be covering this in a future post.

Here’s the full code to date:

A follow up post has been added here.