Power BI is expanding self-service data prep to help business analysts extract insights from big data and introducing enterprise BI platform capabilities. With recent updates, Power BI has enabled connectivity to more data sources than ever before. That said, no product is able to do everything, which is where the CData Power BI Connectors come in.
With CData, you get live connectivity to data in Power BI (meaning DirectQuery) from any of the 120+ supported sources, ranging from CRM and marketing automation to big data and NoSQL. With the CData connectors, you can access MongoDB data quickly (faster than you can with any other connector) and easily, leveraging the built-in modeling and data flattening features of the connector to create a table-like model of your schema-less data, ready to be viewed, reported and analyzed from Power BI — no code or data curation required.
In Part 1 of this series, we looked at how we can collect data from SmartThings and send it over to an Azure Event Hub. We’re now going to take the next step and work on storing that data so that we can use it in our reports and do historical analysis. For this, we are going to be making use of Azure Cosmos DB for storage and Azure Stream Analytics to move the data.
The first thing we need to do is create a Cosmos DB account to store the data. Head over to the Azure Portal and create a Cosmos DB account instance. When you do this, it is going to ask you to select the API to use. I chose SQL, but pick whatever you prefer. You also want to put this in the same region as your Event Hub to improve latency.
At this point, you have created an account that will contain your database. This account is in a single region using the default Session consistency. This was fine for my needs, but if you want to replicate your data to more regions or alter consistency, you can do so now.
We now need to create a database and a collection in Cosmos to store out data. The easiest way to do this is to scroll down to the Collections section, then click Browse. Click on the Add Collection option at the top. For the database, select Create New and then enter a name for the database and for the collection. You then need to pick a size, either fixed or unlimited. The fixed option uses a single partition, which imposes a limit of 10GB. This has the advantage of not having to deal with managing partition keys. To give you an idea of the data volume, I have had this running for around four months and have generated 80MB of data. I would recommend the fixed option to keep things simple. Finally, we need to select the amount of throughput we want the database to support in RUs. The lowest you can go is 400 RU, and that has been more than enough for my setup which has around 20 sensors reporting in. Once you have completed all that, create the DB and collection.
The great thing about Cosmos is that it is NoSQL-based, so we don’t need to worry about creating schemas and tables. All we need to do now is send the data across.
At the moment, our sensor data has arrived at the Event Hub and is sitting there doing nothing. Event Hub is not intended as a long-term storage mechanism and will only keep this data for seven days. We need to create a process to move this data over to our new Cosmos DB.
In this example, I have chosen to use Azure Stream Analytics. This is Microsoft’s tool for processing streams of data and performing analytics. I chose this option because it’s very simple and easy to set up, it processes data quickly, and it can undertake data transformations if required. It also supports sending data to many different locations. This includes the option to stream data to PowerBI for real-time reporting. We will be looking at historical reporting here, but this option is possible.
There is a pretty significant downside to using Stream Analytics for a home project: the cost. Running a single streaming unit full-time will run at around $80 a month. In a future post, we will look at ways we can reduce this cost. We’ll take a look at scheduling batch running Stream Analytics jobs and using other tools like Azure Data Factory and Azure Functions to process the data instead.
Using the Azure portal, create a new Stream Analytics job. Again, it should be in the same region as the Event Hub and Cosmos DB and you should select the Cloud hosting option. We then need to define how many streaming units we need. Streaming units are instances of the job, more units equal more parallel processing of data. As we are running this full-time, the processing of the data will be almost immediate. Given this, and the low amount of data, we should only need a single streaming unit. When we look at batch processing data, we may want to use more units to speed up throughput.
After creating the Stream Analytics job, we need to define the query that will tell it how to process the data. Before we can do that, we need to define our inputs and outputs.
In the Steam Analytics job, select the Inputs option under Job Topology and then click Add Stream Input. In the drop-down, select Event Hub. The window that opens should populate with the details of existing Event Hubs. Provide an alias for this input and then select the Namespace, Event Hub and Policy Name that we created in the previous article. Leave the rest of the options as default, including an empty consumer group.
We’ll follow a similar process to create the output. Under the same Job Topology menu, select Outputs. Click Add, and then pick Cosmos DB from the drop-down. Again, this will fill in the Cosmos DB account, Key, and Database for you. The value to use for the Collection Pattern Name field will depend on what you chose for the Cosmos DB size. If you selected the fixed 10GB size, then you can enter the collection name here. If you selected the unlimited option, then you need to enter a pattern to locate the appropriate collection partitions. Finally, the Document ID field allows you to specify a field in the records to use as the document ID. For simplicities’ sake, I chose to leave this blank and use the default.
Now, we have our input and outputs we need to create the query to transfer data from one to the other. Again, under the Job Topology menu, select Query. This will open the visual query editor and it should look like this:
In the editor, select [YourOutputAlias] and replace it with the alias of the Cosmos DB output we created. Then, select [YourInputAlias] and replace it with the alias of the Event Hub. There is a list of the aliases on the left if you need them. We will leave the select query to
* for now so that we send all data to Cosmos. If you want to, you can refine this query to reduce the amount of data sent.
Now that we have our query defined, all we need to do is start the job. Click back on the Overview tab and then click on the Start option. This will take a few minutes to start. When it does, you should start to see events appear in the metrics charts.
At this point, we can go back to our Cosmos DB instance and check data is arriving. Select your Cosmos DB account and then click on the Data Explorer option. Expanding the database and collection, we can then click on the Documents option and see a list of all the documents in our collection.
Click on a document and we can view its content and confirm that we are seeing the data from SmartThings. We now have a process in place to take the data from the Event Hub and store it in Cosmos DB ready for reporting on. As long as the Steam Analytics job is running, it will transfer the data once it arrives at the Event Hub. If you wish to stop the charges for the stream analytics job you can select the option to stop the job. Stream Analytics jobs only cost money whilst they are running.
In Part 3 of this series, we are going to look at taking the data in Cosmos DB and building some reports in PowerBI.
The CData ODBC Driver for Dynamics NAV provides a streamlined, secure way to visualize your Dynamics NAV data in Power BI. The CData ODBC Driver for Dynamics NAV links your dashboards and reports to the live Dynamics NAV data. By scheduling refreshes and refreshing on demand, you can create dashboards that reflect changes to your data in real time. This article details how to use the ODBC driver to create dashboards featuring Dynamics NAV data in the Microsoft Power BI Designer.
Follow the steps below to connect to Dynamics NAV data, create a visualization, and interact with it in the Editing View.
You can configure the DSN in the built-in Microsoft ODBC Data Source Administrator. This is the last step of the driver installation. See the “Getting Started” chapter in the help documentation for a guide to use the Microsoft ODBC Data Source Administrator to create and configure a DSN.
Provider=;Persist Security Info=False;DSN=CData DynamicsNAV Source
Note: Set “Provider=” to use the default .NET OLE DB Provider installed on your system.
SELECT * FROM Customer
Click Refresh to synchronize your report with any changes to the data.
You can share reports based on ODBC data sources with other PowerBI users in your organization. To upload a dashboard or report, log into PowerBI.com, click Get Data -> Files, and navigate to a Power BI Designer file or Excel workbook. You can then view and edit the report in the Reports section.
You can use the Power BI Personal Gateway to automatically refresh the dataset associated with your report. You can also refresh the dataset on demand. After installing the Personal Gateway, follow the steps to configure the Personal Gateway for an ODBC DSN:
After configuring the Personal Gateway, you can refresh on demand and schedule refreshes.
To schedule refreshes, expand the Schedule Refresh node and select Yes for the Keep Your Data Up-To-Date option. After specifying the refresh interval, click Apply.
To refresh on demand, right-click the Dynamics NAV dataset in the Datasets section and click Refresh Now.