1. Overview
When gathering SharePoint data through Microsoft Graph Data Connect (MGDC), you are billed through Azure. You can find the official MGDC pricing information at Pricing – Microsoft Graph Data Connect. As I write this blog, the price to pull 1,000 objects from MGDC in the US is $0.375. Since the MGDC basic rate is $0.75 per 1,000 objects, you are currently getting a 50% discount.
Keep in mind that the SharePoint datasets are currently in private preview. During this private preview, users do not pay for the SharePoint datasets in MGDC. You do still have to pay for the other Azure infrastructure like Azure Storage, Azure Data Factory and Azure Synapse. The private preview for SharePoint datasets will eventually end, and when it goes public the regular MGDC rates will apply.
With all that said, there is still a question that comes up frequently. What counts as an object? Well, that’s what we will cover in this blog post.
2. What MGDC provides
What MGDC delivers to you are datasets. After you run a pipeline with a copy data action, you end up with a collection of files in your Azure Storage account. Each of these files will contain objects. It’s an interesting file format that contains text using JavaScript Object Notation, also known as JSON. Here is what the contents of the file would look like:
{"property1":"valuea","property2":"valueb","property3":"valuec"}
{"property1":"valued","property2":"valuee","property3":"valuef"}
{"property1":"valueg","property2":"valueh","property3":"valuei"}
{"property1":"valuej","property2":"valuek","property3":"valuel"}
In the example above, you have a file with 4 JSON objects, each with 3 properties. The file contains one line per object and these lines can get quite long.
3. Multiple JSON objects per file
Even though the files have a JSON file extension, the files you get are not proper JSON files. First, you typically don’t want your JSON content to be one long line. The proper formatting would be something like this:
{
"property1": "valuea",
"property2": "valueb",
"property3": "valuec"
}
{
"property1": "valued",
"property2": "valuee",
"property3": "valuef"
}
{
"property1": "valueg",
"property2": "valueh",
"property3": "valuei"
}
{
"property1": "valuej",
"property2": "valuek",
"property3": "valuel"
}
That’s more readable, but this is still not a proper JSON file. That would have only one object, not multiple objects, per file. But if you have lots of objects, having one file for each object will make this far less efficient to process. That’s why MGDC packs lots of JSON objects into a single file with the “json” extension.
Also, if you have lots and lots of objects, MGDC will pack the results as multiple “json” files, each containing many JSON objects packed together.
Most data tools have no problem loading this kind of file. Power BI, for example, not only can load files with multiple JSON objects, but it can also load multiple files in a single Power Query. Here’s an example:

4. What constitutes a SharePoint object in MGDC
With all that information, we’re ready to state what counts as an object for MGDC. Each (long) line in those “json” files is an object, matching the schema published at Datasets, regions, and sinks supported by Microsoft Graph Data Connect.
For the SharePoint datasets currently available, you have:
- Sites: One object is one site collection. These includes Team sites and OneDrive sites.
- Groups: One object is one group. These groups could have multiple members, but those members are all included in that single group object.
- Sharing: One object is a permission granted to a specific scope (site, web, library, folder or file). This single object includes a set of users granted a permission in that scope.
5. Sharing Objects
The SharePoint datasets above are easy to grasp, but Sharing needs further explanation. Sharing captures a more complex concept commonly referred to as an Access Control List (ACL). That is how SharePoint stores permissions granted to users.
Sharing includes different types of permissions (Full control, contribute, read, etc.) that are granted at different scopes (site, web, library, folder or file). This covers permissions granted directly to users and groups, plus those permissions granted using sharing links. Each unique scope and permission combination gets their own Sharing object, which could include multiple users and groups in the “shared with” list.
For instance, if you grant full permissions on a file to 10 users, that is a single Sharing object where the “Full Control” role definition for the scope of that file is granted to a set of 10 users. That entire information is captured in one Sharing object.
If you want to grant permission to a file with 5 users having read/write permission and another 5 users having read-only permissions, then you need 2 Sharing objects. One with the “Contribute” role definition for the file being granted to 5 users and another with the “Read” role for that file being granted to the other 5 users.
6. Can I predict how many Sharing objects?
The exact number of Sharing objects for a given SharePoint tenant is hard to predict.
If you have 100 sites, for instance, you could reasonably assume that there will be at the very least 300 Sharing objects. That’s because each site, by default, gets an Owner, Member and Visitors groups. Each of these 3 SharePoint groups is granted these specific permissions. So that’s 3 Sharing Objects per site, even if you don’t grant any other permissions after creating the site.
In addition to those, you could grant further permissions at other levels. There is also the common scenario of using sharing links. The permissions for each of those links are captured in another Sharing object.
Obviously, the more sharing happens in your company, the more Sharing objects you will have. I have seen an average as high as 130 Sharing objects per site in a company with heavy usage of SharePoint and its collaboration capabilities. I have also seen companies with less collaboration activities that have 45, 30 or 15 Sharing objects per site in average.
7. Objects in Delta datasets
An important topic for those concerned with a high number of objects is Delta Datasets. The idea is simple: instead of pulling all objects every day or every week, you can ask SharePoint on MGDC to deliver just what has changed. This mechanism will drastically reduce the number of objects delivered, by providing only objects that were created, updated, or deleted.
For more details about Delta Datasets, read the blog at SharePoint on MGDC FAQ: How can I use Delta State Datasets?
8. Summary
In summary, SharePoint on MGDC delivers data to you as JSON objects, packed into files that are pulled into your Azure Storage account. MGDC objects transferred will show in your Azure bill. During the private preview, SharePoint objects in MGDC are free. The number of SharePoint objects depends on your number of sites/groups/files, as well as the amount of collaboration in your tenant.
I hope this blog post helped you understand what constitutes an object in SharePoint on MGDC. For more information about SharePoint Data in MGDC, please visit the collection of links I keep at https://aka.ms/SharePointData.