When it comes to creating custom solutions (A.K.A. Sandboxed Solutions) for SharePoint Online, there are a few things you need to be aware of. One of those is the underlying architecture, that is driving the whole SharePoint Online environment. When you peel all the sales talk away, you still have to have a Big-Ass ServerFarm (BASF) and some way to split up the customers on your physical environment.
The BASF Setup
Let me just clarify, that I’m really guessing on this part of the SharePoint Online BASF setup (Yes, it’s still the Big-Ass ServerFarm and not the good old 8-track tapes supplier). I’m assuming that this is more-or-less the architecture for the SharePoint Online environment.
- The BASF
- Loads of SharePoint Farms serves the customers. My guess is that each EnterPrise Customer (Plan E1 to E4) gets a WebApplication with the possibility to create up to 300 SiteCollections. All housed in a single Content Database
- The P1 customers are housed on their own SharePoint Farm, where they are assigned a single SiteCollection with a limit of 10 Gb Storage.
- All or a number of customers are sharing the CPU and RAM-resources on the SharePoint farm, that is serving them.
For a Cloud Service Provider to make any money, they have to get the most out of their investment. That means cramping as many customers together as possible. That way the BASF gets most performance per buck. the problem with this scenario has always been, that you end up having some rouge customers, that are riding the BASF until the sun comes down. Those customers are draining the BASF and all the other customers are feeling the reduced performance.
The traditional way to handle this has always been to build in some barriers in the environment, thus preventing the BASF from blowing up. Common for all is that the customers share the CPU-cores, which will end up in giving a variation in performance during the day. It’s inevitable. These barriers can vary somewhat, but in most cases there are only one . In Office 365 there are 2:
- Resource Points
You have a certain amount of daily Resource Points. don’t ask me how they’re calculated, but whenever your custom solution of some kind runs, it’s using Resource Points. the more users you have on your subscription, the more Resource Points, you will have available.
- Time Limit
You have 30 second to get your Sandboxed Solution airborne and down again. And remember to take into account the 7-8 seconds it takes the solution to begin, when it needs to compile for the first time, after a period of not being used. (I.E. every morning)
The Issue with using two barriers
Actually, the Resource Points is not an issue. It’s the Time Limit that is the real joker. Why? Because whenever you set a Time Limit on the execution of some custom Code in any BASF, you have to think about, what can influence the performance, while the Code is running. The amount of RAM, you have available is one thing, the CPU-power is another. I don’t really see storage as an influence in performance, but hey, lets just throw that into the pile of variables. Maybe your Custom Code is writing some heavy-duty files, that takes a long time to write on the hard drive. Maybe it’s rush-hour over at the neighbors tenant and they are hogging all the CPU-power. RAM is a little easier to control and reserve when the code is started (I think. I really don’t know, I just assume. I’m a Software guy, not Hardware). Then on top of all that you have to remember the caching. As in your code will not be compiled until the first run.
Put all those variables and then add those I don’t know about into the mix and it’s pretty much impossible for you to calculate how long your Custom Code use from start to finish. that makes the time limit barrier to the biggest and most unpredictable variable, when your coding Sandboxed Solutions for Office 365.
Requirements for barriers to function properly each time
- The Time Limit requires reserved resources to perform as expected every time
- The ResourcePoints threshold requires no time limits to perform as expected
- The Reservation of a CPU-core requires no RessourcePoint or time limits, but is expensive
If you read Andy Burns observations from his first project with Sandboxed Solutions in SharePoint Online, I’ve felt the same frustrations as Andy. When your planning for your SharePoint Online project, you have to think very carefull about your using the Sandboxed Solutions in SharePoint Online. In my experience, it’s still pretty unstable, when the Time Limit is sat so low.
There are 3 ways to get around this (as I see it, but there could easily be more):
- Disable the time limit on custom solutions, but enforce the Resource Points in real-time or set the Time Limit to 10 minutes. It doesn’t matter if it’s slow to complete, as long as it completes 100%. (The easy fix)
- Reserve the exact same amount of CPU-core per customer, so the customer can count on the solutions is taking the same amount of time to complete each time. (The way-too expensive solution)
- Give the customer an option to reserve a CPU-core, if they have critical custom solutions, that extends beyond the 30 seconds. (The most flexible solution. Let the customers that needs more power, pay for it.)
It can be a downright pain to try to figure out why your Sandboxed solution is not working, because there are no access to the logs. All you get is a “Correlation ID”, bu you don’t have any way of looking that ID up, to go bug-hunting in your code. You could try using the “Try-catch-finally” method in your code, but I haven’t seen SharePoint accept it yet. Sometimes you can actually get your code to run, without using try-catch. SharePoint Online, just ignores the method all-together.
Final thoughts around Sandboxed Solutions in Office 365