In trying a “bring the kitchen sink” approach to market, Sharepoint ends up almost failing to recognise that the kitchen is for cooking. My workplace has been implementing a relatively complex solution that involves the creation of more than 20000 document libraries each containing up to 10000 documents. As such, we have constantly been challenged by the inherit limitations in the capabilities of Sharepoint to handle large lists and large numbers of list.

You might rightly ask “why not just change the solution design to nicely fit the Sharepoint limitations”? True, that would be a reasonable suggestion if not for part of the whole Sharepoint rationale being a flexible system. My issue is that many of these limitations seem to stem from the underlying schema and could easily be overcome if Microsoft dedicated resources to ‘fix’ (IMHO) the schema.

Sharepoint stores every list in the AllLists table of the content database and all ListItems (Documents etc) in the AllUserData table. The AllUserData table has a fixed number of varchar, integer, and other fields and jams each ListItem into one of more of these rows. It’s a highly inefficient design but highly flexible.

The downside of the design is that all our data for 20000+ lists and 650000+ documents are in two tables. One of the tables is littered with unrequired columns and our data is totally inaccessible using SQL due to the convoluted schema. So what could be done? Microsoft could take a CRM-like approach and instead create a new ‘ContentTypeData’ table for each content type with columns that actually match the content type schema.

Advantages: No more index tables (use SQL Server indexing), Tables/Views for use with SSRS that can be directly queried (instead of needing to create some XML service or a data warehouse to get data to SSRS)

Disadvantages: Changing the underlying data structure need not impact the Object model at all, so aside from requiring a large upgrade process when the schema changes, I can’t see any real disadvantages. Can you?

« »