Designing database driven applications is something that many of us have been doing for a long time. It is the bread and butter for those of us who have been building software for the web. Whether it's web applications accessing data or frameworks built to allow those applications to more easily manage their database interactions or server products that live as intermediaries between application logic and back end services. I had really begun to think that this was all figured out all read and that the patterns for this were essentially part of the software industry DNA now. There is, however an interesting facet to this that I've discovered at Microsoft: that is products built from the ground up for the desktop, and essentially single user based applications that are now being migrated to the MS vision of Software + Services.
Many of these applications have multiple versions of built in assumptions about being the sole owner of the data that they work on and they are now having to deal with a multi user access metaphor for their data. Just the simple fact of migrating this data from the local filesystem to a central database type of setup presents challenge to these apps in scalability and performance not to mention the many transactional considerations that now need to be taken into consideration. Many of these application have historically brought the data they use into local memory and then used in memory query and sorting methods to do what they needed in order to display that data to the user. We find that these applications can fall down quickly when migrated to the remote data store scenario in the areas of data consistency and performance of query across the potentially much larger data sets.
The data consistency problem is a well known problem that has many well documented patterns to solve it. Jim Gray's book Transaction Processing goes into infinite detail on the subject and many patterns within Martin Fowler's book Patterns of Enterprise Application Architecture cover the topic as well. When moving software forward that was single user in scope previously the issue simply requires concerted thought on applying these patterns and re-working pieces of code that make single user assumptions.
Performance tends to end up being a larger issue for us on these types of conversions. Not simply because of the larger data sets that a shared data store might entail, but also because many of the typical ways that a single user application might address performance cause the above mentioned consistency issues when we move into a shared data environment. If I have an application that has an embedded SQL database to store its data in for a single user, I have the ability to use the built in optimizations of the database directly and I can just maintain a full database lock for my application. Once I move that application to use a central database I need to be much more cognizant of the transactions and locking that I am doing. One way to address this is to begin to do more in (application) memory querying, but I lose many of those optimizations that the database vendors have spent years tuning and trying to implement those types of optimizations at an application framework level is very difficult. It is possible to fold the queries back to the database to take advantage of the optimizations, this is a typical approach we see. This is tending to work most of the time, but it is not without it's own set of issues.
There are many additional issues that we run into, all of which are easy to solve on a one off basis for each application. The important (and more difficult) thing is to solve them at a framework level so that all applications get the solutions for free. Building these types of solutions into frameworks and tools becomes an essential part of increasing the speed at which we can migate to this new vision.
Friday, October 26, 2007
Subscribe to:
Post Comments (Atom)
1 comment:
Great work.
Post a Comment