From Book News, Inc.
Explains how SQL Server and its core technologies are designed, how they work, and how they interoperate. As a complement to his other two books on SQL, Henderson explores Windows' process and threading architecture, memory management, and foundational I/O facilities, then looks inside SQL Server's user mode scheduler, XML facilities, and data transformation services. Most of the examples are written in C++. The CD-ROM contains the VBODSOLE library, DTS package guru, and DTSDIAG.Copyright © 2004 Book News, Inc., Portland, OR
From the Back Cover
"I can pretty much guarantee that anyone who uses SQL Server on a regular basis (even those located in Redmond working on SQL Server) can learn something new from reading this book."
--David Campbell, Product Unit Manager,
Relational Server Team, Microsoft Corporation
The latest book from the highly regarded and best-selling author Ken Henderson, The Guru's Guide to SQL Server Architecture and Internals is the consummate reference to Microsoft SQL Server. Picking up where documentation and white papers leave off, this book takes an all-inclusive approach to provide the most depth and breadth of coverage of any book on SQL Server architecture, internals, and tuning.
Blending in-depth discussion with practical application, the guide begins with several chapters on the fundamental Windows technologies behind SQL Server, including processes and threads, memory management, Windows I/O, and networking. The focus then moves on to the architectural details of SQL Server and how to practically apply them.
The entire SQL Server product is covered--not just the functionality that resides within the core executable or product features that have been in place for years. SQL Server has matured and broadened substantially with each release, and the author explores the "fringe" technologies that have yet to be covered elsewhere, including Notification Services, Full Text Search, SQLXML, replication, DTS, and a host of others.
Throughout the book, the author uses WinDbg, Microsoft's free downloadable symbolic debugger, to look under the hood of SQL Server. Armed with new debugging and coding skills, readers will be ready to master SQL Server on their own.
The accompanying CD-ROM is packed with additional material, including full source code for the book's 900+ examples, as well as three invaluable tools: DTSDIAG, the VBODSOLE Library, and DTS Package Guru. DTSDIAG allows developers and administrators to simultaneously collect Profiler traces, perform logs, blocking script output, system event logs, and SQLDIAG reports from a specified SQL Server. The VBODSOLE Library features more than twenty new COM-based functions for Transact-SQL, including T-SQL enhancements such as array-manipulation routines, financial functions, string-manipulation functions, and system functions. DTS Package Guru is a .NET-based package editor for SQL Server's Data Transformation Services that allows editing of any modifiable package and supports the automation of mass package changes.
The Guru's Guide to SQL Server Architecture and Internals is the essential guide for database developers and admin- istrators alike, regardless of skill level.
0201700476B10012003
About the Author
Ken Henderson is a leading DBMS practitioner whose clients have included the U.S. Air Force, the U.S. Navy, H&R Block, Travelers Insurance, J. P. Morgan, the CIA, and many others. He has been a regular magazine contributor for many years and is the author of seven previous books, including the best-selling The Guru's Guide to Transact-SQL (Addison-Wesley, 2000) and The Guru's Guide to SQL Server™ Stored Procedures, XML, and HTML (Addison-Wesley, 2002).
0201700476AB08142003
Excerpt. © Reprinted by permission. All rights reserved.
One day I started writing, not knowing that I had chained myself for life to a noble but merciless master. When God hands you a gift, he hands you a whip; and the whip is intended solely for self-flagellation. . . . I'm here alone in my dark madness, all by myself with my deck of cards--and, of course, the whip God gave me.
--Truman Capote (In Music for Chameleons. New York: Vintage Books, 1994)
I wrote this book to get inside SQL Server. I wanted to see what we could learn about the product and the technologies on which it's based through the use of a freely downloadable debugger, a few well-placed xprocs, and a lot of tenacity. The book you're reading is the result of that experiment.
In my two previous SQL Server books, I focused more on the pragmatic aspects of SQL Server--how to program it and how to make practical use of its many features. As the title suggests, this book focuses more on the architectural design of the product. Here, we dwell on the technical underpinnings of the product more than on how to use it. It's my belief that understanding how the product works will make you a better SQL Server practitioner. You will use the product better and leverage its many features more successfully in your work because you will have a deeper understanding of how those features work and how they were intended to be used.About Books Online
As with my previous books, one of the design goals of this book was to avoid needlessly repeating the information in Books Online. This necessitated omitting certain subjects that you might expect to find in a book like this. For example, I had originally planned to include an overview chapter that covered the architectural layout of the product from a high-level point of view. I had also planned to have a chapter on the architecture of the storage engine. However, on rereading the coverage of these subjects in Books Online (see the topic SQL Server Architecture Overview and the subtopics it links) and in other sources, I didn't feel I could improve on it substantially.
My purpose isn't to fill these pages with information that is already readily available to you; it is to pick up where the product documentation (and other books and whitepapers) leave off and take the discussion to the next level. As such, in this book I assume that you've read through Books Online and that you understand the basic concepts it relates.About WinDbg
This book features a good deal of work with WinDbg, Microsoft's freely downloadable symbolic debugger. You may be wondering why we need a debugger to explore SQL Server in the first place. After all, we obviously aren't going to "debug" SQL Server, and we certainly don't have source code for it, so we won't be stepping through code as is typically the case with a debugger.
The reason we use a debugger is that it gives us the ability to look under the hood of a running process in ways no other tool can. A debugger lets us see the threads currently running inside the process, their current call stacks, the state of virtual memory and heaps within the process, and various other important process-wide and thread-specific data. It lets us set breakpoints, view registers, and see when DLLs are loaded by the process or rebased by Windows. It lets us pause execution, dump memory regions, and save and restore the complete process state. In short, a debugger provides a kind of "X-ray" facility--a tool that lets us peer inside a process and see what's really going on within it. In this case, the object of our interest is SQL Server, but the basic debugging skills you'll learn in this book could be used to investigate any Win32 application. One of the chief goals of this book is to equip you with some basic coding and debugging skills so that you can continue the exploration of SQL Server on your own.
If we are to truly get inside the product and understand how it works, using a debugger is a must. Trying to understand the internal workings of a technology by merely reading about it in books or whitepapers is like trying to learn about a foreign country without actually visiting it--there's no substitute for just going there.
Given that WinDbg is freely downloadable from the Microsoft Web site, has the features we need, and is relatively easy to use, it seems the obvious choice. A symbolic debugger, it can use the symbols that ship with SQL Server and that are publicly available over the Internet, so it's a suitable choice for exploring the inner workings and architectural design of the product.About the Fundamentals
You'll notice an emphasis in this book on understanding the technologies behind SQL Server in order to understand how it works. I spend several chapters going through the fundamentals of processes and threads, memory management, Windows I/O, networking, and several other topics. To the uninitiated, these topics may seem only tangentially related at best. After all, why do you need to know about asynchronous I/O to understand SQL Server? You need to know something about it and the other fundamental technologies on which SQL Server is based in order to have a proper frame of reference and to gain a deep understanding of how the product itself works. You need to understand the fundamental Windows concepts on which SQL Server, a complex Windows application, is based for the same reason that a medical student needs to understand basic biology in order to get into medical school: Without this fundamental knowledge, you lack the perspective and foundation necessary to properly root and ground the more advanced concepts you will be attempting to learn. Humans learn by association--by associating new data with knowledge already acquired. Without a solid grounding in the fundamentals of Windows application design, you lack the basic knowledge required to systematically associate the details of how a complex Windows application such as SQL Server works.
To be sure, you can gain a superficial idea of how SQL Server works (for example, by reading that it makes use of scatter-gather I/O) without really understanding what the details mean. If you really want to master the product--if you really want to know it literally inside-out--you have to have some understanding of the technologies from which it's composed. Knowing how scatter-gather I/O works will give you immediate insight into why SQL Server uses it and why it enhances performance. The same is true for virtual memory, thread synchronization, networking, and the many other foundational topics we explore in this book. Not only are they relevant; having a basic understanding of them is essential to truly understanding SQL Server. Without a basic understanding of the fundamental technologies on which SQL Server is based--Win32 processes and threads, virtual memory, asynchronous I/O, COM, Windows networking, and various others--you have neither the tools nor the frame of reference to truly grasp how the product works or to master how to use it.
I fully realize that not every reader will be interested in the Windows technologies and APIs behind SQL Server's functionality. That's okay. If the nitty-gritty details of the Win32 APIs, how to use them, and how applications such as SQL Server typically employ them don't interest you, feel free to skip the Foundations section (Part I) of this book. There's still plenty of useful information in the rest of the book, and you don't have to understand every detail of every API to benefit from it.About the "How-To"
I've tried very hard to provide the architectural details behind how the various components of SQL Server work without neglecting the discussion of how to apply them in practical use. I am still a coder at heart, and there is still plenty of "how-to" information in this book. At last count, there were some 900 source code files slated for inclusion on the book's CD. That's more than either of my last two books, both of which were very focused on putting SQL Server to practical use, as I've said.
In terms of the central topic of all three of my SQL Server books--namely, getting the most out of the product--I've attempted to elevate the discussion to an exploration of the architectural design behind the product without leaving behind my core reader base. Regardless of whether you came to this book expecting the mother lode of code and practical use information that you typically find in my books or you agree with me that understanding how the product works is key to using it effectively, I hope you won't be disappointed with what you find here.About the Breadth of Topics
You will notice that this book covers a wide range of product features and technologies. It is not limited merely to the functionality provided within sqlservr.exe--it tries to cover the entire product. It's my opinion that a book that purports to discuss the internal workings and architectural design of a complex product such as SQL Server should cover the whole product, not just the functionality that resides within the core executable or product features that have been in place for many years. The world of SQL Server is a lot bigger than just a single executable. Prior to the 7.0 release of the product, I suppose you could get away with just covering the functionality provided by the main executable, but that's no longer the case and hasn't been for years. The product has matured and has broadened substantially with each new release.
This book isn't titled The Guru's Guide to sqlservr.exe--it's about all of SQL Server and how its many component pieces work and fit together. So, you'll see coverage in this book of what might seem like fringe SQL Server technologies such as Full Text Search, Notification Services, and SQLXML. We'll explore replication, DTS, and a host of other SQL Server technologies that are not implemented in the main SQL Server executable. Of necessity, I can't cover every feature in the product or even as many as I'd like. The book would take ten years to write and would be 5,000 pages long. However, I've tried to strike a balance between covering topics in the depth that people have come to expect from my books and exploring a sufficient breadth of features and technologies such that you can get a good feel for the overall design and architecture of SQL Server as a product.About C++
I'm fully aware that many SQL Server people are more comfortable in Visual Basic than in any C or C++ dialect. I used C and C++ to cover Windows programming fundamentals and elsewhere in the book for a couple of reasons.
First, the Win32 API itself is written in C. Although whole books have been written on accessing the Win32 API from VB, it has been my experience that this ranges from clunky to outright impossible in some circumstances, depending on the API function in question. The Win32 API was originally written in C, and therefore C and C++ are the purest and most direct methods of accessing it. Any other approach--be it from VB, Delphi, C#, or some other language or tool--adds a layer of indirection that can cloud the discussion.
Second, I used C++ because I happen to believe that the language is not that hard to learn and that most VB people are more than capable of developing basic C++ programming skills and effectively reading C++ code, regardless of whether they believe that themselves. There seems to be a natural aversion or fear of all things C++ among those in the VB community. It's my belief that these concerns are largely unfounded and that they needlessly limit people's ability to really understand Windows and complex Windows apps such as SQL Server. My advice: Even if you don't know C++ and feel you're out of your depth when reading through C++ code, don't be afraid of it. Work through the examples in this book, follow the instructions I provide, and see where your exploration leads you. Pick up an introductory book on the language if it suits you. You may find that the language isn't nearly as hard to get around in as you thought, and you may benefit--perhaps immensely--from the experience.
All that said, C++ is far from the only language used in this book. I know that no one language is used by everyone so I've tried to keep the book balanced in terms of the language tools used. A good deal of the example code used throughout the book is some flavor of Visual Basic--VB6, VBScript, or VB.NET. In the ODSOLE chapter, for example, I show you how to build COM objects in VB6. In the SQLXML chapter, I show you how to access SQLXML using VBScript. And in the Notification Services chapter, I show you how to implement a subscription management application using VB.NET. There's also a healthy helping of C#, Delphi, CMD files, and even a discussion or two of assembly language. And, of course, there's a wealth of Transact-SQL code throughout the book. Regardless of your preferred language(s), you should find code of interest to you in this book.About Visual C++ 6.0
Some of you may question the decision to use Visual C++ 6.0 for most of the C++ code examples in this book. I chose VC6 over Visual Studio .NET for two reasons: (1) having been around considerably longer, VC6 is much more pervasive, and (2) Visual Studio .NET (both the 2001 and the 2003 releases) will automatically upgrade VC6 projects when they are first opened. So, regardless of whether you have Visual Studio 6 or Visual Studio .NET, the C++ projects on the CD accompanying this book should open just fine for you. You should be able to compile and run them without incident. Also, when teaching basic Windows concepts such as thread synchronization and memory management, I do not use any version-specific features, so there's no advantage to using Visual Studio .NET over VC6.About the Terms and Knowledge Measures
Readers of my previous books may notice a significant amount of "supplementary" material in several of the chapters. You'll likely notice the term definitions that precede some of the chapter discussions and the knowledge measures at the end of each discussion. Don't worry: I still hate filler material and have gone to great lengths to avoid unnecessary screen shots, summaries, and other devices commonly used to lengthen technical books.
Though I personally don't like putting together term definition tables, knowledge measures, and the like and have avoided them in previous books, a growing number of readers have asked for additions such as these in order to make my books more suitable for classroom use. Several of my previous books are regularly used in classroom settings even though, admittedly, those books do not lend themselves well to it. Therefore, I've finally decided to try to do something about that. If you do not find these sections particularly useful, feel free to skip over them. All of the data contained in the term definitions is also in the chapter text--you won't miss anything by skipping them. That said, you may find that having a basic understanding of some of the terms and concepts before we get into them in depth may be useful to you. It really comes down to your individual preferences.
I have intentionally not included the answers to the questions in the knowledge measure sections in order to get a feel for how much they are used. Again, this is an adaptation intended to make the book more usable in classroom scenarios. I may or may not continue it in future books, depending on how useful it proves to be. If you want the answers to the knowledge measure questions, e-mail me at khen@khen.com, and I'll provide them.About SQL Server Versions
This book targets the latest release of SQL Server currently available, SQL Server 2000. Throughout the book, when you see a reference to SQL Server, you can assume that it definitely applies to SQL Server 2000 and probably to other releases as well. I rarely mention SQL Server's version number because I've found it to be a little cumbersome. That said, when in doubt, assume what you read in this book applies to SQL Server 2000.About Master Programming
With the sheer volume of code and code-related discussions in this book, it might appear to some that I'm trying to turn you into a master programmer rather than a master SQL Server practitioner. Nothing could be further from the truth. In order to really address that concern, let's first define what a master programmer is.
To begin with, a master programmer is someone who likely codes for a living. You cannot develop expert-level coding skills and keep them sharp by merely studying other people's code or reading programming books. You have to get in there and get your hands dirty, and you have to keep doing it. Technology changes and software engineering evolves quickly enough that there's simply no substitute for coding every day.
Second, a master programmer is someone who doesn't just know how to churn out source code. A person I worked with once suggested that the defining characteristic of an expert coder is great typing skill! I laughed out loud at that assertion because being an expert coder has nothing to do with typing--I know expert coders who don't type well at all. That notion reminds me of what Truman Capote said when asked about Jack Kerouac's work: "That isn't writing at all, it's typing." (As quoted in New Republic, Feb. 9, 1959.) Just as good writing amounts to a lot more than typing, so does expert-level coding. Cranking out reams of source code does not a master programmer make. In fact, there's a paucity and efficacy about the code of the programming masters that often accomplishes an astonishing amount of work with a surprisingly small amount of code. The idea isn't to write lots of code; it is to write good code. It's a question of quality versus quantity.
Third, a master programmer is well rounded. A master programmer knows a number of languages and works on multiple operating systems and platforms. He does not use one language at the expense of all others regardless of the problem. He uses the right tool for the job and constantly seeks to broaden his horizons and immerse himself in the art and science of computer language mastery. A master programmer is not a "jack of all trades and master of none" but maintains expert-level skills in several areas at once.
Fourth, an expert programmer masters the operating system environment and fundamental technologies with which he works just as much as he masters programming languages. He knows that simply mastering the language with which he happens to be working on a particular project is not enough; he must also know a good deal about the operating system and the foundational components with which he will construct applications. Whether this is COM or EJB, Windows or Linux, the master programmer knows that he must also have expert-level knowledge of the environment in which his code will run and the components from which it will be constructed in order to produce software that is robust, efficient, and extensible.
Fifth, a master programmer keeps up with the technology and developments in software engineering. A master programmer can tell you the difference between the decorator design pattern and the facade design pattern. He can tell you why COM is preferable to plain DLL use and about the advantages the .NET Framework offers over COM. He can tell you where Java fits in the grand scheme of things and how it compares to other languages. You can mention the term "refactoring" to him without getting a blank stare, and he can describe the relationship between eXtreme Programming and Aspect-Oriented Programming. He may not work every day with these concepts and technologies, but he stays current enough with the industry in which he works to understand them conceptually, to be able to explain the relationships between them, and to be able to discuss them articulately.
Sixth, a master programmer is well read. He knows who Martin Fowler is. He reads Kent Beck, and he's well versed in Erich Gamma's work. He reads both technology-specific books as well as those related to software engineering as a discipline. He reads Steve McConnell, and he also reads Donald Knuth. He knows who Jon Bentley is, and he also knows Brian Kernighan's work. He is well versed in Grady Booch's work and also reads Charles Petzold. In a day and age in which technology and the engineering required to master and put it to practical use seem to evolve at the speed of light, one can't read too much or stay too current with the latest developments in the industry. A master programmer knows this and dedicates himself to a lifelong course of continuing education.
So, with this in mind, I hope it's obvious that I'm not trying to turn anyone into a master programmer. This book isn't about software development; it's about SQL Server. To the extent that I delve into subjects seemingly more related to coding than to SQL Server, there is a method to the madness: I am trying to help develop basic coding and debugging skills in those who may lack them so that they can better understand how and why SQL Server is designed the way it is and so that they can continue the exploration of SQL Server on their own. The whole thrust of this book is about gaining as deep an understanding of SQL Server as possible so that we can put it to better use in the real world.
0201700476P10012003
The Guru's Guide to SQL Server Architecture and Internals FROM THE PUBLISHER
The latest book from the highly regarded and best-selling author Ken
Henderson, The Guru's Guide to SQL Server Architecture and Internals is the
consummate reference to Microsoft SQL Server. Picking up where documentation
and white papers leave off, this book takes an all-inclusive approach to
provide the most depth and breadth of coverage of any book on SQL Server
architecture, internals, and tuning.
Blending in-depth discussion with practical application, the guide begins
with several chapters on the fundamental Windows technologies behind SQL
Server, including processes and threads, memory management, Windows I/O, and
networking. The focus then moves on to the architectural details of SQL
Server and how to practically apply them.
The entire SQL Server product is covered--not just the functionality that
resides within the core executable or product features that have been in
place for years. SQL Server has matured and broadened substantially with
each release, and the author explores the "fringe" technologies that have
yet to be covered elsewhere, including Notification Services, Full Text
Search, SQLXML, replication, DTS, and a host of others.
Throughout the book, the author uses WinDbg, Microsoft's free downloadable
symbolic debugger, to look under the hood of SQL Server. Armed with new
debugging and coding skills, readers will be ready to master SQL Server on
their own.
The accompanying CD-ROM is packed with additional material, including full
source code for the book's 900+ examples, as well as three invaluable tools:
DTSDIAG, the VBODSOLE Library, and DTS Package Guru. DTSDIAG allows
developers and administrators to simultaneously collect Profiler traces,
perform logs, blocking script output, system event logs, and SQLDIAG reports
from a specified SQL Server. The VBODSOLE Library features more than twenty
new COM-based functions for Transact-SQL, including T-SQL enhancements such
as array-manipulation routines, financial functions, string-manipulation
functions, and system functions. DTS Package Guru is a .NET-based package
editor for SQL Server's Data Transformation Services that allows editing of
any modifiable package and supports the automation of mass package changes.
The Guru's Guide to SQL Server Architecture and Internals is the essential
guide for database developers and admin- istrators alike, regardless of
skill level.
SYNOPSIS
Explains how SQL Server and its core technologies are designed, how they work, and how they interoperate. As a complement to his other two books on SQL, Henderson explores Windows' process and threading architecture, memory management, and foundational I/O facilities, then looks inside SQL Server's user mode scheduler, XML facilities, and data transformation services. Most of the examples are written in C++. The CD-ROM contains the VBODSOLE library, DTS package guru, and DTSDIAG. Annotation ©2004 Book News, Inc., Portland, OR
WHAT PEOPLE ARE SAYING
I can pretty much guarantee that anyone who uses SQL Server on a regular
basis (even those located in Redmond working on SQL Server) can learn
something new from reading this book.
—David Campbell, Product Unit Manager,
Relational Server Team, Microsoft Corporation David Campbell