Not actual ghosts, but ghost rows. Before becoming a DBA I had never heard of them before. So what are ghost rows? A description I grabbed off this page here describes them as “The number of rows that are marked as deleted but not yet removed. These rows will be removed by a clean-up thread, when the server is not busy”
In my 3 month DBA career so far I’ve run into 2 msdb databases exhausting drive space. While there have been SQL Agent Jobs set in place to maintain the msdb database, these don’t seem to keep the size down, even after manually purging job history via the sp_purge_jobhistory stored procedure. While I could confirm the row count was dropping, I was confused as to why the database size did not. After some investigation around the indexes in msdb I discovered a high count of ghost rows (as well high fragmentation).
Why hadn’t the ghost clean-up thread cleared these out yet? Is it because the server is continuously busy so it does not call the clean-up thread. I think I read rebuilding the index can trigger the clean-up thread. Whether this is true or not, the one time I tried it, it worked!
Now the question is, do we need to implement index maintenance jobs on system databases? Or do we accept that the msdb database may grow out of control sometimes and wait for the clean-up thread to eventually clear the ghost rows and free the space? When that happens is anybody’s guess.
The ghost cleanup is a timer based background task which runs every 5 seconds and goes through a cyclic loop to cleanup ghost records present in the database tables. It doesn’t clean-up everything at once. More details about it in Paul’s blog post: http://www.sqlskills.com/BLOGS/PAUL/post/Inside-the-Storage-Engine-Ghost-cleanup-in-depth.aspx