Since learning about Source Server and how it works with debuggers such as Microsoft’s Visual Studio or WinDBG from the Debugging Tools for Windows, I’ve become an avid fan of adding version control information to my debug and release-mode PDB symbol files. I’ve learned the value of having this information in my symbol files for debugging programs that I’ve released to users that they have had problems with. Also, having them for debugging minidumps has become an invaluable tool that I simply cannot live without.
Recently, I’ve been exploring Git, a distributed version control system that was developed by the Linux kernel crowd. Git and similar version control programs offer the same version control functions as other software such as Subversion or Team Foundation Server, but instead of the repository being maintained on a central server, each user has his own copy of the repository and performs commits locally. On certain occasions, users of the distributed version control repositories will publish their changes with another remote repository, thereby “publishing” their changes.
I’ve been interested in using Git recently because due to a lack of time, I’ve wanted to work on my projects, but I’ve been away from my home network where my personal Subversion repository is stored. With no way of working and committing changes or performing the normal version control operations that I employ, I’ve been struggling with a solution of being able to work without losing my work. Hence I found Git. But to make Git really usable in my software development world, I needed it to support Source Server so that I can add my Git repository information to my PDB files for later use when debugging. With no built-in support for Git in Source Server, I started down the path of writing my own extension for Source Server to support Git. What made this even more fun is that Source Server is really a set of Perl scripts, and it has been so long since I wrote Perl that I’ve forgotten almost everything about the language. But with my subscription to Safari, I was able to conquer the beast that is Perl and found a solution that worked.
One disclaimer that I have on this solution is to point out that I am not a Perl developer. I’ve toyed around with it here and there, but I honestly cannot remember the date that I actually last wrote a Perl application. In terms of a genuine software solution, this is probably the first Perl solution that I’ve ever written, so please excuse me, for those of you that live in the Perl world, if my code is exceptionally bad in this one instance.
For those not familiar with Source Server, it’s a component that is installed with Microsoft’s Debugging Tools for Windows. Source Server is a Perl script and some command line tools that take PDB files that are generated by compilers with debugging information and add version control information for each of the source files that were used to generate an .exe or .dll module. The value of source server is that if you do not have the source code in front of you, you are still able to debug production .exe and .dll files, or debug minidumps, by just having PDB symbol files available. Source Server plays with another Debugging Tools for Windows tool called a symbol server, which is a repository of PDB files. By using Source Server to index my PDB files, then storing the PDB files in a symbol server’s database, I can debug any of my applications without having the exact copy of the source code stored in my development workspace on my hard drive. Instead the debugger will use the symbol server database to get my PDB symbol files, and then will use Source Server to extract the source code from my version control repository.
The Perl script provides the overall framework for indexing PDB files, but doesn’t provide the actual integration with the version control servers. Instead, Source Server relies on external modules that handle the integration. Out of the box, Debugging Tools for Windows provides integration for TFS, Subversion, Perforce, and Visual Source Safe. However, it does provide a Microsoft Word document describing how to create a new module for integration with another version control system. I used this to build the integration with Git.
To start off, Source Server will try to create a new instance of the provider module using the constructor:
sub new {
my $proto = shift;
my $class = ref($proto) || $proto;
my $self = {};
bless($self, $class);
$$self{'FILE_LOOKUP_TABLE'} = ();
return($self);
}
The FILE_LOOKUP_TABLE field will be used to store the version control information for the files in the project workspace. This will be populated and used later in the indexing process.
The next phase in the indexing process is that Source Server will call the module to gather information about the source files in the project workspace that could be indexed in the PDB files. The Git module will use this command to retrieve the version control information for the files in the project workspace. First, the module will call Git to get the SHA-1 identifier for the tree that represents the current commit that the project workspace is based on. The following command gets executed:
git --no-pager log -1 --pretty=format:\%T -- (sourcePath)
The sourcePath parameter is passed to the module by the Source Server script. Once the tree identifier is retrieved, then the module will get the object identifiers for all of the objects in the tree and its subtrees:
git --no-pager ls-tree -r --full-name (treeId) (sourcePath)
This command will walk recursively through the root tree and the subtrees and will output the object identifiers and relative paths of all of the files (or content objects) in the tree hierarchy. The module will collect this information and will save the information in the FILE_LOOKUP_TABLE field that I mentioned earlier. Here’s the code for the GatherFileInformation subroutine:
sub GatherFileInformation {
my $self = shift;
my $sourcePath = shift;
my $serverHashReference = shift;
my $hProcess;
if (!open($hProcess, "git --no-pager log -1 --pretty=format:\%T -- $sourcePath |"))
{
::warn_message("Unable to get the log for $sourcePath");
return();
}
my $treeId;
$treeId = <$hProcess>;
close($hProcess);
if (!open($hProcess, "git --no-pager ls-tree -r --full-name $treeId $sourcePath |")) {
::warn_message("Unable to get the tree $treeId for $sourcePath");
return();
}
$_ = abs_path($sourcePath);
s/\//\\/g;
$sourcePath = $_;
my $repositoryPath;
$repositoryPath = $sourcePath . '\\';
while ($repositoryPath) {
if (defined $$serverHashReference{uc $repositoryPath}) {
last;
}
my $index;
$repositoryPath = substr($repositoryPath, 0, length($repositoryPath) - 1);
$index = rindex($repositoryPath, '\\');
if ($index gt 0) {
$repositoryPath = substr($repositoryPath, 0, $index + 1);
} else {
$repositoryPath = "";
}
}
if (!$repositoryPath) {
::fatal_error("$sourcePath missing from srcsrv.ini\n");
}
my $repositoryName;
$repositoryName = $$serverHashReference{uc $repositoryPath};
my $currentLine;
while ($currentLine = <$hProcess>) {
if ($currentLine =~ m/^(.*)\s(.*)\s(.*)\t(.*)$/i) {
my $mode;
my $type;
my $objectId;
my $path;
$mode = $1;
$type = $2;
$objectId = $3;
$_ = $4;
s/\//\\/g;
$path = $_;
my $localPath;
$localPath = $repositoryPath . $path;
@{$$self{'FILE_LOOKUP_TABLE'}{lc $localPath}} =
( { $repositoryName => $repositoryPath },
"$localPath*$repositoryName*$path*$objectId" );
}
}
close($hProcess);
return(keys %{$$self{'FILE_LOOKUP_TABLE'}} != 0);
}
Note that the GatherFileInformation operation will gather the information for every file in the tree associated with the source path, not just the source code files. That’s ok however, because the information collected here is not yet put into the PDB files. The next step is that Source Server will call the GetFileInfo operation for each source file that is referenced in a PDB file to get the version control information to be added to the PDB file:
sub GetFileInfo {
my $self = shift;
my $localFile = shift;
if (defined $$self{'FILE_LOOKUP_TABLE'}{lc $localFile}) {
return(@{$$self{'FILE_LOOKUP_TABLE'}{lc $localFile}});
} else {
return(undef);
}
}
Finally, the last piece that we need is tell Source Server how to use the information that we’ve inserted into the PDB file to get to the source code. What we’ve stored is the relative path and SHA-1 object id of the file that is stored in the repository. To actually get the source code out of the repository, we can use the git show command to dump the contents of the object to the console, or in Source Server’s case, a temporary file. To tell Source Server how to get to the file using Git, we define a couple of variables that will be added to the PDB file for Source Server to use:
sub SourceStreamVariables {
my $self = shift;
my @stream;
push(@stream, "GIT_EXTRACT_TARGET=%targ%\\%fnbksl%(%var3%)\\%var4%\\%fnfile%(%var1%)");
push(@stream, "GIT_EXTRACT_CMD=cmd /c git --no-pager
\"--git-dir=%fnvar%(%var2%).git\" show %var4% >
\"%git_extract_target%\"");
return (@stream);
}
The full source code for the Git module is below:
# ------------------------------------------------------------------------
# git.pm
#
# Source Server module to handle adding version control information to
# PDB symbol files for source code that is stored in a Git repository.
#
# Copyright (c) 2009 ImaginaryRealities Software Company
# ------------------------------------------------------------------------
package GIT;
require Exporter;
use strict;
use Cwd 'abs_path';
sub new {
my $proto = shift;
my $class = ref($proto) || $proto;
my $self = {};
bless($self, $class);
$$self{'FILE_LOOKUP_TABLE'} = ();
return($self);
}
sub GatherFileInformation {
my $self = shift;
my $sourcePath = shift;
my $serverHashReference = shift;
my $hProcess;
if (!open($hProcess, "git --no-pager log -1 --pretty=format:\%T -- $sourcePath |"))
{
::warn_message("Unable to get the log for $sourcePath");
return();
}
my $treeId;
$treeId = <$hProcess>;
close($hProcess);
if (!open($hProcess, "git --no-pager ls-tree -r --full-name $treeId $sourcePath |")) {
::warn_message("Unable to get the tree $treeId for $sourcePath");
return();
}
$_ = abs_path($sourcePath);
s/\//\\/g;
$sourcePath = $_;
my $repositoryPath;
$repositoryPath = $sourcePath . '\\';
while ($repositoryPath) {
if (defined $$serverHashReference{uc $repositoryPath}) {
last;
}
my $index;
$repositoryPath = substr($repositoryPath, 0, length($repositoryPath) - 1);
$index = rindex($repositoryPath, '\\');
if ($index gt 0) {
$repositoryPath = substr($repositoryPath, 0, $index + 1);
} else {
$repositoryPath = "";
}
}
if (!$repositoryPath) {
::fatal_error("$sourcePath missing from srcsrv.ini\n");
}
my $repositoryName;
$repositoryName = $$serverHashReference{uc $repositoryPath};
my $currentLine;
while ($currentLine = <$hProcess>) {
if ($currentLine =~ m/^(.*)\s(.*)\s(.*)\t(.*)$/i) {
my $mode;
my $type;
my $objectId;
my $path;
$mode = $1;
$type = $2;
$objectId = $3;
$_ = $4;
s/\//\\/g;
$path = $_;
my $localPath;
$localPath = $repositoryPath . $path;
@{$$self{'FILE_LOOKUP_TABLE'}{lc $localPath}} = ( { $repositoryName => $repositoryPath }, "$localPath*$repositoryName*$path*$objectId" );
}
}
close($hProcess);
return(keys %{$$self{'FILE_LOOKUP_TABLE'}} != 0);
}
sub GetFileInfo {
my $self = shift;
my $localFile = shift;
if (defined $$self{'FILE_LOOKUP_TABLE'}{lc $localFile}) {
return(@{$$self{'FILE_LOOKUP_TABLE'}{lc $localFile}});
} else {
return(undef);
}
}
sub LongName {
return("Git");
}
sub SourceStreamVariables {
my $self = shift;
my @stream;
push(@stream, "GIT_EXTRACT_TARGET=%targ%\\%fnbksl%(%var3%)\\%var4%\\%fnfile%(%var1%)");
push(@stream, "GIT_EXTRACT_CMD=cmd /c git --no-pager \"--git-dir=%fnvar%(%var2%).git\" show %var4% > \"%git_extract_target%\"");
return (@stream);
}