PrintFVDiff - compare free variables of two cache entries
PrintFVDiff [--verbose | -v] primary-key cache-index-1 cache-index-2
Most of the time, users trust Vesta's caching system to do the right thing. However, sometimes they want to know why a particular evaluation had a cache miss rather than a cache hit. This kind of investigation is common when looking for ways to improve build performance by reducing cache misses.
In general, the Vesta cache is designed to efficiently determine whether a previous build result can be re-used. It is not designed to efficiently answer user queries about past builds. However in some specific cases, answers can be obtained by inspecting what's stored in the cache.
Two different cache entries with the same primary key (aka "pk") can be compared to see which of their free variables had different values. (Free variables are also often called secondary dependencies.) PrintFVDiff will compare the free variables of two cache entries and print those that had different values. This helps answer the common question "Why was entry B added instead of getting a cache hit on entry A?" The free variables with different values are those inputs which the function used but were different between the two calls.
In order to perform the comparison you need to know the primary key and the cache index (aka "ci") for both of the entries to be compared. The primary key should be given in hex and may have whitespace between the two 64-bit halves (as it often does in printed representations from other programs). There are a variety of ways you can get these pieces of information:
- The evaluator's -trace command-line option will print the cache index of every cache hit and every new entry added to the cache
- PrintCallGraph is a utility for searching for cache entries and showing their caller/callee relationships that prints primary keys and cache indices
- VCacheStats gathers statistics about the entries stored in Vesta cache. It can be used to identify optimization opportunities in SDL functions. It can print primary keys and cache indices, and may be particularly useful in identifying them when its -report command-line option is used.
- PrintGraphLog will print the primary key and cache index of every entry in the cache (though it doesn't provide much information to help humans understand what the cache entries represent). If you don't already know it, the primary key of a cache index can be found by searching its output.
- PrintCacheLog will print information about recently added cache entries, including the primary key and cache index of each.
- The evaluator will print both primary keys and cache indices when using the -cdebug command-line option (though it produces a large amount of output including many other pieces of information).
PrintFVDiff shows the dependency types and paths as the evaluator sent them to the cache. It's up to the user to make sense of what each free variable means in the context of the corresponding function.
By default, PrintFVDiff shows only free variables which both cache entries depended upon. Adding --verbose to the command line will make it show dependencies that one entry depended upon but the other didn't. (Usually the user is more interested in the free variables the two entries have in common but which have different values.)
Note that it is possible to have multiple cache entries with no free variable differences at all. Suppose two users simultaneous evaluate builds that are similar or even the same. They could both get a cache miss on the same function call, both do the work of the function, and both add cache entries. If the function calls had identical arguments, this would produce two identical cache entries.
- --verbose | -v
- In addition to free variables that the two cache entries have in common but had different values, show any free variables that one cache entry depends upon but the other doesn't. Such free variables can be interesting when trying to understand the caching behavior of your builds. Reducing or eliminating uncommon free variables can improve build efficiency. They're not shown by default because such free variables usually don't help answer the question "Why was entry B added instead of getting a cache hit on entry A?"
Also, print the text "sourceFunc" annotation in the PKFile, which indicates the function call that the cache entries correspond to.
Suppose the user is interested in comparing two similar builds. They might run PrintCallGraph on the two top-level models and compare the output. In this case, suppose the user selected this pair of cache entries (for the _run_tool linking the PrintFVDiff executable):
% diff -u /tmp/call_graph_1 /tmp/call_graph_2 ... - ci = 31581 + ci = 31382 pk = 2b1f8372382297bc 7b16a71ef275c5a5 sourceFunc = _run_tool, command line: /usr/bin/g++-3.4 -L -L. -o PrintFVDiff ... ...Here's the output of PrintFVDiff called to compare these two cache entries:% PrintFVDiff '2b1f8372382297bc 7b16a71ef275c5a5' 31581 31382 N/./root/.WD/PrintFVDiff.oThis shows us that the the file /.WD/PrintFVDiff.o had different contents in the encapsulated filesystem (which is ./root in the SDL code) for these two different tool runs.If the user wants more information they can add the --verbose flag:
% PrintFVDiff -v '2b1f8372382297bc 7b16a71ef275c5a5' 31581 31382 sourceFunc = _run_tool, command line: /usr/bin/g++-3.4 -L -L. -o PrintFVDiff ... ------------------------------ ~ : differs < : only in 31581 > : only in 31382 ------------------------------ ~ N/./root/.WD/PrintFVDiff.o > !/./root/.WD/ccpYjO0q.ld > !/./root/.WD/ccnq4HAf.c > !/./root/.WD/cc8TK9cl.o < !/./root/.WD/ccOnYtVh.o < !/./root/.WD/ccTMrSVl.ld < !/./root/.WD/ccjDS15d.cHere we can see that there are some temporary files generated by the tool in /.WD which had different names in each run.(Note that the sourceFunc annotation in both the PrintFVDiff output and the comparison of the PrintCallGraph output above has been truncated for brevity.)
The following exit values are returned:
- 0 : Successful completion
- 1 : Command line parsing error or configuration error (e.g. trouble reading configuration file or missing settings)
- 2 : Error opening or reading the the MultiPKFile or other on-disk cache files
- 3 : The PKFile or the cache entries couldn't be found after successfully opening the MultiPKFile
- 4 : Any other errors
PrintFVDiff locates the MultiPKFile containing the cache entries by reading site-specific configuration information from a Vesta configuration file. (See the vesta.cfg(5) man page for an overview.)
The variables used by the PrintFVDiff are in the section denoted by [CacheServer]. Here are the variables it uses and their meanings; the types of the variables are shown in parentheses:
- MetaDataRoot (string)
- The pathname of the directory in which the Vesta cache's metadata is stored. If this variable is undefined, the current directory is used. If defined, this path should end in a slash (/) character. Other configuration variables are interpreted relative to this path.
- MetaDataDir (string)
- The directory (relative to the MetaDataRoot) in which the cache server's metadata is stored. This directory should end in a slash (/) character.
- SCacheDir (string)
- The directory (relative to the MetaDataRoot/MetaDataDir) in which the function cache stores cache entries.
- $MetaDataRoot/$MetatDataDir/$SCacheDir/
- The root of the sub-tree in which stable cache entry files (also known as MultiPKFiles) are stored. The files are stored under a pathname formed from their respective primary keys.
PrintFVDiff can't tell you what the values of the free variables were for the two calls. It can only tell you that they were different. The cache only records the fingerprints of the values (similar to a checksum or a cryptographic hash function like MD5 or SHA1). It compares those fingerprints to determine whether the value was the same or different, which is the same way cache hits or misses are determined when building.
PrintFVDiff only accesses information that the VCache daemon has committed to disk. You may need to use FlushCache to get new entries committed to disk before running PrintFVDiff, especially if you're interested in cache entries added by recent builds.
The user must supply the primary key. It would be possible to make primary key optional and have PrintFVDiff search the cache's graph log for the primary keys of the two specified CIs. This could be slow, but might be useful. (Of course it would need to detect the case where the CIs have two different PKs and report that to the user as an error.)
In some cases there's no easy answer that can be obtained from the information stored in the Vesta cache. If two cache entries are logically related but have different primary keys, the cache has no way to know which values incorporated into the primary key were different. The construction of the primary key is done by the evaluator, and only it knows what values were incorporated into the primary key used for each function call. (There is documentation about how the evaluator forms primary keys for different function calls, but the topic is beyond the scope of this man page.)
vesta(1), VCache(1), PrintCallGraph(1), VCacheStats(1), FlushCache(1), PrintMPKFile(1), VCacheImpl(7), MultiPKFile(5), PrintGraphLog(1), PrintCacheLog(1), vesta.cfg(5)
This page was generated automatically by mtex software.Kenneth C. Schalk <ken AT xorian DOT net>