The Storage subtab displays information about your UCAR (unique content-addressable repository) garbage collection system.
For clients using deduplication, the UCAR system runs a garbage collection process every day to find and purge any files that are no longer referenced. Some ways that data can become unreferenced “garbage” are when clients are deleted without their jobs being purged or when old jobs were not removed completely. It is recommended to run garbage collection after deleting jobs to insure the data is cleared completely. This is similar to jobs with unreferenced data. (See Unreferenced Data).
The garbage collection will be deferred for up to 12 hours before it terminates the process. If the process times out, it will be retried at its next regular time. There is one exception. In the event the system is running low on space, the garbage collection will proceed whether there are jobs deduplicating or not.
The Storage screen has the following settings:
|Garbage Collection||Allows to manually start the garbage collection|
|Garbage Collection Time of Day||Allows to set the time of day when the garbage collection will run automatically|
|Compact Online||Start Online DDFS Compact manually|
|Compact Online Time of Day||Set time when Online DDFS Compact starts automatically each day|
|Verify UCAR||Allows to verify UCAR integrity. This will systematically read all the files in UCAR, and verify if their computed signature matches the recorded one. If not, the file will be quarantined. The process is extremely I/O intensive and can take weeks to run to completion on systems with large amount of the stored data. Use only when told by the Infrascale Support|
Also, by scrolling down the same page you may find the Block Deduplication Statistics (raw) section. It will look similar to the one that you see below:
|Data name||Data description|
|Block Written||Total number of full blocks that have been written into DDFS since it was configured initially|
|Block Size||The size of the blocks files are divided into during the deduplication process. This option is not configurable|
|Total Blocks||The total number of blocks that have been written to DDFS since it was configured initially. It includes both full and partial blocks|
|Total Bytes||The number of partial blocks that have been written to DDFS since it was configured initially|
|Partial Blocks||The total of the size of all the partial blocks that have been written to DDFS. Partial blocks happen at the end of a file that does not evenly divide into blocks. For example, a 96 KB file will be divided into 64 KB full block, and 32 KB partial block|
|Partial Bytes||The number of times a block already existed in the block store and did not need to be written again, thus saving space|
|Duplicate Blocks||The number of bytes that did not have to get written to the raid because we already had a copy of a block|
|Duplicate Bytes||The sum of the size of all of the blocks marked as free in the block store|
|Free Blocks||A counter of times blocks have been read back from the DDFS|
|Free Bytes||The sum of the size of all of the blocks marked as free in the block store|
|Blocks Read||A counter of times blocks have been read back from the DDFS|
|Allocated Bytes||The size of the block stores. Includes both the used and the free blocks|