Formatting Data Sizes
When we present file-based data to a user, it is often unavoidable to show those byte sizes to the user. While you could get away with a formatted display of the number of bytes, for large files, this loses quite a bit of meaning. Instead, the ideal would be to format it to a prefix, selected based on the actual size.
In order to implement this functionality, let’s create a “ByteSizeFormatter” class. We’ll make it static, and add some string arrays for both the standard as well as the ISO suffixes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
public static class ByteSizeFormatter { private static String[] stdbytesuffixes = new String[] { " Bytes", "KB", "MB", "GB", "TB", "PB", "EB", "YB" }; private static String[] isobytesuffixes = new string[] { " Bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "ZiB", "YiB" }; } |
Why use an array? Well, now we can get the appropriate element to use from a given byte size with a bit of math:
1 2 3 4 5 |
private static int getbyteprefixindex(long bytevalue) { return (int)(Math.Floor(Math.Log(bytevalue) / Math.Log(1024))); } |
And students complain that they’ll never use logarithms in real life. If they became programmers they probably would have implemented the above like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
private static int getbyteprefixindex(long bytevalue) { int currindex = 0; long reduceit = bytevalue; while (reduceit > 1024) { reduceit /= 1024; currindex++; } return currindex; } |
TSSK! I say to that. Anyway, the idea here is to get the index of the prefix that, when applied, will give a number above zero but less than 1024; obviously, the idea is to use this routine within another:
[code]
public static String FormatSize(long amount,int numdecimalplaces=2,bool useISO=false)
{
String[] usesuffixes = useISO ? isobytesuffixes : stdbytesuffixes;
int gotindex = getbyteprefixindex(amount);
double calcamount = amount;
calcamount = calcamount / (Math.Pow(1024, gotindex));
return calcamount.ToString(“F” + numdecimalplaces.ToString(CultureInfo.InvariantCulture),
CultureInfo.CurrentCulture) + ” ” + usesuffixes[gotindex];
}
[/code]
FormatSize() formats a given byte amount into an “optimal” value with an appropriate prefix. It does this by generating the appropriate format string to pass to the ToString() method based on the passed number of decimal places.
What else?
So we have a method for formatting a single size. However, we might want to format several sizes with the same “format”; that is, we might want to format a set of values but have them all “comparable” to each other at a glance; 512K can look larger than 3MB at a quick glance, and it is a lot easier to judge and compare file sizes when the values are something you can compare. The first step is a helper method that accepts a calculated index:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
private static String FormatSizeDirect(long amount, int index,int digitsafterdecimal=2,bool useISO=false) { String[] usesuffixes = useISO ? isobytesuffixes : stdbytesuffixes; String buildresult; double amountuse = amount; amountuse = amountuse / (Math.Pow(1024, index)); String formatstring = "{0:0." + String.Join("", Enumerable.Repeat("0", digitsafterdecimal).ToArray()) + "}"; Debug.Print(formatstring); buildresult = String.Format(formatstring, amountuse); buildresult += " " + usesuffixes[index]; return buildresult; } |
“But WAAAIT!” I here you screaming- “That’s duplicating code!” Of course it is- that’s why we can replace the FormatSize() method with a delegated call to this one, like so:
1 2 3 4 |
public static String FormatSize(long amount,int numdecimalplaces=2,bool useISO=false) { return FormatSizeDirect(amount, getbyteprefixindex(amount), numdecimalplaces, useISO); } |
Then, we can use that to create the imagined method from before:
1 2 3 4 5 6 7 8 9 10 |
public static IEnumerable<string> FormatSizes(IEnumerable<long> bytesizes) { //iterate through all the elements, and find the lowest byteprefixindex... int currlowest = stdbytesuffixes.Length + 1; currlowest = bytesizes.Select(getbyteprefixindex).Concat(new[] {currlowest}).Min(); return bytesizes.Select(iteratesize => FormatSizeDirect(iteratesize, currlowest)); } </long></string> |
Tada! The capability of Linq is very helpful here since it helps us make the code only three statements, whereas it might normally take more for various loops. This particular implementation is also an enumerator method, simply because it doesn’t strictly require anything specific to arrays.
To finish off- the full source of this class as it appears in my BASeCamp.Updating library:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
using System; using System.Collections.Generic; using System.Diagnostics; using System.Globalization; using System.Linq; using System.Text; namespace BASeCamp.Updating { public static class ByteSizeFormatter { private static String[] stdbytesuffixes = new String[] { " Bytes", "KB", "MB", "GB", "TB", "PB", "EB", "YB" }; private static String[] isobytesuffixes = new string[] { " Bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "ZiB", "YiB" }; /// <summary> /// gets the byte prefix index of the private arrays to use for the given value. /// </summary> /// <param name="bytevalue"></param> /// <returns></returns> private static int getbyteprefixindex(long bytevalue) { return (int)(Math.Floor(Math.Log(bytevalue) / Math.Log(1024))); } public static String FormatSize(long amount,int numdecimalplaces=2,bool useISO=false) { return FormatSizeDirect(amount, getbyteprefixindex(amount), numdecimalplaces, useISO); } private static String FormatSizeDirect(long amount, int index,int digitsafterdecimal=2,bool useISO=false) { String[] usesuffixes = useISO ? isobytesuffixes : stdbytesuffixes; String buildresult; double amountuse = amount; amountuse = amountuse / (Math.Pow(1024, index)); String formatstring = "{0:0." + String.Join("", Enumerable.Repeat("0", digitsafterdecimal).ToArray()) + "}"; Debug.Print(formatstring); buildresult = String.Format(formatstring, amountuse); buildresult += " " + usesuffixes[index]; return buildresult; } /// <summary> /// formats a set of byte values to use the most honest display; that is, if we have 23 bytes and 1440 bytes, both will be displayed as bytes, but if it is 1330 and 1440, it shows as KB. /// /// </summary> /// <param name="bytesizes"></param> /// <returns></returns> public static IEnumerable<String> FormatSizes(IEnumerable<long> bytesizes) { //iterate through all the elements, and find the lowest byteprefixindex... int currlowest = stdbytesuffixes.Length + 1; currlowest = bytesizes.Select(getbyteprefixindex).Concat(new[] {currlowest}).Min(); return bytesizes.Select(iteratesize => FormatSizeDirect(iteratesize, currlowest)); } } } |
Have something to say about this post? Comment!