Commifying Numbers in Linux: Not as Easy as it Looks

Jul 07, 2022 @ 11:41 am

Commify Today I had a need to process large numbers of PostgreSQL database tables and report how many rows were in each. Once I had that information, I needed to do some additional math on this data and then print it out in a nice formatted manner. And for reasons I won’t go into, I had to use shell.

But it was all very trivial…until I got to that last part. It involves commification.

Is that even a word? My browser’s spell checker thinks not and it doesn’t know commify either. While you might think commification is a 1950s word for the spread of Marxism, it means “adding commas to numbers”. In other words, turning 100000000 into 100,000,000.

Of course, in some countries they’d look at 100,000,000 and ask why you have two decimal places. And in India they’d expect this to be formatted as 1,00,00,00,00 which is as confused to me as 100,000,000 would be to an Indian. But regardless of your locale, the issue is the same.

There are Linux tools to transform, chop, parse, split, join, fold, bend, substitute, replace, and apply just about any contortion to a string of letters, numbers, and symbols. Indeed, text processing is one of Unix’s greatest strengths. Indeed, early on that was what Unix was mainly used for.

However, at first glance, I was unable to find any simple commifier for Linux. I have little snippets on how to do it in Python and Perl but there should be an easy way to do this on the command line. Right?

Wrong.

Some people have written extensive scripts to do this. Removing comments and blank lines from that script, it’s 20 lines. 20 lines of interpreted shell code just to format a number!

Fortunately, we don’t need to do this. We can do it in a single line:

     $ echo 100000000 | sed ':a;s/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g;ta'
     $ 100,000,000

There, that’s simple enough, right? Let’s walk through that sed command letter by letter…wait, don’t go! I was just kidding. BTW, that only works for GNU sed – there are other options discussed in the sed FAQ.

But please don’t waste your time. Let me show you the simple way that’s small enough to easily commit to memory.

Check this out:

     $ echo 100000000 | printf "%'d\n" 100000000
     $ 100,000,000

Nice! This works in sufficiently modern versions of bash and also ksh (93+ – if you’re still using ksh88 in 2022 you need counseling).

It’s the apostrophe after the % sign that tells printf to commify.

Any suggestion that I wrote this article so I could find it the next time I google how to do this is probably accurate.

The End of Uniprocessor Configs on Linux - It's a Multicore-Only Kernel Now

Five Times When Updating Your OS Would Have Saved You From Being Hacked

"OMG! I Never Knew That!": The Simply Linux Tip That Has Got Me More Thanks Than Anything I've Ever ...

Need a Laugh? Read the Linux Kernel List's Foam-Mouthed Responses to Russian Programmers Banned from...

Nontechnical Nonsense: Rust Stirs Up a Storm of Drama in the Linux Kernel: Ted T'so Shouting, Mainta...

RedHat Prepares to Give Bootloaders the Boot

raindog308

Raindog308 is a longtime LowEndTalk community administrator, technical writer, and self-described techno polymath. With deep roots in the *nix world, he has a passion for systems both modern and vintage, ranging from Unix, Perl, Python, and Golang to shell scripting and mainframe-era operating systems like MVS. He’s equally comfortable with relational database systems, having spent years working with Oracle, PostgreSQL, and MySQL.

As an avid user of LowEndBox providers, Raindog runs an empire of LEBs, from tiny boxes for VPNs, to mid-sized instances for application hosting, and heavyweight servers for data storage and complex databases. He brings both technical rigor and real-world experience to every piece he writes.

Beyond the command line, Raindog is a lover of German Shepherds, high-quality knives, target shooting, theology, tabletop RPGs, and hiking in deep, quiet forests.

His goal with every article is to help users, from beginners to seasoned sysadmins, get more value, performance, and enjoyment out of their infrastructure.

You can find him daily in the forums at LowEndTalk under the handle @raindog308.

2 Comments

Jatin:
Figured I’d point this out – we’d write 100000000 in Indian system as 10,00,00,000 and not 1,00,00,00,00, and read it as 10 crore.
July 9, 2022 @ 2:06 pm | Reply
Dave:
This…
echo 100000000 | printf “%’d\n” 100000000
…doesn’t actually work because printf is getting the number from the end, not the one being piped in. If you want to pipe it in you have to do it via xargs, like so.
grep MemTotal /proc/meminfo | tr -s ‘ ‘ | cut -d ‘ ‘ -f 2 | xargs printf “%’d\n”
Any suggestion that I wrote this comment so I could find it the next time I google how to do this is probably accurate.
September 25, 2023 @ 4:08 pm | Reply