The roots of Unix go all the way back to the 1970s and Bell Labs. Originally intended for use at Bell and AT&T, Unix was licensed for use with outside parties starting in the late 70s, and soon it was in use at UC-Berkeley, Sun Microsystems, Microsoft and IBM. Over the years, the rights to Unix changed hands among Novell, Santa Cruz Operation and The Open Group. Today, the single biggest application for Unix is with Apple’s macOS.
Unix was originally intended as a platform for programmers who were developing Unix-compatible software. It began to catch on in academic circles and grew larger as users adapted their own tools into the system. Unix uses plain text for storing data, with a hierarchical file system and the capacity for treating devices and interprocess communication as separate files. By the early 80s, users began to regard Unix as an operating system that could be suitable for computers of all sizes.
Understanding Unix
Unix can be thought of as the link between the user and the computer, in the form of a set of programs. These programs allocate and manage the computer’s resources, with the shell translating commands from the user and converting them into terms that are understood by the “kernel” or OS, which is Unix.
The system also runs on commands and utilities, which initiate everyday tasks for the system and are usually delivered through third-party software. Its information and data are all organized into files, which are further organized into directories. This tree-like design is called the filesystem.
WC for Linux
In Linux and Unix operating systems, the WC (word count) command is what’s used to determine the number of newline count, word count, bytes and characters in a file, as defined by the file arguments. Let’s break down these definitions a little bit, as they pertain to Linux and Unix:
- Newline: a control character or a series of control characters in a character encoding setup such as ASCII or EBCDIC. Newline marks the end of a line of text and the beginning of a new one.
- Byte: a sequence of a certain number of bits, which can be used as a unit of memory or storage. An eight-bit byte is used for instructions execution
- Word : a “word” is considered to be any sequence of characters, followed by a white space.
Counts are always assembled in the following order: newline, word, character, byte, maximum line length. So, these are a few examples:
WC -1: Prints the number of lines in a file
WC – w: Prints the number of words in a file
WC – c: Displays the count of bytes in a given file
WC – m: Prints the count of characters in a file
WC – L: Prints only the length of the longest line in a file
WC In Use
Obviously, WC is really handy when there are instances where you need to know characters, line counts, bytes, words or longest lines in a given file. So, let’s take a look at how it’s actually done.
A basic command could be:
$ wc/etc/passwd
65 185 3667/etc/passwd
In a case such as this, 65 is the lines count, 185 the word count and 3667 the characters count. Now, should you just need to know the number of lines in a given file, add the -1 argument:
$ wc -1 etc/passwd
65/etc/passwd
Determining the number of words in a file, add the -w argument:
$ wc -w etc/passwd
185 etc/passwd
Using Pipes in Linux WC Commands
Pipelines, in Unix and Linux systems, can be thought of as a series of processes that are linked by standard streams, with the output of one process fed directly as input to the next process. The shell syntax used for pipelines will list a series of commands, each separated by a vertical bar or “pipe,” in Unix/Linux verbiage. Pipes and pipelines are also used in other operating systems, such as Microsoft windows, DOS, OS/2 and BeOs.
WC as a command follows the process of reading input from STDIN and writing output to STD out. Thus, it can be used in Linux command pipelines, for instance:
Here’s a command that shows the number of people who are logged into your Linux system at any given time:
who | wc -1
In this instance, the output of the “who” command is piped into the input of the “wc” command, which, in this case, is being used to tabulate the number of lines in the output of the “who” command. Along the same lines, this example shows how to use pipelines and WC to indicate the number of processes running on a Linux system at a given time:
ps -3 | wc -1
This is achieved the same way as the previous example. The “ps” command generates output, and the “wc” -1 command indicates the number of lines of output associated with that command. In other words, piping is a useful way to connect the streams and connections between programs and files, directing and redirecting data in accessible ways.
Redirecting From a File
Let’s look at this example:
user@bash: wc -1 myoutput
8 myoutput
user@bash 2c -1 ¸myoutput
8
user@bash:
Many programs allow a user to access a file as a command line argument, and will then read and process the contents of that file. Using WC to supply the file you need as a command line argument, the output from that program displays the name of the file that’s being processed. The subtle difference in this example is that when we want to redirect contents of the file using wc, the file name isn’t printed since in an occasion where redirection or piping are used, data is sent anonymously. In this example, WC refers some content to process, but can’t print this information since there’s no specific knowledge of where it came from. This is a mechanism that can be used to get access to ancillary data, which isn’t printed.
Now, we combine two different forms of redirection into a single command, still using WC:
user@bash: wc -1 < barry.txt > myoutput
user@ bash: cat myoutput
7
user@bash:
Let’s Recap
So we’ve talked about how to use the WC command in Linux and Unix. Just to reiterate:
Wc -1 : prints the number of lines in a file
Wc -w: prints the number of words in a file
Wc -c: displays the count of the bytes in a file
Wc -m: prints the count of characters in a file
Wc – L: prints ONLY the length of the longest line in a file
To find more information and get help for the WC command, run “wc -help” or “man wc” from the command line. Remember that more than one file name is called for in argument, the command will show a four-columnar output that displays all individual files. There will be another extra row that displays all files specified in argument, including the total number of characters, lines and words, along with the keyword “total.”
The WC command can also be used to get a total of the number of rows or records in CSV files, when used along with pipes. Here’s an example with five CSV files and the objective of finding the sum of records across all five files. Here, we’ll pipe the output of the “cat” command to WC:
cat *.csv | wc -1
1866
Thus, there are 1866 records counted across all five files.
WC can also be used to get a total of folders and files in a directory, when combined with the ls command. It involves passing the -1 option to ls to count each folder or line, piped to WC to give a count:
ls -1 | wc -1
21
And we see a total of 21 folders and files in that directory.
Learning your way around Unix and Linux is a lot like learning a new language, and the WC command is just another part of the syntax that you’ll need to grasp to do well with Unix.