CUT command


The cut command takes a vertical slice of a file, printing only the specified columns or fields. Like the sort command, the cut command defines a field as a word set off by blanks, unless you specify your own delimiter. It's easiest to think of a column as just the nth character on each line. In other words, "column 5" consists of the fifth character of each line.

Consider a slight variation on the company.data file we've been playing with in this section:

406378:Sales:Itorre:Jan
031762:Marketing:Nasium:Jim
636496:Research:Ancholie:Mel
396082:Sales:Jucacion:Ed


If you want to print just columns 1 to 6 of each line (the employee serial numbers), use the -c1-6 flag, as in this command:

cut -c1-6 company.data
406378
031762
636496
396082


If you want to print just columns 4 and 8 of each line (the first letter of the department and the fourth digit of the serial number), use the -c4,8 flag, as in this command:

cut -c4,8 company.data
3S
7M
4R
0S


And since this file obviously has fields delimited by colons, we can pick out just the last names by specifying the -d: and -f3 flags, like this:

cut -d: -f3 company.data
Itorre
Nasium
Ancholie
Jucacion


It's often the case that you want to use a space as the delimiter. To do so, you must put the delimiter in single quotes, like this: -d' '

Also, when you want to cut from a starting point to the end of the line, just leave off the final field number, as shown in the example below.

Let's say this is your test.txt file:
abc def ghi jkl
mno pqr stu vwx
yz1 234 567 890

To cut only columns 2-END, do this: cut -d' ' -f2- test.txt

And the results are:
def ghi jkl
pqr stu vwx
234 567 890


Here is a summary of the most common flags for the cut command:

-c [n | n,m | n-m] Specify a single column, multiple columns (separated by a comma), or range of columns (separated by a dash).
-f [n | n,m | n-m] Specify a single field, multiple fields (separated by a comma), or range of fields (separated by a dash).
-d Specify the field delimiter.
-s Suppress (don't print) lines not containing the delimiter.


If you think that you can do Linux System administration without cut command, then you are absolutely right. However, mastering this fairly simple command line tool will give you a great advantage when it comes to the efficiency of your work on a user as well administration level. To simply put, cut command is one of many text-filtering command line tools that Linux Operation System has to offer. It filters standard STDIN from another command or input file and sends the filtered output to STDOUT.

2. Frequently used options

Without too much talk let's start by introducing main and the most commonly used cut command line options.

    * -b, --bytes=LIST
      Cuts the input file using list of bytes specified by this option

    * -c, --characters=LIST
      Cuts the input file using list of characters specified by this option

    * -f, --fields=LIST
      Cuts the input file using list of field. The default field to be used TAB. The default behavior can be overwritten by use of -d option.

    * -d, --delimiter=DELIMITER
      Specifies a delimiter to be used as a field. As mentioned previously default field is TAB and this option overwrites this default behavior.



3. Using LIST

List in this case can consist of single or range of bytes, characters or fields. For example to display only second byte the list will include a single number 2 .

Therefore:

    * 2 will display only second byte, character or field counted from 1
    * 2-5 will display all bytes, characters or fields starting from second and finishing by 5th
    * -3 will display all bytes, characters or fields before 4th
    * 5- will produce all bytes, characters or fields starting with 5th
    * 1,3,6 will display only 1st, 3rd and 6th byte, character or field
    * 1,3- displays 1st and all bytes, characters or fields starting with 3th

Let's see how this works in practice.


4. Cut by Character

In the following examples are rather self-explanatory. We used cut's -c option to print only specific range of characters from cut.txt file.

echo cut-command > cut.txt 
$ cut -c 2 cut.txt 
u
$ cut -c -3 cut.txt
cut
$ cut -c 2-5 cut.txt
ut-c
$ cut -c 5- cut.txt
command


5. Cut By Byte

The principle behind -b ( by byte ) option is similar to the one described previously. We know that a single character has size of 1 byte and therefore result after executing previous commands with -b option will be exactly the same:

$ cut -b 2 cut.txt
u
$ cut -b -3 cut.txt
cut
$ cut -b 2-5 cut.txt
ut-c
$ cut -b 5- cut.txt
command


The cut.txt is a simple ASCII text file. The difference only comes when using multi-byte encoding files as UTF-8 Unicode text . For example:

$ echo Ľuboš > cut.txt
$ file cut.txt 
cut.txt: UTF-8 Unicode text
$ cut -b 1-3 cut.txt 
Ľu
$ cut -c 1-3 cut.txt 
Ľub


6. Cut by Field

As mentioned previously the default field used by cut command is TAB. For example lets create a file where common delimiter is TAB.

Hint: In case you will straggle to insert TAB on a command line, use ^V  ( CTRL + V ) before you hit TAB

$ echo "1        2       3" > cut.txt 
$ echo "4        5       6" >> cut.txt 
$ cat cut.txt 
1       2       3
4       5       6
$ cut -f2- cut.txt 
2       3
5       6


The example above printed only 2nd and 3th column because the common delimiter was TAB and TAB is used by cut as a default field. To make sure that you used TAB instead of SPACE use od command:

$ echo "1        2" > tab.txt
$ echo "1        2" > space.txt
$ od -a tab.txt 
0000000   1  ht   2  nl
0000004
$ od -a space.txt 
0000000   1  sp  sp  sp  sp  sp  sp  sp  sp   2  nl
0000013


If we need to override the default behavior and instruct cut command to use different common delimiter the -d option becomes very handy.

$ echo 1-2-3-4 > cut.txt 
$ echo 5-6-7-8 >> cut.txt 
$ cat cut.txt 
1-2-3-4
5-6-7-8
$ cut -d - -f-2,4 cut.txt 
1-2-4
5-6-8

The classical example where we need to use -d option is to extract list of users on a current system from /etc/passwd file:

$ cut -d : -f 1 /etc/passwd
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
...

It needs to mention that to get a uniform output the common delimiter must be unified across every line of the input. For example it would be hard to use SPACE as a common delimiter the the following example:

$ cat cut.txt 
cut command
w   command
awk command
wc  command
$ cut -d " " -f2 cut.txt 
command

command


In this case it would be much easier to use awk command or use sed command to first replace multiple spaces with a single delimiter such as ",":

$ sed 's/\s\+/,/' cut.txt | cut -d , -f2
command
command
command
command
$ awk '{ print $2; }' cut.txt 
command
command
command
command


7. Excluding data using complement

cut command allows you to selectively include desired data in its output. In case you need to select data to exclude from the output, the complement option may become very handy.

For example:

$ echo 12345678 > cut.txt 
$ cat cut.txt 
12345678
$ cut --complement -c -2,4,6- cut.txt 
35


8. Examples
Learning Linux cut command with examples
Linux command syntax        Linux command description

free | grep mem | sed 's/\s\+/,/g' | cut -d , -f2

          Display total memory on the current system

cat /proc/cpuinfo | grep "name" | cut -d : -f2 | uniq

          Retrieve a CPU type

wget -q -O X http://ipchicken.com/
grep '^ \{8\}[0-9]' X | sed 's/\s\+/,/g' | cut -d , -f2

          Retrieve my external IP address


cut -d : -f 1 /etc/passwd

          Extract list of users on on the current system

ifconfig eth0 | grep HWaddr | cut -d " " -f 11

          Get a MAC address of my network interfaces

who | cut -d \s -f1

          List users logged in to a current system

grep -w  <n> /etc/services | cut -f 1 | uniq

          What service is using port <n>.

1 comment:

  1. hello
    I am sunil gupta from Delhi India. I find red hat partner company for linux training certification, final i got a wonderful company with exclusive labs for every kind of red hat certification.

    ReplyDelete