Friday, May 10, 2013

Cut Command


1. Introduction

If you think that you can do Linux System administration without cut command, then you are absolutely right. However, mastering this fairly simple command line tool will give you a great advantage when it comes to the efficiency of your work on a user as well administration level. To simply put, cut command is one of many text-filtering command line tools that Linux Operation System has to offer. It filters standard STDIN from another command or input file and sends the filtered output to STDOUT.

2. Frequently used options

Without too much talk let's start by introducing main and the most commonly used cut command line options.
  • -b, --bytes=LIST
    Cuts the input file using list of bytes specified by this option
  • -c, --characters=LIST
    Cuts the input file using list of characters specified by this option
  • -f, --fields=LIST
    Cuts the input file using list of field. The default field to be used TAB. The default behavior can be overwritten by use of -d option.
  • -d, --delimiter=DELIMITER
    Specifies a delimiter to by used as a field. As mentioned previously default field is TAB and this option overwrites this default behavior.

3. Using LIST

List in this case can consist of single or range of bytes, characters or fields. For example to display only second byte the list will include a single number 2 .
Therefore:
  • 2 will display only second byte, character or field counted from 1
  • 2-5 will display all bytes, characters or fields starting from second and finishing by 5th
  • -3 will display all bytes, characters or fields before 4th
  • 5- will produce all bytes, characters or fields starting with 5th
  • 1,3,6 will display only 1st, 3rd and 6th byte, character or field
  • 1,3- displays 1st and all bytes, characters or fields starting with 3th
Let's see how this works in practice.

4. Cut by Character

In the following examples are rather self-explanatory. We used cut's -c option to print only specific range of characters from cut.txt file.
echo cut-command > cut.txt 
$ cut -c 2 cut.txt 
u
$ cut -c -3 cut.txt
cut
$ cut -c 2-5 cut.txt
ut-c
$ cut -c 5- cut.txt
command

5. Cut By Byte

The principle behind -b ( by byte ) option is similar to the one described previously. We know that a single character has size of 1 byte and therefore result after executing previous commands with -b option will be exactly the same:
$ cut -b 2 cut.txt
u
$ cut -b -3 cut.txt
cut
$ cut -b 2-5 cut.txt
ut-c
$ cut -b 5- cut.txt
command
The cut.txt is a simple ASCII text file. The difference only comes when using multi-byte encoding files as UTF-8 Unicode text . For example:
$ echo Ľuboš > cut.txt
$ file cut.txt 
cut.txt: UTF-8 Unicode text
$ cut -b 1-3 cut.txt 
Ľu
$ cut -c 1-3 cut.txt 
Ľub

6. Cut By Field

As mentioned previously the default field used by cut command is TAB. For example lets create a file where common delimiter is TAB.
Hint: In case you will straggle to insert TAB on a command line, use ^V  ( CTRL + V ) before you hit TAB
$ echo "1        2       3" > cut.txt 
$ echo "4        5       6" >> cut.txt 
$ cat cut.txt 
1       2       3
4       5       6
$ cut -f2- cut.txt 
2       3
5       6
The example above printed only 2nd and 3th column because the common delimiter was TAB and TAB is used by cut as a default field. To make sure that you used TAB instead of SPACE use od command:
$ echo "1        2" > tab.txt
$ echo "1        2" > space.txt
$ od -a tab.txt 
0000000   1  ht   2  nl
0000004
$ od -a space.txt 
0000000   1  sp  sp  sp  sp  sp  sp  sp  sp   2  nl
0000013
If we need to override the default behavior and instruct cut command to use different common delimiter the -d option becomes very handy.
$ echo 1-2-3-4 > cut.txt 
$ echo 5-6-7-8 >> cut.txt 
$ cat cut.txt 
1-2-3-4
5-6-7-8
$ cut -d - -f-2,4 cut.txt 
1-2-4
5-6-8
The clasical example where we need to use -d option is to extract list of users on a current system from /etc/passwd file:
$ cut -d : -f 1 /etc/passwd
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
...
It needs to mentioned that to get a uniform output the common delimiter must be unified across every line of the input. For example it would be hard to use SPACE as a common delimiter the the following example:
$ cat cut.txt 
cut command
w   command
awk command
wc  command
$ cut -d " " -f2 cut.txt 
command

command

In this case it would be much easier to use awk command or use sed command to first replace multiple spaces with a single delimiter such as ",":
$ sed 's/\s\+/,/' cut.txt | cut -d , -f2
command
command
command
command
$ awk '{ print $2; }' cut.txt 
command
command
command
command

7. Excluding data using complement

cut command allows you to selectively include desired data in its output. In case you need to select data to exclude from the output, the complement option may become very handy.
For example:
$ echo 12345678 > cut.txt 
$ cat cut.txt 
12345678
$ cut --complement -c -2,4,6- cut.txt 
35

8. Examples

Learning Linux cut command with examples
Linux command syntaxLinux command description
free | grep Mem | sed 's/\s\+/,/g' | cut -d , -f2
Display total memory on the current system
cat /proc/cpuinfo | grep "name" | cut -d : -f2 | uniq
Retrieve a CPU type
wget -q -O X http://ipchicken.com/
grep '^ \{8\}[0-9]' X | sed 's/\s\+/,/g' | cut -d , -f2
Retrieve my external IP address
cut -d : -f 1 /etc/passwd
Extract list of users on on the current system
ifconfig eth0 | grep HWaddr | cut -d " " -f 11
Get a MAC address of my network interfaces
who | cut -d \s -f1
List users logged in to a current system
grep -w  <n> /etc/services | cut -f 1 | uniq
What service is using port <n>.

No comments:

Post a Comment