Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Linux Comand line for begineer
Description: Linux Comand line for begineer you can learn the linux command very easily.

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


The Linux Command Line
Second Internet Edition

William E
...


A LinuxCommand
...
Shotts, Jr
...
0 United States License
...

Linux® is the registered trademark of Linus Torvalds
...

This book is part of the LinuxCommand
...
You
may contact the LinuxCommand
...
org
...
No Starch Press also offers this book in electronic formats for most popular e-readers: http://nostarch
...
htm
Release History
Version

Date

Description

13
...


09
...


09
...


09
...


09
...


09
...


Table of Contents
Introduction
...
xvi
What This Book Is About
...
xvii
What's In This Book
...
xviii
Prerequisites
...
xix
Acknowledgments
...
xx
What's New In The Second Internet Edition
...
xxi
Colophon
...
1
1 – What Is The Shell?
...
2
Your First Keystrokes
...
3
Cursor Movement
...
3
Try Some Simple Commands
...
5
The Console Behind The Curtain
...
5
Further Reading
...
7
Understanding The File System Tree
...
7
Listing The Contents Of A Directory
...
9
Absolute Pathnames
...
9
Some Helpful Shortcuts
...
11

i

Summing Up
...
13
More Fun With ls
...
14
A Longer Look At Long Format
...
17
Viewing File Contents With less
...
17
Less Is More
...
19
Symbolic Links
...
24
Summing Up
...
24

4 – Manipulating Files And Directories
...
25
Character Ranges
...
27
mkdir – Create Directories
...
28
Useful Options And Examples
...
30
Useful Options And Examples
...
31
Useful Options And Examples
...
32
ln – Create Links
...
33
Symbolic Links
...
34
Creating Directories
...
34
Moving And Renaming Files
...
37
Creating Symbolic Links
...
39
Creating Symlinks With The GUI
...
41
Further Reading
...
42
What Exactly Are Commands?
...
43
type – Display A Command's Type
...
43
Getting A Command's Documentation
...
44
--help – Display Usage Information
...
45
apropos – Display Appropriate Commands
...
47
The Most Brutal Man Page Of Them All
...
48
README And Other Program Documentation Files
...
50
Summing Up
...
52

6 – Redirection
...
53
Redirecting Standard Output
...
55
Redirecting Standard Output And Standard Error To One File
...
57
/dev/null In Unix Culture
...
57
cat – Concatenate Files
...
59
The Difference Between > and |
...
61
uniq - Report Or Omit Repeated Lines
...
62
grep – Print Lines Matching A Pattern
...
63
tee – Read From Stdin And Output To Stdout And Files
...
65
Linux Is About Imagination
...
67
Expansion
...
68
Pathname Expansion Of Hidden Files
...
69
Arithmetic Expansion
...
71
Parameter Expansion
...
73
Quoting
...
75
Single Quotes
...
77
Backslash Escape Sequences
...
78
Further Reading
...
79
Command Line Editing
...
79

iii

Modifying Text
...
80
The Meta Key
...
81
Programmable Completion
...
83
Searching History
...
86
script
...
86
Further Reading
...
88
Owners, Group Members, And Everybody Else
...
90
chmod – Change File Mode
...
93
Setting File Mode With The GUI
...
96
Some Special Permissions
...
99
su – Run A Shell With Substitute User And Group IDs
...
101
Ubuntu And sudo
...
102
chgrp – Change Group Ownership
...
103
Changing Your Password
...
107
Further Reading
...
108
How A Process Works
...
109
Viewing Processes Dynamically With top
...
113
Interrupting A Process
...
114
Returning A Process To The Foreground
...
116
Signals
...
117
Sending Signals To Multiple Processes With killall
...
120
Summing Up
...
123
11 – The Environment
...
124
Examining The Environment
...
126
How Is The Environment Established?
...
128
Modifying The Environment
...
130
Text Editors
...
131
Why Comments Are Important
...
135
Summing Up
...
135

12 – A Gentle Introduction To vi
...
136
A Little Background
...
137
Compatibility Mode
...
139
Entering Insert Mode
...
140
Moving The Cursor Around
...
142
Appending Text
...
143
Deleting Text
...
145
Joining Lines
...
147
Searching Within A Line
...
147
Global Search-And-Replace
...
150
Switching Between Files
...
151
Copying Content From One File Into Another
...
153
Saving Our Work
...
155
Further Reading
...
156
Anatomy Of A Prompt
...
158
Adding Color
...
160
Moving The Cursor
...
163
Summing Up
...
164

Part 3 – Common Tasks And Essential Tools
...
166
Packaging Systems
...
167
Package Files
...
167
Dependencies
...
168
Common Package Management Tasks
...
169
Installing A Package From A Repository
...
170
Removing A Package
...
171
Upgrading A Package From A Package File
...
172
Determining If A Package Is Installed
...
173
Finding Which Package Installed A File
...
173
The Linux Software Installation Myth
...
175

15 – Storage Media
...
176
Viewing A List Of Mounted File Systems
...
181
Determining Device Names
...
185
Manipulating Partitions With fdisk
...
188
Testing And Repairing File Systems
...
189
Formatting Floppy Disks
...
190
Creating CD-ROM Images
...
191
Creating An Image From A Collection Of Files
...
192
Writing CD-ROM Images
...
192
Blanking A Re-Writable CD-ROM
...
193
Summing Up
...
193
Extra Credit
...
195
Examining And Monitoring A Network
...
196
traceroute
...
198
Transporting Files Over A Network
...
199
lftp – A Better ftp
...
202
Secure Communication With Remote Hosts
...
203
Tunneling With SSH
...
207
An SSH Client For Windows?
...
208
Further Reading
...
209
locate – Find Files The Easy Way
...
211
find – Find Files The Hard Way
...
212
Operators
...
217
User-Defined Actions
...
220
xargs
...
221
A Return To The Playground
...
224
Summing Up
...
225

18 – Archiving And Backup
...
226
gzip
...
229
Don’t Be Compressive Compulsive
...
230
tar
...
236
Synchronizing Files And Directories
...
240
Summing Up
...
241

19 – Regular Expressions
...
243
grep
...
245
The Any Character
...
247
A Crossword Puzzle Helper
...
248
Negation
...
249
POSIX Character Classes
...
253
POSIX Basic Vs
...
254
POSIX
...
255
Quantifiers
...
256
* - Match An Element Zero Or More Times
...
258
{ } - Match An Element A Specific Number Of Times
...
259
Validating A Phone List With grep
...
260
Searching For Files With locate
...
261
Summing Up
...
263

20 – Text Processing
...
264
Documents
...
265
Email
...
265
Program Source Code
...
265
cat
...
Unix Text
...
267
uniq
...
276
cut
...
279
paste
...
281
Comparing Text
...
284
diff
...
287
Editing On The Fly
...
288
ROT13: The Not-So-Secret Decoder Ring
...
290

viii

People Who Like sed Also Like
...
299
Summing Up
...
303
Extra Credit
...
305
Simple Formatting Tools
...
305
fold – Wrap Each Line To A Specified Length
...
309
pr – Format Text For Printing
...
314
Document Formatting Systems
...
318
Summing Up
...
324

22 – Printing
...
326
Printing In The Dim Times
...
327
Graphical Printers
...
329
Preparing Files For Printing
...
329
Sending A Print Job To A Printer
...
331
lp – Print Files (System V Style)
...
333
Monitoring And Controlling Print Jobs
...
336
lpq – Display Printer Queue Status
...
338
Summing Up
...
338

23 – Compiling Programs
...
340
Are All Programs Compiled?
...
342
Obtaining The Source Code
...
344
Building The Program
...
350
Summing Up
...
350

Part 4 – Writing Shell Scripts
...
354
What Are Shell Scripts?
...
354
Script File Format
...
356
Script File Location
...
358
More Formatting Tricks
...
358
Indentation And line-continuation
...
359
Summing Up
...
360

25 – Starting A Project
...
361
Second Stage: Adding A Little Data
...
364
Assigning Values To Variables And Constants
...
368
Summing Up
...
371

26 – Top-Down Design
...
373
Local Variables
...
377
Shell Functions In Your
...
380
Summing Up
...
380

27 – Flow Control: Branching With if
...
381
Exit Status
...
384
File Expressions
...
387
Integer Expressions
...
389
(( )) - Designed For Integers
...
392
Portability Is The Hobgoblin Of Little Minds
...
394
Summing Up
...
396

28 – Reading Keyboard Input
...
398
Options
...
402
You Can’t Pipe read
...
404
Menus
...
407
Extra Credit
...
408

29 – Flow Control: Looping With while / until
...
409
while
...
412
until
...
414
Summing Up
...
415

30 – Troubleshooting
...
416
Missing Quotes
...
417
Unanticipated Expansions
...
420
Defensive Programming
...
422
Design Is A Function Of Time
...
422
Test Cases
...
424
Finding The Problem Area
...
424
Examining Values During Execution
...
427
Further Reading
...
429
case
...
431
Performing Multiple Actions
...
434
Further Reading
...
436
Accessing The Command Line
...
437
shift – Getting Access To Many Arguments
...
439
Using Positional Parameters With Shell Functions
...
441

xi

A More Complete Application
...
446
Further Reading
...
450
for: Traditional Shell Form
...
452
for: C Language Form
...
454
Further Reading
...
456
Parameter Expansion
...
456
Expansions To Manage Empty Variables
...
459
String Operations
...
462
Arithmetic Evaluation And Expansion
...
465
Unary Operators
...
465
Assignment
...
469
Logic
...
473
Using bc
...
475
Summing Up
...
476
Further Reading
...
478
What Are Arrays?
...
478
Assigning Values To An Array
...
480
Array Operations
...
482
Determining The Number Of Array Elements
...
483
Adding Elements To The End Of An Array
...
484
Deleting An Array
...
485
Summing Up
...
486

36 – Exotica
...
487
Process Substitution
...
493
Temporary Files
...
496
wait
...
498
Setting Up A Named Pipe
...
499
Summing Up
...
499

Index
...

No, not the story of how, in 1991, Linus Torvalds wrote the first version of the Linux kernel
...
Nor am I going to tell you the story of
how, some years earlier, Richard Stallman began the GNU Project to create a free Unixlike operating system
...

No, I want to tell you the story of how you can take back control of your computer
...
The invention of the microprocessor had made it possible for ordinary people like you and me to actually own a computer
...
Let's just say, you couldn't get much done
...
Computers are everywhere, from tiny wristwatches to
giant data centers to everything in between
...
This has created a wondrous new
age of personal empowerment and creative freedom, but over the last couple of decades
something else has been happening
...
Fortunately, people from all over the world are doing something about it
...
They
are building Linux
...
Freedom is the power to decide what your computer does, and the only way to have this freedom is to know what your computer is doing
...


Why Use The Command Line?
Have you ever noticed in the movies when the “super hacker,”—you know, the guy who
   
can break into the ultra-secure military computer in under thirty seconds —sits down at

the computer, he never touches a mouse? It's because movie makers realize that we, as
human beings, instinctively know the only way to really get anything done on a computer
xvi

is by typing on a keyboard!
Most computer users today are only familiar with the graphical user interface (GUI) and
have been taught by vendors and pundits that the command line interface (CLI) is a terrifying thing of the past
...
It's been said that “graphical user interfaces make
easy tasks easy, while command line interfaces make difficult tasks possible” and this is
still very true today
...
Unix came into prominence during the early
1980s (although it was first developed a decade earlier), before the widespread adoption
of the graphical user interface and, as a result, developed an extensive command line interface instead
...


What This Book Is About
This book is a broad overview of “living” on the Linux command line
...
How does it all work? What can it do? What's the best way to use it?
This is not a book about Linux system administration
...
It will, however, prepare the reader for additional
study by providing a solid foundation in the use of the command line, an essential tool for
any serious system administration task
...
Many other books try to broaden their appeal by including other platforms such as generic Unix and OS X
...
This book, on the other hand, only covers
contemporary Linux distributions
...


Who Should Read This Book
This book is for new Linux users who have migrated from other platforms
...
Perhaps your boss has
told you to administer a Linux server, or maybe you're just a desktop user who is tired of
all the security problems and want to give Linux a try
...
All are welcome here
...
Learning the command line
is challenging and takes real effort
...
The avxvii

erage Linux system has literally thousands of programs you can employ on the command
line
...

On the other hand, learning the Linux command line is extremely rewarding
...
You don't know what real power is  —yet
...
The
skills learned today will still be useful ten years from now
...

It is also assumed that you have no programming experience, but not to worry, we'll start
you down that path as well
...
Many authors treat this material in a “systematic” fashion, which
makes sense from a writer’s perspective, but can be very confusing to new users
...
Along the way, we'll go on a few side trips to help you understand why certain things work the way they do and how they got that way
...
I might throw in a rant or two, as well
...




Part 2 – Configuration And The Environment covers editing configuration
files that control the computer's operation from the command line
...
Unix-like operating
systems, such as Linux, contain many “classic” command line programs that are
used to perform powerful operations on data
...
By learning shell programming, you will become familiar with concepts
that can be applied to many other programming languages
...
It isn’t written as a reference
work, it's really more like a story with a beginning, middle, and an end
...
You can get this in one
of two ways:
1
...
It doesn't matter which distribution
you choose, though most people today start out with either Ubuntu, Fedora, or
OpenSUSE
...
Installing a modern Linux distribution
can be ridiculously easy or ridiculously difficult depending on your hardware
...
Avoid laptops and
wireless networks if at all possible, as these are often more difficult to get working
...
Use a “Live CD
...
Just go into your BIOS setup and set your computer to “Boot from
CDROM,” insert the live CD, and reboot
...
The disadvantage of using
a live CD is that it may be very slow compared to having Linux installed on your
hard drive
...

Regardless of how you install Linux, you will need to have occasional superuser (i
...
, administrative) privileges to carry out the lessons in this book
...
Most of the material in this book is “hands on,” so sit down and get typing!

Why I Don't Call It “GNU/Linux”
In some quarters, it's politically correct to call the Linux operating system the
“GNU/Linux operating system
...
Technically speaking, Linux is the
name of the operating system's kernel, nothing more
...

Enter Richard Stallman, the genius-philosopher who founded the Free Software
movement, started the Free Software Foundation, formed the GNU Project, wrote
the first version of the GNU C Compiler (gcc), created the GNU General Public
License (the GPL), etc
...
, etc
...
While the GNU Project predates
the Linux kernel, and the project's contributions are extremely deserving of recognition, placing them in the name is unfair to everyone else who made significant

xix

contributions
...

In popular usage, “Linux” refers to the kernel and all the other free and open
source software found in the typical Linux distribution; that is, the entire Linux
ecosystem, not just the GNU components
...
I
have chosen to use the popular format
...
I won't mind
...

John C
...
In an episode of his video podcast, “Cranky
Geeks,” Mr
...
Write 200 words a day and
in a year, you have a novel
...

Dmitri Popov wrote an article in Free Software Magazine titled, “Creating a book template with Writer,” which inspired me to use OpenOffice
...
As it turned out, it worked wonderfully
...

Jesse Becker, Tomasz Chrzczonowicz, Michael Levin, Spence Miner also tested and reviewed portions of the text
...
Shotts contributed a lot of hours, polishing my so-called English by editing the
text
...
org, who have sent me so many kind emails
...
If you find a
technical error, drop me a line at:
bshotts@users
...
net
Your changes and suggestions may get into future releases
...
In particular, bash version 4
...
The chapter numbers in the Second Internet Edition now align with those in the No Starch Press edition
...

Special thanks go out to the following individuals who provided valuable feedback on the
first edition: Adrian Arpidez, Hu Bo, Heriberto Cantú, Joshua Escamilla, Bruce Fowler,
Ma Jun, Seth King, Mike O'Donnell, Parviz Rasoulipour, Gabriel Stutzman, and Christian Wuethrich
...
wikipedia
...
wikipedia
...
wikipedia
...
fsf
...
gnu
...
gnu
...
html
http://www
...
org/gnu/gnu-linux-faq
...
org Writer in Liberation Serif and
Sans fonts on a Dell Inspiron 530N, factory configured with Ubuntu 8
...
The PDF version of the text was generated directly by OpenOffice
...
The Second Internet
Edition was produced on the same computer using LibreOffice Writer on Ubuntu 12
...


xxi

xxii

Part 1 – Learning The Shell

1

1 – What Is The Shell?

1 – What Is The Shell?
When we speak of the command line, we are really referring to the shell
...
Almost all Linux distributions supply a shell program from the GNU Project called
bash
...


Terminal Emulators
When using a graphical user interface, we need another program called a terminal emulator to interact with the shell
...
KDE uses konsole and GNOME uses gnome-terminal, though it's likely
called simply “terminal” on our menu
...

You will probably develop a preference for one or another based on the number of bells
and whistles it has
...
Launch the terminal emulator! Once it comes up, we should see something like this:
[me@linuxbox ~]$

This is called a shell prompt and it will appear whenever the shell is ready to accept input
...

If the last character of the prompt is a pound sign (“#”) rather than a dollar sign, the terminal session has superuser privileges
...

Assuming that things are good so far, let's try some typing
...
This is called command history
...
Press the down-arrow key and the previous command disappears
...
Now try the left and right-arrow keys
...


A Few Words About Mice And Focus
While the shell is all about the keyboard, you can also use a mouse with your terminal emulator
...
If you highlight some text by holding down the left mouse button and dragging the mouse over it (or double clicking on a word), it is copied into a buffer
maintained by X
...
Try it
...
They don't work
...


3

1 – What Is The Shell?

Your graphical desktop environment (most likely KDE or GNOME), in an effort
to behave like Windows, probably has its focus policy set to “click to focus
...
This is
contrary to the traditional X behavior of “focus follows mouse” which means that
a window gets focus just by passing the mouse over it
...
Setting
the focus policy to “focus follows mouse” will make the copy and paste technique
even more useful
...
I think if you give it a chance you will prefer it
...


Try Some Simple Commands
Now that we have learned to type, let's try a few simple commands
...
This command displays the current time and date
...

[me@linuxbox ~]$ cal
October 2007
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

To see the current amount of free space on your disk drives, enter df:
[me@linuxbox ~]$ df
Filesystem
/dev/sda2
/dev/sda5
/dev/sda1

4

1K-blocks
15115452
59631908
147764

Used Available Use% Mounted on
5012392
9949716 34% /
26545424 30008432 47% /home
17370
122765 13% /boot

Try Some Simple Commands
tmpfs

256856

0

256856

0% /dev/shm

Likewise, to display the amount of free memory, enter the free command
...
Called virtual terminals or virtual consoles,
these sessions can be accessed on most Linux distributions by pressing CtrlAlt-F1 through Ctrl-Alt-F6
...
To switch from one
virtual console to another, press Alt and F1-F6
...


Summing Up
As we begin our journey, we are introduced to the shell and see the command line for the
first time and learn how to start and end a terminal session
...
That wasn't so
scary was it?

5

1 – What Is The Shell?

Further Reading




6

To learn more about Steve Bourne, father of the Bourne Shell, see this Wikipedia
article:
http://en
...
org/wiki/Steve_Bourne
Here is an article about the concept of shells in computing:
http://en
...
org/wiki/Shell_(computing)

2 – Navigation

2 – Navigation
The first thing we need to learn (besides just typing) is how to navigate the file system on
our Linux system
...
This means that they are organized in a tree-like
pattern of directories (sometimes called folders in other systems), which may contain
files and other directories
...
The root directory contains files and subdirectories, which contain more files and
subdirectories and so on and so on
...
Storage devices are attached
(or more correctly, mounted) at various points on the tree according to the whims of the
system administrator, the person (or persons) responsible for the maintenance of the system
...
Notice that the tree is usually shown upended, that is, with the
root at the top and the various branches descending below
...


7

2 – Navigation
Imagine that the file system is a maze shaped like an upside-down tree and we are able to

Figure 1: File system tree as shown by a
graphical file manager
stand in the middle of it
...
The directory we are standing in is
called the current working directory
...

[me@linuxbox ~]$ pwd
/home/me

When we first log in to our system (or start a terminal emulator session) our current
working directory is set to our home directory
...


Listing The Contents Of A Directory
To list the files and directories in the current working directory, we use the ls command
...
We'll
spend more time with ls in the next chapter
...
To do this, type cd followed by the pathname of the desired working directory
...
Pathnames can be specified in one of two different ways; as absolute
pathnames or as relative pathnames
...


Absolute Pathnames
An absolute pathname begins with the root directory and follows the tree branch by
branch until the path to the desired directory or file is completed
...
The
pathname of the directory is /usr/bin
...

[me@linuxbox ~]$ cd /usr/bin
[me@linuxbox bin]$ pwd
/usr/bin
[me@linuxbox bin]$ ls


...


Now we can see that we have changed the current working directory to /usr/bin and
that it is full of files
...


Relative Pathnames
Where an absolute pathname starts from the root directory and leads to its destination, a
relative pathname starts from the working directory
...
These special symbols are
"
...
" (dot dot)
...
" symbol refers to the working directory and the "
...
Here is how it works
...
We could do that two different ways
...

[me@linuxbox usr]$ pwd
/usr

Two different methods with identical results
...
Either using an absolute pathname:
[me@linuxbox usr]$ cd /usr/bin
[me@linuxbox bin]$ pwd
/usr/bin

Or, with a relative pathname:
[me@linuxbox usr]$ cd
...
In almost all cases, you can
10

Changing The Current Working Directory
omit the "
...
Typing:
[me@linuxbox usr]$ cd bin

does the same thing
...


Some Helpful Shortcuts
In Table 2-1 we see some useful ways the current working directory can be quickly
changed
...


cd -

Changes the working directory to the previous working directory
...
For example, cd ~bob will change the directory to
the home directory of user “bob
...
Filenames that begin with a period character are hidden
...
When your account was created,
several hidden files were placed in your home directory to configure things
for your account
...
In addition, some applications
place their configuration and settings files in your home directory as hidden
files
...
Filenames and commands in Linux, like Unix, are case sensitive
...

3
...

You may name files any way you like
...
Although Unix-like operating system don’t use

11

2 – Navigation

file extensions to determine the contents/purpose of files, some application
programs do
...
Though Linux supports long filenames which may contain embedded spaces
and punctuation characters, limit the punctuation characters in the names of
files you create to period, dash, and underscore
...
If you want to represent spaces between words in a
filename, use underscore characters
...


Summing Up
In this chapter we saw how the shell treats the directory structure of the system
...
In the next chapter we will use this knowledge to go on a tour
of a modern Linux system
...
Before we start however, we’re going to learn some more commands that
will be useful along the way:


ls – List directory contents



file – Determine file type



less – View file contents

More Fun With ls
The ls command is probably the most used command, and for good reason
...
As we have seen, we can simply enter ls to see a list of files and subdirectories
contained in the current working directory:
[me@linuxbox ~]$ ls
Desktop Documents Music

Pictures

Public

Templates

Videos

Besides the current working directory, we can specify the directory to list, like so:
me@linuxbox ~]$ ls /usr
bin games
kerberos libexec
etc include lib
local

sbin
share

src
tmp

Or even specify multiple directories
...


Options And Arguments
This brings us to a very important point about how most commands work
...
So most commands look kind
of like this:
command -options arguments

Most commands use options consisting of a single character preceded by a dash, for example, “-l”, but many commands, including those from the GNU Project, also support
long options, consisting of a word preceded by two dashes
...
In this example, the ls command is given
two options, the “l” option to produce long format output, and the “t” option to sort the
result by the file's modification time
...

The ls command has a large number of possible options
...

Table 3- 1: Common ls Options
Option
-a

Long Option
--all

Description

-A

--almost-all

Like the -a option above except it does not
list
...
(parent
directory)
...
Use this option in conjunction
with the -l option to see details about the
directory rather than its contents
...
For example, a
“/” if the name is a directory
...


-l
-r

List all files, even those with names that begin
with a period, which are normally not listed
(i
...
, hidden)
...

--reverse

Display the results in reverse order
...


-S

Sort results by file size
...


15

3 – Exploring The System

A Longer Look At Long Format
As we saw before, the “-l” option causes ls to display its results in long format
...
Here is the Examples directory from an
Ubuntu system:
-rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--

1
1
1
1
1
1
1
1
1
1
1

root
root
root
root
root
root
root
root
root
root
root

root 3576296 2007-04-03 11:05 Experience ubuntu
...
png
root
47584 2007-04-03 11:05 logo-Edubuntu
...
png
root
34391 2007-04-03 11:05 logo-Ubuntu
...
odf
root 159744 2007-04-03 11:05 oo-derivatives
...
odt
root
98816 2007-04-03 11:05 oo-trig
...
odt
root 358374 2007-04-03 11:05 ubuntu Sax
...
See the discussion of links
later in this chapter
...


root

The name of the group which owns the file
...


2007-04-03 11:05

Date and time of the file's last modification
...
odf

Name of the file
...
The first character indicates the
type of file
...

The next three characters are the access rights for the
file's owner, the next three are for members of the file's
group, and the final three are for everyone else
...


Determining A File's Type With file

Determining A File's Type With file
As we explore the system it will be useful to know what files contain
...
As we discussed earlier, filenames in
Linux are not required to reflect a file's contents
...
jpg”
would normally be expected to contain a JPEG compressed image, it is not required to in
Linux
...

For example:
[me@linuxbox ~]$ file picture
...
jpg: JPEG image data, JFIF standard 1
...
In fact, one of the common ideas in Unix-like operating
systems such as Linux is that “everything is a file
...

While many of the files on your system are familiar, for example MP3 and JPEG, there
are many kinds that are a little less obvious and a few that are quite strange
...
Throughout our Linux system, there
are many files that contain human-readable text
...


What Is “Text”?
There are many ways to represent information on a computer
...
Computers, after all, only understand numbers and all data
is converted to numeric representation
...
One of the earliest and simplest is
called ASCII text
...
This is a simple encoding scheme that was first
used on Teletype machines to map keyboard characters to numbers
...
It is very compact
...
It is important to understand that text only contains a simple mapping of characters to numbers
...
org Writer
...

Plain ASCII text files contain only the characters themselves and a few rudimentary control codes like tabs, carriage returns and line feeds
...
Even Windows recognizes the importance of this format
...
EXE program is an editor for
plain ASCII text files
...
In addition, many of the actual programs
that the system uses (called scripts) are stored in this format
...

The less command is used like this:
less filename

Once started, the less program allows you to scroll forward and backward through a
text file
...
If the file is longer
than one page, we can scroll up and down
...

The table below lists the most common keyboard commands used by less
...
The name “less” is a play on the phrase “less is more”—a
   
motto of modernist architects and designers
...
Whereas the
more program could only page forward, the less program allows paging both
forward and backward and has many other features as well
...
The design is actually specified in a published standard called the Linux Filesystem Hierarchy Standard
...

Next, we are going to wander around the file system ourselves to see what makes our
Linux system tick
...
One of
the things we will discover is that many of the interesting files are in plain human-read able text
...
cd into a given directory
2
...
If you see an interesting file, determine its contents with file
4
...

As we wander around, don't be afraid to look at stuff
...
That's the system administrators job! If a command complains
about something, just move on to something else
...
The
system is ours to explore
...
Feel free to try more!
Table 3-4: Directories Found On Linux Systems
Directory

Comments

/

The root directory
...


/bin

Contains binaries (programs) that must be present for the
system to boot and run
...

Interesting files:
● /boot/grub/grub
...
lst, which
are used to configure the boot loader
...

“Everything is a file” also applies to devices
...


A Guided Tour
Directory

Comments

/etc

The /etc directory contains all of the system-wide
configuration files
...

Everything in this directory should be readable text
...

● /etc/fstab, a table of storage devices and their
associated mount points
...


/home

In normal configurations, each user is given a directory in
/home
...
This limitation protects the system from errant
user activity
...
These are similar to DLLs in Windows
...
It is used in the case of
a partial recovery from a file system corruption event
...


/media

On modern Linux systems the /media directory will
contain the mount points for removable media such as USB
drives, CD-ROMs, etc
...


/mnt

On older Linux systems, the /mnt directory contains mount
points for removable devices that have been mounted
manually
...

This is mainly used to hold commercial software products
that may be installed on your system
...
It's not a real file system in
the sense of files stored on your hard drive
...
The
“files” it contains are peepholes into the kernel itself
...


/root

This is the home directory for the root account
...
These are
programs that perform vital system tasks that are generally
reserved for the superuser
...
Some
configurations cause this directory to be emptied each time
the system is rebooted
...
It contains all the programs and support files used
by regular users
...
It is not uncommon for this
directory to hold thousands of programs
...


/usr/local

The /usr/local tree is where programs that are not
included with your distribution but are intended for systemwide use are installed
...
On a newly
installed Linux system, this tree exists, but it will be empty
until the system administrator puts something in it
...


/usr/share

/usr/share contains all the shared data used by
programs in /usr/bin
...


/usr/share/doc

Most packages installed on the system will include some
kind of documentation
...


22

A Guided Tour
Directory

Comments

/var

With the exception of /tmp and /home, the directories we
have looked at so far remain relatively static, that is, their
contents don't change
...
Various databases,
spool files, user mail, etc
...


/var/log

/var/log contains log files, records of various system
activity
...
The most useful one is
/var/log/messages
...


Symbolic Links
As we look around, we are likely to see a directory listing with an entry like this:
lrwxrwxrwx 1 root root

11 2007-08-11 07:34 libc
...
6 -> libc-2
...
so

Notice how the first letter of the listing is “l” and the entry seems to have two filenames?
This is a special kind of a file called a symbolic link (also known as a soft link or symlink
...

While the value of this may not be obvious, it is really a useful feature
...
It would be good to
include the version number in the filename so the administrator or other interested party
could see what version of “foo” is installed
...
If we change the
name of the shared resource, we have to track down every program that might use it and
change it to look for a new resource name every time a new version of the resource is installed
...

Here is where symbolic links save the day
...
6 of “foo,”
which has the filename “foo-2
...
6
...
6”
...
The programs that rely on “foo” can
find it and we can still see what actual version is installed
...
7,” we just add the file to our system, delete the symbolic link “foo” and create a
new one that points to the new version
...
Imagine that
“foo-2
...

Again, we just delete the symbolic link pointing to the new version and create a new
23

3 – Exploring The System
symbolic link pointing to the old version
...
so
...
6
...
” This
means that programs looking for “libc
...
6” will actually get the file “libc-2
...
so
...


Hard Links
While we are on the subject of links, we need to mention that there is a second type of
link called a hard link
...
We’ll talk more about the differences between symbolic and hard links
in the next chapter
...
We've seen various files
and directories and their contents
...
In Linux there are many important files that are plain human-readable text
...


Further Reading




An article about the directory structure of Unix and Unix-like systems:
http://en
...
org/wiki/Unix_directory_structure



24

The full version of the Linux Filesystem Hierarchy Standard can be found here:
http://www
...
com/fhs/

A detailed description of the ASCII text format:
http://en
...
org/wiki/ASCII

4 – Manipulating Files And Directories

4 – Manipulating Files And Directories
At this point, we are ready for some real work! This chapter will introduce the following
commands:


cp – Copy files and directories



mv – Move/rename files and directories



mkdir – Create directories



rm – Remove files and directories



ln – Create hard and symbolic links

These five commands are among the most frequently used Linux commands
...

Now, to be frank, some of the tasks performed by these commands are more easily done
with a graphical file manager
...
So why use these old command
line programs?
The answer is power and flexibility
...
For example, how could we copy all the HTML files from one directory to another, but only copy files that do not exist in the destination directory or are newer than
the versions in the destination directory? Pretty hard with a file manager
...
html destination

Wildcards
Before we begin using our commands, we need to talk about a shell feature that makes
these commands so powerful
...
These special characters are
25

4 – Manipulating Files And Directories
called wildcards
...
The table below lists the wildcards and what
they select:
Table 4-1: Wildcards
Wildcard

Meaning

*

Matches any characters

?

Matches any single character

[characters]

Matches any character that is a member of the set characters

[!characters]

Matches any character that is not a member of the set
characters

[[:class:]]

Matches any character that is a member of the specified
class

Table 4-2 lists the most commonly used character classes:
Table 4-2: Commonly Used Character Classes
Character Class

Meaning

[:alnum:]

Matches any alphanumeric character

[:alpha:]

Matches any alphabetic character

[:digit:]

Matches any numeral

[:lower:]

Matches any lowercase letter

[:upper:]

Matches any uppercase letter

Using wildcards makes it possible to construct very sophisticated selection criteria for
filenames
...
txt

Any file beginning with “b” followed by
any characters and ending with “
...
[0-9][0-9][0-9]

Any file beginning with “BACKUP
...


Character Ranges
If you are coming from another Unix-like environment or have been reading
some other books on this subject, you may have encountered the [A-Z] or the
[a-z] character range notations
...
They can still work, but you have to be
very careful with them because they will not produce the expected results unless
properly configured
...


Wildcards Work In The GUI Too
Wildcards are especially valuable not only because they are used so frequently on
the command line, but are also supported by some graphical file managers
...
Just enter a file selection pattern with wildcards and the files in the currently viewed directory will be highlighted for selection
...
For example, if you want
to see all the files starting with a lowercase “u” in the /usr/bin directory, enter
“/usr/bin/u*” in the location bar and it will display the result
...
It is one of the many things that make the Linux desktop so powerful
...
It works like this:
mkdir directory
...


cp – Copy Files And Directories
The cp command copies files or directories
...
directory

to copy multiple items (either files or directories) into a directory
...
Normally,
copies take on the default attributes of the user
performing the copy
...
If this option is not specified, cp will
silently overwrite files
...
This
option (or the -a option) is required when copying
directories
...


-v, --verbose

Display informative messages as the copy is
performed
...
If file2 exists, it is overwritten
with the contents of file1
...


cp -i file1 file2

Same as above, except that if file2 exists, the user is
prompted before it is overwritten
...
dir1 must
already exist
...
dir2 must already exist
...
If directory dir2 does not exist, it is created
and, after the copy, will contain the same contents
as directory dir1
...


mv – Move And Rename Files
The mv command performs both file moving and file renaming, depending on how it is
used
...
mv is used
in much the same way as cp:
mv item1 item2

to move or rename file or directory “item1” to “item2” or:
mv item
...


Useful Options And Examples
mv shares many of the same options as cp:
Table 4-6: mv Options
Option
-i, --interactive

Meaning

-u, --update

When moving files from one directory to another, only
move files that either don't exist, or are newer than the
existing corresponding files in the destination
directory
...
If this option is not specified, mv will
silently overwrite files
...


Table 4-7: mv Examples
Command

Results

mv file1 file2

Move file1 to file2
...
If file2 does not exist, it
is created
...


mv -i file1 file2

Same as above, except that if file2 exists, the user is
prompted before it is overwritten
...
dir1 must
already exist
...

If directory dir2 does exist, move directory dir1
(and its contents) into directory dir2
...


where “item” is one or more files or directories
...
If this option is not specified, rm will
silently delete files
...
This means that if a
directory being deleted has subdirectories, delete them
too
...


-f, --force

Ignore nonexistent files and do not prompt
...


-v, --verbose

Display informative messages as the deletion is
performed
...


rm -i file1

Same as above, except that the user is prompted for
confirmation before the deletion is performed
...


rm -rf file1 dir1

Same as above, except that if either file1 or dir1 do
not exist, rm will continue silently
...

Once you delete something with rm, it's gone
...

Be particularly careful with wildcards
...
Let's say
you want to delete just the HTML files in a directory
...
html
which is correct, but if you accidentally place a space between the “*” and the

...
html
the rm command will delete all the files in the directory and then complain that
there is no file called “
...

Here is a useful tip
...
This will let you see the

32

rm – Remove Files And Directories

files that will be deleted
...


ln – Create Links
The ln command is used to create either hard or symbolic links
...


Hard Links
Hard links are the original Unix way of creating links, compared to symbolic links, which
are more modern
...

When we create a hard link, we create an additional directory entry for a file
...
A hard link cannot reference a file outside its own file system
...

2
...

A hard link is indistinguishable from the file itself
...
When a
hard link is deleted, the link is removed but the contents of the file itself continue to exist
(that is, its space is not deallocated) until all links to the file are deleted
...


Symbolic Links
Symbolic links were created to overcome the limitations of hard links
...
In this regard, they operate in much the same way as a Windows shortcut
though of course, they predate the Windows feature by many years ;-)
A file pointed to by a symbolic link, and the symbolic link itself are largely indistinguishable from one another
...
However when you delete a symbolic link, only the link is
deleted, not the file itself
...
In this case, the link is said to be broken
...

The concept of links can seem very confusing, but hang in there
...


Let's Build A Playground
Since we are going to do some real file manipulation, let's build a safe place to “play”
with our file manipulation commands
...
We'll create
one in our home directory and call it “playground
...
To create our playground directory we
will first make sure we are in our home directory and will then create the new directory:
[me@linuxbox ~]$ cd
[me@linuxbox ~]$ mkdir playground

To make our playground a little more interesting, let's create a couple of directories inside
it called “dir1” and “dir2”
...


Copying Files
Next, let's get some data into our playground
...
Using the
34

Let's Build A Playground
cp command, we'll copy the passwd file from the /etc directory to the current working directory:
[me@linuxbox playground]$ cp /etc/passwd
...
So now if we perform an ls, we will see our file:
[me@linuxbox
total 12
drwxrwxr-x 2
drwxrwxr-x 2
-rw-r--r-- 1

playground]$ ls -l
me
me
me

me 4096 2008-01-10 16:40 dir1
me 4096 2008-01-10 16:40 dir2
me 1650 2008-01-10 16:07 passwd

Now, just for fun, let's repeat the copy using the “-v” option (verbose) to see what it does:
[me@linuxbox playground]$ cp -v /etc/passwd
...
/passwd'

The cp command performed the copy again, but this time displayed a concise message
indicating what operation it was performing
...
Again this is a case of cp assuming that you know what you’re are
doing
...

cp: overwrite `
...


Moving And Renaming Files
Now, the name “passwd” doesn't seem very playful and this is a playground, so let's
change it to something else:
[me@linuxbox playground]$ mv passwd fun

35

4 – Manipulating Files And Directories
Let's pass the fun around a little by moving our renamed file to each of the directories and
back again:
[me@linuxbox playground]$ mv fun dir1

to move it first to directory dir1, then:
[me@linuxbox playground]$ mv dir1/fun dir2

to move it from dir1 to dir2, then:
[me@linuxbox playground]$ mv dir2/fun
...
Next, let's see the effect of mv on
directories
...
If dir2 had not existed, mv would have renamed dir1 to dir2
...

[me@linuxbox playground]$ mv dir1/fun
...
First the hard links
...
Let's take a look our playground directory:
[me@linuxbox
total 16
drwxrwxr-x 2
drwxrwxr-x 2
-rw-r--r-- 4
-rw-r--r-- 4

playground]$ ls -l
me
me
me
me

me
me
me
me

4096
4096
1650
1650

2008-01-14
2008-01-14
2008-01-10
2008-01-10

16:17
16:17
16:33
16:33

dir1
dir2
fun
fun-hard

One thing you notice is that the second field in the listing for fun and fun-hard both
contain a “4” which is the number of hard links that now exist for the file
...
So, how do we know that fun and fun-hard are, in fact, the same file? In this
case, ls is not very helpful
...
To solve this problem, we're
going to have to dig a little deeper
...
When we create hard links, we are actually creating additional name parts that all
refer to the same data part
...
Each hard link therefore refers to a
specific inode containing the file's contents
...
It is invoked with the “-i” option:
[me@linuxbox playground]$ ls -li
total 16
12353539 drwxrwxr-x 2 me
me
4096 2008-01-14 16:17 dir1
12353540 drwxrwxr-x 2 me
me
4096 2008-01-14 16:17 dir2
12353538 -rw-r--r-- 4 me
me
1650 2008-01-10 16:33 fun

37

4 – Manipulating Files And Directories
12353538 -rw-r--r-- 4 me

me

1650 2008-01-10 16:33 fun-hard

In this version of the listing, the first field is the inode number and, as we can see, both
fun and fun-hard share the same inode number, which confirms they are the same
file
...
Symbolic links are a special type of file that contains a text pointer to the target file or directory
...
/fun dir1/fun-sym
[me@linuxbox playground]$ ln -s
...
But what about the next two? Remember, when we
create a symbolic link, we are creating a text description of where the target file is relative to the symbolic link
...
/fun

The listing for fun-sym in dir1 shows that it is a symbolic link by the leading “l” in
the first field and that it points to “
...
Relative to the location of
fun-sym, fun is in the directory above it
...
/fun” rather than the length of the
file to which it is pointing
...
Using relative pathnames is more
desirable because it allows a directory containing symbolic links to be renamed and/or
moved without breaking the links
...
We are going to use it to clean up our playground a little bit
...
The file fun-hard is gone and the link count shown for fun
is reduced from four to three, as indicated in the second field of the directory listing
...
But let's look at the output of ls now
...
On a Fedora box, broken
links are displayed in blinking red text! The presence of a broken link is not, in and of itself dangerous but it is rather messy
...
We'll delete the symbolic links:
[me@linuxbox
[me@linuxbox
total 8
drwxrwxr-x 2
drwxrwxr-x 2

playground]$ rm fun-sym dir1-sym
playground]$ ls -l
me
me

me
me

4096 2008-01-15 15:17 dir1
4096 2008-01-15 15:17 dir2

One thing to remember about symbolic links is that most file operations are carried out
on the link's target, not the link itself
...
When you delete a link, it is the
link that is deleted, not the target
...
To do this, we will return to our home directory
and use rm with the recursive option (-r) to delete playground and all of its contents, including its subdirectories:
[me@linuxbox playground]$ cd
[me@linuxbox ~]$ rm -r playground

Creating Symlinks With The GUI
The file managers in both GNOME and KDE provide an easy and automatic
method of creating symbolic links
...
In
KDE, a small menu appears whenever a file is dropped, offering a choice of copying, moving, or linking the file
...
Perform the
playground exercise over and over until it makes sense
...
Feel free to expand on
the playground exercise by adding more files and directories, using wildcards to specify
files for various operations
...
They can be a real lifesaver
...
wikipedia
...
In this chapter, we will attempt to remove some of that
mystery and even create some of our own commands
...
An executable program like all those files we saw in /usr/bin
...

2
...
bash supports a number of commands internally called shell builtins
...

3
...
These are miniature shell scripts incorporated into the environment
...

4
...
Commands that we can define ourselves, built from other commands
...


type – Display A Command's Type
The type command is a shell builtin that displays the kind of command the shell will
execute, given a particular command name
...
Here are some examples:
[me@linuxbox ~]$ type type
type is a shell builtin
[me@linuxbox ~]$ type ls
ls is aliased to `ls --color=tty'
[me@linuxbox ~]$ type cp
cp is /bin/cp

Here we see the results for three different commands
...
Now we know why the output from ls is displayed
in color!

which – Display An Executable's Location
Sometimes there is more than one version of an executable program installed on a system
...

To determine the exact location of a given executable, the which command is used:
[me@linuxbox ~]$ which ls
/bin/ls

which only works for executable programs, not builtins nor aliases that are substitutes
for actual executable programs
...
6
...
3/bin:/usr/kerberos/bin:/opt/jre1
...
0_03/bin:/usr/lib/ccache:/usr/l
ocal/bin:/usr/bin:/bin:/home/me/bin)

which is a fancy way of saying “command not found
...


help – Get Help For Shell Builtins
bash has a built-in help facility available for each of the shell builtins
...
For example:
[me@linuxbox ~]$ help cd
cd: cd [-L|[-P [-e]]] [dir]
Change the shell working directory
...

the HOME shell variable
...
Alternative directory names in CDPATH are separated
by a colon (:)
...
If DIR begins with a slash (/), then CDPATH is not used
...
If that variable
has a value, its value is used for DIR
...

Exit Status:
Returns 0 if the directory is changed, and if $PWD is set
successfully when -P is used; non-zero otherwise
...
A vertical bar character indicates mutually exclusive
items
...

While the output of help for the cd commands is concise and accurate, it is by no
means tutorial and as we can see, it also seems to mention a lot of things we haven't
talked about yet! Don't worry
...


--help – Display Usage Information
Many executable programs support a “--help” option that displays a description of the
command's supported syntax and options
...

Create the DIRECTORY(ies), if they do not already exist
...

-m, --mode=MODE
set file mode (as in chmod), not a=rwx – umask
-p, --parents
no error if existing, make parent directories as
needed
-v, --verbose
print a message for each created directory
--help
display this help and exit
--version
output version information and exit

Report bugs to ...


Some programs don't support the “--help” option, but try it anyway
...


man – Display A Program's Manual Page
Most executable programs intended for command line use provide a formal piece of documentation called a manual or man page
...
It is used like this:
45

5 – Working With Commands
man program

where “program” is the name of the command to view
...
Man pages, however, do not usually include examples,
and are intended as a reference, not a tutorial
...

The “manual” that man displays is broken into sections and not only covers user commands but also system administration commands, programming interfaces, file formats
and more
...
This is particularly true if we are looking for a file format that is also the name of
a command
...
To specify a section number, we use man like this:

46

Getting A Command's Documentation
man section search_term

For example:
[me@linuxbox ~]$ man 5 passwd

This will display the man page describing the file format of the /etc/passwd file
...
It's very crude but sometimes helpful
...
Note that the man command with the “-k” option performs the exact same
function as apropos
...
Many man
pages are hard to read, but I think that the grand prize for difficulty has got to go
to the man page for bash
...
When printed, it's
over 80 pages long and extremely dense, and its structure makes absolutely no
sense to a new user
...
So check it out if you dare and look forward to the day when you can
read it and it all makes sense
...

Info pages are displayed with a reader program named, appropriately enough, info
...
Here is a sample:
File: coreutils
...
1 `ls': List directory contents
==================================

The `ls' program lists information about files (of any type,
including directories)
...


For non-option command-line arguments that are directories, by
default `ls' lists the contents of directories, not recursively, and
omitting files with names beginning with `
...
If no non-option
argument is specified, `ls' operates on the current directory, acting
as if it had been invoked with a single argument of `
...
info
...
Info files contain hyperlinks that can move you from node to
node
...

To invoke info, type “info” followed optionally by the name of a program
...


Enter

Follow the hyperlink at the cursor location

q

Quit

Most of the command line programs we have discussed so far are part of the GNU
Project's “coreutils” package, so typing:
[me@linuxbox ~]$ info coreutils

will display a menu page with hyperlinks to each program contained in the coreutils
package
...
Most of these are stored in plain text format and can
49

5 – Working With Commands
be viewed with less
...
We may encounter some files ending with a “
...
This indicates
that they have been compressed with the gzip compression program
...


Creating Your Own Commands With alias
Now for our very first experience with programming! We will create a command of our
own using the alias command
...
It's possible to put more than one command on a line by separating each
command with a semicolon character
...


Here's the example we will use:
[me@linuxbox ~]$ cd /usr; ls; cd bin games
kerberos lib64
local
etc include lib
libexec sbin
/home/me
[me@linuxbox ~]$

share
src

tmp

As we can see, we have combined three commands on one line
...
Now let's turn this sequence into a new command using alias
...

Let's try “test”
...
To find out, we can use the type command again:
[me@linuxbox ~]$ type test
test is a shell builtin

Oops! The name “test” is already taken
...
So let's create our alias:
[me@linuxbox ~]$ alias foo='cd /usr; ls; cd -'

Notice the structure of this command:
alias name='string'

After the command “alias” we give alias a name followed immediately (no whitespace allowed) by an equals sign, followed immediately by a quoted string containing the meaning to be assigned to the name
...
Let's try it:
[me@linuxbox ~]$ foo
bin games
kerberos
etc include lib
/home/me
[me@linuxbox ~]$

lib64
libexec

local
sbin

share
src

tmp

We can also use the type command again to see our alias:
[me@linuxbox ~]$ type foo
foo is aliased to `cd /usr; ls ; cd -'

To remove an alias, the unalias command is used, like so:
[me@linuxbox ~]$ unalias foo
[me@linuxbox ~]$ type foo
bash: type: foo: not found

While we purposefully avoided naming our alias with an existing command name, it is
not uncommon to do so
...
For instance, we saw earlier how the ls command is
often aliased to add color support:

51

5 – Working With Commands
[me@linuxbox ~]$ type ls
ls is aliased to `ls --color=tty'

To see all the aliases defined in the environment, use the alias command without arguments
...
Try and figure
out what they all do:
[me@linuxbox
alias l
...
* --color=tty'
-l --color=tty'
--color=tty'

There is one tiny problem with defining aliases on the command line
...
In a later chapter, we will see how to add our own aliases to the
files that establish the environment each time we log on, but for now, enjoy the fact that
we have taken our first, albeit tiny, step into the world of shell programming!

Summing Up
Now that we have learned how to find the documentation for commands, go and look up
the documentation for all the commands we have encountered so far
...
Here
are some of the best:




The Bash FAQ contains answers to frequently asked questions regarding bash
...

http://mywiki
...
org/BashFAQ



The GNU Project provides extensive documentation for its programs, which form
the core of the Linux command line experience
...
gnu
...
html



52

The Bash Reference Manual is a reference guide to the bash shell
...

http://www
...
org/software/bash/manual/bashref
...
wikipedia
...
It's called I/O redirection
...
To show off this facility,
we will introduce the following commands:


cat - Concatenate files



sort - Sort lines of text



uniq - Report or omit repeated lines



grep - Print lines matching a pattern



wc - Print newline, word, and byte counts for each file



head - Output the first part of a file



tail - Output the last part of a file



tee - Read from standard input and write to standard output and files

Standard Input, Output, And Error
Many of the programs that we have used so far produce output of some kind
...
First, we have the program's results; that is, the data the program is designed to produce, and second, we have status and error messages that tell us
how the program is getting along
...

Keeping with the Unix theme of “everything is a file,” programs such as ls actually send
their results to a special file called standard output (often expressed as stdout) and their
status messages to another file called standard error (stderr)
...

In addition, many programs take input from a facility called standard input (stdin) which
is, by default, attached to the keyboard
...
Normally, output goes to the screen and input comes from the keyboard, but with I/O redirection, we can change that
...
To redirect standard
output to another file instead of the screen, we use the “>” redirection operator followed
by the name of the file
...
For example, we could tell the shell to send the output of the
ls command to the file ls-output
...
txt

Here, we created a long listing of the /usr/bin directory and sent the results to the file
ls-output
...
Let's examine the redirected output of the command:
[me@linuxbox ~]$ ls -l ls-output
...
txt

Good; a nice, large, text file
...
txt does indeed contain the results from our ls command:
[me@linuxbox ~]$ less ls-output
...
We'll change the name of
the directory to one that does not exist:
[me@linuxbox ~]$ ls -l /bin/usr > ls-output
...
This makes sense since we specified the non-existent directory /bin/usr, but why was the error message displayed on the screen rather than
being redirected to the file ls-output
...
Instead, like most well-written Unix programs, it sends its error messages to standard error
...
We'll see how
54

Redirecting Standard Output
to redirect standard error in just a minute, but first, let's look at what happened to our output file:
[me@linuxbox ~]$ ls -l ls-output
...
txt

The file now has zero length! This is because, when we redirect output with the “>” redirection operator, the destination file is always rewritten from the beginning
...

In fact, if we ever need to actually truncate a file (or create a new, empty file) we can use
a trick like this:
[me@linuxbox ~]$ > ls-output
...

So, how can we append redirected output to a file instead of overwriting the file from the
beginning? For that, we use the “>>” redirection operator, like so:
[me@linuxbox ~]$ ls -l /usr/bin >> ls-output
...
If the file
does not already exist, it is created just as though the “>” operator had been used
...
txt
/usr/bin >> ls-output
...
txt
ls-output
...
txt

We repeated the command three times resulting in an output file three times as large
...
To redirect
55

6 – Redirection
standard error we must refer to its file descriptor
...
While we have referred to the first three of these file
streams as standard input, output and error, the shell references them internally as file descriptors 0, 1 and 2, respectively
...
Since standard error is the same as file descriptor number 2,
we can redirect standard error with this notation:
[me@linuxbox ~]$ ls -l /bin/usr 2> ls-error
...
txt
...
To do this, we must redirect both standard output and standard error at the same
time
...
First, the traditional way, which works with old versions of the shell:
[me@linuxbox ~]$ ls -l /bin/usr > ls-output
...
First we redirect standard output to the
file ls-output
...

Notice that the order of the redirections is significant
...
In
the example above,
>ls-output
...
txt, but if the order is changed to
2>&1 >ls-output
...

Recent versions of bash provide a second, more streamlined method for performing this
56

Redirecting Standard Error
combined redirection:
[me@linuxbox ~]$ ls -l /bin/usr &> ls-output
...
txt
...
txt

Disposing Of Unwanted Output
Sometimes “silence is golden,” and we don't want output from a command, we just want
to throw it away
...
The system provides a way to do this by redirecting output to a special file called “/dev/null”
...
To suppress error messages from a command, we do this:
[me@linuxbox ~]$ ls -l /bin/usr 2> /dev/null

/dev/null In Unix Culture
The bit bucket is an ancient Unix concept and due to its universality, it has appeared in many parts of Unix culture
...
For more examples,
see the Wikipedia article on “/dev/null”
...


cat – Concatenate Files
The cat command reads one or more files and copies them to standard output like so:
57

6 – Redirection
cat [file
...

You can use it to display files without paging, for example:
[me@linuxbox ~]$ cat ls-output
...
txt
...
Since cat can accept more than one file as an argument, it can also be used to
join files together
...
If the files were named:
movie
...
001 movie
...
002
...
mpeg
...
mpeg
...
mpeg

Since wildcards always expand in sorted order, the arguments will be arranged in the correct order
...
What happens if we enter “cat” with no arguments:
[me@linuxbox ~]$ cat

Nothing happens, it just sits there like it's hung
...

If cat is not given any arguments, it reads from standard input and since standard input
is, by default, attached to the keyboard, it's waiting for us to type something! Try adding
the following text and pressing Enter:
[me@linuxbox ~]$ cat
The quick brown fox jumped over the lazy dog
...
e
...

The quick brown fox jumped over the lazy dog
...
We can use this behavior to create short text files
...
txt” containing the text in our example
...
txt
The quick brown fox jumped over the lazy dog
...
Remember to type
Ctrl-d at the end
...
txt
The quick brown fox jumped over the lazy dog
...
txt
The quick brown fox jumped over the lazy dog
...
txt
...
This is not particularly useful compared to passing a filename argument, but it serves to demonstrate using a file as a source of standard input
...

Before we move on, check out the man page for cat, as it has several interesting options
...
Using the pipe operator “|” (vertical bar), the
standard output of one command can be piped into the standard input of another:
command1 | command2

To fully demonstrate this, we are going to need some commands
...
We can use less
to display, page-by-page, the output of any command that sends its results to standard
output:
[me@linuxbox ~]$ ls -l /usr/bin | less

This is extremely handy! Using this technique, we can conveniently examine the output
of any command that produces standard output
...
Simply put, the redirection
operator connects a command with a file while the pipeline operator connects the
output of one command with the input of a second command
...

command1 > command2
Answer: Sometimes something really bad
...
As the superuser, he did this:
# cd /usr/bin
# ls > less

60

Pipelines

The first command put him in the directory where most programs are stored and
the second command told the shell to overwrite the file less with the output of
the ls command
...

The lesson here is that the redirection operator silently creates or overwrites files,
so you need to treat it with a lot of respect
...
It is possible to put several commands together into a pipeline
...
Filters take input, change it somehow and then output it
...
Imagine we wanted to make a combined list of all of the executable
programs in /bin and /usr/bin, put them in sorted order and view it:
[me@linuxbox ~]$ ls /bin /usr/bin | sort | less

Since we specified two directories (/bin and /usr/bin), the output of ls would have
consisted of two sorted lists, one for each directory
...


uniq - Report Or Omit Repeated Lines
The uniq command is often used in conjunction with sort
...
So, to make sure our list
has no duplicates (that is, any programs of the same name that appear in both the /bin
and /usr/bin directories) we will add uniq to our pipeline:
[me@linuxbox ~]$ ls /bin /usr/bin | sort | uniq | less

In this example, we use uniq to remove any duplicates from the output of the sort
command
...
For example:
[me@linuxbox ~]$ wc ls-output
...
txt

In this case it prints out three numbers: lines, words, and bytes contained in ls-output
...
Like our previous commands, if executed without command line arguments,
wc accepts standard input
...
Adding it
to a pipeline is a handy way to count things
...
It's used like this:
grep pattern [file
...
The
patterns that grep can match can be very complex, but for now we will concentrate on
simple text matches
...

Let's say we wanted to find all the files in our list of programs that had the word “zip”
embedded in the name
...
We would do this:
[me@linuxbox ~]$ ls /bin /usr/bin | sort | uniq | grep zip

62

Pipelines
bunzip2
bzip2
gunzip
gzip
unzip
zip
zipcloak
zipgrep
zipinfo
zipnote
zipsplit

There are a couple of handy options for grep: “-i” which causes grep to ignore case
when performing the search (normally searches are case sensitive) and “-v” which tells
grep to only print lines that do not match the pattern
...
You may only want the first
few lines or the last few lines
...
By default, both commands print ten lines of
text, but this can be adjusted with the “-n” option:
[me@linuxbox
total 343496
-rwxr-xr-x 1
-rwxr-xr-x 1
-rwxr-xr-x 1
-rwxr-xr-x 1
[me@linuxbox
-rwxr-xr-x 1
-rwxr-xr-x 1
-rw-r--r-- 1
-rw-r--r-- 1
lrwxrwxrwx 1

~]$ head -n 5 ls-output
...
txt
root root
5234 2007-06-27
root root
691 2005-09-10
root root
930 2007-11-01
root root
930 2007-11-01
root root
6 2008-01-31

08:58
13:39
14:27
20:16

[
411toppm
a2p
a52dec

10:56
04:21
12:23
12:23
05:22

znew
zonetab2pot
...
pyc
zonetab2pot
...
py
zonetab2pot
...
pyo

63

6 – Redirection
zsoelim

tail has an option which allows you to view files in real-time
...
In the following example, we will
look at the messages file in /var/log (or the /var/log/syslog file if messages is missing)
...
168
...
1
Feb 8 13:40:05 twin4 dhclient: bound to 192
...
1
...

Feb 8 13:55:32 twin4 mountd[3953]: /var/NFSv4/musicbox exported to
both 192
...
1
...
localdomain in
192
...
1
...
localdomain
Feb 8 14:07:37 twin4 dhclient: DHCPREQUEST on eth0 to 192
...
1
...
168
...
1
Feb 8 14:07:37 twin4 dhclient: bound to 192
...
1
...

Feb 8 14:09:56 twin4 smartd[3468]: Device: /dev/hda, SMART
Prefailure Attribute: 8 Seek_Time_Performance changed from 237 to 236
Feb 8 14:10:37 twin4 mountd[3953]: /var/NFSv4/musicbox exported to
both 192
...
1
...
localdomain in
192
...
1
...
localdomain
Feb 8 14:25:07 twin4 sshd(pam_unix)[29234]: session opened for user
me by (uid=0)
Feb 8 14:25:36 twin4 su(pam_unix)[29279]: session opened for user
root by me(uid=500)

Using the “-f” option, tail continues to monitor the file and when new lines are appended, they immediately appear on the display
...


tee – Read From Stdin And Output To Stdout And Files
In keeping with our plumbing metaphor, Linux provides a command called tee which
creates a “tee” fitting on our pipe
...
This is useful for capturing a pipeline's contents at an intermediate stage of processing
...
txt before grep filters the pipeline's contents:

64

Pipelines
[me@linuxbox ~]$ ls /usr/bin | tee ls
...
We have only seen their most basic usage
...
As we gain Linux experience, we will see that the redirection feature of
the command line is extremely useful for solving specialized problems
...


Linux Is About Imagination
When I am asked to explain the difference between Windows and Linux, I often
use a toy analogy
...
You go to the store and buy one all shiny new in the
box
...
Pretty graphics, cute sounds
...
This cycle repeats over and over
...
Then you say, “But I only need to change this one thing!” The
person behind the counter says you can't change it
...
You discover that your toy is limited to the games that others
have decided that you need and no more
...
You open it up
and it's just a huge collection of parts
...
So you start to play with
it
...
After a while you discover

65

6 – Redirection

that you have your own ideas of what to make
...
The Erector Set takes on the
shape of your imagination
...

Your choice of toys is, of course, a personal thing, so which toy would you find
more satisfying?

66

7 – Seeing The World As The Shell Sees It

7 – Seeing The World As The Shell Sees It
In this chapter we are going to look at some of the “magic” that occurs on the command
line when you press the enter key
...
We have seen a couple of cases
of how a simple character sequence, for example “*”, can have a lot of meaning to the
shell
...
With expansion, you enter
something and it is expanded into something else before the shell acts upon it
...
echo is a shell
builtin that performs a very simple task
...
Any argument passed to echo gets displayed
...
txt Music Pictures Public Templates
Videos

So what just happened? Why didn't echo print “*”? As you recall from our work with
wildcards, the “*” character means match any characters in a filename, but what we didn't
see in our original discussion was how the shell does that
...
When the enter key is
pressed, the shell automatically expands any qualifying characters on the command line
before the command is carried out, so the echo command never saw the “*”, only its expanded result
...


Pathname Expansion
The mechanism by which wildcards work is called pathname expansion
...
Given a home directory that looks like this:
[me@linuxbox ~]$ ls
Desktop
ls-output
...
Pathname
expansion also respects this behavior
...

It might appear at first glance that we could include hidden files in an expansion
by starting the pattern with a leading period, like this:
echo
...
However, if we examine the results closely, we will see that the
names “
...
” will also appear in the results
...
We can see this if we try the command:
ls -d
...
]*
This pattern expands into every filename that begins with a period, does not include a second period, and followed by any other characters
...
The ls command with the -A option (“almost all”) will provide
a correct listing of hidden files:
ls -A

Tilde Expansion
As you may recall from our introduction to the cd command, the tilde character (“~”) has
a special meaning
...
This allow us to use the shell
prompt as a calculator:
[me@linuxbox ~]$ echo $((2 + 2))
4

Arithmetic expansion uses the form:
$((expression))
where expression is an arithmetic expression consisting of values and arithmetic operators
...
Here are a few of the supported operators:
Table 7-1: Arithmetic Operators
Operator

Description

+

Addition

-

Subtraction

*

Multiplication

/

Division (but remember, since expansion only supports integer
arithmetic, results are integers
...


**

Exponentiation

Spaces are not significant in arithmetic expressions and expressions may be nested
...
With this technique,
we can rewrite the example above and get the same result using a single expansion instead of two:

70

Expansion
[me@linuxbox ~]$ echo $(((5**2) * 3))
75

Here is an example using the division and remainder operators
...

with 1 left over
...


Brace Expansion
Perhaps the strangest expansion is called brace expansion
...
Here's an example:
[me@linuxbox ~]$ echo Front-{A,B,C}-Back
Front-A-Back Front-B-Back Front-C-Back

Patterns to be brace expanded may contain a leading portion called a preamble and a
trailing portion called a postscript
...
The pattern
may not contain embedded whitespace
...
5}
Number_1 Number_2 Number_3 Number_4 Number_5

Integers may also be zero-padded like so:
[me@linuxbox ~]$ echo {01
...
15}
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015

A range of letters in reverse order:
71

7 – Seeing The World As The Shell Sees It
[me@linuxbox ~]$ echo {Z
...
For example, if we were photographers and had a large collection of
images that we wanted to organize into years and months, the first thing we might do is
create a series of directories named in numeric “Year-Month” format
...
We could type out a complete list of directories, but that's a lot of work and it's error-prone too
...
2009}-{01
...
It's a feature that is more useful in shell scripts than directly
on the command line
...
Many such chunks, more properly
called variables, are available for your examination
...
To invoke parameter expansion and reveal the contents
of USER you would do this:
[me@linuxbox ~]$ echo $USER

72

Expansion
me

To see a list of available variables, try this:
[me@linuxbox ~]$ printenv | less

You may have noticed that with other types of expansion, if you mistype a pattern, the
expansion will not take place and the echo command will simply display the mistyped
pattern
...
txt Music Pictures Public Templates
Videos

One of my favorites goes something like this:
[me@linuxbox ~]$ ls -l $(which cp)
-rwxr-xr-x 1 root root 71516 2007-12-05 08:58 /bin/cp

Here we passed the results of which cp as an argument to the ls command, thereby
getting the listing of of the cp program without having to know its full pathname
...
Entire pipelines can be used (only partial output
shown):
[me@linuxbox ~]$ file $(ls -d /usr/bin/* | grep zip)
/usr/bin/bunzip2:
symbolic link to `bzip2'

73

7 – Seeing The World As The Shell Sees It
/usr/bin/bzip2:
ELF 32-bit LSB executable, Intel 80386,
version 1 (SYSV), dynamically linked (uses shared libs), for
GNU/Linux 2
...
9, stripped
/usr/bin/bzip2recover: ELF 32-bit LSB executable, Intel 80386,
version 1 (SYSV), dynamically linked (uses shared libs), for
GNU/Linux 2
...
9, stripped
/usr/bin/funzip:
ELF 32-bit LSB executable, Intel 80386,
version 1 (SYSV), dynamically linked (uses shared libs), for
GNU/Linux 2
...
9, stripped
/usr/bin/gpg-zip:
Bourne shell script text executable
/usr/bin/gunzip:
symbolic link to `
...
/bin/gzip'
/usr/bin/mzip:
symbolic link to `mtools'

In this example, the results of the pipeline became the argument list of the file command
...
It uses back-quotes instead of the dollar sign and parentheses:
[me@linuxbox ~]$ ls -l `which cp`
-rwxr-xr-x 1 root root 71516 2007-12-05 08:58 /bin/cp

Quoting
Now that we've seen how many ways the shell can perform expansions, it's time to learn
how we can control it
...
00
The total is 00
...
In the second example, parameter expansion substituted an
empty string for the value of “$1” because it was an undefined variable
...

74

Quoting

Double Quotes
The first type of quoting we will look at is double quotes
...
The exceptions are “$”, “\” (backslash), and “`” (backquote)
...
Using double quotes, we can cope with filenames containing embedded spaces
...
txt
...
txt
ls: cannot access two: No such file or directory
ls: cannot access words
...
txt"
-rw-rw-r-- 1 me
me
18 2008-02-20 13:03 two words
...
txt" two_words
...

Remember, parameter expansion, arithmetic expansion, and command substitution still
take place within double quotes:
[me@linuxbox ~]$ echo "$USER $((2+2)) $(cal)"
me 4
February 2008
Su Mo Tu We Th Fr Sa
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29

We should take a moment to look at the effect of double quotes on command substitution
...
In our earlier example, we saw
how word-splitting appears to remove extra spaces in our text:

75

7 – Seeing The World As The Shell Sees It
[me@linuxbox ~]$ echo this is a
this is a test

test

By default, word-splitting looks for the presence of spaces, tabs, and newlines (linefeed
characters) and treats them as delimiters between words
...
They only serve as separators
...
If we add double quotes:
[me@linuxbox ~]$ echo "this is a
this is a
test

test"

word-splitting is suppressed and the embedded spaces are not treated as delimiters, rather
they become part of the argument
...

The fact that newlines are considered delimiters by the word-splitting mechanism causes
an interesting, albeit subtle, effect on command substitution
...
In the second, a command line with one argument that includes the
embedded spaces and newlines
...
Here is a comparison of unquoted, double quotes, and single quotes:

76

Quoting
[me@linuxbox ~]$ echo text ~/*
...
txt a b foo 4 me
[me@linuxbox ~]$ echo "text ~/*
...
txt {a,b} foo 4 me
[me@linuxbox ~]$ echo 'text ~/*
...
txt {a,b} $(echo foo) $((2+2)) $USER

As we can see, with each succeeding level of quoting, more and more of the expansions
are suppressed
...
To do this, we can precede a character with a backslash, which in this context is called the escape character
...
00"
The balance for user me is: $5
...
For example, it is possible to use characters in filenames that normally have
special meaning to the shell
...
To include a special character in a filename you can to this:
[me@linuxbox ~]$ mv bad\&filename good_filename

To allow a backslash character to appear, escape it by typing “\\”
...


Backslash Escape Sequences
In addition to its role as the escape character, the backslash is also used as part of
a notation to represent certain special characters called control codes
...
Some of these codes are familiar (tab, backspace, linefeed, and
carriage return), while others are not (null, end-of-transmission, and acknowledge)
...
On Unix-like systems, this
produces a linefeed
...
The idea
behind this representation using the backslash originated in the C programming
language and has been adopted by many others, including the shell
...

You may also place them inside $' '
...
In fact, it could be argued that they are the most important subjects to
learn about the shell
...


Further Reading




78

The bash man page has major sections on both expansion and quoting which
cover these topics in a more formal manner
...
gnu
...
html

8 – Advanced Keyboard Tricks

8 – Advanced Keyboard Tricks
I often kiddingly describe Unix as “the operating system for people who like to type
...
But command line
users don't like to type that much
...
Another goal
is never having to lift your fingers from the keyboard, never reaching for the mouse
...

The following commands will make an appearance:


clear – Clear the screen



history – Display the contents of the history list

Command Line Editing
bash uses a library (a shared collection of routines that different programs can use)
called Readline to implement command line editing
...

We know, for example, that the arrow keys move the cursor but there are many more features
...
It’s not important to learn all of them, but many of them are very useful
...

Note: Some of the key sequences below (particularly those which use the Alt key)
may be intercepted by the GUI for other functions
...


Cursor Movement
The following table lists the keys used to move the cursor:

79

8 – Advanced Keyboard Tricks
Table 8-1: Cursor Movement Commands
Key

Action

Ctrl-a

Move cursor to the beginning of the line
...


Ctrl-f

Move cursor forward one character; same as the right arrow key
...


Alt-f

Move cursor forward one word
...


Ctrl-l

Clear the screen and move the cursor to the top left corner
...


Modifying Text
Table 8-2 lists keyboard commands that are used to edit characters on the command line
...


Alt-t

Transpose the word at the cursor location with the one preceding it
...


Alt-u

Convert the characters from the cursor location to the end of the
word to uppercase
...
Items that are cut are stored in a buffer called the killring
...


Ctrl-u

Kill text from the cursor location to the beginning of the line
...


AltBackspace

Kill text from the cursor location to the beginning of the current
word
...


Ctrl-y

Yank text from the kill-ring and insert it at the cursor location
...
” On modern keyboards this maps to the Alt key but it wasn't always so
...
What they might have had was a device called a terminal
...
It was attached (usually by serial cable) to a larger computer or the communication network of a larger computer
...
Since they
all tended to at least understand ASCII, software developers wanting portable applications wrote to the lowest common denominator
...
Since
the developers of Readline could not be sure of the presence of a dedicated extra
control key, they invented one and called it “meta
...


Completion
Another way that the shell can help you is through a mechanism called completion
...
Let's see how this
works
...
txt
Documents Music

Pictures
Public

Templates

Videos

Try typing the following but don't press the Enter key:
[me@linuxbox ~]$ ls l

Now press the tab key:
[me@linuxbox ~]$ ls ls-output
...
Again, don't press
Enter:
[me@linuxbox ~]$ ls D

Press tab:
[me@linuxbox ~]$ ls D

No completion, just a beep
...
For completion to be successful, the “clue” you give it has to be unambiguous
...

While this example shows completion of pathnames, which is its most common use,
82

Completion
completion will also work on variables (if the beginning of the word is a “$”), user names
(if the word begins with “~”), commands (if the word is the first word on the line
...
Hostname completion only works for
hostnames listed in /etc/hosts
...
On most systems you can also
do this by pressing the tab key a second time, which is much easier
...
This is useful when you want to use
more than one possible match
...
You can see a list in the bash man
page under “READLINE”
...
Programmable completion allows you (or more likely, your distribution provider) to
add additional completion rules
...
For example it is possible to add completions for the option list of a
command or match particular file types that an application supports
...
Programmable completion is implemented by
shell functions, a kind of mini shell script that we will cover in later chapters
...
Not all distributions include them by default
...
This list of commands is kept in your home directory in a file called

...
The history facility is a useful resource for reducing the amount of
typing you have to do, especially when combined with command line editing
...
We will see how to adjust this value in a later chapter
...
One way we could do this:
[me@linuxbox ~]$ history | grep /usr/bin

And let's say that among our results we got a line containing an interesting command like
this:
88

ls -l /usr/bin > ls-output
...
We could use this
immediately using another type of expansion called history expansion
...

There are other forms of history expansion that we will cover a little later
...
This means that we
can tell bash to search the history list as we enter characters, with each additional character further refining our search
...
When you find it, you can either press Enter to execute the
command or press Ctrl-j to copy the line from the history list to the current command
line
...
To quit searching, press either Ctrl-g or Ctrl-c
...
It is
“reverse” because we are searching from “now” to some time in the past
...
In this example “/usr/bin”:
(reverse-i-search)`/usr/bin': ls -l /usr/bin > ls-output
...
With our result, we can execute the command
by pressing Enter, or we can copy the command to our current command line for further editing by pressing Ctrl-j
...
Press Ctrl-j:
[me@linuxbox ~]$ ls -l /usr/bin > ls-output
...
Same action as the down arrow
...


Alt->

Move to the end (bottom) of the history list, i
...
, the current
command line
...
Searches incrementally from the
current command line up the history list
...
With this key, type in the search
string and press enter before the search is performed
...


Ctrl-o

Execute the current item in the history list and advance to the next
one
...


Move to the previous history entry
...


85

8 – Advanced Keyboard Tricks

History Expansion
The shell offers a specialized type of expansion for items in the history list by using the
“!” character
...
There are a number of other expansion features:
Table 8-6: History Expansion Commands
Sequence

Action

!!

Repeat the last command
...


!number

Repeat history list item number
...


!?string

Repeat last history list item containing string
...

There are many more elements available in the history expansion mechanism, but this
subject is already too arcane and our heads may explode if we continue
...
Feel free to
explore!

script
In addition to the command history feature in bash, most Linux distributions include a program called script that can be used to record an entire shell session
and store it in a file
...
If no file is specified, the file typescript is used
...


Summing Up
In this chapter we have covered some of the keyboard tricks that the shell provides to
help hardcore typists reduce their workloads
...
For now, consider them optional and potentially helpful
...
wikipedia
...

What exactly does this mean? It means that more than one person can be using the computer at the same time
...
For example, if a computer is attached
to a network or the Internet, remote users can log in via ssh (secure shell) and operate
the computer
...
The X Window System supports this as part
of its basic design
...
Considering the environment in
which Unix was created, this makes perfect sense
...
A typical university computer
system, for example, consisted of a large central computer located in one building and
terminals which were located throughout the campus, each connected to the large central
computer
...

In order to make this practical, a method had to be devised to protect the users from each
other
...

In this chapter we are going to look at this essential part of system security and introduce
the following commands:



chmod – Change a file's mode



umask – Set the default file permissions



su – Run a shell as another user



sudo – Execute a command as another user



88

id – Display user identity

chown – Change a file's owner

9 – Permissions


chgrp – Change a file's group ownership



passwd – Change a user's password

Owners, Group Members, And Everybody Else
When we were exploring the system back in Chapter 3, we may have encountered a problem when trying to examine a file such as /etc/shadow:
[me@linuxbox
/etc/shadow:
[me@linuxbox
/etc/shadow:

~]$ file /etc/shadow
regular file, no read permission
~]$ less /etc/shadow
Permission denied

The reason for this error message is that, as regular users, we do not have permission to
read this file
...
When a user owns a file
or directory, the user has control over its access
...
In addition to granting access to a group, an owner may also grant some set of
access rights to everybody, which in Unix terms is referred to as the world
...
When user accounts are created, users are assigned a number
called a user ID or uid which is then, for the sake of the humans, mapped to a username
...
The
above example is from a Fedora system
...
This is simply because Fedora starts
its numbering of regular user accounts at 500, while Ubuntu starts at 1000
...
This has to do with the way
Ubuntu manages privileges for system devices and services
...
User accounts are defined in the /etc/passwd file and groups are defined
in the /etc/group file
...

For each user account, the /etc/passwd file defines the user (login) name, uid, gid,
the account's real name, home directory, and login shell
...

In the next chapter, when we cover processes, you will see that some of these other
“users” are, in fact, quite busy
...
This makes certain types of permission assignment easier
...
If we look at the output of the ls command, we can get some clue as to
how this is implemented:
[me@linuxbox ~]$ > foo
...
txt
-rw-rw-r-- 1 me
me
0 2008-03-06 14:52 foo
...
The first of these characters is
the file type
...


l

A symbolic link
...
The real
file attributes are those of the file the symbolic link points to
...


Reading, Writing, And Executing
c

A character special file
...


b

A block special file
...


The remaining nine characters of the file attributes, called the file mode, represent the
read, write, and execute permissions for the file's owner, the file's group owner, and
everybody else:
Owner

Group

World

rwx

rwx

rwx

When set, the r, w, and x mode attributes have the following effect on files and directories:
Table 9-2: Permission Attributes
Attribute

Files

Directories

r

Allows a file to be opened and
read
...


w

Allows a file to be written to or
truncated, however this attribute
does not allow files to be
renamed or deleted
...


Allows files within a directory
to be created, deleted, and
renamed if the execute attribute
is also set
...
Program
files written in scripting
languages must also be set as
readable to be executed
...
g
...


Here are some examples of file attribute settings:

91

9 – Permissions
Table 9-3: Permission Attribute Examples
File Attributes

Meaning

-rwx------

A regular file that is readable, writable, and executable by the
file's owner
...


-rw-------

A regular file that is readable and writable by the file's owner
...


-rw-r--r--

A regular file that is readable and writable by the file's owner
...
The file is
world-readable
...
The file may be read and executed by everybody
else
...


lrwxrwxrwx

A symbolic link
...
The real permissions are kept with the actual file
pointed to by the symbolic link
...
The owner and the members of the owner group
may enter the directory and, create, rename and remove files
within the directory
...
The owner may enter the directory and create,
rename and delete files within the directory
...


chmod – Change File Mode
To change the mode (permissions) of a file or directory, the chmod command is used
...
chmod supports two distinct ways of specifying mode changes: octal number representation, or symbolic representation
...


92

Reading, Writing, And Executing

What The Heck Is Octal?
Octal (base 8), and its cousin, hexadecimal (base 16) are number systems often
used to express numbers on computers
...
Computers, on the other the other hand, were born with only one finger and
thus do all all their counting in binary (base 2)
...
So in binary, counting looks like this:
0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011
...

Hexadecimal counting uses the numerals zero through nine plus the letters “A”
through “F”:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10, 11, 12, 13
...
Many times, small portions of data are represented on computers as bit
patterns
...
On most computer displays, each pixel
is composed of three color components: eight bits of red, eight bits of green, and
eight bits of blue
...
Here's where another number system would help
...
In octal, each digit represents
three binary digits
...

These days, hexadecimal notation (often spoken as “hex”) is more common than
octal, but as we shall soon see, octal's ability to express three bits of binary will
be very useful
...
Since
each digit in an octal number represents three binary digits, this maps nicely to the
scheme used to store the file mode
...
txt
ls -l foo
...
txt
chmod 600 foo
...
txt
me
0 2008-03-06 14:52 foo
...
Though remembering the octal to binary mapping may seem inconvenient, you will usually only
have to use a few common ones: 7 (rwx), 6 (rw-), 5 (r-x), 4 (r--), and 0 (---)
...
Symbolic notation is
divided into three parts: who the change will affect, which operation will be performed,
and what permission will be set
...


g

Group owner
...


a

Short for “all
...


94

Reading, Writing, And Executing
If no character is specified, “all” will be assumed
...

Permissions are specified with the “r”, “w”, and “x” characters
...


u-x

Remove execute permission from the owner
...

Equivalent to a+x
...


go=rw

Set the group owner and anyone besides the owner to have read and
write permission
...


u+x,go=rx

Add execute permission for the owner and set the permissions for
the group and others to read and execute
...


Some people prefer to use octal notation, some folks really like the symbolic
...

Take a look at the chmod man page for more details and a list of options
...


Setting File Mode With The GUI
Now that we have seen how the permissions on files and directories are set, we can better
understand the permission dialogs in the GUI
...
Here
is an example from KDE 3
...
5 File
Properties Dialog
Here we can see the settings for the owner, group, and world
...
Another victory for understanding brought to us by the command line!

umask – Set Default Permissions
The umask command controls the default permissions given to a file when it is created
...
Let's take a look:
[me@linuxbox
[me@linuxbox
0002
[me@linuxbox
[me@linuxbox
-rw-rw-r-- 1

~]$ rm -f foo
...
txt
~]$ ls -l foo
...
txt

We first removed any old copy of foo
...
Next,
we ran the umask command without an argument to see the current value
...
We next create a new instance of the file foo
...

We can see that both the owner and group get read and write permission, while everyone
else only gets read permission
...
Let's repeat our example, this time setting the mask ourselves:
[me@linuxbox
[me@linuxbox
[me@linuxbox
[me@linuxbox
-rw-rw-rw- 1

~]$
~]$
~]$
~]$
me

rm foo
...
txt
ls -l foo
...
txt

When we set the mask to 0000 (effectively turning it off), we see that the file is now
world writable
...
If
we take the mask and expand it into binary, and then compare it to the attributes we can
see what happens:
Original file mode

--- rw- rw- rw-

Mask

000 000 000 010

Result

--- rw- rw- r--

Ignore for the moment the leading zeros (we'll get to those in a minute) and observe that
where the 1 appears in our mask, an attribute was removed—in this case, the world write
   
permission
...
Everywhere a 1 appears in the binary value of the
mask, an attribute is unset
...
Play
with some values (try some sevens) to get used to how this works
...
txt; umask 0002

Most of the time you won't have to change the mask; the default provided by your distribution will be fine
...


Some Special Permissions
Though we usually see an octal permission mask expressed as a three digit number, it is more technically correct to express it in four digits
...

The first of these is the setuid bit (octal 4000)
...
Most often this is given to a few programs owned by the superuser
...
This
allows the program to access files and directories that an ordinary user would normally be prohibited from accessing
...

The second less-used setting is the setgid bit (octal 2000) which, like the setuid
bit, changes the effective group ID from the real group ID of the real user to that
of the file owner
...
This is useful in a shared directory when members of a
common group need access to all the files in the directory, regardless of the file
owner's primary group
...
This is a holdover from ancient
Unix, where it was possible to mark an executable file as “not swappable
...
This is often used to control access to a
shared directory, such as /tmp
...
First assigning setuid to a program:
chmod u+s program
Next, assigning setgid to a directory:
chmod g+s dir
Finally, assigning the sticky bit to a directory:
chmod +t dir

98

Reading, Writing, And Executing

When viewing the output from ls, you can determine the special permissions
...
First, a program that is setuid:
-rwsr-xr-x
A directory that has the setgid attribute:
drwxrwsr-x
A directory with the sticky bit set:
drwxrwxrwt

Changing Identities
At various times, we may find it necessary to take on the identity of another user
...
There are
three ways to take on an alternate identity:
1
...

2
...

3
...

We will skip the first technique since we know how to do it and it lacks the convenience
of the other two
...
The sudo command allows an administrator to
set up a configuration file called /etc/sudoers, and define specific commands that
particular users are permitted to execute under an assumed identity
...
Your distribution probably includes both commands, but its configuration will favor either one or
the other
...


su – Run A Shell With Substitute User And Group IDs
The su command is used to start a shell as another user
...
This means that the user's environment is loaded and the working directory is
99

9 – Permissions
changed to the user's home directory
...
If the user is not
specified, the superuser is assumed
...
To start a shell for the superuser, we would do
this:
[me@linuxbox ~]$ su Password:
[root@linuxbox ~]#

After entering the command, we are prompted for the superuser's password
...
) Once in the new shell, we can carry
out commands as the superuser
...
It is important to enclose the command in quotes, as we do not want expansion to occur in our
shell, but rather in the new shell:
[me@linuxbox ~]$ su -c 'ls -l /root/*'
Password:
-rw------- 1 root root
754 2007-08-11 03:19 /root/anaconda-ks
...
The administrator can configure sudo to allow an ordinary user to execute commands as a different user (usually the superuser) in a very controlled way
...
Another important difference is that the use of sudo does not require access to the superuser's password
...
Let's say, for example, that sudo has been configured to allow us to run a fictitious backup program
called “backup_script”, which requires superuser privileges
...


After entering the command, we are prompted for our password (not the superuser's) and
once the authentication is complete, the specified command is carried out
...
This means that commands do not need to be quoted any differently than they would be without using sudo
...
See the sudo man page for details
...
These tasks include installing and updating software, editing system configuration files, and accessing devices
...
This allows
users to perform these tasks
...
This is desirable in most cases, but it also permits
malware (malicious software) such as viruses to have free reign of the computer
...
The approach taken
in Unix is to grant superuser privileges only when needed
...

Up until a few of years ago, most Linux distributions relied on su for this purpose
...
This introduced a problem
...
In fact, some users operated their systems as the
root user exclusively, since it does away with all those annoying “permission denied” messages
...
Not a good idea
...
By default,
Ubuntu disables logins to the root account (by failing to set a password for the account), and instead uses sudo to grant superuser privileges
...


chown – Change File Owner And Group
The chown command is used to change the owner and group owner of a file or directory
...
The syntax of chown looks like
this:
chown [owner][:[group]] file
...
Here are some examples:
Table 9-7: chown Argument Examples
Argument
bob

Results

bob:users

Changes the ownership of the file from its current owner to user
bob and changes the file group owner to group users
...


Changing Identities
:admins

Changes the group owner to the group admins
...


bob:

Change the file owner from the current owner to user bob and
changes the group owner to the login group of user bob
...
User janet wants to copy a file from her home directory to the
home directory of user tony
...
txt ~tony
sudo ls -l ~tony/myfile
...
txt
sudo chown tony: ~tony/myfile
...
txt
tony 8031 2008-03-20 14:30 /home/tony/myfile
...
Next, janet changes the ownership of the file from root (a result of using
sudo) to tony
...

Notice that after the first use of sudo, janet was not prompted for her password? This
is because sudo, in most configurations, “trusts” you for several minutes until its timer
runs out
...
For that purpose, a separate command, chgrp was used
...


Exercising Our Privileges
Now that we have learned how this permissions thing works, it's time to show it off
...
Let's imagine that we have two users named “bill” and “karen
...
User bill has access to superuser privileges via
sudo
...
Using the graphical user management tool, bill creates a group
called music and adds users bill and karen to it:

Figure 3: Creating A New Group With GNOME

Next, bill creates the directory for the music files:
[bill@linuxbox ~]$ sudo mkdir /usr/local/share/Music
Password:

Since bill is manipulating files outside his home directory, superuser privileges are required
...
To make this
directory sharable, bill needs to change the group ownership and the group permissions
to allow writing:

104

Exercising Our Privileges
[bill@linuxbox ~]$ sudo chown :music /usr/local/share/Music
[bill@linuxbox ~]$ sudo chmod 775 /usr/local/share/Music
[bill@linuxbox ~]$ ls -ld /usr/local/share/Music
drwxrwxr-x 2 root music 4096 2008-03-21 18:05 /usr/local/share/Music

So what does this all mean? It means that we now have a directory,
/usr/local/share/Music that is owned by root and allows read and write access to group music
...
Other users can list
the contents of the directory but cannot create files there
...
With the current permissions, files and directories created
within the Music directory will have the normal permissions of the users bill and
karen:
[bill@linuxbox ~]$ > /usr/local/share/Music/test_file
[bill@linuxbox ~]$ ls -l /usr/local/share/Music
-rw-r--r-- 1 bill
bill
0 2008-03-24 20:03 test_file

Actually there are two problems
...

This would not be a problem if the shared directory only contained files, but since this directory will store music, and music is usually organized in a hierarchy of artists and albums, members of the group will need the ability to create files and directories inside directories created by other members
...

Second, each file and directory created by one member will be set to the primary group of
the user rather than the group music
...
bill sets his umask to
0002, removes the previous test file, and creates a new test file and directory:
[bill@linuxbox ~]$ umask 0002

105

9 – Permissions
[bill@linuxbox ~]$
[bill@linuxbox ~]$
[bill@linuxbox ~]$
[bill@linuxbox ~]$
drwxrwsr-x 2 bill
-rw-rw-r-- 1 bill
[bill@linuxbox ~]$

rm /usr/local/share/Music/test_file
> /usr/local/share/Music/test_file
mkdir /usr/local/share/Music/test_dir
ls -l /usr/local/share/Music
music 4096 2008-03-24 20:24 test_dir
music 0 2008-03-24 20:22 test_file

Both files and directories are now created with the correct permissions to allow all members of the group music to create files and directories inside the Music directory
...
The necessary setting only lasts until the end of session and must be reset
...


Changing Your Password
The last topic we'll cover in this chapter is setting passwords for yourself (and for other
users if you have access to superuser privileges
...
The command syntax looks like this:
passwd [user]

To change your password, just enter the passwd command
...
This means it will
refuse to accept passwords that are too short, too similar to previous passwords, are dictionary words, or are too easily guessed:
[me@linuxbox ~]$ passwd
(current) UNIX password:
New UNIX password:
BAD PASSWORD: is too similar to the old one
New UNIX password:
BAD PASSWORD: it is WAY too short

106

Changing Your Password
New UNIX password:
BAD PASSWORD: it is based on a dictionary word

If you have superuser privileges, you can specify a username as an argument to the
passwd command to set the password for another user
...
See the passwd man
page for details
...
The basic
ideas of this system of permissions date back to the early days of Unix and have stood up
pretty well to the test of time
...


Further Reading


Wikipedia has a good article on malware:
http://en
...
org/wiki/Malware

There are number of command line programs used to create and maintain users and
groups
...
The Linux kernel manages this through the use of processes
...

Sometimes a computer will become sluggish or an application will stop responding
...

This chapter will introduce the following commands:


ps – Report a snapshot of current processes



top – Display tasks



jobs – List active jobs



bg – Place a job in the background



fg – Place a job in the foreground



kill – Send a signal to a process



killall – Kill processes by name



shutdown – Shutdown or reboot the system

How A Process Works
When a system starts up, the kernel initiates a few of its own activities as processes and
launches a program called init
...
Many of these services are
implemented as daemon programs, programs that just sit in the background and do their
thing without having any user interface
...

The fact that a program can launch other programs is expressed in the process scheme as
a parent process producing a child process
...
For
example, each process is assigned a number called a process ID or PID
...
The kernel also keeps track
of the memory assigned to each process, as well as the processes' readiness to resume execution
...


Viewing Processes
The most commonly used command to view processes (there are several) is ps
...
As we can see, by default, ps doesn't show us very much, just
the processes associated with the current terminal session
...
TTY is
short for “Teletype,” and refers to the controlling terminal for the process
...
The TIME field is the amount of CPU time consumed by the process
...

If we add an option, we can get a bigger picture of what the system is doing:
[me@linuxbox ~]$ ps x
PID TTY
STAT
TIME COMMAND
2799 ?
Ssl
0:00 /usr/libexec/bonobo-activation-server –ac
2820 ?
Sl
0:01 /usr/libexec/evolution-data-server-1
...

15797 ?
S
0:00 dcopserver –nosid

and many more
...
The presence of a “?” in
the TTY column indicates no controlling terminal
...

Since the system is running a lot of processes, ps produces a long list
...
Some option combinations also
produce long lines of output, so maximizing the terminal emulator window may be a
good idea, too
...
STAT is short for “state” and reveals the current status of the process:
Table 10-1: Process States
State

Meaning

R

Running
...


S

Sleeping
...


D

Uninterruptible Sleep
...


T

Stopped
...
More on this later
...
This is a child process that has
terminated, but has not been cleaned up by its parent
...
It's possible to grant more importance to a
process, giving it more time on the CPU
...
A process with high priority is said to be less nice
because it's taking more of the CPU's time, which leaves less for
everybody else
...
A process with low priority (a “nice”
process) will only get processor time after other processes with
higher priority have been serviced
...
These indicate various exotic
process characteristics
...

Another popular set of options is “aux” (without a leading dash)
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...


This set of options displays the processes belonging to every user
...
The Linux
version of ps can emulate the behavior of the ps program found in several different
Unix implementations
...
This is the owner of the process
...


%MEM

Memory usage in percent
...


RSS

Resident Set Size
...


START

Time when the process started
...


Viewing Processes Dynamically With top
While the ps command can reveal a lot about what the machine is doing, it provides only
a snapshot of the machine's state at the moment the ps command is executed
...
The name “top” comes from
the fact that the top program is used to see the “top” processes on the system
...
07, 0
...
00
Tasks: 109 total,
1 running, 106 sleeping,
0 stopped,
2 zombie
Cpu(s): 0
...
0%sy, 0
...
3%id, 0
...
0%hi, 0
...
3 1
...
3 0
...
7 0
...
7 2
...
3 3
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
0 0
...
42
0:00
...
66
2:51
...
39
0:03
...
00
0:00
...
72
0:00
...
42
0:00
...
08
0:00
...
62
0:02
...
Here's a rundown:
Table 10-3: top Information Fields
Row

Current time of day
...
It is the amount of time
since the machine was last booted
...


2 users

There are two users logged in
...


Viewing Processes
that are waiting to run, that is, the number of
processes that are in a runnable state and are
sharing the CPU
...
The first is the
average for the last 60 seconds, the next the
previous 5 minutes, and finally the previous 15
minutes
...
0 indicate that the
machine is not busy
...


3

Cpu(s):

This row describes the character of the
activities that the CPU is performing
...
7%us

0
...
This means processes outside of the
kernel itself
...
0%sy

1
...


0
...
0% of the CPU is being used by “nice” (low
priority) processes
...
3%id

98
...


0
...
0% of the CPU is waiting for I/O
...


5

Swap:

Shows how swap space (virtual memory) is
being used
...
The two most interesting are
h, which displays the program's help screen, and q, which quits top
...
After all, our system monitor program shouldn't be the source of
the system slowdown that we are trying to track
...
For our
113

10 – Processes
experiments, we're going to use a little program called xlogo as our guinea pig
...
First, we'll get to know our test subject:
[me@linuxbox ~]$ xlogo

After entering the command, a small window containing the logo should appear somewhere on the screen
...

Tip: If your system does not include the xlogo program, try using gedit or
kwrite instead
...
If the logo is redrawn in the
new size, the program is running
...
If we close the
xlogo window, the prompt returns
...
First, enter the xlogo command
and verify that the program is running
...

[me@linuxbox ~]$ xlogo
[me@linuxbox ~]$

In a terminal, pressing Ctrl-c, interrupts a program
...
After we pressed Ctrl-c, the xlogo window closed and the
shell prompt returned
...


Putting A Process In The Background
Let's say we wanted to get the shell prompt back without terminating the xlogo pro114

Controlling Processes
gram
...
Think of the terminal as
having a foreground (with stuff visible on the surface like the shell prompt) and a background (with hidden stuff behind the surface
...
This message is part of a shell feature called
job control
...
If we run ps, we can see our process:
[me@linuxbox ~]$ ps
PID TTY
TIME CMD
10603 pts/1
00:00:00 bash
28236 pts/1
00:00:00 xlogo
28239 pts/1
00:00:00 ps

The shell's job control facility also gives us a way to list the jobs that have been launched
from our terminal
...


Returning A Process To The Foreground
A process in the background is immune from keyboard input, including any attempt interrupt it with a Ctrl-c
...
If we only have one background job, the jobspec is optional
...


Stopping (Pausing) A Process
Sometimes we'll want to stop a process without terminating it
...
To stop a foreground process, press
Ctrl-z
...
At the command prompt, type xlogo, the Enter key, then Ctrlz:
[me@linuxbox ~]$ xlogo
[1]+ Stopped
[me@linuxbox ~]$

xlogo

After stopping xlogo, we can verify that the program has stopped by attempting to resize the xlogo window
...
We can either restore the
program to the foreground, using the fg command, or move the program to the background with the bg command:
[me@linuxbox ~]$ bg %1
[1]+ xlogo &
[me@linuxbox ~]$

As with the fg command, the jobspec is optional if there is only one job
...

Why would you want to launch a graphical program from the command line? There are
two reasons
...
Secondly, by launching a program from the command
line, you might be able to see error messages that would otherwise be invisible if the program were launched graphically
...
By launching it from the command line instead, we
may see an error message that will reveal the problem
...

116

Signals

Signals
The kill command is used to “kill” processes
...
Here's an example:
[me@linuxbox ~]$ xlogo &
[1] 28401
[me@linuxbox ~]$ kill 28401
[1]+ Terminated

xlogo

We first launch xlogo in the background
...
Next, we use the kill command and specify the PID of the process
we want to terminate
...

While this is all very straightforward, there is more to it than that
...
Signals are one of several
ways that the operating system communicates with programs
...
When the terminal receives one of
these keystrokes, it sends a signal to the program in the foreground
...
Programs, in turn, “listen” for signals and may act upon them as they are received
...


Sending Signals To Processes With kill
The kill command is used to send signals to programs
...


If no signal is specified on the command line, then the TERM (Terminate) signal is sent by
default
...
This is a vestige of the good old days
when terminals were attached to remote
117

10 – Processes
computers with phone lines and modems
...
” The effect of
this signal can be demonstrated by closing a
terminal session
...

This signal is also used by many daemon
programs to cause a reinitialization
...
The
Apache web server is an example of a daemon
that uses the HUP signal in this way
...
Performs the same function as the
Ctrl-c key sent from the terminal
...


9

KILL

Kill
...
Whereas programs
may choose to handle signals sent to them in
different ways, including ignoring them all
together, the KILL signal is never actually sent to
the target program
...
When a
process is terminated in this manner, it is given no
opportunity to “clean up” after itself or save its
work
...


15

TERM

Terminate
...
If a program is still “alive”
enough to receive signals, it will terminate
...
This will restore a process after a STOP
signal
...
This signal causes a process to pause
without terminating
...


118

Signals
Let's try out the kill command:
[me@linuxbox ~]$ xlogo &
[1] 13546
[me@linuxbox ~]$ kill -1 13546
[1]+ Hangup
xlogo

In this example, we start the xlogo program in the background and then send it a HUP
signal with kill
...
You may need to press the enter key a couple of times before you see the message
...
Remember, you can also use jobspecs in place of PIDs
...

In addition to the list of signals above, which are most often used with kill, there are
other signals frequently used by the system
...


11

SEGV

Segmentation Violation
...


20

TSTP

Terminal Stop
...
Unlike
the STOP signal, the TSTP signal is received by
119

10 – Processes
the program but the program may choose to
ignore it
...
This is a signal sent by the
system when a window changes size
...


For the curious, a complete list of signals can be seen with the following command:
[me@linuxbox ~]$ kill -l

Sending Signals To Multiple Processes With killall
It's also possible to send signals to multiple processes matching a specified program or
username by using the killall command
...


To demonstrate, we will start a couple of instances of the xlogo program and then terminate them:
[me@linuxbox ~]$ xlogo &
[1] 18801
[me@linuxbox ~]$ xlogo &
[2] 18802
[me@linuxbox ~]$ killall xlogo
[1]- Terminated
xlogo
[2]+ Terminated
xlogo

Remember, as with kill, you must have superuser privileges to send signals to processes that do not belong to you
...
Here are some to play with:
120

More Process Related Commands
Table 10-6: Other Process Related Commands
Command

Description

pstree

Outputs a process list arranged in a tree-like pattern showing the
parent/child relationships between processes
...
To see a continuous display, follow the
command with a time delay (in seconds) for updates
...
Terminate the output with Ctrl-c
...


tload

Similar to the xload program, but draws the graph in the terminal
...


Summing Up
Most modern systems feature a mechanism for managing multiple processes
...
Given that Linux is the world's most deployed
server operating system, this makes a lot of sense
...
Though there are
graphical process tools for Linux, the command line tools are greatly preferred because of
their speed and light footprint
...


121

Part 2 – Configuration And The Environment

Part 2 – Configuration And The
Environment

123

11 – The Environment

11 – The Environment
As we discussed earlier, the shell maintains a body of information during our shell session called the environment
...
While most programs use configuration files to store
program settings, some programs will also look for values stored in the environment to
adjust their behavior
...

In this chapter, we will work with the following commands:


printenv – Print part or all of the environment



set – Set shell options



export – Export environment to subsequently executed programs



alias – Create an alias for a command

What Is Stored In The Environment?
The shell stores two basic types of data in the environment, though, with bash, the
types are largely indistinguishable
...

Shell variables are bits of data placed there by bash, and environment variables are basically everything else
...
We covered aliases in Chapter 5, and shell functions (which are related to shell scripting) will be covered in Part 4
...
The set command will show both the shell and environment
variables, while printenv will only display the latter
...
gpg-agent:6689:1
SHELL=/bin/bash
TERM=xterm
XDG_MENU_PREFIX=kdeHISTSIZE=1000
XDG_SESSION_COOKIE=6d7b05c65846c3eaf3101b0046bd2b001208521990
...
0/gtkrc:/home/me/
...
0:/home/me/
...
0
GTK_RC_FILES=/etc/gtk/gtkrc:/home/me/
...
kde/share/confi
g/gtkrc
GS_LIB=/home/me/
...
3
QTINC=/usr/lib/qt-3
...
cmd=00;32:*
...
For example, we see a
variable called USER, which contains the value “me”
...
Unlike printenv, its
output is courteously sorted in alphabetical order:
[me@linuxbox ~]$ set | less

It is also possible to view the contents of a variable using the echo command, like this:
125

11 – The Environment
[me@linuxbox ~]$ echo $HOME
/home/me

One element of the environment that neither set nor printenv displays is aliases
...
='ls -d
...


SHELL

The name of your shell program
...


LANG

Defines the character set and collation order of your language
...


PAGER

The name of the program to be used for paging output
...


PATH

A colon-separated list of directories that are searched when you
enter the name of a executable program
...
This defines the contents of your shell prompt
...


126

The name of your display if you are running a graphical
environment
...


What Is Stored In The Environment?
PWD

The current working directory
...
Unix-like systems support many
terminal protocols; this variable sets the protocol to be used with
your terminal emulator
...
Most Unix-like systems maintain the
computer’s internal clock in Coordinated Universal Time (UTC)
and then displays the local time by applying an offset specified by
this variable
...


Don't worry if some of these values are missing
...


How Is The Environment Established?
When we log on to the system, the bash program starts, and reads a series of configuration scripts called startup files, which define the default environment shared by all users
...
The exact sequence depends on the type of shell session being started
...

A login shell session is one in which we are prompted for our username and password;
when we start a virtual console session, for example
...

Login shells read one or more startup files as shown in Table 11-2:
Table 11-2: Startup Files For Login Shell Sessions
File

Contents

/etc/profile

A global configuration script that applies to all users
...
bash_profile

A user's personal startup file
...


~/
...
bash_profile is not found, bash attempts to
read this script
...
profile

If neither ~/
...
bash_login
is found, bash attempts to read this file
...


Non-login shell sessions read the following startup files:
127

11 – The Environment
Table 11-3: Startup Files For Non-Login Shell Sessions
File

Contents

/etc/bash
...


~/
...
Can be used to extend or
override settings in the global configuration script
...

Take a look at your system and see which of these startup files you have
...

The ~/
...
Non-login shells read it by default and most
startup files for login shells are written in such a way as to read the ~/
...


What's In A Startup File?
If we take a look inside a typical
...
bash_profile
# Get the aliases and functions
if [ -f ~/
...
bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH

Lines that begin with a “#” are comments and are not read by the shell
...
The first interesting thing occurs on the fourth line, with the following code:
if [ -f ~/
...
bashrc
fi

This is called an if compound command, which we will cover fully when we get to shell
scripting in Part 4, but for now we will translate:
If the file "~/
...
bashrc" file
...
bashrc
...

Ever wonder how the shell knows where to find commands when we enter them on the
command line? For example, when we enter ls, the shell does not search the entire computer to find /bin/ls (the full pathname of the ls command), rather, it searches a list
of directories that are contained in the PATH variable
...
This is an example of parameter expansion, which we touched on in Chapter 7
...
"
~]$ echo $foo
text
...

By adding the string $HOME/bin to the end of the PATH variable's contents, the directory $HOME/bin is added to the list of directories searched when a command is entered
...
All we have to do is call
129

11 – The Environment
it bin, and we’re ready to go
...
Some Debian based
distributions, such as Ubuntu, test for the existence of the ~/bin directory at login, and dynamically add it to the PATH variable if the directory is found
...


Modifying The Environment
Since we know where the startup files are and what they contain, we can modify them to
customize our environment
...
bash_profile (or equivalent, according to your distribution
...
profile
...
bashrc
...
It is
certainly possible to change the files in /etc such as profile, and in many cases it
would be sensible to do so, but for now, let's play it safe
...
e
...
A text editor is a program that
is, in some ways, like a word processor in that it allows you to edit the words on the
screen with a moving cursor
...
Text editors are the central tool used by software developers to write code, and by system administrators to manage the configuration files that control the system
...
Why so many different ones? Probably because programmers like writing
130

Modifying The Environment
them, and since programmers use them extensively, they write editors to express their
own desires as to how they should work
...
GNOME and KDE
both include some popular graphical editors
...
KDE usually ships with three
which are (in order of increasing complexity) kedit, kwrite, and kate
...
The popular ones you will encounter are nano, vi,
and emacs
...
The vi editor (on most Linux
systems replaced by a program named vim, which is short for “Vi IMproved”) is the traditional editor for Unix-like systems
...
The
emacs editor was originally written by Richard Stallman
...
While readily available, it is seldom installed
on most Linux systems by default
...
If the file does not already exist, the editor will assume that you want to create a new file
...

All graphical text editors are pretty self-explanatory, so we won't cover them here
...
Let's fire up nano
and edit the
...
But before we do that, let's practice some “safe computing
...
This protects us in case we mess the file up while editing
...
bashrc file, do this:
[me@linuxbox ~]$ cp
...
bashrc
...
The extensions “
...
sav”, “
...
orig” are all popular ways of indicating a backup
file
...

131

11 – The Environment
Now that we have a backup file, we'll start the editor:
[me@linuxbox ~]$ nano
...
0
...
bashrc

#
...
/etc/bashrc
fi
# User specific aliases and functions

[ Read 8 lines ]
^G Get Help^O WriteOut^R Read Fil^Y Prev Pag^K Cut Text^C Cur Pos
^X Exit
^J Justify ^W Where Is^V Next Pag^U UnCut Te^T To Spell

Note: If your system does not have nano installed, you may use a graphical editor
instead
...
Since nano was designed to replace the text editor supplied with an email client, it is rather short on editing features
...
In the
case of nano, you type Ctrl-x to exit
...
The notation “^X” means Ctrl-x
...

The second command we need to know is how to save our work
...
With this knowledge under our belts, we're ready to do some editing
...
bashrc file:
umask 0002
export HISTCONTROL=ignoredups
export HISTSIZE=1000
alias l
...
* --color=auto'
alias ll='ls -l --color=auto'

Note: Your distribution may already include some of these, but duplicates won't
hurt anything
...
bashrc
Line
umask 0002

Meaning

export HISTCONTROL=ignoredups

Causes the shell's history
recording feature to ignore a
command if the same command
was just recorded
...


alias l
...
* --color=auto'

Creates a new command called
“l
...


alias ll='ls -l --color=auto'

Creates a new command called
“ll” which displays a long
format directory listing
...


As we can see, many of our additions are not intuitively obvious, so it would be a good
idea to add some comments to our
...

133

11 – The Environment
Using the editor, change our additions to look like this:
# Change umask to make directory sharing easier
umask 0002
# Ignore duplicates in command history and increase
# history size to 1000 lines
export HISTCONTROL=ignoredups
export HISTSIZE=1000
# Add some helpful aliases
alias l
...
* --color=auto'
alias ll='ls -l --color=auto'

Ah, much better! With our changes complete, press Ctrl-o to save our modified

...


Why Comments Are Important
Whenever you modify configuration files it's a good idea to add some comments
to document your changes
...
While you're at it, it’s not a bad idea to keep a log of what changes you
make
...
Other
configuration files may use other symbols
...
Use them as a guide
...
This is done to give the reader
suggestions for possible configuration choices or examples of correct configuration syntax
...
bashrc file of Ubuntu 8
...
If
you remove the leading “#” symbols from these three lines, a technique called uncommenting, you will activate the aliases
...


134

Modifying The Environment

Activating Our Changes
The changes we have made to our
...
bashrc file is only read at the beginning of a
session
...
bashrc file with the following command:
[me@linuxbox ~]$ source
...
Try out one of the
new aliases:
[me@linuxbox ~]$ ll

Summing Up
In this chapter we learned an essential skill—editing configuration files with a text edi    
tor
...
There may be a gem or two
...


Further Reading


The INVOCATION section of the bash man page covers the bash startup files
in gory detail
...
It takes years of practice
...
vi is somewhat notorious for its difficult user interface, but when we see
a master sit down at the keyboard and begin to “play,” we will indeed be witness to some
great art
...


Why We Should Learn vi
In this modern age of graphical editors and easy-to-use text-based editors such as nano,
why should we learn vi? There are three good reasons:


vi is always available
...
nano, while increasingly popular is still not universal
...




vi is lightweight and fast
...
In addition, vi is designed for typing speed
...




We don't want other Linux and Unix users to think we are sissies
...


136

A Little Background

A Little Background
The first version of vi was written in 1976 by Bill Joy, a University of California at
Berkley student who later went on to co-found Sun Microsystems
...
Previous to visual editors, there were line editors which operated on a
single line of text at a time
...
With the advent
of video terminals (rather than printer-based terminals like teletypes) visual editing became possible
...

Most Linux distributions don't include real vi; rather, they ship with an enhanced replacement called vim (which is short for “vi improved”) written by Bram Moolenaar
...
In the discussions that follow, we
will assume that we have a program called “vi” that is really vim
...
1
...

Vim is open source and freely distributable
type

Sponsor Vim development!
:help sponsor
for information

type
type
type

:q
:help or
:help version7

type

Running in Vi compatible mode
:set nocp
for Vim defaults

to exit
for on-line help
for version info

137

12 – A Gentle Introduction To vi
~
~
~
~

type

:help cp-default for info on this

Just as we did with nano earlier, the first thing to learn is how to exit
...
If, for some reason, vi will not quit (usually because we
made a change to a file that has not yet been saved), we can tell vi that we really mean it
by adding an exclamation point to the command:
:q!

Tip: If you get “lost” in vi, try pressing the Esc key twice to find your way again
...
04), we see the text
“Running in Vi compatible mode
...

For purposes of this chapter, we will want to run vim with its enhanced behavior
...

If that works, consider adding alias vi='vim' to your
...

Alternatively, use this command to add a line to your vim configuration file:
echo "set nocp" >> ~/
...
Some distributions
install a minimal version of vim by default that only supports a limited set of
vim features
...
If this is the case, install the full version of vim
...
This is how
we can create a new file with vi:
[me@linuxbox ~]$ rm -f foo
...
txt

If all goes well, we should get a screen like this:

~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
"foo
...
This shows that
we have an empty file
...
When vi starts up, it begins in command mode
...


139

12 – A Gentle Introduction To vi

Entering Insert Mode
In order to add some text to our file, we must first enter insert mode
...
Afterward, we should see the following at the bottom of the screen if vim is
running in its usual enhanced mode (this will not appear in vi compatible mode):
-- INSERT --

Now we can enter some text
...


To exit insert mode and return to command mode, press the Esc key
...
This is easily done by pressing the “:” key
...
txt" [New] 1L, 46C written

Tip: If you read the vim documentation, you will notice that (confusingly) command mode is called normal mode and ex commands are called command mode
...


Moving The Cursor Around
While in command mode, vi offers a large number of movement commands, some of
which it shares with less
...


h or Left Arrow

Left one character
...


k or Up Arrow

Up one line
...


^

To the first non-whitespace character on the current
line
...


w

To the beginning of the next word or punctuation
character
...


b

To the beginning of the previous word or punctuation
character
...


Ctrl-f or Page Down

Down one page
...


numberG

To line number
...


G

To the last line of the file
...

Many commands in vi can be prefixed with a number, as with the “G” command listed
above
...
For example, the command “5j” causes vi to move the
cursor down five lines
...
vi, of course, supports all of these operations
in its own unique way
...
If we press the “u” key
while in command mode, vi will undo the last change that you made
...


Appending Text
vi has several different ways of entering insert mode
...

Let's go back to our foo
...


If we wanted to add some text to the end of this sentence, we would discover that the i
command will not do it, since we can't move the cursor beyond the end of the line
...
If we move the
cursor to the end of the line and type “a”, the cursor will move past the end of the line
and vi will enter insert mode
...
It was cool
...

Since we will almost always want to append text to the end of a line, vi offers a shortcut
to move to the end of the current line and start appending
...
Let's try
it and add some more lines to our file
...

142

Basic Editing
Now we type “A” and add the following lines of text:
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3
Line 4
Line 5

Again, press the Esc key to exit insert mode
...


Opening A Line
Another way we can insert text is by “opening” a line
...
This has two variants:
Table 12-2: Line Opening Keys
Command
o

Opens

O

The line above the current line
...


We can demonstrate this as follows: place the cursor on “Line 3” then press the o key
...
It was cool
...
Exit insert mode
by pressing the Esc key
...

Press the O key to open the line above the cursor:
The quick brown fox jumped over the lazy dog
...

Line 2

143

12 – A Gentle Introduction To vi

Line 3
Line 4
Line 5

Exit insert mode by pressing the Esc key and undo our change by pressing u
...
First, the x key will delete a character at the cursor location
...
The d key is
more general purpose
...
In addition, d is always followed by a movement
command that controls the size of the deletion
...


3x

The current character and the next two characters
...


5dd

The current line and the next four lines
...


d$

From the current cursor location to the end of the
current line
...


d^

From the current cursor location to the first nonwhitespace character in the line
...


d20G

From the current line to the twentieth line of the file
...
Press the x key repeatedly
until the rest of the sentence is deleted
...

Note: Real vi only supports a single level of undo
...

Let's try the deletion again, this time using the d command
...
was cool
...

Line 2
Line 3
Line 4
Line 5

Press dG to delete from the current line to the end of the file:
~
~
~
~
~

Press u three times to undo the deletion
...
Each time we use the d command the deletion is copied into a paste buffer (think clipboard) that we can later recall
with the p command to paste the contents of the buffer after the cursor or the P command
to paste the contents before the cursor
...
Here are some examples combining the y command with various movement commands:
Table13- 4: Yanking Commands
Command

Copies

yy

The current line
...


yW

From the current cursor position to the beginning of
the next word
...


y0

From the current cursor location to the beginning of
the line
...


yG

From the current line to the end of the file
...


Let's try some copy and paste
...
Next, move the cursor to the last line (G) and type p to paste the
line below the current line:
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3
Line 4
Line 5
The quick brown fox jumped over the lazy dog
...


Just as before, the u command will undo our change
...
It was cool
...
It was cool
...
When you are done, return the file to its original state
...
Normally, it is not possible to move the cursor
to the end of a line and delete the end-of-line character to join one line with the one below it
...

If we place the cursor on line 3 and type the J command, here's what happens:
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3 Line 4
Line 5

Search-And-Replace
vi has the ability to move the cursor to locations based on searches
...
It can also perform text replacements with or without confirmation from the user
...
For example, the command fa would move the cursor to the next occurrence
of the character “a” within the current line
...


Searching The Entire File
To move the cursor to the next occurrence of a word or phrase, the / command is used
...
When you type the
/ command a “/” will appear at the bottom of the screen
...
The cursor will move to the next location
containing the search string
...
Here's an example:
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3
Line 4
Line 5

Place the cursor on the first line of the file
...
The cursor will move to line 2
...
Repeating the n command will move the cursor down the file until it
runs out of matches
...
We will cover regular expressions in some detail in a later chapter
...
To change the word “Line” to “line” for the
entire file, we would enter the following command:
:%s/Line/line/g

Let's break this command down into separate items and see what each one does:
Table12- 5:An example of global search-and-replace syntax
Item
:

Meaning

%

Specifies the range of lines for the operation
...
Alternately, the
range could have been specified 1,5 (since our file is five
lines long), or 1,$ which means “from line 1 to the last line in
the file
...


148

The colon character starts an ex command
...
In this case, substitution (search-andreplace)
...


g

This means “global” in the sense that the search-and-replace is
performed on every instance of the search string in the line
...


After executing our search-and-replace command our file looks like this:
The quick brown fox jumped over the lazy dog
...

line 2
line 3
line 4
line 5

We can also specify a substitution command with user confirmation
...
For example:
:%s/line/Line/gc

This command will change our file back to its previous form; however, before each substitution, vi stops and asks us to confirm the substitution with this message:
replace with Line (y/n/a/q/l/^E/^Y)?

Each of the characters within the parentheses is a possible choice as follows:
Table 12-6: Replace Confirmation Keys
Key
y

Action

n

Skip this instance of the pattern
...


Perform the substitution
...


l

Perform this substitution and then quit
...


Ctrl-e, Ctrl-y

Scroll down and scroll up, respectively
...


If you type y, the substitution will be performed, n will cause vi to skip this instance and
move on to the next one
...
You might need to make changes to
multiple files or you may need to copy content from one file into another
...


Let's exit our existing vi session and create a new file for editing
...
Next, we'll create an additional file in our home directory that
we can play with
...
txt

Let's edit our old file and our new one with vi:
[me@linuxbox ~]$ vi foo
...
txt

vi will start up and we will see the first file on the screen:
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3
Line 4
Line 5

150

Editing Multiple Files

Switching Between Files
To switch from one file to the next, use this ex command:
:n

To move back to the previous file use:
:N

While we can move from one file to another, vi enforces a policy that prevents us from
switching files if the current file has unsaved changes
...

In addition to the switching method described above, vim (and some versions of vi) also
provide some ex commands that make multiple files easier to manage
...
Doing so will display a list of the
files at the bottom of the display:
:buffers
1 %a
"foo
...
txt"
line 0
Press ENTER or type command to continue

To switch to another buffer (file), type :buffer followed by the number of the buffer
you wish to edit
...
txt
to buffer 2 containing the file ls-output
...


Opening Additional Files For Editing
It's also possible to add files to our current editing session
...
Let's end our current editing
session and return to the command line
...
txt

To add our second file, enter:
:e ls-output
...
The first file is still present as we can verify:
:buffers
1 #
"foo
...
txt"
line 0
Press ENTER or type command to continue

Note: You cannot switch to files loaded with the :e command using either the :n
or :N command
...


Copying Content From One File Into Another
Often while editing multiple files, we will want to copy a portion of one file into another
file that we are editing
...
We can demonstrate as follows
...
txt) by entering:
:buffer 1

which should give us this:

152

Editing Multiple Files
The quick brown fox jumped over the lazy dog
...

Line 2
Line 3
Line 4
Line 5

Next, move the cursor to the first line, and type yy to yank (copy) the line
...

-rwxr-xr-x 1 root root
31316 2007-12-05
-rwxr-xr-x 1 root root
8240 2007-12-09
-rwxr-xr-x 1 root root
111276 2008-01-31
-rwxr-xr-x 1 root root
25368 2006-10-06
-rwxr-xr-x 1 root root
11532 2007-05-04
-rwxr-xr-x 1 root root
7292 2007-05-04

It was cool
...
To see this in action,
let's end our vi session and start a new one with just a single file:

153

12 – A Gentle Introduction To vi
[me@linuxbox ~]$ vi ls-output
...
txt

The :r command (short for “read”) inserts the specified file before the cursor position
...

Line 2
Line 3
Line 4
Line 5
-rwxr-xr-x 1 root root
111276 2008-01-31
-rwxr-xr-x 1 root root
25368 2006-10-06
-rwxr-xr-x 1 root root
11532 2007-05-04
-rwxr-xr-x 1 root root
7292 2007-05-04

08:58 [
13:39 411toppm
It was cool
...
We
have already covered the ex command :w, but there are some others we may also find
helpful
...
Likewise, the ex
command :wq will combine the :w and :q commands into one that will both save the
154

Saving Our Work
file and exit
...
This acts like “Save As
...
txt and wanted to save an alternate version called
foo1
...
txt

Note: While the command above saves the file under a new name, it does not
change the name of the file you are editing
...
txt, not foo1
...


Summing Up
With this basic set of skills we can now perform most of the text editing needed to maintain a typical Linux system
...
Since vi-style editors are so deeply embedded in Unix culture, we will see many
other programs that have been influenced by its design
...


Further Reading
Even with all that we have covered in this chapter, we have barely scratched the surface
of what vi and vim can do
...
It's available at:
http://en
...
org/wiki/Vi



The Vim Book - The vim project has a 570-page book that covers (almost) all of
the features in vim
...
vim
...
pdf
...
:
http://en
...
org/wiki/Bill_Joy



A Wikipedia article on Bram Moolenaar, the author of vim:
http://en
...
org/wiki/Bram_Moolenaar

155

13 – Customizing The Prompt

13 – Customizing The Prompt
In this chapter we will look at a seemingly trivial detail  — our shell prompt
...

Like so many things in Linux, the shell prompt is highly configurable, and while we have
pretty much taken it for granted, the prompt is a really useful device once we learn how
to control it
...
The prompt is defined by an environment variable named PS1 (short for “prompt string one”)
...

Every Linux distribution defines the prompt string a little differently, some quite
exotically
...

The astute among us will recognize these as backslash-escaped special characters like
156

Anatomy Of A Prompt
those we saw in Chapter 7
...
This makes the computer beep when it is encountered
...
For example, “Mon May
26
...


\H

Full hostname
...


\l

Name of the current terminal device
...


\r

A carriage return
...


\t

Current time in 24 hour hours:minutes:seconds format
...


\@

Current time in 12 hour AM/PM format
...


\u

username of the current user
...


\V

Version and release numbers of the shell
...


\W

Last part of the current working directory name
...


\#

Number of commands entered during this shell session
...

In that case, it displays a “#” instead
...

This is used to embed non-printing control characters which
manipulate the terminal emulator in some way, such as moving the
157

13 – Customizing The Prompt
cursor or changing text colors
...


Trying Some Alternative Prompt Designs
With this list of special characters, we can change the prompt to see the effect
...
To do this, we will copy the existing
string into another shell variable that we create ourselves:
[me@linuxbox ~]$ ps1_old="$PS1"

We create a new variable called ps1_old and assign the value of PS1 to it
...
No prompt string at all! The
prompt is still there, but displays nothing, just as we asked it to
...
At least now we can see what we are doing
...
This provides the space between the dollar sign and the cursor when
158

Trying Some Alternative Prompt Designs
the prompt is displayed
...
This could get annoying,
but it might be useful if we needed notification when an especially long-running command has been executed
...
Since the
ASCII bell (\a) does not “print,” that is, it does not move the cursor, we need to tell
bash so it can correctly determine the length of the prompt
...
Finally, we'll make a new prompt that is similar to our original:
17:37 linuxbox $ PS1="<\u@\h \W>\$ "
$

Try out the other sequences listed in the table above and see if you can come up with a
brilliant new prompt
...
We'll cover cursor position in a little bit, but first we'll look at
color
...
They
had different keyboards and they all had different ways of interpreting control information
...
If you
look in the deepest recesses of your terminal emulator settings you may find a setting for the type of terminal emulation
...
Old time DOS users will remember the ANSI
...


Character color is controlled by sending the terminal emulator an ANSI escape code embedded in the stream of characters to be displayed
...
As we saw in the
table above, the \[ and \] sequences are used to encapsulate non-printing characters
...
For example, the
code to set the text color to normal (attribute = 0), black text is:
\033[0;30m
Here is a table of available text colors
...
We'll insert the escape code at the beginning:
$ PS1="\[\033[0;31m\]<\u@\h \W>\$ "
$

That works, but notice that all the text that we type after the prompt is also red
...
The background colors do not support the bold attribute
...
In the interests
of good taste, many terminal emulators refuse to honor the blinking attribute, however
...
This is commonly used to provide a
clock or some other kind of information at a different location on the screen, such as an
upper corner each time the prompt is drawn
...

The code for the prompt is this formidable looking string:
PS1="\[\033[s\033[0;0H\033[0;41m\033[K\033[1;33m\t\033[0m\033[u\]
<\u@\h \W>\$ "

Let's take a look at each part of the string to see what it does:

162

Moving The Cursor
Table 13-5: Breakdown Of Complex Prompt String
Sequence

Action

\[

Begins a non-printing character sequence
...
Without an accurate calculation, command line editing
features cannot position the cursor correctly
...
This is needed to return to the prompt
location after the bar and clock have been drawn at the top of
the screen
...


\033[0;0H

Move the cursor to the upper left corner, which is line 0,
column 0
...


\033[K

Clear from the current cursor location (the top left corner) to
the end of the line
...
Note that clearing
to the end of the line does not change the cursor position, which
remains at the upper left corner
...


\t

Display the current time
...


\033[0m

Turn off color
...


\033[u

Restore the cursor position saved earlier
...


<\u@\h \W>\$

Prompt string
...
We can make the prompt permanent by adding it to our
...
To do so, add these two lines to the file:
PS1="\[\033[s\033[0;0H\033[0;41m\033[K\033[1;33m\t\033[0m\033[u\]

163

13 – Customizing The Prompt
<\u@\h \W>\$ "
export PS1

Summing Up
Believe it or not, there is much more that can be done with prompts involving shell functions and scripts that we haven't covered here, but this is a good start
...
But for
those of us who like to tinker, the shell provides the opportunity for many hours of trivial
fun
...
wikipedia
...
It is available at:
http://tldp
...
” Often, these discussions get really silly, focusing on
such things as the prettiness of the desktop background (some people won't use Ubuntu
because of its default color scheme!) and other trivial matters
...
As we spend more time with Linux, we
see that its software landscape is extremely dynamic
...

Most of the top-tier Linux distributions release new versions every six months and many
individual program updates every day
...

Package management is a method of installing and maintaining software on the system
...
This contrasts with the early days of Linux, when one had to
download and compile source code in order to install software
...
It gives us (and everybody else) the ability to examine and improve the system
...

In this chapter, we will look at some of the command line tools used for package management
...
They can perform many tasks that are difficult (or impossible) to do with their
graphical counterparts
...
Most distributions fall into one of two camps of packaging technologies: the Debian “
...
rpm” camp
...


166

Packaging Systems
Table 14-1: Major Packaging System Families
Packaging System

Distributions (Partial Listing)

Debian Style (
...
rpm)

Fedora, CentOS, Red Hat Enterprise Linux, OpenSUSE,
Mandriva, PCLinuxOS

How A Package System Works
The method of software distribution found in the proprietary software industry usually
entails buying a piece of installation media such as an “install disk” and then running an
“installation wizard” to install a new application on the system
...
Virtually all software for a Linux system will be found on
the Internet
...

We'll talk a little about how to install software by compiling source code in a later chapter
...
A package file is a
compressed collection of files that comprise the software package
...
In addition to the files to
be installed, the package file also includes metadata about the package, such as a text description of the package and its contents
...

Package files are created by a person known as a package maintainer, often (but not always) an employee of the distribution vendor
...
Often, the package maintainer will apply modifications to the original source code to improve the program's integration with the other parts of the Linux distribution
...

Packages are made available to the users of a distribution in central repositories that may
contain many thousands of packages, each specially built and maintained for the distribution
...
For example, there will usually be a “testing” repository
that contains packages that have just been built and are intended for use by brave souls
who are looking for bugs before they are released for general distribution
...

A distribution may also have related third-party repositories
...
Perhaps the best known case is that of encrypted
DVD support, which is not legal in the United States
...
These
repositories are usually wholly independent of the distribution they support and to use
them, one must know about them and manually include them in the configuration files for
the package management system
...
Common activities, such as input/output for example, are
handled by routines shared by many programs
...
If a
package requires a shared resource such as a shared library, it is said to have a dependency
...


High And Low-level Package Tools
Package management systems usually consist of two types of tools: low-level tools which
handle tasks such as installing and removing package files, and high-level tools that perform metadata searching and dependency resolution
...
While all Red Hat-style distributions rely on the same
low-level program (rpm), they use different high-level tools
...
Other Red Hat-style distributions provide high-level tools with comparable features
...
We will look at the most common
...

In the discussion below, the term “package_name” refers to the actual name of a package rather than the term “package_file,” which is the name of the file that contains
the package
...

Table 14-3: Package Search Commands
Style

Command(s)

Debian

apt-get update
apt-cache search search_string

Red Hat

yum search search_string

Example: To search a yum repository for the emacs text editor, this command could be
used:
yum search emacs

Installing A Package From A Repository
High-level tools permit a package to be downloaded from a repository and installed with
full dependency resolution
...

Table 14-5: Low-Level Package Installation Commands
Style
Debian

Command(s)
dpkg --install package_file

Red Hat

rpm -i package_file

Example: If the emacs-22
...
fc7-i386
...
1-7
...
rpm

Note: Since this technique uses the low-level rpm program to perform the installation, no dependency resolution is performed
...


Removing A Package
Packages can be uninstalled using either the high-level or low-level tools
...


170

Common Package Management Tasks
Table15- 6: Package Removal Commands
Style

Command(s)

Debian

apt-get remove package_name

Red Hat

yum erase package_name

Example: To uninstall the emacs package from a Debian-style system:
apt-get remove emacs

Updating Packages From A Repository
The most common package management task is keeping the system up-to-date with the
latest packages
...

Table 14-7: Package Update Commands
Style
Debian

Command(s)
apt-get update; apt-get upgrade

Red Hat

yum update

Example: To apply any available updates to the installed packages on a Debian-style system:
apt-get update; apt-get upgrade

Upgrading A Package From A Package File
If an updated version of a package has been downloaded from a non-repository source, it
can be installed, replacing the previous version:
Table 14-8: Low-Level Package Upgrade Commands
Style

Command(s)

Debian

dpkg --install package_file

171

14 – Package Management
Red Hat

rpm -U package_file

Example: Updating an existing installation of emacs to the version contained in the package file emacs-22
...
fc7-i386
...
1-7
...
rpm

Note: dpkg does not have a specific option for upgrading a package versus installing one as rpm does
...
While most of these programs are commonly installed by default, we may need to install additional packages if necessary programs are not already
installed on our system
...


The Linux Software Installation Myth
People migrating from other platforms sometimes fall victim to the myth that
software is somehow difficult to install under Linux and that the variety of packaging schemes used by different distributions is a hindrance
...

The Linux software ecosystem is based on the idea of open source code
...
This method ensures that the product is well integrated into the distribution
and the user is given the convenience of “one-stop shopping” for software, rather
than having to search for each product's web site
...
Generally speaking, there is no such thing as a “driver disk” in Linux
...
Many more, in fact, than Windows does
...
When that happens,
you need to look at the cause
...
The device is too new
...
This takes time
...
The device is too exotic
...
Each distribution builds their own kernels, and since kernels are very configurable (which is what makes it possible to run Linux on everything from wristwatches to mainframes) they may have overlooked a particular device
...
This process is not overly difficult,
but it is rather involved
...

3
...
They have neither released source
code for a Linux driver, nor have they released the technical documentation for
somebody to create one for them
...
Since we don't want secret devices in our computers, I suggest that you remove the offending hardware
and pitch it into the trash with your other useless items
...

Each distribution provides documentation for its package management tools
...
debian
...
en
...
rpm
...
duke
...
wikipedia
...
In this chapter,
we will consider data at the device level
...

However, since this is not a book about system administration, we will not try to cover
this entire topic in depth
...

To carry out the exercises in this chapter, we will use a USB flash drive, a CD-RW disc
(for systems equipped with a CD-ROM burner) and a floppy disk (again, if the system is
so equipped
...
For the most part, we attach a device to our system and it “just
works
...
On nondesktop systems (i
...
, servers) this is still a largely manual procedure since servers often
have extreme storage needs and complex configuration requirements
...

This process, called mounting, allows the device to participate with the operating system
...
This contrasts with other operating systems such as MS-DOS and Windows that maintain separate file system trees for
each device (for example C:\, D:\, etc
...
Here is an example /etc/fstab file from a Fedora 7 system:
LABEL=/12
LABEL=/home
LABEL=/boot
tmpfs
devpts
sysfs
proc
LABEL=SWAP-sda3

/
/home
/boot
/dev/shm
/dev/pts
/sys
/proc
swap

ext3
ext3
ext3
tmpfs
devpts
sysfs
proc
swap

defaults
defaults
defaults
defaults
gid=5,mode=620
defaults
defaults
defaults

1
1
1
0
0
0
0
0

1
2
2
0
0
0
0
0

Most of the file systems listed in this example file are virtual and are not applicable to our
discussion
...
Each line of the file consists of six fields, as follows:
Table 15-1: /etc/fstab Fields
Field

Contents

Description

1

Device

Traditionally, this field contains the actual name of a
device file associated with the physical device, such as
/dev/hda1 (the first partition of the master device
on the first IDE channel)
...
This label
(which is added to the storage media when it is
formatted) is read by the operating system when the
device is attached to the system
...

2

Mount Point

The directory where the device is attached to the file
system tree
...

Most native Linux file systems are ext3, but many
others are supported, such as FAT16 (msdos), FAT32
(vfat), NTFS (ntfs), CD-ROM (iso9660), etc
...
It is
possible, for example, to mount file systems as readonly, or to prevent any programs from being executed
from them (a useful security feature for removable
media)
...


6

Order

A single number that specifies in what order file
systems should be checked with the fsck command
...
Entering the command without arguments will display a list of the file systems currently mounted:
[me@linuxbox ~]$ mount
/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda5 on /home type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
/dev/sdd1 on /media/disk type vfat (rw,nosuid,nodev,noatime,

178

Mounting And Unmounting Storage Devices
uhelper=hal,uid=500,utf8,shortname=lower)
twin4:/musicbox on /misc/musicbox type nfs4 (rw,addr=192
...
1
...
For
example, the first line shows that device /dev/sda2 is mounted as the root file system,
is of type ext3, and is both readable and writable (the option “rw”)
...
The next-to-last entry shows a 2 gigabyte
SD memory card in a card reader mounted at /media/disk, and the last entry is a network drive mounted at /misc/musicbox
...
First, let's look at a system before a CD-ROM is inserted:
[me@linuxbox ~]$ mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

This listing is from a CentOS 5 system, which is using LVM (Logical Volume Manager)
to create its root file system
...
After we insert the disc, we
see the following:
[me@linuxbox ~]$ mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/hdc on /media/live-1
...
10-8 type iso9660 (ro,noexec,nosuid,
nodev,uid=500)

After we insert the disc, we see the same listing as before with one additional entry
...
0
...
For purposes of our experiment, we're interested in the name of the device
...

Warning: In the examples that follow, it is vitally important that you pay close attention to the actual device names in use on your system and do not use the names
used in this text!
Also note that audio CDs are not the same as CD-ROMs
...

Now that we have the device name of the CD-ROM drive, let's unmount the disc and remount it at another location in the file system tree
...
A mount point is simply a directory somewhere on the file system tree
...
It doesn't even have to
be an empty directory, though if you mount a device on a non-empty directory, you will
not be able to see the directory's previous contents until you unmount the device
...
The -t option is used to specify
the file system type:
[root@linuxbox ~]# mount -t iso9660 /dev/hdc /mnt/cdrom

Afterward, we can examine the contents of the CD-ROM via the new mount point:
[root@linuxbox ~]# cd /mnt/cdrom

180

Mounting And Unmounting Storage Devices
[root@linuxbox cdrom]# ls

Notice what happens when we try to unmount the CD-ROM:
[root@linuxbox cdrom]# umount /dev/hdc
umount: /mnt/cdrom: device is busy

Why is this? The reason is that we cannot unmount a device if the device is being used by
someone or some process
...
We can easily remedy the issue by changing the working directory to something other than the mount point:
[root@linuxbox cdrom]# cd
[root@linuxbox ~]# umount /dev/hdc

Now the device unmounts successfully
...
” Computer systems are designed to go as fast as possible
...
Printers are a good example
...
A computer would be very slow indeed if it had to stop
and wait for a printer to finish printing a page
...
If you were working on a spreadsheet or
text document, the computer would stop and become unavailable every time you
printed
...
This problem was solved by the advent of the printer buffer, a device containing some
RAM memory that would sit between the computer and the printer
...
Meanwhile, the printer buffer would slowly spool the data
to the printer from the buffer's memory at the speed at which the printer could accept it
...
Don't
let the need to occasionally read or write data to or from slow devices impede the
speed of the system
...
On a Linux system for example, you
will notice that the system seems to fill up memory the longer it is used
...

This buffering allows writing to storage devices to be done very quickly, because
the writing to the physical device is being deferred to a future time
...
From time to time,
the operating system will write this data to the physical device
...
If the device is removed without unmounting it first, the
possibility exists that not all the data destined for the device has been transferred
...


Determining Device Names
It's sometimes difficult to determine the name of a device
...
A device was always in the same place and it didn't change
...
Back when Unix was developed, “changing a disk drive” involved using
a forklift to remove a washing machine-sized device from the computer room
...

In the examples above we took advantage of the modern Linux desktop's ability to “automagically” mount the device and then determine the name after the fact
...
If we list the contents of the /dev directory (where all devices live), we can see that there are lots and lots of devices:
[me@linuxbox ~]$ ls /dev

The contents of this listing reveal some patterns of device naming
...


/dev/hd*

IDE (PATA) disks on older systems
...
The first drive on the cable is
called the master device and the second is called the slave
device
...
A trailing digit indicates the
partition number on the device
...


/dev/lp*

Printers
...
On recent Linux systems, the kernel treats all disklike devices (including PATA/SATA hard disks, flash drives, and
USB mass storage devices such as portable music players, and
digital cameras) as SCSI disks
...


/dev/sr*

Optical drives (CD/DVD readers and burners)
...

If you are working on a system that does not automatically mount removable devices,
you can use the following technique to determine how the removable device is named
when it is attached
...
Next, plug in the removable device
...
Almost immediately, the
kernel will notice the device and probe it:

183

15 – Storage Media
Jul 23 10:07:53 linuxbox kernel: usb 3-2: new full speed USB device
using uhci_hcd and address 2
Jul 23 10:07:53 linuxbox kernel: usb 3-2: configuration #1 chosen
from 1 choice
Jul 23 10:07:53 linuxbox kernel: scsi3 : SCSI emulation for USB Mass
Storage devices
Jul 23 10:07:58 linuxbox kernel: scsi scan: INQUIRY result too short
(5), using 36
Jul 23 10:07:58 linuxbox kernel: scsi 3:0:0:0: Direct-Access
Easy
Disk
1
...
The interesting parts
of the output are the repeated references to “[sdb]” which matches our expectation of a
SCSI disk device name
...
As we have seen, working with Linux is full of interesting detective work!
Tip: Using the tail -f /var/log/messages technique is a great way to
watch what the system is doing in near real-time
...


Creating New File Systems
Let's say that we want to reformat the flash drive with a Linux native file system, rather
than the FAT32 system it has now
...
(optional) create a new partition layout if the existing one is not to our liking, and 2
...

Warning! In the following exercise, we are going to format a flash drive
...
Failure to heed this warning could result in you formatting (i
...
, erasing) the wrong drive!

Manipulating Partitions With fdisk
The fdisk program allows us to interact directly with disk-like devices (such as hard
disk drives and flash drives) at a very low level
...
To work with our flash drive, we must first unmount it (if
needed) and then invoke the fdisk program as follows:
[me@linuxbox ~]$ sudo umount /dev/sdb1
[me@linuxbox ~]$ sudo fdisk /dev/sdb

Notice that we must specify the device in terms of the entire device, not by partition number
...
We do this by en tering “p” to print the partition table for the device:
Command (m for help): p
Disk /dev/sdb: 16 MB, 16006656 bytes
1 heads, 31 sectors/track, 1008 cylinders
Units = cylinders of 31 * 512 = 15872 bytes
Device Boot
/dev/sdb1

Start
2

End
1008

Blocks
15608+

Id
b

System
W95 FAT32

In this example, we see a 16 MB device with a single partition (1) that uses 1006 of the
available 1008 cylinders on the device
...
Some programs will use this identifier to limit the kinds of operation
that can be done to the disk, but most of the time it is not critical to change it
...
To do this,
we must first find out what ID is used to identify a Linux partition
...
To see a list of the available partition types, we refer back to the program menu
...
Among them we
see “b” for our existing partition type and “83” for Linux
...
Up to this point, the device has
been untouched (all the changes have been stored in memory, not on the physical device),
so we will write the modified partition table to the device and exit
...

WARNING: If you have created or modified any DOS 6
...

Syncing disks
...
We can safely ignore
the ominous sounding warning message
...
To do this, we will use mkfs (short for “make file
system”), which can create file systems in a variety of formats
...
40
...
00%) reserved for the super user
First data block=1
Maximum filesystem blocks=15990784
2 block groups
8192 blocks per group, 8192 fragments per group
1952 inodes per group
Superblock backups stored on blocks:
8193
Writing inode tables: done
Creating journal (1024 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 34 mounts or
180 days, whichever comes first
...

[me@linuxbox ~]$

The program will display a lot of information when ext3 is the chosen file system type
...
While we worked with a tiny flash drive, the same process
188

Creating New File Systems
can be applied to internal hard disks and other removable storage devices like USB hard
drives
...
Each time the system boots, it routinely checks the integrity of the file
systems before mounting them
...
The last number in each fstab entry specifies the order in which the devices
are to be checked
...
Devices with a zero as the last digit are not
routinely checked
...
On Unix-like
file systems, recovered portions of files are placed in the lost+found directory, located in the root of each file system
...
40
...
40
...
On most systems, file system corruption detected at boot time
will cause the system to stop and direct you to run fsck before continuing
...
This is especially appropriate, given that you will
probably be uttering the aforementioned word if you find yourself in a situation
where you are forced to run fsck
...
Preparing a blank floppy for use is a two step
process
...
To accomplish the formatting, we use the fdformat program specifying the name
of the floppy device (usually /dev/fd0):
[me@linuxbox ~]$ sudo fdformat /dev/fd0
Double-sided, 80 tracks, 18 sec/track
...

Formatting
...
done

Next, we apply a FAT file system to the diskette with mkfs:
[me@linuxbox ~]$ sudo mkfs -t msdos /dev/fd0

Notice that we use the “msdos” file system type to get the older (and smaller) style file
allocation tables
...


Moving Data Directly To/From Devices
While we usually think of data on our computers as being organized into files, it is also
possible to think of the data in “raw” form
...
However, if we could treat a disk drive as simply a large collection of
data blocks, we could perform useful tasks, such as cloning devices
...
It copies blocks of data from one place to another
...
If we attached both drives to the computer and they are as signed to devices /dev/sdb and /dev/sdc respectively, we could copy everything on
the first drive to the second drive with the following:
dd if=/dev/sdb of=/dev/sdc

190

Moving Data Directly To/From Devices
Alternately, if only the first device were attached to the computer, we could copy its contents to an ordinary file for later restoration or copying:
dd if=/dev/sdb of=flash_drive
...
Though its name derives from “data
definition,” it is sometimes called “destroy disk” because users often mistype either
the if or of specifications
...


Creating An Image Copy Of A CD-ROM
If we want to make an iso image of an existing CD-ROM, we can use dd to read all the
data blocks off the CD-ROM and copy them to a local file
...
After inserting the CD and determining its device name (we’ll assume /dev/cdrom), we can
make the iso file like so:
dd if=/dev/cdrom of=ubuntu
...
For audio CDs, look at the cdrdao command
...
To do this, we first create a directory containing all the files
we wish to include in the image, and then execute the genisoimage command to create the image file
...
iso with the following command:
191

15 – Storage Media
genisoimage -o cd-rom
...
Likewise, the “-J” option enables the
Joliet extensions, which permit long filenames for Windows
...

If you look at on-line tutorials for creating and burning optical media like CDROMs and DVDs, you will frequently encounter two programs called mkisofs
and cdrecord
...
In the summer of 2006, Mr
...
As
a result, a fork of the cdrtools project was started that now includes replacement
programs for cdrecord and mkisofs named wodim and genisoimage, respectively
...
Most of the commands we will discuss below can be applied to both recordable CD-ROM and DVD media
...
By adding the “-o loop” option to
mount (along with the required “-t iso9660” file system type), we can mount the image
file as though it were a device and attach it to the file system tree:
mkdir /mnt/iso_image
mount -t iso9660 -o loop image
...
iso at that mount point
...
Remember to unmount the
image when it is no longer needed
...
To do
this, we can use wodim, specifying the device name for the CD writer and the type of
blanking to be performed
...
The most minimal
(and fastest) is the “fast” type:
wodim dev=/dev/cdrw blank=fast

Writing An Image
To write an image, we again use wodim, specifying the name of the optical media writer
device and the name of the image file:
wodim dev=/dev/cdrw image
...
Two common ones are “-v” for verbose output, and “-dao”, which writes the disc in
disc-at-once mode
...
The default mode for wodim is track-at-once, which is useful for recording
music tracks
...
There are, of
course, many more
...
It also offers many features for interoperability with other systems
...
Some of them support
huge numbers of options and operations
...


Extra Credit
It’s often useful to verify the integrity of an iso image that we have downloaded
...
A checksum is the result of an exotic mathematical calculation resulting in a number that represents the con193

15 – Storage Media
tent of the target file
...
The most common method of checksum generation
uses the md5sum program
...
iso
34e354760f9bb7fbf85c96f6a3f94ece

image
...

In addition to checking the integrity of a downloaded file, we can use md5sum to verify
newly written optical media
...
The trick to verifying the media is to limit
the calculation to only the portion of the optical media that contains the image
...
On
some types of media, this is not required
...
In the example below, we check the integrity of the image file dvd-image
...
Can you figure out how this works?
md5sum dvd-image
...
iso) / 2048 )) | md5sum

194

16 – Networking

16 – Networking
When it comes to networking, there is probably nothing that cannot be done with Linux
...

Just as the subject of networking is vast, so are the number of commands that can be used
to configure and control it
...
The commands chosen for examination include those used to monitor
networks and those used to transfer files
...
This chapter will cover:


ping - Send an ICMP ECHO_REQUEST to network hosts



traceroute - Print the route packets trace to a network host



netstat - Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships



ftp - Internet file transfer program



wget - Non-interactive network downloader



ssh - OpenSSH SSH client (remote login program)

We’re going to assume a little background in networking
...
To make full
use of this chapter we should be familiar with the following terms:


IP (Internet Protocol) address



Host and domain name



URI (Uniform Resource Identifier)

Please see the “Further Reading” section below for some useful articles regarding these
terms
...


Examining And Monitoring A Network
Even if you’re not the system administrator, it’s often helpful to examine the performance
and operation of a network
...
The ping command sends a special network
packet called an IMCP ECHO_REQUEST to a specified host
...

Note: It is possible to configure most network devices (including Linux hosts) to
ignore these packets
...
It is also common for firewalls to be configured to
block IMCP traffic
...
org (one of our favorite sites ;-),
we can use use ping like this:
[me@linuxbox ~]$ ping linuxcommand
...
org
PING linuxcommand
...
35
...
210) 56(84) bytes of data
...
sourceforge
...
35
...
210): icmp_seq=1
ttl=43 time=107 ms
64 bytes from vhost
...
net (66
...
250
...
sourceforge
...
35
...
210): icmp_seq=3
ttl=43 time=106 ms
64 bytes from vhost
...
net (66
...
250
...
sourceforge
...
35
...
210): icmp_seq=5
ttl=43 time=105 ms
64 bytes from vhost
...
net (66
...
250
...
org ping statistics --6 packets transmitted, 6 received, 0% packet loss, time 6010ms
rtt min/avg/max/mdev = 105
...
052/108
...
824 ms

After it is interrupted (in this case after the sixth packet) by pressing Ctrl-c, ping
prints performance statistics
...
A successful “ping” will indicate that the elements of the network (its interface cards, cabling, routing, and gateways) are in generally good working order
...
For example, to see the route taken to reach slashdot
...
org

The output looks like this:
traceroute to slashdot
...
34
...
45), 30 hops max, 40 byte
packets
1 ipcop
...
168
...
1) 1
...
366 ms 1
...
rockville
...
bad
...
net (68
...
130
...
622
ms 14
...
169 ms
4 po-30-ur02
...
md
...
comcast
...
87
...
154) 17
...
626 ms 17
...
rockville
...
bad
...
net (68
...
129
...
992
ms 15
...
256 ms
6 po-30-ar01
...
md
...
comcast
...
87
...
5) 22
...
233 ms 14
...
whitemarsh
...
bad
...
net (68
...
129
...
154
ms 13
...
867 ms
8 te-0-3-0-1-cr01
...
pa
...
comcast
...
86
...
77)
21
...
073 ms 21
...
newyork
...
ibone
...
net (68
...
85
...
917 ms 21
...
126 ms
10 204
...
144
...
70
...
1) 43
...
248 ms 21
...
newyork
...
net (204
...
195
...
857 ms
cr2-pos-0-0-3-1
...
savvis
...
70
...
238) 19
...
newyork
...
net (204
...
195
...
634 ms
12 cr2-pos-0-7-3-0
...
savvis
...
70
...
109) 41
...
843 ms cr2-tengig-0-0-2-0
...
savvis
...
70
...
242)
43
...
elkgrovech3
...
net
(204
...
195
...
215 ms 41
...
658 ms
14 csr1-ve241
...
savvis
...
64
...
42) 46
...
372 ms 47
...
27
...
194 (64
...
160
...
137 ms 55
...
810 ms
16 slashdot
...
34
...
45) 42
...
016 ms 41
...
org requires traversing sixteen routers
...
For routers that do not provide identifying information (because of router configuration, network congestion, firewalls, etc
...


netstat
The netstat program is used to examine various network settings and statistics
...
Using the “-ie” option, we can examine the network interfaces in our system:
[me@linuxbox ~]$ netstat -ie
eth0
Link encap:Ethernet HWaddr 00:1d:09:9b:99:67
inet addr:192
...
1
...
168
...
255 Mask:255
...
255
...
0 MB) TX bytes:261035246 (248
...
0
...
1 Mask:255
...
0
...
8 KB) TX bytes:111490 (108
...
The first,
198

Examining And Monitoring A Network
called eth0, is the Ethernet interface and the second, called lo, is the loopback interface, a virtual interface that the system uses to “talk to itself
...
For systems using DHCP (Dynamic Host Configuration Protocol), a valid IP address in this field will verify that the DHCP is working
...
This shows how the
network is configured to send packets from network to network:
[me@linuxbox ~]$ netstat -r
Kernel IP routing table
Destination Gateway
Genmask
192
...
1
...
255
...
0 U
192
...
1
...
0
...
0
UG

MSS Window
0 0
0 0

irtt Iface
0 eth0
0 eth0

In this simple example, we see a typical routing table for a client machine on a LAN (Local Area Network) behind a firewall/router
...
168
...
0
...
The next field, Gateway, is
the name or IP address of the gateway (router) used to go from the current host to the destination network
...

The last line contains the destination default
...
In our example, we see that the gateway
is defined as a router with the address of 192
...
1
...

The netstat program has many options and we have only looked at a couple
...


Transporting Files Over A Network
What good is a network unless we know how to move files across it? There are many
programs that move data over networks
...


ftp
One of the true “classic” programs, ftp gets it name from the protocol it uses, the File
Transfer Protocol
...
Most, if not all,
199

16 – Networking
web browsers support it and you often see URIs starting with the protocol ftp://
...
ftp is used to communicate with FTP servers, machines that contain files that can be uploaded and downloaded
over a network
...
This means that they are not encrypted and anyone sniffing the network can see
them
...
An anonymous server allows anyone to login using the login name “anonymous”
and a meaningless password
...
04 directory of the
anonymous FTP server fileserver:
[me@linuxbox ~]$ ftp fileserver
Connected to fileserver
...

220 (vsFTPd 2
...
1)
Name (fileserver:me): anonymous
331 Please specify the password
...

Remote system type is UNIX
...

ftp> cd pub/cd_images/Ubuntu-8
...

ftp> ls
200 PORT command successful
...

150 Here comes the directory listing
...
04desktop-i386
...

ftp> lcd Desktop
Local directory now /home/me/Desktop
ftp> get ubuntu-8
...
iso
local: ubuntu-8
...
iso remote: ubuntu-8
...
iso
200 PORT command successful
...

150 Opening BINARY mode data connection for ubuntu-8
...
iso (733079552 bytes)
...

733079552 bytes received in 68
...
5 kB/s)
ftp> bye

Here is an explanation of the commands entered during this session:

200

Transporting Files Over A Network
Command

Meaning

ftp fileserver

Invoke the ftp program and have it
connect to the FTP server
fileserver
...
After the login prompt, a
password prompt will appear
...
In that case,
try something like
“user@example
...


cd pub/cd_images/Ubuntu-8
...

Note that on most anonymous FTP
servers, the files for public
downloading are found somewhere
under the pub directory
...


lcd Desktop

Change the directory on the local
system to ~/Desktop
...
This command changes the
working directory to ~/Desktop
...
04-desktopi386
...
04-desktopi386
...
Since
the working directory on the local
system was changed to ~/Desktop,
the file will be downloaded there
...
The commands
quit and exit may also be used
...
Using
ftp on a server where sufficient permissions have been granted, it is possible to perform
201

16 – Networking
many ordinary file management tasks
...


lftp – A Better ftp
ftp is not the only command-line FTP client
...
One of the better
(and more popular) ones is lftp by Alexander Lukyanov
...


wget
Another popular command-line program for file downloading is wget
...
Single files, multiple files, and even
entire sites can be downloaded
...
org we
could do this:
[me@linuxbox ~]$ wget http://linuxcommand
...
php
--11:02:51-- http://linuxcommand
...
php
=> `index
...
org
...
35
...
210
Connecting to linuxcommand
...
35
...
210|:80
...

HTTP request sent, awaiting response
...
--K/s

11:02:51 (161
...
php' saved [3120]

The program's many options allow wget to recursively download, download files in the
background (allowing you to log off but continue downloading), and complete the download of a partially downloaded file
...


Secure Communication With Remote Hosts
For many years, Unix-like operating systems have had the ability to be administered remotely via a network
...
These were the
rlogin and telnet programs
...
This makes them wholly inappropriate for use in the
202

Secure Communication With Remote Hosts
Internet age
...
SSH
solves the two basic problems of secure communication with a remote host
...

SSH consists of two parts
...

Most Linux distributions ship an implementation of SSH called OpenSSH from the
OpenBSD project
...
To
enable a system to receive remote connections, it must have the OpenSSH-server
package installed, configured and running, and (if the system is either running or is behind a firewall) it must allow incoming network connections on TCP port 22
...
That way, your machine will create network connections with itself
...
To connect to a remote host named remote-sys, we would use the ssh
client program like so:
[me@linuxbox ~]$ ssh remote-sys
The authenticity of host 'remote-sys (192
...
1
...

RSA key fingerprint is
41:ed:7a:df:23:19:bf:3c:a5:17:bc:61:b3:7f:d9:bb
...
This is because the client program has
never seen this remote host before
...
Once the connection is established, the user is prompted for
his/her password:
203

16 – Networking
Warning: Permanently added 'remote-sys,192
...
1
...

me@remote-sys's password:

After the password is successfully entered, we receive the shell prompt from the remote
system:
Last login: Sat Aug 30 13:00:48 2008
[me@remote-sys ~]$

The remote shell session continues until the user enters the exit command at the remote
shell prompt, thereby closing the remote connection
...

It is also possible to connect to remote systems using a different username
...
If the remote host does
not successfully authenticate, the following message appears:
[me@linuxbox ~]$ ssh remote-sys
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@
WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle
attack)!
It is also possible that the RSA host key has just been changed
...

Please contact your system administrator
...
ssh/known_hosts to get rid of this
message
...
ssh/known_hosts:1
RSA host key for remote-sys has changed and you have requested strict

204

Secure Communication With Remote Hosts
checking
...


This message is caused by one of two possible situations
...
This is rare, since everybody knows that ssh
alerts the user to this
...
In the interests of security and safety however, the first possibility should not be dismissed out of
hand
...

After it has been determined that the message is due to a benign cause, it is safe to correct
the problem on the client side
...
ssh/known_hosts file
...
ssh/known_hosts:1

This means that line one of the known_hosts file contains the offending key
...

Besides opening a shell session on a remote system, ssh also allows us to execute a single command on a remote system
...
txt
me@twin4's password:
[me@linuxbox ~]$

Notice the use of the single quotes in the command above
...
Likewise, if we had wanted the output redirected to a
file on the remote machine, we could have placed the redirection operator and the filename within the single quotes:
[me@linuxbox ~]$ ssh remote-sys 'ls * > dirlist
...
Normally, this tunnel is used to allow commands typed at the local system to be transmitted safely to the remote system, and for the results to be transmitted safely
back
...

Perhaps the most common use of this feature is to allow X Window system traffic
to be transmitted
...
It’s easy
to do; here’s an example: Let’s say we are sitting at a Linux system called linuxbox which is running an X server, and we want to run the xload program on
a remote system named remote-sys and see the program’s graphical output on
our local system
...
On some systems, you may need to use the “-Y” option
rather than the “-X” option to do this
...
The first, scp (secure copy) is used
much like the familiar cp program to copy files
...
For example, if we wanted to copy a document named document
...
txt
...
txt
100% 5581
[me@linuxbox ~]$

5
...
txt
...
sftp works much like the original ftp program that
we used earlier; however, instead of transmitting everything in cleartext, it uses an SSH
encrypted tunnel
...
It only requires the SSH
server
...
Here is a sample session:
[me@linuxbox ~]$ sftp remote-sys
Connecting to remote-sys
...
04-desktop-i386
...
04-desktop-i386
...
04-desktop-i386
...
04desktop-i386
...
04-desktop-i386
...
4MB/s
01:35
sftp> bye

207

16 – Networking
Tip: The SFTP protocol is supported by many of the graphical file managers found
in Linux distributions
...


An SSH Client For Windows?
Let’s say you are sitting at a Windows machine but you need to log in to your
Linux server and get some real work done; what do you do? Get an SSH client
program for your Windows box, of course! There are a number of these
...
The PuTTY program displays a terminal window and allow a Windows user to open an SSH (or
telnet) session on a remote host
...

PuTTY is available at http://www
...
greenend
...
uk/~sgtatham/putty/

Summing Up
In this chapter, we have surveyed the field of networking tools found on most Linux systems
...
But even with the basic set of
tools, it is possible to perform many useful network related tasks
...
Here are some of the basics:
http://en
...
org/wiki/Internet_protocol_address
http://en
...
org/wiki/Host_name
http://en
...
org/wiki/Uniform_Resource_Identifier

208

For a broad (albeit dated) look at network administration, the Linux Documentation Project provides the Linux Network Administrator’s Guide:
http://tldp
...
html

17 – Searching For Files

17 – Searching For Files
As we have wandered around our Linux system, one thing has become abundantly clear:
A typical Linux system has a lot of files! This begs the question, “How do we find
things?” We already know that the Linux file system is well organized according to conventions that have been passed down from one generation of Unix-like systems to the
next, but the sheer number of files can present a daunting problem
...
These
tools are:


locate – Find files by name



find – Search for files in a directory hierarchy

We will also look at a command that is often used with file-search commands to process
the resulting list of files:


xargs – Build and execute command lines from standard input

In addition, we will introduce a couple of commands to assist us in our explorations:


touch – Change file times



stat – Display file or file system status

locate – Find Files The Easy Way
The locate program performs a rapid database search of pathnames, and then outputs
every name that matches a given substring
...
” Since we are looking for programs, we can assume that the name of the directory containing the programs would end with “bin/”
...
The two most common ones found in modern Linux distributions are slocate and mlocate, though they are usually accessed by a symbolic
link named locate
...

Some versions include regular expression matching (which we’ll cover in an upcoming
chapter) and wildcard support
...


210

locate – Find Files The Easy Way

Where Does The locate Database Come From?
You may notice that, on some distributions, locate fails to work just after the
system is installed, but if you try again the next day, it works fine
...
Usually, it is run periodically as a cron job; that is, a task performed at regular intervals by the cron daemon
...
Since the database is not updated continuously, you will notice that
very recent files do not show up when using locate
...


find – Find Files The Hard Way
While the locate program can find a file based solely on its name, the find program
searches a given directory (and its subdirectories) for files based on a variety of attributes
...

In its simplest use, find is given one or more names of directories to search
...
Since the list is sent to standard output, we can pipe the list into other programs
...
It does this through the (slightly strange) application of options,
tests, and actions
...


211

17 – Searching For Files

Tests
Let’s say that we want a list of directories from our search
...
Conversely, we could have
limited the search to regular files with this test:
[me@linuxbox ~]$ find ~ -type f | wc -l
38737

Here are the common file type tests supported by find:
Table 17-1: find File Types
File Type

Description

b

Block special device file

c

Character special device file

d

Directory

f

Regular file

l

Symbolic link

We can also search by file size and filename by adding some additional tests: Let’s look
for all the regular files that match the wildcard pattern “*
...
JPG" -size +1M | wc -l
840

In this example, we add the -name test followed by the wildcard pattern
...
Next, we add the -size
test followed by the string “+1M”
...
A leading minus sign would change the meaning of
212

find – Find Files The Hard Way
the string to be smaller than the specified number
...
” The trailing letter “M” indicates that the unit of measurement is megabytes
...
This is the default if no unit is specified
...
Below is a rundown of the common
ones
...
To specify less than n
minutes ago, use -n and to specify more than n minutes
ago, use +n
...


-ctime n

Match files or directories whose contents or attributes were
last modified n*24 hours ago
...


-group name

Match file or directories belonging to group
...


-iname pattern

Like the -name test but case insensitive
...
This is helpful for finding
all the hard links to a particular inode
...


-mtime n

Match files or directories whose contents were last
modified n*24 hours ago
...


-newer file

Match files and directories whose contents were modified
more recently than the specified file
...
Each
time you make a backup, update a file (such as a log), and
then use find to determine which files have changed since
the last update
...

This can be used to find files belonging to deleted accounts
or to detect activity by attackers
...


-perm mode

Match files or directories that have permissions set to the
specified mode
...


-samefile name

Similar to the -inum test
...


-size n

Match files of size n
...


-user name

Match files or directories belonging to user name
...


This is not a complete list
...


Operators
Even with all the tests that find provides, we may still need a better way to describe the
logical relationships between the tests
...
Fortunately, find provides a way to combine tests using logical operators
214

find – Find Files The Hard Way
to create more complex logical relationships
...
What is all this stuff? Actually, the operators are not that
complicated once you get to know them
...

May be shortened to -a
...


-or

Match if a test on either side of the operator is true
...


-not

Match if the test following the operator is false
...


( )

Groups tests and operators together to form larger
expressions
...
By default, find evaluates from left to
right
...
Even if not needed, it is
helpful sometimes to include the grouping characters to
improve readability of the command
...
Usually
the backslash character is used to escape them
...
When viewed
from the uppermost level, we see that our tests are arranged as two groupings separated
by an -or operator:
( expression 1 ) -or ( expression 2 )
This makes sense, since we are searching for files with a certain set of permissions and
for directories with a different set
...
We want to know if it is either a file with bad permissions or a directory with bad permissions
...
So if we expand the grouped expressions, we can see it this way:
( file with bad perms ) -or ( directory with bad perms )
Our next challenge is how to test for “bad permissions
...
What we will test for is “not good permissions,” since we know what “good permissions” are
...

The expression that will test files for “not good” permissions is:
-type f -and -not -perms 0600
and for directories:
-type d -and -not -perms 0700
As noted in the table of operators above, the -and operator can be safely removed, since
it is implied by default
...
Preceding each one with a backslash
character does the trick
...
Let’s say that
we have two expressions separated by a logical operator:
expr1 -operator expr2
In all cases, expr1 will always be performed; however, the operator will determine if
expr2 is performed
...


True

-and

Always performed

False

-and

Never performed

True

-or

Never performed

False

-or

Always performed

Why does this happen? It’s done to improve performance
...
We
know that the expression expr1 -and expr2 cannot be true if the result of expr1 is
216

find – Find Files The Hard Way
false, so there is no point in performing expr2
...

OK, so it helps it go faster
...


Predefined Actions
Let’s get some work done! Having a list of results from our find command is useful, but
what we really want to do is act on the items on the list
...
There are a set of predefined actions and several ways to apply user-defined actions
...


-ls

Perform the equivalent of ls -dils on the matching file
...


-print

Output the full pathname of the matching file to standard
output
...


-quit

Quit once a match has been made
...
See the find man page for full details
...

It produced a list because the -print action is implied if no other action is specified
...
For example, to delete files that
217

17 – Searching For Files
have the file extension “
...
BAK' -delete

In this example, every file in the user’s home directory (and its subdirectories) is searched
for filenames ending in
...
When they are found, they are deleted
...
Always test the command first by substituting the
-print action for -delete to confirm the search results
...
Consider the following command:
find ~ -type f -name '*
...
BAK (-name '*
...
However, the reason the command performs
the way it does is determined by the logical relationships between each of the tests and
actions
...
We could also express the command this way to make the logical relationships easier to see:
find ~ -type f -and -name '*
...


-name ‘*
...


218

-type f and -name '*
...
For instance, if
we were to reorder the tests and actions so that the -print action was the first one, the
command would behave much differently:
find ~ -print -and -type f -and -name '*
...


User-Defined Actions
In addition to the predefined actions, we can also invoke arbitrary commands
...
This action works like this:
-exec command {} ;
where command is the name of a command, {} is a symbolic representation of the current
pathname, and the semicolon is a required delimiter indicating the end of the command
...

It’s also possible to execute a user-defined action interactively
...
/home/me/bin/foo > ? y
-rwxr-xr-x 1 me
me 224 2007-10-29 18:44 /home/me/bin/foo
< ls
...
txt > ? y
-rw-r--r-- 1 me
me
0 2008-09-19 12:53 /home/me/foo
...
Using the -ok action prompts the user before the ls command is executed
...
There are times when we might prefer to combine all
of the search results and launch a single instance of the command
...
There
are two ways we can do this
...
We’ll talk about the alternate
way first
...
Going back to our example, this:
find ~ -type f -name 'foo*' -exec ls -l '{}' ';'
-rwxr-xr-x 1 me
me 224 2007-10-29 18:44 /home/me/bin/foo
-rw-r--r-- 1 me
me
0 2008-09-19 12:53 /home/me/foo
...
By changing the command to:
find ~ -type f -name 'foo*' -exec ls -l '{}' +
-rwxr-xr-x 1 me
me 224 2007-10-29 18:44 /home/me/bin/foo
-rw-r--r-- 1 me
me
0 2008-09-19 12:53 /home/me/foo
...


xargs
The xargs command performs an interesting function
...
With our example, we
would use it like this:

220

find – Find Files The Hard Way
find ~ -type f -name 'foo*' -print | xargs ls -l
-rwxr-xr-x 1 me
me 224 2007-10-29 18:44 /home/me/bin/foo
-rw-r--r-- 1 me
me
0 2008-09-19 12:53 /home/me/foo
...

Note: While the number of arguments that can be placed into a command line is
quite large, it’s not unlimited
...
When a command line exceeds the maximum length supported
by the system, xargs executes the specified command with the maximum number
of arguments possible and then repeats this process until standard input is exhausted
...


Dealing With Funny Filenames
Unix-like systems allow embedded spaces (and even newlines!) in filenames
...
An embedded space will be treated as a delimiter, and the resulting command will interpret each space-separated word as a separate argument
...
A null character is defined in ASCII as the character represented by the number zero (as opposed to, for example, the space character, which
is defined in ASCII as the character represented by the number 32)
...
Here’s an example:
find ~ -iname '*
...


A Return To The Playground
It’s time to put find to some (almost) practical use
...

First, let’s create a playground with lots of subdirectories and files:
221

17 – Searching For Files
[me@linuxbox ~]$ mkdir -p playground/dir-{001
...
100}/file-{A
...
Try that with the
GUI!
The method we employed to accomplish this magic involved a familiar command
(mkdir), an exotic shell expansion (braces) and a new command, touch
...

The touch command is usually used to set or update the access, change, and modify
times of files
...

In our playground, we created 100 instances of a file named file-A
...
Its order is determined by the layout of the storage device
...
This will be helpful
when creating backups or organizing files in chronological order
...
We can verify this by using another handy command, stat, which is a kind of
souped-up version of ls
...
000000000 -0400
Modify: 2008-10-08 15:15:39
...
000000000 -0400

If we touch the file again and then examine it with stat, we will see that the file’s
times have been updated
...
000000000 -0400
Modify: 2008-10-08 15:23:33
...
000000000 -0400

Next, let’s use find to update some of our playground files:
[me@linuxbox ~]$ find playground -type f -name 'file-B' -exec touch
'{}' ';'

This updates all files in the playground named file-B
...
Since we performed a touch on all the
files in the playground named file-B after we updated timestamp, they are now
“newer” than timestamp and thus can be identified with the -newer test
...
” With our knowledge of operators and actions, we
can add actions to this command to apply new permissions to the files and directories in
our playground:
[me@linuxbox ~]$ find playground \( -type f -not -perm 0600 -exec
chmod 0600 '{}' ';' \) -or \( -type d -not -perm 0700 -exec chmod
0700 '{}' ';' \)

On a day-to-day basis, we might find it easier to issue two commands, one for the directories and one for the files, rather than this one large compound command, but it’s nice to
know that we can do it this way
...


Options
Finally, we have the options
...

They may be included with other tests and actions when constructing find expressions
...


-mindepth levels

Set the minimum number of levels that find will
descend into a directory tree before applying tests and
actions
...


224

Direct find to process a directory’s files before the
directory itself
...


find – Find Files The Hard Way
-noleaf

Direct find not to optimize its search based on the
assumption that it is searching a Unix-like file system
...


Summing Up
It's easy to see that locate is as simple as find is complicated
...
Take the time to explore the many features of find
...


Further Reading


The locate, updatedb, find, and xargs programs are all part the GNU
Project’s findutils package
...
gnu
...
One way this is done is by performing timely backups of the system’s files
...

In this chapter, we will look at several common programs that are used to manage collections of files
...
Many of the data services that we take for granted today, such as portable music players, high definition television, or broadband Internet, owe their existence to effective data compression techniques
...
Let’s consider an
imaginary example
...
In terms of data storage (assuming 24 bits, or 3 bytes per pixel), the
image will occupy 30,000 bytes of storage:
100 * 100 * 3 = 30,000
An image that is all one color contains entirely redundant data
...
So, instead of storing a block of data containing 30,000 zeros
(black is usually represented in image files as zero), we could compress the data into the
number 10,000, followed by a zero to represent our data
...
Today’s techniques are much more advanced and complex but the basic goal
remains the same—get rid of redundant data
...
Lossless compression preserves all the
data contained in the original
...
Lossy
compression, on the other hand, removes data as the compression is performed, to allow
more compression to be applied
...
Examples of lossy compression are JPEG
(for images) and MP3 (for music)
...


gzip
The gzip program is used to compress one or more files
...
The corresponding gunzip program is used to restore compressed files to their original, uncompressed form
...
txt
ls -l foo
...
txt
gzip foo
...
*
me
3230 2008-10-14 07:15 foo
...
gz
gunzip foo
...
*
me
15738 2008-10-14 07:15 foo
...
txt from a directory listing
...
txt
...
In the directory listing of foo
...
We can also see that the compressed file has the same permissions and
timestamp as the original
...
Afterward, we can see that the
compressed version of the file has been replaced with the original, again with the permis227

18 – Archiving And Backup
sions and timestamp preserved
...
Here are a few:
Table 18-1: gzip Options
Option

Description

-c

Write output to standard output and keep original files
...


-d

Decompress
...
May also be
specified with --decompress or --uncompress
...
May also be specified with --force
...
May also be specified with --help
...
May also be
specified with --list
...
May also be
specified with --recursive
...
May also be specified with
--test
...
May also be specified
with --verbose
...
number is an integer in the range of 1
(fastest, least compression) to 9 (slowest, most compression)
...
The default value is 6
...
txt
[me@linuxbox ~]$ gzip -tv foo
...
gz
foo
...
gz:
OK
[me@linuxbox ~]$ gzip -d foo
...
gz

Here, we replaced the file foo
...
txt
...

Next, we tested the integrity of the compressed version, using the -t and -v options
...

gzip can also be used in interesting ways via standard input and output:
[me@linuxbox ~]$ ls -l /etc | gzip > foo
...
gz

This command creates a compressed version of a directory listing
...
gz, so it’s not necessary to specify it, as long as the specified name is not in
conflict with an existing uncompressed file:
[me@linuxbox ~]$ gunzip foo
...
txt | less

Alternately, there is a program supplied with gzip, called zcat, that is equivalent to
gunzip with the -c option
...
txt
...
It performs the same function as the pipeline
above
...
In most regards, it works in the same fashion as gzip
...
bz2:

229

18 – Archiving And Backup
[me@linuxbox
[me@linuxbox
-rw-r--r-- 1
[me@linuxbox
[me@linuxbox
-rw-r--r-- 1
[me@linuxbox

~]$
~]$
me
~]$
~]$
me
~]$

ls -l /etc > foo
...
txt
me
15738 2008-10-17 13:51 foo
...
txt
ls -l foo
...
bz2
me
2792 2008-10-17 13:51 foo
...
bz2
bunzip2 foo
...
bz2

As we can see, bzip2 can be used the same way as gzip
...
Note, however, that the
compression level option (-number) has a somewhat different meaning to bzip2
...

bzip2 also comes with the bzip2recover program, which will try to recover damaged
...


Don’t Be Compressive Compulsive
I occasionally see people attempting to compress a file, that has already been
compressed with an effective compression algorithm, by doing something like
this:
$ gzip picture
...
You’re probably just wasting time and space! If you apply compression to a file that is already compressed, you will actually end up a larger file
...
If you try to compress a file that already
contains no redundant information, the compression will not result in any savings
to offset the additional overhead
...
Archiving is the process of gathering up many files and bundling them together into a
single large file
...
It is also used when
old data is moved from a system to some type of long-term storage
...

Its name, short for tape archive, reveals its roots as a tool for making backup tapes
...

230

Archiving Files
We often see filenames that end with the extension
...
tgz, which indicate a
“plain” tar archive and a gzipped archive, respectively
...
The command syntax works like this:
tar mode[options] pathname
...


x

Extract an archive
...


t

List the contents of an archive
...
First, let’s re-create our playground from the previous chapter:
[me@linuxbox ~]$ mkdir -p playground/dir-{001
...
100}/file-{A
...
tar playground

This command creates a tar archive named playground
...
We can see that the mode and the f option, which is used
to specify the name of the tar archive, may be joined together, and do not require a leading dash
...

To list the contents of the archive, we can do this:
[me@linuxbox ~]$ tar tf playground
...
tar

Now, let’s extract the playground in a new location
...
/playground
...
There is one
caveat, however: Unless you are operating as the superuser, files and directories extracted
from archives take on the ownership of the user performing the restoration, rather than
the original owner
...
The default for pathnames is relative, rather than absolute
...
To demonstrate, we will
re-create our archive, this time specifying an absolute pathname:
[me@linuxbox foo]$ cd
[me@linuxbox ~]$ tar cf playground2
...
Next, we
will extract the archive as before and watch what happens:
[me@linuxbox ~]$ cd foo
[me@linuxbox foo]$ tar xf
...
tar
[me@linuxbox foo]$ ls
home
playground
[me@linuxbox foo]$ ls home
me
[me@linuxbox foo]$ ls home/me
playground

232

Archiving Files
Here we can see that when we extracted our second archive, it re-created the directory
home/me/playground relative to our current working directory, ~/foo, not relative
to the root directory, as would have been the case with an absolute pathname
...
Repeating the exercise with the inclusion of the verbose option (v) will
give a clearer picture of what’s going on
...
Imagine we want
to copy the home directory and its contents from one system to another and we have a
large USB hard drive that we can use for the transfer
...
Let’s also imagine that the
disk has a volume name of BigDisk when we attach it
...
tar /home

After the tar file is written, we unmount the drive and attach it to the second computer
...
To extract the archive, we do this:
[me@linuxbox2 ~]$ cd /
[me@linuxbox2 /]$ sudo tar xf /media/BigDisk/home
...

When extracting an archive, it’s possible to limit what is extracted from the archive
...
tar pathname

By adding the trailing pathname to the command, tar will only restore the specified file
...
Note that the pathname must be the full, exact relative pathname as stored in the archive
...
Here is an
example using our previous playground
...
/playground2
...

tar is often used in conjunction with find to produce archives
...
tar '{}' '+'

Here we use find to match all the files in playground named file-A and then, using the -exec action, we invoke tar in the append mode (r) to add the matching files
to the archive playground
...

Using tar with find is a good way of creating incremental backups of a directory tree
or an entire system
...

tar can also make use of both standard input and output
...
tgz

In this example, we used the find program to produce a list of matching files and piped
them into tar
...
) The --files-from option (which
may also be specified as -T) causes tar to read its list of pathnames from a file rather
than the command line
...
tgz
...
tgz extension is the conventional
extension given to gzip-compressed tar files
...
tar
...

While we used the gzip program externally to produced our compressed archive, mod234

Archiving Files
ern versions of GNU tar support both gzip and bzip2 compression directly, with the use
of the z and j options, respectively
...
tgz -T -

If we had wanted to create a bzip2 compressed archive instead, we could have done this:
[me@linuxbox ~]$ find playground -name 'file-A' | tar cjf
playground
...
tbz to indicate a bzip2 compressed file) we enabled bzip2 compression
...
Imagine that we had two machines
running a Unix-like system equipped with tar and ssh
...
How did we do this? First, we launched the tar program on the remote system using ssh
...
We can take advantage of this by having tar create an archive (the c mode) and send it to standard output,
rather than a file (the f option with the dash argument), thereby transporting the archive
over the encrypted tunnel provided by ssh to the local system
...


zip
The zip program is both a compression tool and an archiver
...
zip files
...

In its most basic usage, zip is invoked like this:
zip options zipfile file
...
zip playground

Unless we include the -r option for recursion, only the playground directory (but
none of its contents) is stored
...
zip is automatic,
we will include the file extension for clarity
...
zip will add files to
the archive using one of two storage methods: Either it will “store” a file without compression, as shown here, or it will “deflate” the file which performs compression
...
Since our playground only contains empty files, no compression is performed
on its contents
...
/playground
...
This means that the existing archive is preserved,
but new files are added and matching files are replaced
...
zip playground/dir-087/file-Z
Archive:
...
zip
Length
Date
Time
Name
----------------0 10-05-08 09:25
playground/dir-087/file-Z
-------------0
1 file
[me@linuxbox ~]$ cd foo
[me@linuxbox foo]$ unzip
...
zip playground/dir-087/file-Z
Archive:
...
zip
replace playground/dir-087/file-Z? [y]es, [n]o, [A]ll, [N]one,
[r]ename: y
extracting: playground/dir-087/file-Z

Using the -l option causes unzip to merely list the contents of the archive without extracting the file
...
The -v
option can be added to increase the verbosity of the listing
...

Like tar, zip can make use of standard input and output, though its implementation is
somewhat less useful
...
zip

Here we use find to generate a list of files matching the test -name "file-A", and
then pipe the list into zip, which creates the archive file-A
...

zip also supports writing its output to standard output, but its use is limited because very
few programs can make use of the output
...
This prevents zip and unzip from being used together to perform network file copying like tar
...
zip adding: - (deflated 80%)

In this example we pipe the output of ls into zip
...

The unzip program allows its output to be sent to standard output when the -p (for
pipe) option is specified:
[me@linuxbox ~]$ unzip -p ls-etc
...
They both have a lot of
options that add to their flexibility, though some are platform specific to other systems
...

However, the main use of these programs is for exchanging files with Windows systems,
rather than performing compression and archiving on Linux, where tar and gzip are
greatly preferred
...
We
might, for example, have a local copy of a website under development and synchronize it
from time to time with the “live” copy on a remote web server
...
This program can synchronize both local and remote directories by using the rsync remote-update protocol,
which allows rsync to quickly detect the differences between two directories and perform the minimum amount of copying required to bring them into sync
...

rsync is invoked like this:
rsync options source destination
where source and destination are one of the following:


A local file or directory



A remote file or directory in the form of [user@]host:path



A remote rsync server specified with a URI of rsync://[user@]host[:port]/path

Note that either the source or the destination must be a local file
...

Let’s try rsync out on some local files
...
While the command runs, we will see a list of the files and directories being copied
...
02

387258
...
If we run the command again, we will see a
different result:
[me@linuxbox ~]$ rsync -av playgound foo
building file list
...
14

45310
...
This is because rsync detected that there were
no differences between ~/playground and ~/foo/playground, and therefore it
didn’t need to copy anything
...
done
playground/dir-099/file-Z
sent 22685 bytes received 42 bytes 45454
...
14

239

18 – Archiving And Backup
we see that rsync detected the change and copied only the updated file
...
If we attach the drive to our system and, once again, it is mounted at /media/BigDisk, we can perform a useful system backup by first creating a directory
named /backup on the external drive, and then using rsync to copy the most important stuff from our system to the external drive:
[me@linuxbox ~]$ mkdir /media/BigDisk/backup
[me@linuxbox ~]$ sudo rsync -av --delete /etc /home /usr/local
/media/BigDisk/backup

In this example, we copied the /etc, /home, and /usr/local directories from our
system to our imaginary storage device
...
Repeating the procedure of attaching the external drive and running this rsync
command would be a useful (though not ideal) way of keeping a small system backed up
...
We could create an alias and add it to our

...


Using rsync Over A Network
One of the real beauties of rsync is that it can be used to copy files over a network
...
” Remote copying can be done in one of two
ways
...
Let’s say we had another system on our local network with a
lot of available hard drive space and we wanted to perform our backup operation using
the remote system instead of an external drive
...
First, we added the
--rsh=ssh option, which instructs rsync to use the ssh program as its remote shell
...
Second, we specified the remote host by prefixing its
name (in this case the remote host is named remote-sys) to the destination pathname
...
rsync can be configured to run as a daemon and listen to incoming requests for synchronization
...
For
example, Red Hat Software maintains a large repository of software packages under development for its Fedora distribution
...
Since files in the repository
change frequently (often more than once a day), it is desirable to maintain a local mirror
by periodic synchronization, rather than by bulk copying of the repository
...
gtlib
...
edu/fed
ora-linux-core/development/i386/os fedora-devel

In this example, we use the URI of the remote rsync server, which consists of a protocol
(rsync://), followed by the remote host-name (rsync
...
gatech
...


Summing Up
We've looked at the common compression and archiving programs used on Linux and
other Unix-like operating systems
...
Finally, we looked at the rsync program (a personal favorite)
which is very handy for efficient synchronization of files and directories across systems
...
In addition, the GNU Project has a good online manual for
its version of tar
...
gnu
...
html

242

19 – Regular Expressions

19 – Regular Expressions
In the next few chapters, we are going to look at tools used to manipulate text
...
But
before we can fully appreciate all of the features offered by these tools, we have to first
examine a technology that is frequently associated with the most sophisticated uses of
these tools — regular expressions
...

Regular expressions continue this “tradition” and may be (arguably) the most arcane feature of them all
...
Quite the contrary
...


What Are Regular Expressions?
Simply put, regular expressions are symbolic notations used to identify patterns in text
...
Regular expressions are supported by many command line
tools and by most programming languages to facilitate the solution of text manipulation
problems
...
For our
discussion, we will limit ourselves to regular expressions as described in the POSIX standard (which will cover most of the command line tools), as opposed to many programming languages (most notably Perl), which use slightly larger and richer sets of notations
...

The name “grep” is actually derived from the phrase “global regular expression print,” so
we can see that grep has something to do with regular expressions
...

243

19 – Regular Expressions
So far, we have used grep with fixed strings, like so:
[me@linuxbox ~]$ ls /usr/bin | grep zip

This will list all the files in the /usr/bin directory whose names contain the substring
“zip”
...
]
where regex is a regular expression
...
Normally, grep prints lines that contain a match
...
May also be specified --invert-match
...
May also be
specified --count
...
May also be specified --files-with-matches
...
May also be specified --files-withoutmatch
...
May also be specified --line-number
...
May also
be specified --no-filename
...
Do not distinguish between upper and lower case
characters
...


In order to more fully explore grep, let’s create some text files to search:

244

grep
[me@linuxbox ~]$ ls /bin > dirlist-bin
...
txt
[me@linuxbox ~]$ ls /sbin > dirlist-sbin
...
txt
[me@linuxbox ~]$ ls dirlist*
...
txt
dirlist-sbin
...
txt
dirlist-usr-bin
...
txt
dirlist-bin
...
txt:bzip2recover

In this example, grep searches all of the listed files for the string bzip and finds two
matches, both in the file dirlist-bin
...
If we were only interested in the list of
files that contained matches rather than the matches themselves, we could specify the -l
option:
[me@linuxbox ~]$ grep -l bzip dirlist*
...
txt

Conversely, if we wanted only to see a list of the files that did not contain a match, we
could do this:
[me@linuxbox ~]$ grep -L bzip dirlist*
...
txt
dirlist-usr-bin
...
txt

Metacharacters And Literals
While it may not seem apparent, our grep searches have been using regular expressions
all along, albeit very simple ones
...
The characters in the string “bzip” are all literal characters,
in that they match themselves
...
Regular expression
metacharacters consist of the following:
^ $
...

Note: As we can see, many of the regular expression metacharacters are also characters that have meaning to the shell when expansion is performed
...


The Any Character
The first metacharacter we will look at is the dot or period character, which is used to
match any character
...
Here’s an example:
[me@linuxbox ~]$ grep -h '
...
txt
bunzip2
bzip2
bzip2recover
gunzip
gzip
funzip
gpg-zip
preunzip
prezip
prezip-bin
unzip
unzipsfx

We searched for any line in our files that matches the regular expression “
...
There are
a couple of interesting things to note about the results
...
This is because the inclusion of the dot metacharacter in our regular expression
increased the length of the required match to four characters, and because the name “zip”
only contains three, it does not match
...
zip, they would have been matched as well, because the period character in
the file extension is treated as “any character,” too
...

This means that they cause the match to occur only if the regular expression is found at
the beginning of the line (^) or at the end of the line ($):
[me@linuxbox ~]$ grep -h '^zip' dirlist*
...
txt
gunzip
gzip
funzip
gpg-zip
preunzip
prezip
unzip
zip
[me@linuxbox ~]$ grep -h '^zip$' dirlist*
...
e
...
) Note that the regular expression ‘^$’ (a beginning and an end
with nothing in between) will match blank lines
...

My wife loves crossword puzzles and she will sometimes ask me for help with a
particular question
...
?” This kind of question got me thinking
...
Take a look
in the /usr/share/dict directory and you might find one, or several
...
On my system, the words file contains just over 98,500

247

19 – Regular Expressions

words
...
j
...


Bracket Expressions And Character Classes
In addition to matching any character at a given position in our regular expression, we
can also match a single character from a specified set of characters by using bracket expressions
...
In this example, using a two character set:
[me@linuxbox ~]$ grep -h '[bg]zip' dirlist*
...

A set may contain any number of characters, and metacharacters lose their special meaning when placed within brackets
...
The first is the caret
(^), which is used to indicate negation; the second is the dash (-), which is used to indicate a character range
...
We
do this by modifying our previous example:
[me@linuxbox ~]$ grep -h '[^bg]zip' dirlist*
...
Notice that the file zip was not found
...

The caret character only invokes negation if it is the first character within a bracket expression; otherwise, it loses its special meaning and becomes an ordinary character in the
set
...
txt

It’s just a matter of putting all 26uppercase letters in a bracket expression
...
txt
MAKEDEV
ControlPanel
GET
HEAD
POST
X
X11
Xorg
MAKEFLOPPIES
NetworkManager
NetworkManagerDispatcher

By using a three character range, we can abbreviate the 26 letters
...
txt

In character ranges, we see that the dash character is treated specially, so how do we actually include a dash character in a bracket expression? By making it the first character in
the expression
...
txt

This will match every filename containing an uppercase letter
...
txt

will match every filename containing a dash, or a uppercase “A” or an uppercase “Z”
...
Unfortunately, they don’t always work
...

Back in Chapter 4, we looked at how wildcards are used to perform pathname expansion
...
This example is from Ubuntu)
...
Why is that? It’s a long story, but here’s the short version:
Back when Unix was first developed, it only knew about ASCII characters, and this feature reflects that fact
...
The next 32 (32-63) contain printable
characters, including most punctuation characters and the numerals zero through nine
...
The final 31 (numbers 96-127) contain the lowercase letters and yet more punctuation symbols
...
S
...
The ASCII table was expanded to use a full eight
bits, adding characters numbers 128-255, which accommodated many more languages
...
We can see
the language setting of our system using this command:
[me@linuxbox ~]$ echo $LANG
en_US
...
This explains the behavior of the commands above
...

To partially work around this problem, the POSIX standard includes a number of character classes which provide useful ranges of characters
...
In ASCII, equivalent to:
[A-Za-z0-9]

[:word:]

The same as [:alnum:], with the addition of the underscore
(_) character
...
In ASCII, equivalent to:
[A-Za-z]

[:blank:]

Includes the space and tab characters
...
Includes the ASCII characters 0
through 31 and 127
...


[:graph:]

The visible characters
...


[:lower:]

The lowercase letters
...
In ASCII, equivalent to:
[-!"#$%&'()*+,
...
All the characters in [:graph:]
plus the space character
...
In ASCII,
equivalent to:
[ \t\r\n\v\f]

[:upper:]

The uppercase characters
...
In ASCII,
equivalent to:
[0-9A-Fa-f]

Even with the character classes, there is still no convenient way to express partial ranges,
such as [A-M]
...
We show it here because POSIX character classes
can be used for both
...
As we saw above, the
LANG variable contains the name of the language and character set used in your
locale
...

To see the locale settings, use the locale command:
[me@linuxbox ~]$ locale
LANG=en_US
...
UTF-8"
LC_NUMERIC="en_US
...
UTF-8"
LC_COLLATE="en_US
...
UTF-8"
LC_MESSAGES="en_US
...
UTF-8"
LC_NAME="en_US
...
UTF-8"
LC_TELEPHONE="en_US
...
UTF-8"
LC_IDENTIFICATION="en_US
...
S
...

You can make this change permanent by adding this line to you your
...
Extended Regular Expressions
Just when we thought this couldn’t get any more confusing, we discover that POSIX also
splits regular expression implementations into two kinds: basic regular expressions
(BRE) and extended regular expressions (ERE)
...
Our grep
program is one such program
...
With BRE,
the following metacharacters are recognized:
^ $
...
With ERE, the following metacharacters (and
their associated functions) are added:
( ) { } ? + |
However (and this is the fun part), the “(”, “)”, “{”, and “}” characters are treated as
metacharacters in BRE if they are escaped with a backslash, whereas with ERE, preceding any metacharacter with a backslash causes it to be treated as a literal
...

Since the features we are going to discuss next are part of ERE, we are going to need to
use a different grep
...


POSIX
During the 1980’s, Unix became a very popular commercial operating system, but
by 1988, the Unix world was in turmoil
...
However, in their efforts to
create product differentiation, each manufacturer added proprietary changes and
extensions
...
As always with
proprietary vendors, each was trying to play a winning game of “lock-in” with
their customers
...


254

POSIX Basic Vs
...
In the mid1980s, the IEEE began developing a set of standards that would define how Unix
(and Unix-like) systems would perform
...
The name “POSIX,”
which stands for Portable Operating System Interface (with the “X” added to the
end for extra snappiness), was suggested by Richard Stallman (yes, that Richard
Stallman), and was adopted by the IEEE
...
Just as
a bracket expression allows a single character to match from a set of specified characters,
alternation allows matches from a set of strings or other regular expressions
...
First, let’s try a plain old
string match:
[me@linuxbox ~]$ echo "AAA" | grep AAA
AAA
[me@linuxbox ~]$ echo "BBB" | grep AAA
[me@linuxbox ~]$

A pretty straightforward example, in which we pipe the output of echo into grep and
see the results
...

Now we’ll add alternation, signified by the vertical-bar metacharacter:
[me@linuxbox
AAA
[me@linuxbox
BBB
[me@linuxbox
[me@linuxbox

~]$ echo "AAA" | grep -E 'AAA|BBB'
~]$ echo "BBB" | grep -E 'AAA|BBB'
~]$ echo "CCC" | grep -E 'AAA|BBB'
~]$

Here we see the regular expression 'AAA|BBB', which means “match either the string
AAA or the string BBB
...
Alternation is not limited to two choices:
[me@linuxbox ~]$ echo "AAA" | grep -E 'AAA|BBB|CCC'
AAA

To combine alternation with other regular expression elements, we can use () to separate
the alternation:
[me@linuxbox ~]$ grep -Eh '^(bz|gz|zip)' dirlist*
...
Had we left off the parentheses, the meaning of this regular expression :
[me@linuxbox ~]$ grep -Eh '^bz|gz|zip' dirlist*
...


Quantifiers
Extended regular expressions support several ways to specify the number of times an element is matched
...
” Let’s say we
wanted to check a phone number for validity and we considered a phone number to be
valid if it matched either of these two forms:
(nnn) nnn-nnnn
nnn nnn-nnnn

where “n” is a numeral
...
Again, since the parentheses are normally
metacharacters (in ERE), we precede them with backslashes to cause them to be treated
as literals instead
...


* - Match An Element Zero Or More Times
Like the ? metacharacter, the * is used to denote an optional item; however, unlike the ?,
the item may occur any number of times, not just once
...
To match this (very
crude) definition of a sentence, we could use a regular expression like this:
[[:upper:]][[:upper:][:lower:] ]*\
...
The second element
is trailed with an * metacharacter, so that after the leading uppercase letter in our sentence, any number of upper and lowercase letters and spaces may follow it and still
match:
[me@linuxbox ~]$ echo "This works
...
'
This works
...
" | grep -E '[[:upper:]][[:upper:][
:lower:] ]*\
...

[me@linuxbox ~]$ echo "this does not" | grep -E '[[:upper:]][[:upper:
][:lower:] ]*\
...


+ - Match An Element One Or More Times
The + metacharacter works much like the *, except it requires at least one instance of the
preceding element to cause a match
...


{ } - Match An Element A Specific Number Of Times
The { and } metacharacters are used to express minimum and maximum numbers of required matches
...


{n,m}

Match the preceding element if it occurs at least n times, but no
more than m times
...


{,m}

Match the preceding element if it occurs no more than m times
...


Putting Regular Expressions To Work
Let’s look at some of the commands we already know and see how they can be used with
regular expressions
...
A more realistic scenario would be checking a list of numbers instead, so let’s
make a list
...
It will be
magic because we have not covered most of the commands involved, but worry not
...
Here is the incantation:
[me@linuxbox ~]$ for i in {1
...
txt; done

This command will produce a file named phonelist
...
Each time the command is repeated, another ten numbers are added to the list
...
If we examine the contents of the file, however, we see we have a
problem:

259

19 – Regular Expressions
[me@linuxbox ~]$ cat phonelist
...

One useful method of validation would be to scan the file for invalid numbers and display
the resulting list on the display:
[me@linuxbox ~]$ grep -Ev '^\([0-9]{3}\) [0-9]{3}-[0-9]{4}$'
phonelist
...
The expression itself includes
the anchor metacharacters at each end to ensure that the number has no extra characters at
either end
...


Finding Ugly Filenames With find
The find command supports a test based on a regular expression
...

Whereas grep will print a line when the line contains a string that matches an expression, find requires that the pathname exactly match the regular expression
...
/0-9a-zA-Z]
Such a scan would reveal pathnames that contain embedded spaces and other potentially
offensive characters:

260

Putting Regular Expressions To Work
[me@linuxbox ~]$ find
...
*[^-_
...
*'

Due to the requirement for an exact match of the entire pathname, we use
...
In the middle of the
expression, we use a negated bracket expression containing our set of acceptable pathname characters
...
With it, we can perform many of the same operations that we performed earlier with our dirlist files:
[me@linuxbox ~]$ locate --regex 'bin/(bz|gz|zip)'
/bin/bzcat
/bin/bzcmp
/bin/bzdiff
/bin/bzegrep
/bin/bzexe
/bin/bzfgrep
/bin/bzgrep
/bin/bzip2
/bin/bzip2recover
/bin/bzless
/bin/bzmore
/bin/gzexe
/bin/gzip
/usr/bin/zip
/usr/bin/zipcloak
/usr/bin/zipgrep
/usr/bin/zipinfo
/usr/bin/zipnote
/usr/bin/zipsplit

Using alternation, we perform a search for pathnames that contain either bin/bz,
bin/gz, or /bin/zip
...
Pressing the / key followed by a regular expression will perform a search
...
txt file:

261

19 – Regular Expressions
[me@linuxbox ~]$ less phonelist
...
They are only treated as metacharacters when escaped with a backslash
...
If not, try this command mode command:
:hlsearch
to activate search highlighting
...
Ubuntu, in particular, supplies a very stripped-down version of vim
by default
...


Summing Up
In this chapter, we’ve seen a few of the many uses of regular expressions
...
We can do that by searching the man pages:
[me@linuxbox ~]$ cd /usr/share/man/man1
[me@linuxbox man1]$ zgrep -El 'regex|regular expression' *
...

In our example, we search the compressed section one man page files in their usual location
...
As we can see, regular expressions show up in a lot of programs
...
Called back
references, this feature will be discussed in the next chapter
...

In addition, the Wikipedia has good articles on the following background topics:


POSIX: http://en
...
org/wiki/Posix



ASCII: http://en
...
org/wiki/Ascii

263

20 – Text Processing

20 – Text Processing
All Unix-like operating systems rely heavily on text files for several types of data storage
...
In this chapter, we
will look at programs that are used to “slice and dice” text
...

This chapter will revisit some old friends and introduce us to some new ones:


cat – Concatenate files and print on the standard output



sort – Sort lines of text files



uniq – Report or omit repeated lines



cut – Remove sections from each line of files



paste – Merge lines of files



join – Join lines of two files on a common field



comm – Compare two sorted files line by line



diff – Compare files line by line



patch – Apply a diff file to an original



tr – Translate or delete characters



sed – Stream editor for filtering and transforming text



aspell – Interactive spell checker

Applications Of Text
So far, we have learned a couple of text editors (nano and vim), looked at a bunch of
configuration files, and have witnessed the output of dozens of commands, all in text
...


264

Applications Of Text

Documents
Many people write documents using plain text formats
...
One popular approach is to write a large document in a
text format and then use a markup language to describe the formatting of the finished
document
...


Web Pages
The world’s most popular type of electronic document is probably the web page
...


Email
Email is an intrinsically text-based medium
...
We can see this for ourselves by downloading
an email message and then viewing it in less
...


Printer Output
On Unix-like systems, output destined for a printer is sent as plain text or, if the page
contains graphics, is converted into a text format page description language known as
PostScript, which is then sent to a program that generates the graphic dots to be printed
...
Many of them are designed to solve software development problems
...
Source code, the part of the program the programmer actually writes, is always in
text format
...
We only touched on them
briefly then, but now we will take a closer look at how they can be used to perform text
processing
...
Many of them are used to help
better visualize text content
...
There are times when we want to know if control characters are embedded in our otherwise visible text
...
Another common situation is a file containing lines of text
with trailing spaces
...
To do this, we’ll just enter the command cat (along with specifying a file for redirected output) and type our
text, followed by Enter to properly end the line, then Ctrl-d, to indicate to cat that
we have reached end-of-file
...
txt
The quick brown fox jumped over the lazy dog
...
txt
^IThe quick brown fox jumped over the lazy dog
...
This is a
common notation that means “Control-I” which, as it turns out, is the same as a tab character
...


266

Revisiting Some Old Friends

MS-DOS Text Vs
...
Where do hidden carriage returns come
from? DOS and Windows! Unix and DOS don’t define the end of a line the same
way in text files
...

There are a several ways to convert files from DOS to Unix format
...
However, if you don’t have dos2unix on your system, don’t worry
...
That is easily accomplished by a couple of the programs discussed
later in this chapter
...
The two most prominent are -n, which
numbers lines, and -s, which suppresses the output of multiple blank lines
...
txt
The quick brown fox
jumped over the lazy dog
...
txt
1
The quick brown fox
2
3
jumped over the lazy dog
...
txt test file, which contains two
lines of text separated by two blank lines
...
While this is not
much of a process to perform on text, it is a process
...
Using the same technique that
we used with cat, we can demonstrate processing of standard input directly from the
267

20 – Text Processing
keyboard:
[me@linuxbox ~]$ sort > foo
...
txt
a
b
c

After entering the command, we type the letters “c”, “b”, and “a”, followed once again by
Ctrl-d to indicate end-of-file
...

Since sort can accept multiple files on the command line as arguments, it is possible to
merge multiple files into a single sorted whole
...
txt file2
...
txt > final_sorted_list
...
Here is a partial list:
Table 20-1: Common sort Options
Option
-b

Long Option
--ignore-leading-blanks

Description

-f

--ignore-case

Makes sorting case-insensitive
...

Using this option allows sorting to
be performed on numeric values
rather than alphabetic values
...
This
option causes sort to ignore
leading spaces in lines and
calculates sorting based on the first
non-whitespace character on the
line
...
Results are in
descending rather than ascending
order
...
See discussion below
...
Merge multiple
files into a single sorted result
without performing any additional
sorting
...


-t

--field-separator=char

Define the field-separator
character
...


Although most of the options above are pretty self-explanatory, some are not
...
With this option, it is possible to sort values based on numeric values
...
Normally, the du command lists
the results of a summary in pathname order:
[me@linuxbox ~]$ du -s /usr/share/* | head
252
/usr/share/aclocal
96
/usr/share/acpi-support
8
/usr/share/adduser
196
/usr/share/alacarte
344
/usr/share/alsa
8
/usr/share/alsa-base
12488
/usr/share/anthy
8
/usr/share/apmd
21440
/usr/share/app-install
48
/usr/share/application-registry

In this example, we pipe the results into head to limit the results to the first ten lines
...
This sort works because the numerical values occur at the
beginning of each line
...
4
root
3654020 2008-08-26 16:16 quanta
root
2928760 2008-09-10 14:31 gdbtui
root
2928756 2008-09-10 14:31 gdb
root
2602236 2008-10-10 12:56 net
root
2304684 2008-10-10 12:56 rpcclient
root
2241832 2008-04-04 05:56 aptitude
root
2202476 2008-10-10 12:56 smbcacls

Revisiting Some Old Friends
Many uses of sort involve the processing of tabular data, such as the results of the ls
command above
...
sort is able to process individual
fields
...

In the example above, we specify the n and r options to perform a reverse numerical sort
and specify -k 5 to make sort use the fifth field as the key for sorting
...
Let’s consider a very simple text file consisting of a single line
containing the author’s name:
William

Shotts

By default, sort sees this line as having two fields
...

Looking again at a line from our ls output, we can see that a line contains eight fields
and that the fifth field is the file size:
-rwxr-xr-x 1 root

root

8234216 2008-04-07 17:42 inkscape

For our next series of experiments, let’s consider the following file containing the history
of three popular Linux distributions released from 2006 to 2008
...
2
10
11
...
04
8
10
...
10
7
7
...
04
10
...
06
8
...
txt
...
1
SUSE
10
...
3
SUSE
11
...
06
Ubuntu
6
...
04
Ubuntu
7
...
04
Ubuntu
8
...
txt
11/25/2008
03/20/2006
10/24/2006
05/31/2007
11/08/2007
05/13/2008
05/11/2006
12/07/2006
10/04/2007
06/19/2008
06/01/2006
10/26/2006
04/19/2007
10/18/2007
04/24/2008
10/30/2008

Well, it mostly worked
...

Since a “1” comes before a “5” in the character set, version “10” ends up at the top while
version “9” falls to the bottom
...
We want to perform an
alphabetic sort on the first field and then a numeric sort on the second field
...
In fact, a
key may include a range of fields
...
Here is the syntax for our multi-key sort:

272

Revisiting Some Old Friends
[me@linuxbox ~]$
Fedora
5
Fedora
6
Fedora
7
Fedora
8
Fedora
9
Fedora
10
SUSE
10
...
2
SUSE
10
...
0
Ubuntu
6
...
10
Ubuntu
7
...
10
Ubuntu
8
...
10

sort --key=1,1 --key=2n distros
...
In the first instance of the key option, we specified a range of fields to
include in the first key
...
” In the second instance, we
specified 2n, which means that field 2 is the sort key and that the sort should be numeric
...
These option letters are the same as the global options for the sort program: b (ignore leading blanks), n (numeric sort), r (reverse sort), and so on
...
On computers, dates are usually formatted in YYYY-MM-DD order to make chronological sorting easy, but ours are in the American format of MM/DD/YYYY
...
The key option allows specification of offsets within
fields, so we can define keys within fields:
[me@linuxbox ~]$
Fedora
10
Ubuntu
8
...
0
Fedora
9
Ubuntu
8
...
10
SUSE
10
...
7nbr -k 3
...
4nbr distros
...
04
10
...
10
6
6
...
1
5

04/19/2007
12/07/2006
10/26/2006
10/24/2006
06/01/2006
05/11/2006
03/20/2006

By specifying -k 3
...
Likewise, we
specify -k 3
...
4 to isolate the month and day portions of the date
...
The b option is included to
suppress the leading spaces (whose numbers vary from line to line, thereby affecting the
outcome of the sort) in the date field
...
To sort the
passwd file on the seventh field (the account’s default shell), we could do this:
[me@linuxbox ~]$ sort -t ':' -k 7 /etc/passwd | head
me:x:1001:1001:Myself,,,:/home/me:/bin/bash
root:x:0:0:root:/root:/bin/bash
dhcp:x:101:102::/nonexistent:/bin/false
gdm:x:106:114:Gnome Display Manager:/var/lib/gdm:/bin/false
hplip:x:104:7:HPLIP system user,,,:/var/run/hplip:/bin/false
klog:x:103:104::/home/klog:/bin/false
messagebus:x:108:119::/var/run/dbus:/bin/false
polkituser:x:110:122:PolicyKit,,,:/var/run/PolicyKit:/bin/false

274

Revisiting Some Old Friends
pulse:x:107:116:PulseAudio daemon,,,:/var/run/pulse:/bin/false

By specifying the colon character as the field separator, we can sort on the seventh field
...
uniq performs a seemingly
trivial task
...
It is often used in conjunction with sort to
clean the output of duplicates
...

Let’s make a text file to try this out:
[me@linuxbox ~]$ cat > foo
...
Now, if we run uniq on our text
file:
[me@linuxbox ~]$ uniq foo
...
For
uniq to actually do its job, the input must be sorted first:

275

20 – Text Processing
[me@linuxbox ~]$ sort foo
...

uniq has several options
...


-d

Only output repeated lines, rather than unique lines
...
Fields are separated by
whitespace as they are in sort; however, unlike sort, uniq has
no option for setting an alternate field separator
...


-s n

Skip (ignore) the leading n characters of each line
...
This is the default
...
txt | uniq -c
2 a
2 b
2 c

Slicing And Dicing
The next three programs we will discuss are used to peel columns of text out of files and
recombine them in useful ways
...
It can accept multiple file arguments or input from standard input
...
The list
may consist of one or more comma-separated numerical
ranges
...
The list may contain one or more fields or field
ranges separated by commas
...
By default, fields must be separated by a single tab
character
...


As we can see, the way cut extracts text is rather inflexible
...
We’ll take a look at our distros
...
If we use cat with the -A option, we can see if
the file meets our requirements of tab-separated fields:
[me@linuxbox ~]$ cat -A distros
...
2^I12/07/2006$
Fedora^I10^I11/25/2008$
SUSE^I11
...
04^I04/24/2008$
Fedora^I8^I11/08/2007$
SUSE^I10
...
10^I10/26/2006$
Fedora^I7^I05/31/2007$
Ubuntu^I7
...
04^I04/19/2007$
SUSE^I10
...
06^I06/01/2006$
Ubuntu^I8
...
No embedded spaces, just single tab characters between the fields
...
txt
12/07/2006
11/25/2008
06/19/2008
04/24/2008
11/08/2007
10/04/2007
10/26/2006
05/31/2007
10/18/2007
04/19/2007
05/11/2006
10/24/2006
05/13/2008
06/01/2006
10/30/2008
03/20/2006

Because our distros file is tab-delimited, it is best to use cut to extract fields rather
than characters
...
In our example above, however, we now have extracted a field that luckily contains data of identical length, so we can show how character
extraction works by extracting the year from each line:
[me@linuxbox ~]$ cut -f 3 distros
...
The 7-10 notation is an example of a range
...


Expanding Tabs
Our distros
...
But
what if we wanted a file that could be fully manipulated with cut by characters,
rather than fields? This would require us to replace the tab characters within the
file with the corresponding number of spaces
...
Named expand, this program accepts either one
or more file arguments or standard input, and outputs the modified text to standard output
...
txt file with expand, we can use the cut -c to
extract any range of characters from the file
...
txt | cut -c 23-

Coreutils also provides the unexpand program to substitute tabs for spaces
...
Here we will extract the first field from the /etc/passwd file:
[me@linuxbox ~]$ cut -d ':' -f 1 /etc/passwd | head
root
daemon
bin
sys
sync
games
man
lp

279

20 – Text Processing
mail
news

Using the -d option, we are able to specify the colon character as the field delimiter
...
Rather than extracting a column of text
from a file, it adds one or more columns of text to a file
...

Like cut, paste accepts multiple file arguments and/or standard input
...
txt file to produce a chronological list of releases
...
txt:
[me@linuxbox ~]$ sort -k 3
...
1nbr -k 3
...
txt > dis
tros-by-date
...
txt:
[me@linuxbox ~]$ cut -f 1,2 distros-by-date
...
t
xt
[me@linuxbox ~]$ head distros-versions
...
10
SUSE
11
...
04
Fedora
8
Ubuntu
7
...
3
Fedora
7
Ubuntu
7
...
txt:

280

Slicing And Dicing
[me@linuxbox ~]$ cut -f 3 distros-by-date
...
txt
[me@linuxbox ~]$ head distros-dates
...
To complete the process, use paste to put the column
of dates ahead of the distro names and versions, thus creating a chronological list
...
txt distros-versions
...
10
06/19/2008 SUSE
11
...
04
11/08/2007 Fedora
8
10/18/2007 Ubuntu
7
...
3
05/31/2007 Fedora
7
04/19/2007 Ubuntu
7
...
2
10/26/2006 Ubuntu
6
...
06
05/11/2006 SUSE
10
...
A join is an operation usually associated with relational databases where
data from multiple tables with a shared key field is combined to form a desired result
...
It joins data from multiple files based
on a shared key field
...
The first table, called
281

20 – Text Processing
CUSTOMERS, has three fields: a customer number (CUSTNUM), the customer’s first
name (FNAME), and the customer’s last name (LNAME):
CUSTNUM
========
4681934

FNAME
=====
John

LNAME
======
Smith

The second table is called ORDERS and contains four fields: an order number (ORDERNUM), the customer number (CUSTNUM), the quantity (QUAN), and the item ordered
(ITEM)
...
This is important, as it allows a relationship between the tables
...
Using the matching values in the
CUSTNUM fields of both tables, a join operation could produce the following:
FNAME
=====
John

LNAME
=====
Smith

QUAN ITEM
==== ====
1
Blue Widget

To demonstrate the join program, we’ll need to make a couple of files with a shared
key
...
txt file
...
txt > distros-names
...
txt distros-names
...
txt
[me@linuxbox ~]$ head distros-key-names
...
txt > distros-vernums
...
txt distros-vernums
...
txt
[me@linuxbox ~]$ head distros-key-vernums
...
10
06/19/2008 11
...
04
11/08/2007 8
10/18/2007 7
...
3
05/31/2007 7
04/19/2007 7
...
It is important to point
out that the files must be sorted on the key field for join to work properly
...
txt distros-key-vernums
...
10
06/19/2008 SUSE 11
...
04
11/08/2007 Fedora 8
10/18/2007 Ubuntu 7
...
3
05/31/2007 Fedora 7
04/19/2007 Ubuntu 7
...
This behavior can be modified by specifying options
...


Comparing Text
It is often useful to compare versions of text files
...
A system administrator may, for example, need
to compare an existing configuration file to a previous version to diagnose a system problem
...


283

20 – Text Processing

comm
The comm program compares two text files and displays the lines that are unique to each
one and the lines they have in common
...
txt
a
b
c
d
[me@linuxbox ~]$ cat > file2
...
txt file2
...
The first column contains lines
unique to the first file argument; the second column, the lines unique to the second file argument; the third column contains the lines shared by both files
...
When used, these options specify which column(s) to suppress
...
txt file2
...
However,
284

Comparing Text
diff is a much more complex tool, supporting many output formats and the ability to
process large collections of text files at once
...

One common use for diff is the creation of diff files or patches that are used by programs such as patch (which we’ll discuss shortly) to convert one version of a file (or
files) to another version
...
txt file2
...
In the default format, each group of changes is preceded by a change command in
the form of range operation range to describe the positions and types of changes required
to convert the first file to the second file:
Table 20-4: diff Change Commands
Change
r1ar2

Description

r1cr2

Change (replace) the lines at position r1 with the lines at the
position r2 in the second file
...


In this format, a range is a comma-separated list of the starting line and the ending line
...
Two of the more popular formats are the context format and the unified format
...
txt file2
...
txt
2008-12-23 06:40:13
...
txt
2008-12-23 06:40:34
...
The first file is
marked with asterisks and the second file is marked with dashes
...
Next, we see groups of
changes, including the default number of surrounding context lines
...
Later we see:
--- 1,4 --which indicates lines 1 through 4 in the second file
...
This line will appear in the first file but not in the
second file
...
This line will appear in the second file but not in the
first file
...
The two versions of the line will be displayed, each
in its respective section of the change group
...
It does not indicate a difference between
the two files
...
It is specified
with the -u option:

286

Comparing Text
[me@linuxbox ~]$ diff -u file1
...
txt
--- file1
...
000000000 -0500
+++ file2
...
000000000 -0500
@@ -1,4 +1,4 @@
-a
b
c
d
+e

The most notable difference between the context and unified formats is the elimination of
the duplicated lines of context, making the results of the unified format shorter than those
of the context format
...
This indicates the lines in the
first file and the lines in the second file described in the change group
...
Each line starts with one of
three possible characters:
Table 20-6: diff Unified Format Change Indicators
Character
blank

Meaning

-

This line was removed from the first file
...


This line is shared by both files
...
It accepts output from diff
and is generally used to convert older version of files into newer versions
...
The Linux kernel is developed by a large, loosely organized team of
contributors who submit a constant stream of small changes to the source code
...
It makes no sense for a contributor to send
each developer an entire kernel source tree each time a small change is made
...
The diff file contains the change from the previous version of the
kernel to the new version with the contributor's changes
...
Using diff/patch offers
two significant advantages:
1
...

2
...

Of course, diff/patch will work on any text file, not just source code
...

To prepare a diff file for use with patch, the GNU documentation (see Further Reading
below) suggests using diff as follows:
diff -Naur old_file new_file > diff_file
Where old_file and new_file are either single files or directories containing files
...

Once the diff file has been created, we can apply it to patch the old file into the new file:
patch < diff_file
We’ll demonstrate with our test file:
[me@linuxbox ~]$ diff -Naur file1
...
txt > patchfile
...
txt
patching file file1
...
txt
b
c
d
e

In this example, we created a diff file named patchfile
...
Note that we did not have to specify a target file to
patch, as the diff file (in unified format) already contains the filenames in the header
...
txt now matches file2
...

patch has a large number of options, and there are additional utility programs that can
be used to analyze and edit patches
...
However, there are non-interactive ways to
edit text as well
...


tr
The tr program is used to transliterate characters
...
Transliteration is the process of changing characters from one alphabet to another
...
We can perform such a conversion with tr as follows:
[me@linuxbox ~]$ echo "lowercase letters" | tr a-z A-Z
LOWERCASE LETTERS

As we can see, tr operates on standard input, and outputs its results on standard output
...
Character sets may be expressed in one of three ways:
1
...
For example, ABCDEFGHIJKLMNOPQRSTUVWXYZ
2
...
For example, A-Z
...

3
...
For example, [:upper:]
...
Earlier in this chapter, we discussed the problem of converting MS-DOS text files
to Unix-style text
...
This can be performed with tr as follows:
tr -d '\r' < dos_file > unix_file
where dos_file is the file to be converted and unix_file is the result
...
To see a
complete list of the sequences and character classes tr supports, try:
[me@linuxbox ~]$ tr --help

289

20 – Text Processing

ROT13: The Not-So-Secret Decoder Ring
One amusing use of tr is to perform ROT13 encoding of text
...
Calling ROT13 “encryption” is being generous; “text obfuscation” is more accurate
...
The method simply moves each
character 13 places up the alphabet
...
To perform this encoding with tr:
echo "secret text" | tr a-zA-Z n-za-mN-ZA-M
frperg grkg

Performing the same procedure a second time results in the translation:
echo "frperg grkg" | tr a-zA-Z n-za-mN-ZA-M
secret text

A number of email programs and Usenet news readers support ROT13 encoding
...
wikipedia
...
Using the -s option, tr can “squeeze” (delete) repeated instances of a character:
[me@linuxbox ~]$ echo "aaabbbccc" | tr -s ab
abccc

Here we have a string containing repeated characters
...
Note that the repeating characters must be
adjoining
...


sed
The name sed is short for stream editor
...
sed is a powerful and somewhat complex
program (there are entire books about it), so we will not cover it completely here
...
Here is a very simple example of sed in action:
[me@linuxbox ~]$ echo "front" | sed 's/front/back/'
back

In this example, we produce a one-word stream of text using echo and pipe it into sed
...
We can also recognize this command as resembling the “substitution” (search-and-replace) command in vi
...
In the example above, the substitution command is represented by the letter s and is followed by the search-and-replace strings, separated by the slash character as a delimiter
...
By convention, the slash character is often used, but sed will accept any character
that immediately follows the command as the delimiter
...
The ability to set the delimiter can be used to make commands more readable, as we
shall see
...
If the address is omitted, then the editing command is carried out on every line in the input stream
...

We can add one to our example:
[me@linuxbox ~]$ echo "front" | sed '1s/front/back/'
back

Adding the address 1 to our command causes our substitution to be performed on the first
291

20 – Text Processing
line of our one-line input stream
...

Addresses may be expressed in many ways
...


$

The last line
...
Note that the
regular expression is delimited by slash characters
...


addr1,addr2

A range of lines from addr1 to addr2, inclusive
...


first~step

Match the line represented by the number first, then each
subsequent line at step intervals
...


addr1,+n

Match addr1 and the following n lines
...


We’ll demonstrate different kinds of addresses using the distros
...
First, a range of line numbers:
[me@linuxbox ~]$
SUSE
10
...
0
Ubuntu
8
...
txt
12/07/2006
11/25/2008
06/19/2008
04/24/2008

Editing On The Fly
Fedora

8

11/08/2007

In this example, we print a range of lines, starting with line 1 and continuing to line 5
...
For
this to be effective however, we must include the option -n (the no auto-print option) to
cause sed not to print every line by default
...
2
SUSE
11
...
3
SUSE
10
...
txt
12/07/2006
06/19/2008
10/04/2007
05/11/2006

By including the slash-delimited regular expression /SUSE/, we are able to isolate the
lines containing it in much the same manner as grep
...
04
Fedora
8
Ubuntu
6
...
10
Ubuntu
7
...
06
Ubuntu
8
...
txt
11/25/2008
04/24/2008
11/08/2007
10/26/2006
05/31/2007
10/18/2007
04/19/2007
10/24/2006
05/13/2008
06/01/2006
10/30/2008
03/20/2006

Here we see the expected result: all of the lines in the file except the ones matched by the
regular expression
...
Here is a more complete list of the basic editing commands:
Table 20-8: sed Basic Editing Commands
Command

Description
293

20 – Text Processing
=

Output current line number
...


d

Delete the current line
...


p

Print the current line
...
The default behavior can
be overridden by specifying the -n option
...
If the
-n option is not specified, output the current line
...


s/regexp/replacement/

Substitute the contents of replacement wherever
regexp is found
...
In addition, replacement may
include the sequences \1 through \9, which are
the contents of the corresponding subexpressions
in regexp
...
After the trailing slash
following replacement, an optional flag may be
specified to modify the s command’s behavior
...

Note that unlike tr, sed requires that both sets be
of the same length
...
We will demonstrate just some of its power by performing an edit on our distros
...
We discussed before how the date field in distros
...
While the date is formatted MM/DD/YYYY, it would be better (for ease of sorting)
if the format were YYYY-MM-DD
...
txt
SUSE
10
...
0 2008-06-19
Ubuntu
8
...
3 2007-10-04
Ubuntu
6
...
10 2007-10-18
Ubuntu
7
...
1 2006-05-11
Fedora
6
2006-10-24
Fedora
9
2008-05-13
Ubuntu
6
...
10 2008-10-30
Fedora
5
2006-03-20

Wow! Now that is an ugly looking command
...
In just one step, we have
changed the date format in our file
...
We can write them,
but we sometimes cannot read them
...
First, we know that the command will
have this basic structure:
sed 's/regexp/replacement/' distros
...
Since it is in
MM/DD/YYYY format and appears at the end of the line, we can use an expression like
this:
[0-9]{2}/[0-9]{2}/[0-9]{4}$

which matches two digits, a slash, two digits, a slash, four digits, and the end of line
...
This
feature is called back references and works like this: If the sequence \n appears in replacement where n is a number from 1 to 9, the sequence will refer to the corresponding
subexpression in the preceding regular expression
...
The first contains the month, the second contains the
day of the month, and the third contains the year
...

Now, our command looks like this:
sed 's/([0-9]{2})/([0-9]{2})/([0-9]{4})$/\3-\1-\2/' distros
...
The first is that the extra slashes in our regular expression will confuse sed when it tries to interpret the s command
...
We can solve
both these problems with a liberal application of backslashes to escape the offending
characters:
sed 's/\([0-9]\{2\}\)\/\([0-9]\{2\}\)\/\([0-9]\{4\}\)$/\3-\1-\2/' dis
tros
...
The most important of these is the g flag, which instructs sed to apply
the search-and-replace globally to a line, not just to the first instance, which is the default
...
By adding the g flag, we are able to
change all the instances:
296

Editing On The Fly
[me@linuxbox ~]$ echo "aaabbbccc" | sed 's/b/B/g'
aaaBBBccc

So far, we have only given sed single commands via the command line
...
To demonstrate, we will use sed with our distros
...
Our report will
feature a title at the top, our modified dates, and all the distribution names converted to
uppercase
...
sed and run it like this:
[me@linuxbox ~]$ sed -f distros
...
txt
Linux Distributions Report
SUSE
FEDORA
SUSE
UBUNTU
FEDORA
SUSE
UBUNTU
FEDORA
UBUNTU
UBUNTU
SUSE
FEDORA
FEDORA
UBUNTU
UBUNTU
FEDORA

10
...
0
8
...
3
6
...
10
7
...
1
6
9
6
...
10
5

2006-12-07
2008-11-25
2008-06-19
2008-04-24
2007-11-08
2007-10-04
2006-10-26
2007-05-31
2007-10-18
2007-04-19
2006-05-11
2006-10-24
2008-05-13
2006-06-01
2008-10-30
2006-03-20

As we can see, our script produces the desired results, but how does it do it? Let’s take
297

20 – Text Processing
another look at our script
...
sed
1
# sed script to produce Linux distributions report
2
3
1 i\
4
\
5
Linux Distributions Report\
6
7
s/\([0-9]\{2\}\)\/\([0-9]\{2\}\)\/\([0-9]\{4\}\)
$/\3-\1-\2/
8
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/

Line one of our script is a comment
...
Comments can be placed anywhere in the script (though not within
commands themselves) and are helpful to any humans who might need to identify and/or
maintain the script
...
Like comments, blank lines may be added to improve readability
...
These are used to specify which lines of
the input are to be acted upon
...

Lines 3 through 6 contain text to be inserted at the address 1, the first line of the input
...
This sequence,
which can be used in many circumstances including shell scripts, allows a carriage return
to be embedded in a stream of text without signaling the interpreter (in this case sed)
that the end of the line has been reached
...
The sixth
line of our script is actually the end of our inserted text and ends with a plain carriage return rather than a line-continuation character, signaling the end of the i command
...
No intermediary spaces are permitted
...
Since it is not preceded by an address, each
line in the input stream is subject to its action
...
Note that
298

Editing On The Fly
unlike tr, the y command in sed does not support character ranges (for example, [az]), nor does it support POSIX character classes
...


People Who Like sed Also Like
...
It is most often used for simple, one-line tasks rather than long
scripts
...
The most popular of these
are awk and perl
...
perl, in particular, is often used in place of shell scripts for many system-management and administration tasks, as well as being a very popular medium for web development
...
Its specific strength is its ability to manipulate
tabular data
...
While both awk and perl are outside the scope of this book, they are
very good skills for the Linux command line user
...
The aspell
program is the successor to an earlier program named ispell, and can be used, for the
most part, as a drop-in replacement
...
It has the ability to intelligently check various
type of text files, including HTML documents, C/C++ programs, email messages, and
other kinds of specialized texts
...
As a practical example, let’s create a simple
text file named foo
...
txt

299

20 – Text Processing
The quick brown fox jimped over the laxy dog
...
txt

As aspell is interactive in the check mode, we will see a screen like this:
The quick brown fox jimped over the laxy dog
...
In
the middle, we see ten spelling suggestions numbered zero through nine, followed by a
list of other possible actions
...

If we press the 1 key, aspell replaces the offending word with the word “jumped” and
moves on to the next misspelled word, which is “laxy
...
Once aspell has finished, we can examine
our file and see that the misspellings have been corrected:
[me@linuxbox ~]$ cat foo
...


Unless told otherwise via the command line option --dont-backup, aspell creates
a backup file containing the original text by appending the extension
...


300

Editing On The Fly
Showing off our sed editing prowess, we’ll put our spelling mistakes back in so we can
reuse our file:
[me@linuxbox ~]$ sed -i 's/lazy/laxy/; s/jumped/jimped/' foo
...
We
also see the ability to place more than one editing command on the line by separating
them with a semicolon
...
Using a text editor such as vim (the adventurous may want to try sed), we will add some HTML
markup to our file:


Mispelled HTML file


The quick brown fox jimped over the laxy dog
...
If we do it this
way:
[me@linuxbox ~]$ aspell check foo
...





301

20 – Text Processing
1) HTML
2) ht ml
3) ht-ml

4) Hamel
5) Hamil
6) hotel

i)
r)
a)
b)

I)
R)
l)
x)

Ignore
Replace
Add
Abort

Ignore all
Replace all
Add Lower
Exit

?

aspell will see the contents of the HTML tags as misspelled
...
txt

which will result in this:



Mispelled HTML file





The quick brown fox jimped over the laxy dog
...
In this
mode, the contents of HTML tags are ignored and not checked for spelling
...

302

Editing On The Fly
Note: By default, aspell will ignore URLs and email addresses in text
...
It is also possible to specify
which markup tags are checked and skipped
...


Summing Up
In this chapter, we have looked at a few of the many command line tools that operate on
text
...
Admittedly, it may not seem immediately obvious how or why you might use some of these tools on a day-to-day basis,
though we have tried to show some semi-practical examples of their use
...
This will be particularly true when we get into shell scripting, where
these tools will really show their worth
...



From the Coreutils package:
http://www
...
org/software/coreutils/manual/coreutils
...
gnu
...
html#Operating-onsorted-files
http://www
...
org/software/coreutils/manual/coreutils
...
gnu
...
html#Operating-on-characters



From the Diffutils package:
http://www
...
org/software/diffutils/manual/html_mono/diff
...
gnu
...
html



aspell:
http://aspell
...
html



There are many other online resources for sed, in particular:
http://www
...
com/Unix/Sed
...
sourceforge
...
txt



Also try googling “sed one liners”, “sed cheat sheets”

303

20 – Text Processing

Extra Credit
There are a few more interesting text-manipulation commands worth investigating
...
)

304

21 – Formatting Output

21 – Formatting Output
In this chapter, we continue our look at text-related tools, focusing on programs that are
used to format text output, rather than changing the text itself
...
The
programs that we will cover in this chapter include:


nl – Number lines



fold – Wrap each line to a specified length



fmt – A simple text formatter



pr – Prepare text for printing



printf – Format and print data



groff – A document formatting system

Simple Formatting Tools
We’ll look at some of the simple formatting tools first
...


nl – Number Lines
The nl program is a rather arcane tool used to perform a simple task
...
In
its simplest use, it resembles cat -n:
[me@linuxbox ~]$ nl distros
...
2 12/07/2006
2
Fedora
10
11/25/2008
3
SUSE
11
...
04 04/24/2008
5
Fedora
8
11/08/2007
6
SUSE
10
...
10 10/26/2006

305

21 – Formatting Output
8
9
10

Fedora
Ubuntu
Ubuntu

7
7
...
04

05/31/2007
10/18/2007
04/19/2007

Like cat, nl can accept either multiple files as command line arguments, or standard input
...

nl supports a concept called “logical pages” when numbering
...
Using options, it is possible to set
the starting number to a specific value and, to a limited extent, its format
...
Within each of these sections, line
numbering may be reset and/or be assigned a different style
...
Sections in the text stream are indicated by the
presence of some rather odd-looking markup added to the text:
Table 21-1: nl Markup
Markup
\:\:\:

Meaning

\:\:

Start of logical page body

\:

Start of logical page footer

Start of logical page header

Each of the above markup elements must appear alone on its own line
...

Here are the common options for nl:
Table 21-2: Common nl Options
Option
-b style

Meaning

-f style

Set footer numbering to style
...


-h style

Set header numbering to style
...


306

Set body numbering to style, where style is one of the following:
a = number all lines
t = number only non-blank lines
...

n = none
pregexp = number only lines matching basic regular expression
regexp
...
Default is one
...

rn = right justified, without leading zeros
...

rz = right justified, with leading zeros
...


-s string

Add string to the end of each line number to create a separator
...


-v number

Set first line number of each logical page to number
...


-w width

Set width of the line number field to width
...


Admittedly, we probably won’t be numbering lines that often, but we can use nl to look
at how we can combine multiple tools to perform more complex tasks
...
Since we will be
using nl, it will be useful to include its header/body/footer markup
...
Using our text editor, we will change the
script as follows and save it as distros-nl
...
Released\
------- --------\
\\:\\:
s/\([0-9]\{2\}\)\/\([0-9]\{2\}\)\/\([0-9]\{4\}\)$/\3-\1-\2/
$ a\
\\:\
\
End Of Report

The script now inserts the nl logical page markup and adds a footer at the end of the report
...

Next, we’ll produce our enhanced report by combining sort, sed, and nl:

307

21 – Formatting Output
[me@linuxbox ~]$ sort -k 1,1 -k 2n distros
...
s
ed | nl
Linux Distributions Report
Name
---1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Fedora
Fedora
Fedora
Fedora
Fedora
Fedora
SUSE
SUSE
SUSE
SUSE
Ubuntu
Ubuntu
Ubuntu
Ubuntu
Ubuntu
Ubuntu

Ver
...
1
10
...
3
11
...
06
6
...
04
7
...
04
8
...
First, we sort the list by distribution
name and version (fields 1 and 2), then we process the results with sed, adding the report header (including the logical page markup for nl) and footer
...

We can repeat the command and experiment with different options for nl
...
Like our other commands, fold accepts either one or more text files or standard input
...
"
| fold -w 12
The quick br
own fox jump
ed over the
lazy dog
...
The text sent by the echo command is broken into segments specified by the -w option
...
If no width is specified, the default is 80 characters
...
The addition of the -s option will cause fold to break
the line at the last available space before the line width is reached:
[me@linuxbox ~]$ echo "The quick brown fox jumped over the lazy dog
...


fmt – A Simple Text Formatter
The fmt program also folds text, plus a lot more
...
Basically, it fills and joins lines in
text while preserving blank lines and indentation
...
Let’s lift some from the fmt info page:
`fmt' reads from the specified FILE arguments (or standard input
if none are given), and writes to standard output
...

`fmt' prefers breaking lines at the end of a sentence, and tries
to avoid line breaks after the first word of a sentence or before the
last word of a sentence
...
?!', followed by two
spaces or end of line, ignoring any intervening parentheses or
quotes
...

Knuth and Michael F
...


We’ll copy this text into our text editor and save the file as fmt-info
...
Now, let’s
say we wanted to reformat this text to fit a fifty character wide column
...
txt | head
`fmt' reads from the specified FILE arguments
(or standard input if
none are given), and writes to standard output
...


Well, that’s an awkward result
...

So, fmt is preserving the indentation of the first line
...
txt
`fmt' reads from the specified FILE arguments
(or standard input if none are given), and writes
to standard output
...

`fmt' prefers breaking lines at the end of a
sentence, and tries to avoid line breaks after
the first word of a sentence or before the
last word of a sentence
...
?!', followed
by two spaces or end of line, ignoring any
intervening parentheses or quotes
...
Knuth and Michael F
...


Much better
...

fmt has some interesting options:
Table 21-3: fmt Options
Option

Description

-c

Operate in crown margin mode
...
Subsequent lines are aligned with
the indentation of the second line
...
After
formatting, the contents of string are prefixed to each reformatted
line
...
For example, any programming language or
configuration file that uses a “#” character to delineate a comment
could be formatted by specifying -p '# ' so that only the
comments will be formatted
...


-s

Split-only mode
...
Short lines will not be joined to fill lines
...


-u

Perform uniform spacing
...
This means a single space between
words and two spaces between sentences
...

-w width

Format text to fit within a column width characters wide
...
Note: fmt actually formats lines slightly
shorter than the specified width to allow for line balancing
...
With it, we can format selected portions of a
file, provided that the lines to be formatted all begin with the same sequence of characters
...
Let’s create a file that simulates a
program that uses comments:
[me@linuxbox ~]$ cat > fmt-code
...

# This line is a comment
...

# And another
...

And another line of code
...


Our sample file contains comments which begin with the string “# “ (a # followed by a
space) and lines of “code” which do not
...
txt
# This file contains code with comments
...
Followed by another
# comment line
...

This, on the other hand, is a line of code
...

And another
...

312

Simple Formatting Tools

pr – Format Text For Printing
The pr program is used to paginate text
...
Further, this whitespace can be used to insert a header and footer
on each page
...
txt file into a series of very short
pages (only the first two pages are shown):
[me@linuxbox ~]$ pr -l 15 -w 65 distros
...
2
10
11
...
04
8

distros
...
txt

10
...
10
7
7
...
04

Page 2

12/07/2006
11/25/2008
06/19/2008
04/24/2008
11/08/2007

2008-12-11 18:27

SUSE
Ubuntu
Fedora
Ubuntu
Ubuntu

Page 1

10/04/2007
10/26/2006
05/31/2007
10/18/2007
04/19/2007

In this example, we employ the -l option (for page length) and the -w option (page
width) to define a “page” that is 65 columns wide and 15 lines long
...
txt file, separates each page with several lines of whitespace
and creates a default header containing the file modification time, filename, and page
number
...
We’ll take a look
at more of them in the next chapter
...
So why is it important? Because it is so
widely used
...
In fact, in bash, printf is a builtin
...
The formatted result is sent to standard output
...
In the example above, the conversion specification %s is used to format the string “foo” and place it in the command’s output
...
\n" foo
I formatted 'foo' as a string
...
The s conversion is used to format string data
...
This table lists the commonly used data types:
Table 21-4: Common printf Data Type Specifiers
Specifier
d

Description

f

Format and output a floating point number
...


314

Format a number as a signed decimal integer
...


x

Format an integer as a hexadecimal number using lowercase a-f where
needed
...


%

Print a literal % symbol (i
...
, specify “%%”)

We’ll demonstrate the effect each of the conversion specifiers on the string “380”:
[me@linuxbox ~]$ printf "%d, %f, %o, %s, %x, %X\n" 380 380 380 380
380 380
380, 380
...
The six results show the effect of each specifier
...
A complete conversion specification may consist of the following:
%[flags][width][
...
Here is a description of each:
Table 21-5: printf Conversion Specification Components
Component

Description

flags

There are five different flags:
# – Use the “alternate format” for output
...
For o (octal number) conversion, the output is prefixed with
0
...

0–(zero) Pad the output with zeros
...

- – (dash) Left-align the output
...

‘ ’ – (space) Produce a leading space for positive numbers
...
By default, printf only

315

21 – Formatting Output
signs negative numbers
...



...
For string
conversion, precision specifies the number of characters to
output
...


380

"%#x"

0x17c

Integer formatted as a
hexadecimal number using
the “alternate format” flag
...


380

"%05
...
00000

Number formatted as a
floating point number with
padding and five decimal
places of precision
...


380

"%010
...
00000

By increasing the
minimum field width to 10
the padding is now visible
...


380

"%-d"

380

The - flag left aligns the
formatting
...


abcdefghijk

"%
...


Again, printf is used mostly in scripts where it is employed to format tabular data,
rather than on the command line directly
...
First, let’s output some fields separated by tab characters:
[me@linuxbox ~]$ printf "%s\t%s\t%s\n" str1 str2 str3
str1 str2 str3

By inserting \t (the escape sequence for a tab), we achieve the desired effect
...
3f Result: %+15d\n" 1071
3
...
142 Result:
+32589

This shows the effect of minimum field width on the spacing of the fields
...
These are good for small, sim317

21 – Formatting Output
ple tasks, but what about larger jobs? One of the reasons that Unix became a popular operating system among technical and scientific users (aside from providing a powerful
multitasking, multiuser environment for all kinds of software development) is that it offered tools that could be used to produce many types of documents, particularly scientific
and academic publications
...
In 1971 the developers wanted to get a PDP-11 for further work on the
operating system
...
This
first formatting program was a reimplementation of McIllroy's `roff', written by J
...
Ossanna
...
And yes, the dropped “E” in the
middle is part of its name
...

The nroff program is used to format documents for output to devices that use
monospaced fonts, such as character terminals and typewriter-style printers
...
The
later troff program formats documents for output on typesetters, devices used to produce “camera-ready” type for commercial printing
...
The roff family also includes some other programs
that are used to prepare portions of documents
...

The TEX system (in stable form) first appeared in 1989 and has, to some degree, displaced troff as the tool of choice for typesetter output
...

Tip: For those interested in installing TEX, check out the texlive package
which can be found in most distribution repositories, and the LyX graphical content
editor
...
It also includes a script that is used to emulate nroff and the rest of the roff family as well
...
Most documents today are produced using
word processors that are able to perform both the composition and layout of a document
in a single step
...
Instructions for the formatting program were embedded into the composed text through the use of a markup language
...

We’re not going to cover groff in its entirety, as many elements of its markup language
deal with rather arcane details of typography
...
These macro packages condense many of its
low-level commands into a smaller set of high-level commands that make using groff
much easier
...
It lives in the /usr/share/man
directory as a gzip compressed text file
...
1
...
\" DO NOT MODIFY THIS FILE! It was generated by help2man 1
...


...
10" "User Commands"

...
SH SYNOPSIS

...
[\fIFILE\fR]
...
SH DESCRIPTION

...
PP

Compared to the man page in its normal presentation, we can begin to see a correlation
between the markup language and its results:
[me@linuxbox ~]$ man ls | head
LS(1)
User Commands
NAME

LS(1)

ls - list directory contents

319

21 – Formatting Output
SYNOPSIS
ls [OPTION]
...


The reason this is of interest is that man pages are rendered by groff, using the mandoc macro package
...
1
...
[FILE]
...
groff can produce output in several formats
...
1
...
0
%%Creator: groff version 1
...
1
%%CreationDate: Thu Feb 5 13:44:37 2009
%%DocumentNeededResources: font Times-Roman
%%+ font Times-Bold
%%+ font Times-Italic
%%DocumentSuppliedResources: procset grops 1
...
PostScript is a page description language that is used to describe the contents of a
printed page to a typesetter-like device
...
1
...
ps

An icon for the output file should appear on the desktop
...
ps ~/Desktop/ls
...

Tip: Linux systems often include many command line programs for file format

321

21 – Formatting Output
conversion
...
Try using the command ls /usr/bin/*[[:alpha:]]2[[:alpha:]]* to identify them
...

For our last exercise with groff, we will revisit our old friend distros
...
This time, we will use the tbl program which is used to format tables to typeset
our list of Linux distributions
...

First, we need to modify our sed script to add the necessary requests that tbl requires
...
sed to the following:
# sed script to produce Linux distributions report
1 i\

...
\
Linux Distributions Report\
=\
Name Version
Released\
_
s/\([0-9]\{2\}\)\/\([0-9]\{2\}\)\/\([0-9]\{4\}\)$/\3-\1-\2/
$ a\

...
We’ll save the resulting file
as distros-tbl
...
tbl uses the
...
TE requests to start and end the table
...
TS request define global properties of the table which, for our
example, are centered horizontally on the page and surrounded by a box
...
Now, if we run our reportgenerating pipeline again with the new sed script, we’ll get the following :
[me@linuxbox ~]$ sort -k 1,1 -k 2n distros
...
sed | groff -t -T ascii 2>/dev/null
+------------------------------+
| Linux Distributions Report
|
+------------------------------+
| Name
Version
Released |

322

Document Formatting Systems
+------------------------------+
|Fedora
5
2006-03-20 |
|Fedora
6
2006-10-24 |
|Fedora
7
2007-05-31 |
|Fedora
8
2007-11-08 |
|Fedora
9
2008-05-13 |
|Fedora
10
2008-11-25 |
|SUSE
10
...
2
2006-12-07 |
|SUSE
10
...
0
2008-06-19 |
|Ubuntu
6
...
10
2006-10-26 |
|Ubuntu
7
...
10
2007-10-18 |
|Ubuntu
8
...
10
2008-10-30 |
+------------------------------+

Adding the -t option to groff instructs it to pre-process the text stream with tbl
...

The format of the output is the best we can expect if we are limited to the capabilities of a
terminal screen or typewriter-style printer
...
txt | sed -f distros-tbl

...
ps

323

21 – Formatting Output

Figure 5: Viewing The Finished Table

Summing Up
Given that text is so central to the character of Unix-like operating systems, it makes
sense that there would be many tools that are used to manipulate and format text
...
We may never write a technical paper using command line tools (though there are
many people who do!), but it’s good to know that we could
...
freebsd
...
memacros/paper
...
gnu
...
freebsd
...
meref/paper
...
bell-labs
...
pdf



And, of course, try the following articles at Wikipedia:
http://en
...
org/wiki/TeX
http://en
...
org/wiki/Donald_Knuth
http://en
...
org/wiki/Typesetting

325

22 – Printing

22 – Printing
After spending the last couple of chapters manipulating text, it’s time to put that text on
paper
...
We won’t be looking at how to configure printing, as that varies
from distribution to distribution and is usually set up automatically during installation
...

We will discuss the following commands:


pr – Convert text files for printing



lpr – Print files



a2ps – Format files for printing on a PostScript printer



lpstat – Show printer status information



lpq – Show printer queue status



lprm – Cancel print jobs

A Brief History Of Printing
To fully understand the printing features found in Unix-like operating systems, we must
first learn some history
...
In those days, printers and how they were used was much different from today
...
The typical computer user of 1980 worked at a terminal connected to a
computer some distance away
...

When printers were expensive and centralized, as they often were in the early days of
Unix, it was common practice for many users to share a printer
...
The computer support staff would then load up
a cart containing the day’s print jobs and deliver them to the individual users
...
First, printers of that
period were almost always impact printers
...
Two
of the popular technologies of that time were daisy-wheel printing and dot-matrix printing
...
For example, a daisy-wheel
printer could only print the characters actually molded into the petals of the daisy wheel
...
As with most typewriters, they
printed using monospaced (fixed width) fonts
...
Printing was done at fixed positions on the page, and the printable area of a
page contained a fixed number of characters
...
Using this scheme, a US-letter
sheet of paper is 85 characters wide and 66 lines high
...
This explains why terminal displays (and our terminal emulators) are normally 80 characters
wide
...

Data is sent to a typewriter-like printer in a simple stream of bytes containing the characters to be printed
...
In addition, the low-numbered ASCII control codes provided a means of moving the printer’s
carriage and paper, using codes for carriage return, line feed, form feed, etc
...
We can actually witness this if we use nroff to render a
man page and examine the output using cat -A:
[me@linuxbox ~]$ zcat /usr/share/man/man1/ls
...
gz | nroff -man | cat
-A | head
LS(1)
User Commands
LS(1)
$
$
$
N^HNA^HAM^HME^HE$
ls - list directory contents$

327

22 – Printing
$
S^HSY^HYN^HNO^HOP^HPS^HSI^HIS^HS$
l^Hls^Hs [_^HO_^HP_^HT_^HI_^HO_^HN]
...
$

The ^H (Control-h) characters are the backspaces used to create the boldface effect
...


Graphical Printers
The development of GUIs led to major changes in printer technology
...
This was facilitated by the advent of the low-cost laser printer which, instead
of printing fixed characters, could print tiny dots anywhere in the printable area of the
page
...

However, moving from a character-based scheme to a graphical scheme presented a formidable technical challenge
...

That invention turned out to be the page description language (PDL)
...
Basically it
says, “go to this position, draw the character ‘a’ in 10 point Helvetica, go to this
position
...
The first major PDL was PostScript
from Adobe Systems, which is still in wide use today
...
It includes built-in support for 35 standard, high-quality fonts, plus the ability to
accept additional font definitions at run time
...
This solved the data transmission problem
...

A PostScript printer accepted a PostScript program as input
...
The generic name for this process of rendering something into
a large bit pattern (called a bitmap) is raster image processor or RIP
...
This allowed the
RIP to move from the printer to the host computer, which, in turn, permitted high-quality
printers to be much less expensive
...
They rely on the host computer’s RIP to provide a stream of bits to print as dots
...


Printing With Linux
Modern Linux systems employ two software suites to perform and manage printing
...

CUPS manages printers by creating and maintaining print queues
...
Since printers are slow by nature, compared to the computers that are feeding them, printing systems need a way to schedule multiple print jobs
and keep things organized
...


Preparing Files For Printing
As command line users, we are mostly interested in printing text, though it is certainly
possible to print other data formats as well
...
Now we will examine some of its many
options used in conjunction with printing
...
pr is used to adjust text to fit on a specific page size, with optional
page headers and margins
...


-columns

Organize the content of the page into the number of columns
specified by columns
...
By adding
the -a (across) option, content is listed horizontally
...


-D “format”

Format the date displayed in page headers using format
...


-f

Use form feeds rather than carriage returns to separate pages
...


-l length

Set page length to length
...


-o offset

Create a left margin offset characters wide
...
Default is 72
...
In this example, we will produce a directory listing of /usr/bin and format it into paginated, three-column output using pr:
[me@linuxbox ~]$

ls /usr/bin | pr -3 -w 65 | head

2009-02-18 14:00

[
411toppm
a2p
a2ps
a2ps-lpr-wrapper

330

Page 1

apturl
ar
arecord
arecordmidi
ark

bsd-write
bsh
btcflash
bug-buddy
buildhash

Sending A Print Job To A Printer

Sending A Print Job To A Printer
The CUPS printing suite supports two methods of printing historically used on Unix-like
systems
...
Both programs do roughly the same
thing
...


lpr – Print Files (Berkeley Style)
The lpr program can be used to send files to the printer
...
For example, to print the results of our multicolumn directory
listing above, we could do this:
[me@linuxbox ~]$ ls /usr/bin | pr -3 | lpr

and the report would be sent to the system’s default printer
...
To see a list of printers known to
the system:
[me@linuxbox ~]$ lpstat -a

Tip: Many Linux distributions allow you to define a “printer” that outputs files in
PDF (Portable Document Format), rather than printing on the physical printer
...
Check your printer configuration program to see if it supports this configuration
...

Here are some of the common options for lpr:

331

22 – Printing
Table 22-2: Common lpr Options
Option

Description

-# number

Set number of copies to number
...
This so-called “pretty print” option
can be used when printing text files
...
If no printer is
specified, the system’s default printer is used
...
This would be useful for programs
that produce temporary printer-output files
...
It differs from lpr in
that it supports a different (and slightly more sophisticated) option set
...


-o landscape

Set output to landscape orientation
...
This is useful when
printing images, such as JPEG files
...
The value of 100 fills the
page
...


-o cpi=number

Set the output characters per inch to number
...


-o lpi=number

Set the output lines per inch to number
...


332

Set the destination (printer) to printer
...


Sending A Print Job To A Printer
-o
-o
-o
-o

page-bottom=points
page-left=points
page-right=points
page-top=points

-P pages

Set the page margins
...
There
are 72 points to an inch
...
pages may be expressed
as a comma-separated list and/or a range
...
Note that we have to adjust the pr options to account for the
new page size:
[me@linuxbox ~]$ ls /usr/bin | pr -4 -w 90 -l 88 | lp -o page-left=36
-o cpi=12 -o lpi=8

This pipeline produces a four-column listing using smaller type than the default
...


Another Option: a2ps
The a2ps program is interesting
...
Its name originally meant “ASCII to PostScript”
and it was used to prepare text files for printing on PostScript printers
...
” While its name suggests a format-conversion program, it is actually a
printing program
...
The program’s default behavior is that of a “pretty printer,” meaning that
it improves the appearance of output
...
ps -L
66
[stdin (plain): 11 pages on 6 sheets]
[Total: 11 pages on 6 sheets] saved into the file `/home/me/Desktop/
ls
...
If we view the resulting file with a suitable file
viewer, we will see this:

Figure 6: Viewing a2ps Output
As we can see, the default output layout is “two up” format
...
a2ps applies nice page headers and footers,
too
...
Here is a summary:
Table 22-4: a2ps Options
Option

Description

--center-title text

Set center page title to text
...
Default
is 2
...


--guess

Report the types of files given as arguments
...


--left-footer text

Set left-page footer to text
...


--line-numbers=interval

Number lines of output every interval lines
...


--list=topic

Display settings for topic, where topic is one
of the following: delegations (external
programs that will be used to convert data),
encodings, features, variables, media (paper
sizes and the like), ppd (PostScript printer
descriptions), printers, prologues (portions of
code that are prefixed to normal output),
stylesheets, and user options
...


--right-footer text

Set right-page footer to text
...


--rows number

Arrange pages into number rows
...


-B

No page headers
...


-f size

Use size point font
...
This and the
-L option (below) can be used to make files
paginated with other programs, such as pr, fit
correctly on the page
...


-M name

Use media name
...


-n number

Output number copies of each page
...
If file is specified as “-”,
use standard output
...
If a printer is not specified, the
system default printer is used
...


-r

Landscape orientation
...


-u text

Underlay (watermark) pages with text
...
a2ps has several more options
...
During my testing, I noticed different
behavior on various distributions
...
On CentOS 4 and Fedora 10, output defaulted to A4 media, despite the program being configured to use letter-size media by default
...
On Ubuntu 8
...

Also note that there is another output formatter that is useful for converting text
into PostScript
...


Monitoring And Controlling Print Jobs
As Unix printing systems are designed to handle multiple print jobs from multiple users,
CUPS is designed to do the same
...
CUPS supplies several command line programs that are used to manage printer status and print queues
...


lpstat – Display Print System Status
The lpstat program is useful for determining the names and availability of printers on
the system
...

The commonly useful options include:
Table 22-5: Common lpstat Options
Option
-a [printer
...


-p [printer
...
If no printers
are specified, all printers are shown
...


-s

Display a status summary
...


Display the state of the printer queue for printer
...
If no printers
are specified, all print queues are shown
...
This allows us to view the
status of the queue and the print jobs it contains
...

If we send a job to the printer and then look at the queue, we will see it listed:
[me@linuxbox ~]$ ls *
...
One is Berkeley style (lprm) and the other is System V (cancel)
...
Using our print job
above as an example, we could stop the job and remove it this way:
[me@linuxbox ~]$ cancel 603
[me@linuxbox ~]$ lpq
printer is ready
no entries

Each command has options for removing all the jobs belonging to a particular user, particular printer, and multiple job numbers
...


Summing Up
In this chapter, we have seen how the printers of the past influenced the design of the
printing systems on Unix-like machines, and how much control is available on the command line to control not only the scheduling and execution of print jobs, but also the various output options
...
wikipedia
...
wikipedia
...
cups
...
wikipedia
...
wikipedia
...
The
availability of source code is the essential freedom that makes Linux possible
...
For many
desktop users, compiling is a lost art
...

At the time of this writing, the Debian repository (one of the largest of any of the distributions) contains almost 23,000 packages
...
Availability
...
In this case,
the only way to get the desired program is to compile it from source
...
Timeliness
...
This means that in order to have the very latest version of a
program, compiling is necessary
...
However, many compiling tasks are quite easy and involve
only a few steps
...
We will look at a very simple case in order to provide an overview of the process and as a starting point for those who wish to
undertake further study
...

The computer’s processor (or CPU) works at a very elemental level, executing programs
in what is called machine language
...

340

What Is Compiling?
Each of these instructions is expressed in binary (ones and zeros)
...

This problem was overcome by the advent of assembly language, which replaced the numeric codes with (slightly) easier to use character mnemonics such as CPY (for copy) and
MOV (for move)
...
Assembly language is still used today for
certain specialized programming tasks, such as device drivers and embedded systems
...
They are called this
because they allow the programmer to be less concerned with the details of what the processor is doing and more with solving the problem at hand
...
Both are still in limited use today
...
Most programs
written for modern systems are written in either C or C++
...

Programs written in high-level programming languages are converted into machine language by processing them with another program, called a compiler
...

A process often used in conjunction with compiling is called linking
...
Take, for instance, opening a file
...
It makes more sense to have a single piece of programming that knows
how to open files and to allow all programs that need it to share it
...
They contain multiple routines, each performing some common task that multiple programs can share
...
A program
called a linker is used to form the connections between the output of the compiler and the
libraries that the compiled program requires
...


Are All Programs Compiled?
No
...
They are executed directly
...
These languages have grown in popularity in recent years and include
Perl, Python, PHP, Ruby, and many others
...
An interpreter
inputs the program file and reads and executes each instruction contained within it
...
This is
because that each source code instruction in an interpreted program is translated every
time it is carried out, whereas with a compiled program, a source code instruction is only
translated once, and this translation is permanently recorded in the final executable file
...
Programs are usually developed in a repeating cycle of code, compile, test
...
Interpreted languages remove the compilation step and
thus speed up program development
...
Before we do that however, we’re going to need some tools like
the compiler, the linker, and make
...

Most distributions do not install gcc by default
...

Tip: Your distribution may have a meta-package (a collection of packages) for software development
...
If your system does not provide a meta-package, try installing the
gcc and make packages
...


Obtaining The Source Code
For our compiling exercise, we are going to compile a program from the GNU Project
called diction
...
As programs go, it is fairly small and easy to build
...
gnu
...
gnu
...

220 GNU FTP server ready
...
gnu
...

Remote system type is UNIX
...

ftp> cd gnu/diction
250 Directory successfully changed
...
Consider using PASV
...

-rw-r--r-1 1003 65534
68940 Aug 28 1998 diction-0
...
tar
...
02
...
gz
-rw-r--r-1 1003 65534 141062 Sep 17 2007 diction-1
...
tar
...

ftp> get diction-1
...
tar
...
11
...
gz remote: diction-1
...
tar
...
Consider using PASV
...
11
...
gz
(141062 bytes)
...

141062 bytes received in 0
...
4 kB/s)
ftp> bye
221 Goodbye
...
11
...
gz

Note: Since we are the “maintainer” of this source code while we compile it, we
will keep it in ~/src
...

As we can see, source code is usually supplied in the form of a compressed tar file
...
After arriving at the ftp site, we examine the list
of tar files available and select the newest version for download
...

Once the tar file is downloaded, it must be unpacked
...
11
...
gz

343

23 – Compiling Programs
[me@linuxbox src]$ ls
diction-1
...
11
...
gz

Tip: The diction program, like all GNU Project software, follows certain standards for source code packaging
...
One element of the standard is that when the
source code tar file is unpacked, a directory will be created which contains the
source tree, and that this directory will be named project-x
...
This scheme allows easy installation of
multiple versions of the same program
...
Some projects will not create the directory, but instead will deliver the files directly into the current directory
...
To avoid this, use the
following command to examine the contents of the tar file:
tar tzvf tarfile | head

Examining The Source Tree
Unpacking the tar file results in the creation of a new directory, named diction-1
...

This directory contains the source tree
...
11
[me@linuxbox diction-1
...
guess diction
...
c
config
...
in
diction
...
h
config
...
spec
getopt_int
...
spec
...
in diction
...
in install-sh
COPYING
en
Makefile
...
c
de
...
po
misc
...
1
...
c
NEWS

nl
nl
...
c
sentence
...
1
...
c
test

In it, we see a number of files
...

These files contain the description of the program, information on how to build and install it, and its licensing terms
...

344

Compiling A C Program
The other interesting files in this directory are the ones ending with
...
h:
[me@linuxbox diction-1
...
c
diction
...
c getopt
...
c sentence
...
11]$ ls *
...
h getopt_int
...
h sentence
...
c

The
...
It is common practice for large programs to be broken into
smaller, easier to manage pieces
...
11]$ less diction
...
h files are known as header files
...
Header files contain
descriptions of the routines included in a source code file or library
...
Near the beginning of the diction
...
h"

This instructs the compiler to read the file getopt
...
c in order to “know” what’s in getopt
...
The getopt
...

Above the include statement for getopt
...
h>
...
h>
...
h>

These also refer to header files, but they refer to header files that live outside the current
source tree
...

If we look in /usr/include, we can see them:
345

23 – Compiling Programs
[me@linuxbox diction-1
...


Building The Program
Most programs build with a simple, two-command sequence:

...
Its job
is to analyze the build environment
...
That is,
it is designed to build on more than one kind of Unix-like system
...
configure also checks to see that necessary external
tools and components are installed
...
Since configure is not located where the shell normally expects programs to be located, we must explicitly tell the
shell its location by prefixing the command with
...
11]$
...
When it
finishes, it will look something like this:
checking libintl
...
yes
checking for libintl
...
yes
checking for library containing gettext
...
/config
...
status: creating Makefile
config
...
1
config
...
texi
config
...
spec
config
...
1
config
...
status: creating config
...
11]$

346

Compiling A C Program
What’s important here is that there are no error messages
...

We see configure created several new files in our source directory
...
Makefile is a configuration file that instructs the make program exactly how to build the program
...
Makefile
is an ordinary text file, so we can view it:
[me@linuxbox diction-1
...

The first part of the makefile defines variables that are substituted in later sections of the
makefile
...
Later in the makefile, we see one instance
where it gets used:
diction:

diction
...
o misc
...
o getopt1
...
o sentence
...
o \
getopt
...
o $(LIBS)

A substitution is performed here, and the value $(CC) is replaced by gcc at run time
...
The remaining lines describe the
command(s) needed to create the target from its components
...
o, sentence
...
o, getopt
...
o
...
o:
getopt
...
o:
misc
...
c config
...
h misc
...
h
getopt
...
h getopt_int
...
c getopt
...
h
misc
...
h misc
...
o:
style
...
c config
...
h sentence
...
c config
...
h misc
...
h

However, we don’t see any command specified for them
...
c file into a
...
c
...
Why not simply list all the steps to compile the parts
and be done with it? The answer to this will become clear in a moment
...
11]$ make

The make program will run, using the contents of Makefile to guide its actions
...

When it finishes, we will see that all the targets are now present in our directory:
[me@linuxbox diction-1
...
guess
de
...
h
diction
config
...
in
diction
...
log
diction
...
in
config
...
c
config
...
o
configure
diction
...
in
diction
...
spec
...
texi
de
...
texi
...
mo
en_GB
...
c
getopt1
...
c
getopt
...
h
getopt
...
in
misc
...
h
misc
...
mo
nl
...
c
sentence
...
o
style
style
...
1
...
c
style
...

Congratulations are in order! We just compiled our first programs from source code!
But just out of curiosity, let’s run make again:

348

Compiling A C Program
[me@linuxbox diction-1
...


It only produces this strange message
...
Rather than simply building everything again,
make only builds what needs building
...
We can demonstrate this by deleting one of the targets and
running make again to see what it does
...
11]$ rm getopt
...
11]$ make

We see that make rebuilds it and re-links the diction and style programs, since they
depend on the missing module
...
make insists that targets be newer than their dependencies
...
make ensures that everything that needs building based on the updated code is built
...
11]$
me
me
me
me
diction-1
...
11]$
me
me
me
me
diction-1
...
c
37164 2009-03-05 06:14
33125 2007-03-30 17:45
touch getopt
...
c
37164 2009-03-05 06:14
33125 2009-03-05 06:23
make

diction
getopt
...
c

After make runs, we see that it has restored the target to being newer than the dependency:
[me@linuxbox diction-1
...
c
-rwxr-xr-x 1 me
me
37164 2009-03-05 06:24 diction
-rw-r--r-- 1 me
me
33125 2009-03-05 06:23 getopt
...
While the time savings may not be very apparent with our small project, it
349

23 – Compiling Programs
is very significant with larger projects
...


Installing The Program
Well-packaged source code will often include a special make target called install
...
Usually, this directory is /usr/local/bin, the traditional location for locally built software
...
11]$ sudo make install

After we perform the installation, we can check that the program is ready to go:
[me@linuxbox diction-1
...
11]$ man diction

And there we have it!

Summing Up
In this chapter, we have seen how three simple commands:

...
We have also seen the important role
that make plays in the maintenance of programs
...


Further Reading


350

The Wikipedia has good articles on compilers and the make program:
http://en
...
org/wiki/Compiler
http://en
...
org/wiki/Make_(software)

Further Reading


The GNU Make Manual:
http://www
...
org/software/make/manual/html_node/index
...
While
these tools can solve many kinds of computing problems, we are still limited to manually
using them one by one on the command line
...
By joining our tools together into programs of our
own design, the shell can carry out complex sequences of tasks all by itself
...


What Are Shell Scripts?
In the simplest terms, a shell script is a file containing a series of commands
...

The shell is somewhat unique, in that it is both a powerful command line interface to the
system and a scripting language interpreter
...

We have covered many shell features, but we have focused on those features most often
used directly on the command line
...


How To Write A Shell Script
To successfully create and run a shell script, we need to do three things:
1
...
Shell scripts are ordinary text files
...
The best text editors will provide syntax highlighting, allowing us to
see a color-coded view of the elements of the script
...
vim, gedit, kate, and many other editors are good candidates for writing scripts
...
Make the script executable
...

354

How To Write A Shell Script
3
...
The shell automatically searches
certain directories for executable files when no explicit pathname is specified
...


Script File Format
In keeping with programming tradition, we’ll create a “hello world” program to demonstrate an extremely simple script
...

echo 'Hello World!'

The last line of our script is pretty familiar, just an echo command with a string argument
...
It looks like a comment that we have seen used in
many of the configuration files we have examined and edited
...

Like many things, this works on the command line, too:
[me@linuxbox ~]$ echo 'Hello World!' # This is a comment too
Hello World!

Though comments are of little use on the command line, they will work
...
It looks as if it should be a comment,
since it starts with #, but it looks too purposeful to be just that
...
The shebang is used to tell the
system the name of the interpreter that should be used to execute the script that follows
...

Let’s save our script file as hello_world
...
This is easily done using
chmod:
[me@linuxbox
-rw-r--r-- 1
[me@linuxbox
[me@linuxbox
-rwxr-xr-x 1

~]$ ls -l hello_world
me
me
63 2009-03-07 10:10 hello_world
~]$ chmod 755 hello_world
~]$ ls -l hello_world
me
me
63 2009-03-07 10:10 hello_world

There are two common permission settings for scripts; 755 for scripts that everyone can
execute, and 700 for scripts that only the owner can execute
...


Script File Location
With the permissions set, we can now execute our script:
[me@linuxbox ~]$
...
If we
don’t, we get this:
[me@linuxbox ~]$ hello_world
bash: hello_world: command not found

Why is this? What makes our script different from other programs? As it turns out, nothing
...
Its location is the problem
...
To recap, the system searches a list of directories each time it needs to find an executable program, if no explicit path is specified
...
The /bin directory is one of the directories that the system automatically searches
...
The PATH variable contains a colon-separated list of
directories to be searched
...
If our script were located in any of the directories in
the list, our problem would be solved
...
Most Linux distributions configure the PATH variable to contain a
bin directory in the user’s home directory, to allow users to execute their own programs
...

If the PATH variable does not contain the directory, we can easily add it by including this
line in our
...
To apply the
change to the current terminal session, we must have the shell re-read the
...

This can be done by “sourcing” it:
[me@linuxbox ~]$
...
) command is a synonym for the source command, a shell builtin which
reads a specified file of shell commands and treats it like input from the keyboard
...
bashrc file is executed
...


357

24 – Writing Your First Script

Good Locations For Scripts
The ~/bin directory is a good place to put scripts intended for personal use
...
Scripts intended for use by the system administrator are often located in /usr/local/sbin
...
These directories are specified by the Linux Filesystem Hierarchy Standard to contain only files supplied and maintained by the Linux distributor
...

Making a script easy to read and understand is one way to facilitate easy maintenance
...
For
instance, the ls command has many options that can be expressed in either short or long
form
...
In the interests of reduced typing, short options are preferred
when entering options on the command line, but when writing scripts, long options can
provide improved readability
...
In Chapter 17, we looked at a particularly long example of the
find command:

358

More Formatting Tricks
[me@linuxbox ~]$ find playground \( -type f -not -perm 0600 -exec
chmod 0600 ‘{}’ ‘;’ \) -or \( -type d -not -perm 0700 -exec chmod
0700 ‘{}’ ‘;’ \)

Obviously, this command is a little hard to figure out at first glance
...
This technique works on
the command line, too, though it is seldom used, as it is very awkward to type and edit
...


Configuring vim For Script Writing
The vim text editor has many, many configuration settings
...
With this setting, different elements of shell syntax
will be displayed in different colors when viewing a script
...
It looks cool, too
...
If you
have difficulty with the command above, try :set syntax=sh instead
...
Say we search for the word “echo
...

:set tabstop=4
sets the number of columns occupied by a tab character
...

Setting the value to 4 (which is a common practice) allows long lines to fit more
easily on the screen
...
This causes vim to indent a new line the same
amount as the line just typed
...
To stop indentation, type Ctrl-d
...
vimrc file
...
We also saw how we may use various formatting techniques to improve the readability (and thus, the maintainability) of our scripts
...


Further Reading




This Wikipedia article talks more about the shebang mechanism:
http://en
...
org/wiki/Shebang_(Unix)

360

For “Hello World” programs and examples in various programming languages,
see:
http://en
...
org/wiki/Hello_world

25 – Starting A Project

25 – Starting A Project
Starting with this chapter, we will begin to build a program
...

The program we will write is a report generator
...

Programs are usually built up in a series of stages, with each stage adding features and
capabilities
...
That will come later
...
It looks
like this:


Page Title


Page body
...
html, we can use the following URL in Firefox to view the file:
file:///home/username/foo
...

We can write a program to do this pretty easily
...
"
"
"
""

Our first attempt at this problem contains a shebang, a comment (always a good idea) and
a sequence of echo commands, one for each line of output
...
We’ll
run the program again and redirect the output of the program to the file
sys_info_page
...
html
[me@linuxbox ~]$ firefox sys_info_page
...

When writing programs, it’s always a good idea to strive for simplicity and clarity
...
Our current version of the program works fine, but it could be simpler
...
So, let’s change our program to this:
362

First Stage: Minimal Document
#!/bin/bash
# Program to output a system information page
echo "

Page Title


Page body
...
The
shell will keep reading the text until it encounters the closing quotation mark
...

>

> "

The leading “>” character is the shell prompt contained in the PS2 shell variable
...
This feature is a little obscure right now, but later, when we cover multi-line programming statements, it will turn
out to be quite handy
...
To do this, we will make the following changes:
#!/bin/bash
# Program to output a system information page
echo "


System Information Report

363

25 – Starting A Project



System Information Report





"

We added a page title and a heading to the body of the report
...
Notice how the string “System Information
Report” is repeated? With our tiny script it’s not a problem, but let’s imagine that our
script was really long and we had multiple instances of this string
...
What if we could arrange the script so that the string only ap peared once and not multiple times? That would make future maintenance of the script
much easier
...

So, how do we create a variable? Simple, we just use it
...
This differs from many programming languages in which
variables must be explicitly declared or defined before use
...
For example, consider this scenario played out on
the command line:

364

Variables And Constants
[me@linuxbox ~]$ foo="yes"
[me@linuxbox ~]$ echo $foo
yes
[me@linuxbox ~]$ echo $fool
[me@linuxbox ~]$

We first assign the value “yes” to the variable foo, and then display its value with echo
...
This is because the shell happily created the variable fool when it encountered it,
and gave it the default value of nothing, or empty
...
From our previous look at how the shell performs expansions, we know
that the command:
[me@linuxbox ~]$ echo $foo

undergoes parameter expansion and results in:
[me@linuxbox ~]$ echo yes

Whereas the command:
[me@linuxbox ~]$ echo $fool

expands into:
[me@linuxbox ~]$ echo

The empty variable expands into nothing! This can play havoc with commands that require arguments
...
txt
[me@linuxbox ~]$ foo1=foo1
...
txt'

365

25 – Starting A Project
Try `cp --help' for more information
...
We then perform a cp, but misspell
the name of the second argument
...

There are some rules about variable names:
1
...

2
...

3
...

The word “variable” implies a value that changes, and in many applications, variables are
used this way
...
A
constant is just like a variable in that it has a name and contains a value
...
In an application that performs geometric
calculations, we might define PI as a constant, and assign it the value of 3
...
The shell makes no distinction between variables and constants; they are mostly for the programmer’s convenience
...
We will modify our script to comply with this convention:
#!/bin/bash
# Program to output a system information page
TITLE="System Information Report For $HOSTNAME"
echo "




$TITLE

$TITLE





"

We also took the opportunity to jazz up our title by adding the value of the shell variable
HOSTNAME
...


366

Variables And Constants
Note: The shell actually does provide a way to enforce the immutability of constants, through the use of the declare builtin command with the -r (read-only)
option
...
This feature is rarely
used, but it exists for very formal scripts
...
As we have seen, variables are assigned values this way:
variable=value
where variable is the name of the variable and value is a string
...
You can force the shell to restrict the assignment to integers
by using the declare command with the -i option, but, like setting variables as readonly, this is rarely done
...
So what can the value consist of? Anything that we can expand into a string:
a=z
b="a string"
c="a string and $b"
d=$(ls -l foo
...

Embedded spaces must be within quotes
...

Results of a command
...

Escape sequences such as tabs and newlines
...
This
is useful in cases where a variable name becomes ambiguous due to its surrounding con367

25 – Starting A Project
text
...


This attempt fails because the shell interprets the second argument of the mv command as
a new (and empty) variable
...

We’ll take this opportunity to add some data to our report, namely the date and time the
report was created and the username of the creator:
#!/bin/bash
# Program to output a system information page
TITLE="System Information Report For $HOSTNAME"
CURRENT_TIME=$(date +"%x %r %Z")
TIMESTAMP="Generated $CURRENT_TIME, by $USER"
echo "


$TITLE



$TITLE


$TIMESTAMP




"

Here Documents
We’ve looked at two different methods of outputting our text, both using the echo com368

Here Documents
mand
...
A here document is an
additional form of I/O redirection in which we embed a body of text into our script and
feed it into the standard input of a command
...
We’ll modify our script to use a here document:
#!/bin/bash
# Program to output a system information page
TITLE="System Information Report For $HOSTNAME"
CURRENT_TIME=$(date +"%x %r %Z")
TIMESTAMP="Generated $CURRENT_TIME, by $USER"
cat << _EOF_






$TITLE

$TITLE


$TIMESTAMP




_EOF_

Instead of using echo, our script now uses cat and a here document
...
Note that the token must appear alone and that there must not
be trailing spaces on the line
...
Here is a command line example:
[me@linuxbox ~]$ foo="some text"
[me@linuxbox ~]$ cat << _EOF_
> $foo

369

25 – Starting A Project
> "$foo"
> '$foo'
> \$foo
> _EOF_
some text
"some text"
'some text'
$foo

As we can see, the shell pays no attention to the quotation marks
...
This allows us to embed quotes freely within a here document
...

Here documents can be used with any command that accepts standard input
...
nl
...
org
FTP_PATH=/debian/dists/lenny/main/installer-i386/current/images/cdrom
REMOTE_FILE=debian-cd_info
...
gz
ftp -n << _EOF_
open $FTP_SERVER
user anonymous me@linuxbox
cd $FTP_PATH
hash
get $REMOTE_FILE
bye
_EOF_
ls -l $REMOTE_FILE

If we change the redirection operator from “<<” to “<<-”, the shell will ignore leading
tab characters in the here document
...
nl
...
org

370

Here Documents
FTP_PATH=/debian/dists/lenny/main/installer-i386/current/images/cdrom
REMOTE_FILE=debian-cd_info
...
gz
ftp -n <<- _EOF_
open $FTP_SERVER
user anonymous me@linuxbox
cd $FTP_PATH
hash
get $REMOTE_FILE
bye
_EOF_
ls -l $REMOTE_FILE

Summing Up
In this chapter, we started a project that will carry us through the process of building a
successful script
...
They are the first of many applications we will find for parameter expansion
...


Further Reading


For more information about HTML, see the following articles and tutorials:
http://en
...
org/wiki/Html
http://en
...
org/wiki/HTML_Programming
http://html
...


371

26 – Top-Down Design

26 – Top-Down Design
As programs get larger and more complex, they become more difficult to design, code
and maintain
...
Let’s imagine that we are trying to describe a
common, everyday task, going to the market to buy food, to a person from Mars
...

2
...

4
...

6
...

8
...


Get in car
...

Park car
...

Purchase food
...

Drive home
...

Enter house
...
We could further break down
the subtask “Park car” into this series of steps:
1
...

3
...

5
...


Find parking space
...

Turn off motor
...

Exit car
...


The “Turn off motor” subtask could further be broken down into steps including “Turn
off ignition,” “Remove ignition key,” and so on, until every step of the entire process of
going to the market has been fully defined
...
This technique allows us to break large complex
tasks into many small, simple tasks
...

In this chapter, we will use top-down design to further develop our report-generator
script
...

2
...

4
...

6
...

8
...


Open page
...

Set page title
...

Open page body
...

Output timestamp
...

Close page
...
These
will include:


System uptime and load
...




Disk space
...




Home space
...


If we had a command for each of these tasks, we could add them to our script simply
through command substitution:
#!/bin/bash
# Program to output a system information page
TITLE="System Information Report For $HOSTNAME"
CURRENT_TIME=$(date +"%x %r %Z")
TIMESTAMP="Generated $CURRENT_TIME, by $USER"
cat << _EOF_



373

26 – Top-Down Design
$TITLE





$TITLE


$TIMESTAMP


$(report_uptime)
$(report_disk_space)
$(report_home_space)


_EOF_

We could create these additional commands two ways
...
As we have mentioned before, shell functions are
“mini-scripts” that are located inside other scripts and can act as autonomous programs
...
Both forms are equivalent and may be used interchangeably
...
Execution begins at line 12, with an echo command
...
Program control then moves to line 6, and the second echo
command is executed
...
Its return command terminates the
function and returns control to the program at the line following the function call (line
14), and the final echo command is executed
...

We’ll add minimal shell function definitions to our script:
#!/bin/bash
# Program to output a system information page
TITLE="System Information Report For $HOSTNAME"
CURRENT_TIME=$(date +"%x %r %Z")
TIMESTAMP="Generated $CURRENT_TIME, by $USER"
report_uptime () {
return
}
report_disk_space () {
return
}
report_home_space () {
return
}
cat << _EOF_


$TITLE


$TITLE


$TIMESTAMP


$(report_uptime)
$(report_disk_space)
$(report_home_space)

375

26 – Top-Down Design


_EOF_

Shell function names follow the same rules as variables
...
The return command (which is optional) satisfies the requirement
...
Global variables maintain their existence throughout the program
...
Inside
shell functions, it is often desirable to have local variables
...

Having local variables allows the programmer to use variables with names that may already exist, either in the script globally or in other shell functions, without having to
worry about potential name conflicts
...
This creates a variable that is local to the shell function in which it is defined
...
When we run this script, we
see the results:
[me@linuxbox
global: foo
funct_1: foo
global: foo
funct_2: foo
global: foo

~]$ local-vars
= 0
= 1
= 0
= 2
= 0

We see that the assignment of values to the local variable foo within both shell functions
has no effect on the value of foo defined outside the functions
...
This is very valuable, as it helps prevent one
part of a program from interfering with another
...
That is, they may be cut and pasted from script to script,
as needed
...
By
doing this, and testing frequently, we can detect errors early in the development process
...
For example, if we run the program,
make a small change, then run the program again and find a problem, it’s very likely that
the most recent change is the source of the problem
...
When constructing a stub, it’s a good idea to include something that provides
feedback to the programmer, which shows the logical flow is being carried out
...
If we change the functions to include some feedback:
report_uptime () {
echo "Function report_uptime executed
...
"
return
}
report_home_space () {
echo "Function report_home_space executed
...

Function report_disk_space executed
...




378

Keep Scripts Running
we now see that, in fact, our three functions are being executed
...
First, the report_uptime function:
report_uptime () {
cat <<- _EOF_

System Uptime


$(uptime)

_EOF_
return
}

It’s pretty straightforward
...
The report_disk_space function is similar:
report_disk_space () {
cat <<- _EOF_

Disk Space Utilization


$(df -h)

_EOF_
return
}

This function uses the df -h command to determine the amount of disk space
...
This, however, is not
a complete solution to the problem
...
The reason is that many systems set the permissions of
home directories to prevent them from being world-readable, which is a reasonable security measure
...
A better solution would be to
have the script adjust its behavior according to the privileges of the user
...


Shell Functions In Your
...
Aliases are very limited in the kind of commands and shell features they support, whereas shell functions allow anything that can be scripted
...
bashrc file:
ds () {
echo “Disk Space Utilization For $HOSTNAME”
df -h
}

Summing Up
In this chapter, we have introduced a common method of program design called topdown design, and we have seen how shell functions are used to build the stepwise refinement that it requires
...

This makes it possible for shell functions to be written in a portable manner and to be reusable by allowing them to be placed in multiple programs; a great time saver
...
Here are a couple of good ones:
http://en
...
org/wiki/Top-down_design
http://en
...
org/wiki/Subroutines

27 – Flow Control: Branching With if

27 – Flow Control: Branching With if
In the last chapter, we were presented with a problem
...
In programming terms, we need the program to branch
...

Otherwise:
Say “X is not equal to 5
...
Based on the condition, “Does X = 5?” do one thing,
“Say X equals 5,” otherwise do another thing, “Say X is not equal to 5
...
"
else
echo "x does not equal 5
...
Once, with the value of x set to 5,
which results in the string “equals 5” being output, and the second time with the value of
x set to 0, which results in the string “does not equal 5” being output
...
]
[else
commands]
fi
where commands is a list of commands
...
But before we can clear this up, we have to look at how the shell evaluates the success or failure
of a command
...
This value, which is an integer in the range of
0 to 255, indicates the success or failure of the command’s execution
...
The shell provides a
parameter that we can use to examine the exit status
...
The first time, the command executes successfully
...
We
execute the ls command a second time, producing an error, and examine the parameter
$? again
...

Some commands use different exit status values to provide diagnostics for errors, while
many commands simply exit with a value of one when they fail
...
However, a zero always
indicates success
...
The true command always executes successfully and the false command always executes unsuccessfully:
[me@linuxbox
[me@linuxbox
0
[me@linuxbox
[me@linuxbox
1

~]$ true
~]$ echo $?
~]$ false
~]$ echo $?

We can use these commands to see how the if statement works
...
"; fi
It's true
...
"; fi
[me@linuxbox ~]$

The command echo "It's true
...
If a list of commands follows if, the last command in the list is evaluated:
[me@linuxbox ~]$ if false; true; then echo "It's true
...

[me@linuxbox ~]$ if true; false; then echo "It's true
...
The test command performs a variety of checks and comparisons
...
The test command returns an exit status of zero when the expression is true and a status of one when
the expression is false
...


file1 -ot file2

file1 is older than file2
...


-c file

file exists and is a character-special (device) file
...


-e file

file exists
...


-g file

file exists and is set-group-ID
...


-k file

file exists and has its “sticky bit” set
...


-O file

file exists and is owned by the effective user ID
...


-r file

file exists and is readable (has readable permission for

384

file1 and file2 have the same inode numbers (the two
filenames refer to the same file by hard linking)
...

-s file

file exists and has a length greater than zero
...


-t fd

fd is a file descriptor directed to/from the terminal
...


-u file

file exists and is setuid
...


-x file

file exists and is executable (has execute/search
permission for the effective user)
...
bashrc
if [ -e "$FILE" ]; then
if [ -f "$FILE" ]; then
echo "$FILE is a regular file
...
"
fi
if [ -r "$FILE" ]; then
echo "$FILE is readable
...
"
fi
if [ -x "$FILE" ]; then
echo "$FILE is executable/searchable
...
There are two interesting things to note about this script
...
This is not required,
but is a defense against the parameter being empty
...
Using the quotes around the parameter
ensures that the operator is always followed by a string, even if the string is empty
...
The exit
command accepts a single, optional argument, which becomes the script’s exit status
...
Using exit in this way allows the script to indicate failure if $FILE expands to the name of a nonexistent file
...
When a script “runs off the end” (reaches end of file), it
terminates with an exit status of the last command executed by default, anyway
...
If we were to convert the script above to a shell function to include it
in a larger program, we could replace the exit commands with return statements and
get the desired behavior:
test_file () {
# test-file: Evaluate the status of a file
FILE=~/
...
"
fi
if [ -d "$FILE" ]; then
echo "$FILE is a directory
...
"
fi
if [ -w "$FILE" ]; then
echo "$FILE is writable
...
"
fi
else
echo "$FILE does not exist"
return 1

386

test
fi
}

String Expressions
The following expressions are used to evaluate strings:
Table 27-2: test String Expressions
Expression
string

Is True If
...


-z string

The length of string is zero
...
Single or double
equal signs may be used, but the use of double
equal signs is greatly preferred
...


string1 > string2

string1 sorts after string2
...


string is not null
...
If they are not, they will be interpreted by the
shell as redirection operators, with potentially destructive results
...
ASCII (POSIX) order is used in versions of bash up to and including 4
...

Here is a script that incorporates string expressions:
#!/bin/bash
# test-string: evaluate the value of a string
ANSWER=maybe
if [ -z "$ANSWER" ]; then

387

27 – Flow Control: Branching With if

fi

echo "There is no answer
...
"
elif [ "$ANSWER" = "no" ]; then
echo "The answer is NO
...
"
else
echo "The answer is UNKNOWN
...
We first determine if the string is empty
...
Notice the redirection that is
applied to the echo command
...
” to
standard error, which is the “proper” thing to do with error messages
...
” We do this by using elif, which is short for “else if
...


Integer Expressions
The following expressions are used with integers:
Table 27-3: test Integer Expressions
Expression
integer1 -eq integer2

Is True If
...


integer1 -le integer2

integer1 is less than or equal to integer2
...


integer1 -ge integer2

integer1 is greater than or equal to integer2
...


integer1 is equal to integer2
...

INT=-5
if [ -z "$INT" ]; then
echo "INT is empty
...
"
else
if [ $INT -lt 0 ]; then
echo "INT is negative
...
"
fi
if [ $((INT % 2)) -eq 0 ]; then
echo "INT is even
...
"
fi
fi

The interesting part of the script is how it determines whether an integer is even or odd
...


A More Modern Version Of test
Recent versions of bash include a compound command that acts as an enhanced replacement for test
...
The [[ ]] command is very similar to test (it supports all of its expressions), but
adds an important new string expression:
string1 =~ regex
which returns true if string1 is matched by the extended regular expression regex
...
In our earlier
example of the integer expressions, the script would fail if the constant INT contained
anything except an integer
...
Using [[ ]] with the =~ string expression operator, we could improve the
389

27 – Flow Control: Branching With if
script this way:
#!/bin/bash
# test-integer2: evaluate the value of an integer
...
"
else
if [ $INT -lt 0 ]; then
echo "INT is negative
...
"
fi
if [ $((INT % 2)) -eq 0 ]; then
echo "INT is even
...
"
fi
fi
else
echo "INT is not an integer
...
This expression also eliminates the possibility of empty values
...
For example:
[me@linuxbox ~]$ FILE=foo
...
* ]]; then
> echo "$FILE matches pattern 'foo
...
bar matches pattern 'foo
...


390

(( )) - Designed For Integers

(( )) - Designed For Integers
In addition to the [[ ]] compound command, bash also provides the (( )) compound command, which is useful for operating on integers
...

(( )) is used to perform arithmetic truth tests
...

[me@linuxbox ~]$ if ((1)); then echo "It is true
...

[me@linuxbox ~]$ if ((0)); then echo "It is true
...

INT=-5
if [[ "$INT" =~ ^-?[0-9]+$ ]]; then
if ((INT == 0)); then
echo "INT is zero
...
"
else
echo "INT is positive
...
"
else
echo "INT is odd
...
" >&2
exit 1
fi

Notice that we use less-than and greater-than signs and that == is used to test for equivalence
...
Notice too, that
because the compound command (( )) is part of the shell syntax rather than an ordi391

27 – Flow Control: Branching With if
nary command, and it deals only with integers, it is able to recognize variables by name
and does not require expansion to be performed
...


Combining Expressions
It’s also possible to combine expressions to create more complex evaluations
...
We saw these in Chapter 17, when we
learned about the find command
...
They are AND, OR and NOT
...
The following script determines if an integer is
within a range of values:
#!/bin/bash
# test-integer3: determine if an integer is within a
# specified range of values
...
"
else
echo "$INT is out of range
...
" >&2
exit 1
fi

392

Combining Expressions
In this script, we determine if the value of integer INT lies between the values of
MIN_VAL and MAX_VAL
...
We could have also coded this using
test:
if [ $INT -ge $MIN_VAL -a $INT -le $MAX_VAL ]; then
echo "$INT is within $MIN_VAL to $MAX_VAL
...
"
fi

The ! negation operator reverses the outcome of an expression
...
In the following script, we
modify the logic of our evaluation to find values of INT that are outside the specified
range:
#!/bin/bash
# test-integer4: determine if an integer is outside a
# specified range of values
...
"
else
echo "$INT is in range
...
" >&2
exit 1
fi

We also include parentheses around the expression, for grouping
...
Coding this with test would be done this way:
if [ ! \( $INT -ge $MIN_VAL -a $INT -le $MAX_VAL \) ]; then

393

27 – Flow Control: Branching With if
echo "$INT is outside $MIN_VAL to $MAX_VAL
...
"

fi

Since all expressions and operators used by test are treated as command arguments by
the shell (unlike [[ ]] and (( )) ), characters which have special meaning to bash,
such as <, >, (, and ), must be quoted or escaped
...
It’s important to
know how to use test, since it is very widely used, but [[ ]] is clearly more useful
and is easier to code
...
They regard it as impure and unclean
...
” This means that any script you
write should be able to run, unchanged, on any Unix-like system
...
Having seen what proprietary extensions to commands and shells did to the Unix world before POSIX, they are
naturally wary of the effect of Linux on their beloved OS
...
It prevents progress
...
In the case of
shell programming, it means making everything compatible with sh, the original
Bourne shell
...
” But they are really just lock-in devices for their customers
...
They encourage portability by supporting standards and by being universally available
...
So feel free to use all the features of bash
...


Control Operators: Another Way To Branch
bash provides two control operators that can perform branching
...
This
is the syntax:
394

Control Operators: Another Way To Branch
command1 && command2
and
command1 || command2
It is important to understand the behavior of these
...
With the ||
operator, command1 is executed and command2 is executed if, and only if, command1 is
unsuccessful
...
The second command is attempted only if the mkdir command is successful
...
This type of construct is very handy for handling errors in scripts, a subject we will discuss more in later chapters
...


Summing Up
We started this chapter with a question
...
With the -u option, id outputs the numeric
user ID number of the effective user
...
Knowing this, we can construct two different here documents, one taking advantage of superuser privileges, and the other, restricted to the user’s
own home directory
...
It
will be back
...


Further Reading
There are several sections of the bash man page that provide further detail on the topics
covered in this chapter:


Lists (covers the control operators || and &&)



Compound Commands (covers [[ ]], (( )) and if)



CONDITIONAL EXPRESSIONS



SHELL BUILTIN COMMANDS (covers test)

Further, the Wikipedia has a good article on the concept of pseudocode:
http://en
...
org/wiki/Pseudocode

396

28 – Reading Keyboard Input

28 – Reading Keyboard Input
The scripts we have written so far lack a feature common in most computer programs  — 
interactivity
...
While many programs don’t need to be interactive, some programs benefit from being able to accept input
directly from the user
...

INT=-5
if [[ "$INT" =~ ^-?[0-9]+$ ]]; then
if [ $INT -eq 0 ]; then
echo "INT is zero
...
"
else
echo "INT is positive
...
"
else
echo "INT is odd
...
" >&2
exit 1
fi

Each time we want to change the value of INT, we have to edit the script
...
In this chapter, we will begin to look at how we can add interactivity to our programs
...
This command
can be used to read keyboard input or, when redirection is employed, a line of data from a
file
...
]
where options is one or more of the available options listed below and variable is the
name of one or more variables used to hold the input value
...

Basically, read assigns fields from standard input to the specified variables
...

echo -n "Please enter an integer -> "
read int
if [[ "$int" =~ ^-?[0-9]+$ ]]; then
if [ $int -eq 0 ]; then
echo "$int is zero
...
"
else
echo "$int is positive
...
"
else
echo "$int is odd
...
" >&2
exit 1
fi

We use echo with the -n option (which suppresses the trailing newline on output) to
display a prompt, and then use read to input a value for the variable int
...

5 is odd
...
Notice how read behaves when
given different numbers of values:
[me@linuxbox ~]$ read-multiple
Enter one or more values > a b c d e
var1 = 'a'
var2 = 'b'
var3 = 'c'
var4 = 'd'
var5 = 'e'
[me@linuxbox ~]$ read-multiple
Enter one or more values > a
var1 = 'a'
var2 = ''
var3 = ''
var4 = ''
var5 = ''
[me@linuxbox ~]$ read-multiple
Enter one or more values > a b c d e f g
var1 = 'a'
var2 = 'b'
var3 = 'c'
var4 = 'd'
var5 = 'e f g'

399

28 – Reading Keyboard Input
If read receives fewer than the expected number, the extra variables are empty, while an
excessive amount of input results in the final variable containing all of the extra input
...


-e

Use Readline to handle input
...


-i string

Use string as a default reply if the user simply presses
Enter
...


-n num

Read num characters of input, rather than an entire line
...


400

Assign the input to array, starting with index zero
...


read – Read Values From Standard Input
-r

Raw mode
...


-s

Silent mode
...
This is useful when inputting passwords
and other confidential information
...
Terminate input after seconds
...


-u fd

Use input from file descriptor fd, rather than standard
input
...
For example, with the
-p option, we can provide a prompt string:
#!/bin/bash
# read-single: read multiple values into default variable
read -p "Enter one or more values > "
echo "REPLY = '$REPLY'"

With the -t and -s options we can write a script that reads “secret” input and times out
if the input is not completed in a specified time:
#!/bin/bash
# read-secret: input a secret passphrase
if read -t 10 -sp "Enter secret passphrase > " secret_pass; then
echo -e "\nSecret passphrase = '$secret_pass'"
else
echo -e "\nInput timed out" >&2
exit 1
fi

The script prompts the user for a secret passphrase and waits 10 seconds for input
...
Since the
-s option is included, the characters of the passphrase are not echoed to the display as
they are typed
...

read -e -p "What is your user name? " -i $USER
echo "You answered: '$REPLY'"

In this script, we prompt the user to enter his/her user name and use the environment variable USER to provide a default value
...

[me@linuxbox ~]$ read-default
What is your user name? me
You answered: 'me'

IFS
Normally, the shell performs word splitting on the input provided to read
...
This behavior is
configured by a shell variable named IFS (for Internal Field Separator)
...

We can adjust the value of IFS to control the separation of fields input to read
...
By changing the value of IFS to a single colon, we can use read to input
the contents of /etc/passwd and successfully separate fields into different variables
...
= '$home'"
echo "Shell =
'$shell'"
else
echo "No such user '$user_name'" >&2
exit 1
fi

This script prompts the user to enter the username of an account on the system, then displays the different fields found in the user’s record in the /etc/passwd file
...
The first is:
file_info=$(grep "^$user_name:" $FILE)

This line assigns the results of a grep command to the variable file_info
...

The second interesting line is this one:
IFS=":" read user pw uid gid name home shell <<< "$file_info"

The line consists of three parts: a variable assignment, a read command with a list of
variable names as arguments, and a strange new redirection operator
...

The shell allows one or more variable assignments to take place immediately before a
command
...
The
effect of the assignment is temporary; only changing the environment for the duration of
the command
...
Alternately,
we could have coded it this way:
OLD_IFS="$IFS"
IFS=":"
read user pw uid gid name home shell <<< "$file_info"
IFS="$OLD_IFS"

where we store the value of IFS, assign a new value, perform the read command, and
then restore IFS to its original value
...

The <<< operator indicates a here string
...
In our example, the line of data from the
/etc/passwd file is fed to the standard input of the read command
...


You Can’t Pipe read
While the read command normally takes input from standard input, you cannot
do this:
echo "foo" | read

We would expect this to work, but it does not
...
Why is this?
The explanation has to do with the way the shell handles pipelines
...
These are copies of the shell
and its environment which are used to execute the command in the pipeline
...

Subshells in Unix-like systems create copies of the environment for the processes
to use while they execute
...
This means that a subshell can never alter the environment of
its parent process
...
In the example above, read assigns the value “foo” to the variable REPLY in its subshell’s environment, but when the command exits, the subshell and
its environment are destroyed, and the effect of the assignment is lost
...
Another method is
discussed in Chapter 36
...
Very often the difference between a well-written program and a poorly
written one lies in the program’s ability to deal with the unexpected
...
We’ve done a little of this with our evaluation
programs in the previous chapter, where we checked the values of integers and screened
out empty values and non-numeric characters
...

This is especially important for programs that are shared by multiple users
...
Even then, if the program performs dangerous tasks such as deleting files, it would be wise to include data validation,
just in case
...
_]+$ ]]; then
echo "'$REPLY' is a valid filename
...
"
else
echo "However, file '$REPLY' does not exist
...
[[:digit:]]+$ ]]; then
echo "'$REPLY' is a floating point number
...
"
fi
# is input an integer?
if [[ $REPLY =~ ^-?[[:digit:]]+$ ]]; then
echo "'$REPLY' is an integer
...
"
fi
else

echo "The string '$REPLY' is not a valid filename
...
The item is subsequently analyzed to determine its contents
...


Menus
A common type of interactivity is called menu-driven
...
For example, we could
imagine a program that presented the following:
Please Select:
1
...

3
...


Display System Information
Display Disk Space
Display Home Space Utilization
Quit

Enter selection [0-3] >

Using what we learned from writing our sys_info_page program, we can construct a
menu-driven program to perform the tasks on the above menu:
#!/bin/bash
# read-menu: a menu driven system information program
clear
echo "
Please Select:
1
...
Display Disk Space
3
...
Quit
"
read -p "Enter selection [0-3] > "
if [[ $REPLY =~ ^[0-3]$ ]]; then
if [[ $REPLY == 0 ]]; then
echo "Program terminated
...
" >&2
exit 1

fi

This script is logically divided into two parts
...
The second part identifies the response and carries out the selected action
...
It is used here to prevent the script from executing unnecessary code after an action has been carried out
...


Summing Up
In this chapter, we took our first steps toward interactivity; allowing users to input data
into our programs via the keyboard
...
In the next chapter, we will build on the menudriven program concept to make it even better
...
As an exercise, rewrite the programs in this chapter using the test command rather than the [[ ]] compound command
...
This will be good practice
...
gnu
...
html#Bash-Builtins

29 – Flow Control: Looping With while / until

29 – Flow Control: Looping With while / until
In the previous chapter, we developed a menu-driven program to produce various kinds
of system information
...

It only executes a single choice and then terminates
...
It would be better if we could somehow construct the program so that it could repeat the menu display and selection over and over, until the user chooses to exit the program
...
The shell provides three compound commands for
looping
...


Looping
Daily life is full of repeated activities
...
Let’s consider slicing a carrot
...
get cutting board
2
...
place carrot on cutting board
4
...
advance carrot
6
...
if entire carrot sliced, then quit, else go to step 4
Steps 4 through 7 form a loop
...


while
bash can express a similar idea
...
a bash script could be constructed as follows:
#!/bin/bash
# while-count: display a series of numbers
count=1
while [[ $count -le 5 ]]; do
echo $count
count=$((count + 1))
done
echo "Finished
...


The syntax of the while command is:
while commands; do commands; done
Like if, while evaluates the exit status of a list of commands
...
In the script above, the variable
count is created and assigned an initial value of 1
...
As long as the test command returns an exit status of
zero, the commands within the loop are executed
...
After six iterations of the loop, the value of count has increased
to 6, the test command no longer returns an exit status of zero and the loop terminates
...

We can use a while loop to improve the read-menu program from the previous chapter:
#!/bin/bash
# while-menu: a menu driven system information program

410

Looping
DELAY=3 # Number of seconds to display results
while [[ $REPLY != 0 ]]; do
clear
cat <<- _EOF_
Please Select:
1
...

3
...


Display System Information
Display Disk Space
Display Home Space Utilization
Quit

_EOF_
read -p "Enter selection [0-3] > "
if [[ $REPLY =~ ^[0-3]$ ]]; then
if [[ $REPLY == 1 ]]; then
echo "Hostname: $HOSTNAME"
uptime
sleep $DELAY
fi
if [[ $REPLY == 2 ]]; then
df -h
sleep $DELAY
fi
if [[ $REPLY == 3 ]]; then
if [[ $(id -u) -eq 0 ]]; then
echo "Home Space Utilization (All Users)"
du -sh /home/*
else
echo "Home Space Utilization ($USER)"
du -sh $HOME
fi
sleep $DELAY
fi
else
echo "Invalid entry
...
"

By enclosing the menu in a while loop, we are able to have the program repeat the menu
display after each selection
...
At
the end of each action, a sleep command is executed so the program will pause for a
few seconds to allow the results of the selection to be seen before the screen is cleared
and the menu is redisplayed
...


Breaking Out Of A Loop
bash provides two builtin commands that can be used to control program flow inside
loops
...
The continue command causes the
remainder of the loop to be skipped, and program control resumes with the next iteration
of the loop
...

2
...

0
...
"
sleep $DELAY

fi
done
echo "Program terminated
...
Since true will
always exit with a exit status of zero, the loop will never end
...
Since the loop will never end on its own, it’s up to the programmer to provide some way to break out of the loop when the time is right
...
The continue command has been included at the end of the other script choices to allow for
more efficient execution
...
For example, if the “1” selection is chosen and
identified, there is no reason to test for the other selections
...
An until loop continues until it receives a zero exit status
...
We could get the same result
by coding the script with until:
#!/bin/bash
# until-count: display a series of numbers
count=1
until [[ $count -gt 5 ]]; do
echo $count

413

29 – Flow Control: Looping With while / until
count=$((count + 1))
done
echo "Finished
...
The decision of whether to use the while or until loop is usually a
matter of choosing the one that allows the clearest test to be written
...
This allows files to be processed with
while and until loops
...
txt file used in earlier chapters:
#!/bin/bash
# while-read: read lines from a file
while read distro version release; do
printf "Distro: %s\tVersion: %s\tReleased: %s\n" \
$distro \
$version \
$release
done < distros
...

The loop will use read to input the fields from the redirected file
...
At
that point, it will exit with a non-zero exit status, thereby terminating the loop
...
txt | while read distro version release; do
printf "Distro: %s\tVersion: %s\tReleased: %s\n" \
$distro \
$version \
$release
done

414

Reading Files With Loops
Here we take the output of the sort command and display the stream of text
...


Summing Up
With the introduction of loops, and our previous encounters with branching, subroutines
and sequences, we have covered the major types of flow control used in programs
...


Further Reading


The Bash Guide for Beginners from the Linux Documentation Project has some
more examples of while loops:
http://tldp
...
html



The Wikipedia has an article on loops, which is part of a larger article on flow
control:
http://en
...
org/wiki/Control_flow#Loops

415

30 – Troubleshooting

30 – Troubleshooting
As our scripts become more complex, it’s time to take a look at what happens when
things go wrong and they don’t do what we want
...


Syntactic Errors
One general class of errors is syntactic
...
In most cases, these kinds of errors will lead to the shell refusing to execute the script
...
"
else
echo "Number is not equal to 1
...


416

Syntactic Errors

Missing Quotes
If we edit our script and remove the trailing quote from the argument following the first
echo command:
#!/bin/bash
# trouble: script to demonstrate common errors
number=1
if [ $number = 1 ]; then
echo "Number is equal to 1
...
"
fi

watch what happens:
[me@linuxbox ~]$ trouble
/home/me/bin/trouble: line 10: unexpected EOF while looking for
matching `"'
/home/me/bin/trouble: line 13: syntax error: unexpected end of file

It generates two errors
...
We can see why, if we follow
the program after the missing quote
...
bash becomes very confused after that, and the syntax of the if command is broken because the
fi statement is now inside a quoted (but open) string
...
Using an editor with syntax
highlighting will help
...
Let’s look at what happens if we remove the semicolon after the test in the if
command:
#!/bin/bash
# trouble: script to demonstrate common errors
number=1
if [ $number = 1 ] then
echo "Number is equal to 1
...
"
fi

The result is this:
[me@linuxbox ~]$ trouble
/home/me/bin/trouble: line 9: syntax error near unexpected token
`else'
/home/me/bin/trouble: line 9: `else'

Again, the error message points to a error that occurs later than the actual problem
...
As we recall, if accepts a list of commands and evaluates the exit code of the last command in the list
...
The [ command takes what follows
it as a list of arguments; in our case, four arguments: $number, 1, =, and ]
...
The following echo command is legal, too
...
The else is encountered next, but it’s out of place, since the shell recognizes it as a reserved word (a
word that has special meaning to the shell) and not the name of a command, hence the error message
...
Sometimes the script
will run fine and other times it will fail because of the results of an expansion
...
"
else
echo "Number is not equal to 1
...


We get this rather cryptic error message, followed by the output of the second echo
command
...
When the command:
[ $number = 1 ]

undergoes expansion with number being empty, the result is this:
[

= 1 ]

which is invalid and the error is generated
...
Further, since the test failed (because of the error),
the if command receives a non-zero exit code and acts accordingly, and the second
echo command is executed
...
In addition to empty strings, quotes should
be used in cases where a value could expand into multi-word strings, as with filenames
containing embedded spaces
...
The script
will run, but it will not produce the desired result, due to a problem with its logic
...
Incorrect conditional expressions
...
Sometimes the logic will be reversed, or it
will be incomplete
...
“Off by one” errors
...
These kinds of errors result in
either a loop “going off the end” by counting too far, or else missing the last iteration of the loop by terminating one iteration too soon
...
Unanticipated situations
...
This can also include
unanticipated expansions, such as a filename that contains embedded spaces that
expands into multiple command arguments rather than a single filename
...
This means a careful evaluation
of the exit status of programs and commands that are used by a script
...
An unfortunate system administrator wrote a script to perform a
maintenance task on an important server
...
But what happens if it does not? In that case, the cd
command fails and the script continues to the next line and deletes the files in the current
working directory
...

Let’s look at some ways this design could be improved
...
This is better, but
still leaves open the possibility that the variable, dir_name, is unset or empty, which
would result in the files in the user’s home directory being deleted
...
Check results" >&2
exit 1
fi

Here, we check both the name, to see that it is that of an existing directory, and the success of the cd command
...


421

30 – Troubleshooting

Verifying Input
A general rule of good programming is that if a program accepts input, it must be able to
deal with anything it receives
...
We saw an example of
this in the previous chapter when we studied the read command
...
It will only return a zero exit status if the string returned by the
user is a numeral in the range of zero to three
...
Sometimes
these sorts of tests can be very challenging to write, but the effort is necessary to produce
a high quality script
...
If you were given five minutes to design a device “that kills flies,”
you designed a flyswatter
...

The same principle applies to programming
...

That kind of script is common and should be developed quickly to make the effort
economical
...
On
the other hand, if a script is intended for production use, that is, a script that will
be used over and over for an important task or by multiple users, it needs much
more careful development
...

There is a saying in the open-source world, “release early, release often,” which reflects
this fact
...

Experience has shown that bugs are much easier to find, and much less expensive to fix,
if they are found early in the development cycle
...
From the
earliest stages of script development, they are a valuable technique to check the progress
422

Testing
of our work
...
Testing the original fragment of code would be dangerous, since its purpose is to
delete files, but we could modify the code to make the test safe:
if [[ -d $dir_name ]]; then
if cd $dir_name; then
echo rm * # TESTING
else
echo "cannot cd to '$dir_name'" >&2
exit 1
fi
else
echo "no such directory: '$dir_name'" >&2
exit 1
fi
exit # TESTING

Since the error conditions already output useful messages, we don't have to add any
...
This change allows safe execution of the code
...
The need for this will vary according to the
design of the script
...
These
can be used to help find and remove the changes when testing is complete
...
This is
done by carefully choosing input data or operating conditions that reflect edge and corner cases
...
dir_name contains the name of an existing directory
2
...
dir_name is empty
By performing the test with each of these conditions, good test coverage is achieved
...
Not every script feature needs to
423

30 – Troubleshooting
be extensively tested
...
Since it
could be so potentially destructive if it malfunctioned, our code fragment deserves careful
consideration during both its design and testing
...
“A problem” usually
means that the script is, in some way, not performing to the programmer's expectations
...
Finding bugs can sometimes involve a lot of detective work
...
It should be programmed defensively, to detect abnormal conditions and provide useful feedback to the user
...


Finding The Problem Area
In some scripts, particularly long ones, it is sometimes useful to isolate the area of the
script that is related to the problem
...
One technique that can be used to isolate
code is “commenting out” sections of a script
...
Testing can then be performed again, to see
if the removal of the code has any impact on the behavior of the bug
...
That is, portions of the
script are either never being executed, or are being executed in the wrong order or at the
424

Debugging
wrong time
...

One tracing method involves placing informative messages in a script that display the location of execution
...
We also do
not indent the lines containing the messages, so it is easier to find when it’s time to remove them
...
Using our earlier trouble script, we can activate tracing
for the entire script by adding the -x option to the first line:
#!/bin/bash -x
# trouble: script to demonstrate common errors
number=1

425

30 – Troubleshooting
if [ $number = 1 ]; then
echo "Number is equal to 1
...
"
fi

When executed, the results look like this:
[me@linuxbox ~]$ trouble
+ number=1
+ '[' 1 = 1 ']'
+ echo 'Number is equal to 1
...


With tracing enabled, we see the commands performed with expansions applied
...
The plus sign is the default character for trace output
...
The contents of this variable can be adjusted to
make the prompt more useful
...
Note that single quotes are
required to prevent expansion until the prompt is actually used:
[me@linuxbox ~]$ export PS4='$LINENO + '
[me@linuxbox ~]$ trouble
5 + number=1
7 + '[' 1 = 1 ']'
8 + echo 'Number is equal to 1
...


To perform a trace on a selected portion of a script, rather than the entire script, we can
use the set command with the -x option:
#!/bin/bash
# trouble: script to demonstrate common errors
number=1
set -x # Turn on tracing
if [ $number = 1 ]; then
echo "Number is equal to 1
...
"
fi
set +x # Turn off tracing

We use the set command with the -x option to activate tracing and the +x option to deactivate tracing
...


Examining Values During Execution
It is often useful, along with tracing, to display the content of variables to see the internal
workings of a script while it is being executed
...
"
else
echo "Number is not equal to 1
...
This technique is particularly useful when watching the behavior of loops and arithmetic within
scripts
...
Of course, there are many more
...
Debugging is a fine art that can be developed through experience, both in knowing how to avoid bugs (testing constantly throughout development)
and in finding bugs (effective use of tracing)
...
wooledge
...
org/LDP/abs/html/gotchas
...
gnu
...
html



Eric Raymond’s The Art of Unix Programming is a great resource for learning the
basic concepts found in well-written Unix programs
...
faqs
...
faqs
...
html



For really heavy-duty debugging, there is the Bash Debugger:
http://bashdb
...
net/

428

The Wikipedia has a couple of short articles on syntactic and logical errors:
http://en
...
org/wiki/Syntax_error
http://en
...
org/wiki/Logic_error

31 – Flow Control: Branching With case

31 – Flow Control: Branching With case
In this chapter, we will continue to look at flow control
...
To do this, we
used a series of if commands to identify which of the possible choices has been selected
...


case
The bash multiple-choice compound command is called case
...
) commands ;;]
...
Display System Information
2
...
Display Home Space Utilization
0
...
"
exit

else

fi
if [[ $REPLY == 1 ]]; then
echo "Hostname: $HOSTNAME"
uptime
exit
fi
if [[ $REPLY == 2 ]]; then
df -h
exit
fi
if [[ $REPLY == 3 ]]; then
if [[ $(id -u) -eq 0 ]]; then
echo "Home Space Utilization (All Users)"
du -sh /home/*
else
echo "Home Space Utilization ($USER)"
du -sh $HOME
fi
exit
fi
echo "Invalid entry
...
Display System Information
2
...
Display Home Space Utilization
0
...
"
exit
;;

430

case
1)
2)
3)

*)
esac

echo "Hostname: $HOSTNAME"
uptime
;;
df -h
;;
if [[ $(id -u) -eq 0 ]]; then
echo "Home Space Utilization (All Users)"
du -sh /home/*
else
echo "Home Space Utilization ($USER)"
du -sh $HOME
fi
;;
echo "Invalid entry" >&2
exit 1
;;

The case command looks at the value of word, in our example, the value of the REPLY
variable, and then attempts to match it against one of the specified patterns
...
After a
match is found, no further matches are attempted
...
Patterns
are terminated with a “)” character
...


???)

Matches if word is exactly three characters long
...
txt)

Matches if word ends with the characters “
...


*)

Matches any value of word
...


Matches if word equals “a”
...
txt)
*)
esac

echo
echo
echo
echo
echo

"is
"is
"is
"is
"is

a single alphabetic character
...
" ;;
three characters long
...
txt'" ;;
something else
...
This creates an “or” conditional pattern
...
For example:
#!/bin/bash
# case-menu: a menu driven system information program
clear
echo "
Please Select:
A
...
Display Disk Space
C
...
Quit
"
read -p "Enter selection [A, B, C or Q] > "
case $REPLY in
q|Q) echo "Program terminated
...
Notice how the new patterns allow for entry of both upper- and lowercase letters
...
0, case allowed only one action to be performed on a
successful match
...
Here we see
a script that tests a character:
#!/bin/bash
# case4-1: test a character
read -n 1 -p "Type a character > "
echo
case $REPLY in
[[:upper:]])
echo "'$REPLY' is upper case
...
" ;;
[[:alpha:]])
echo "'$REPLY' is alphabetic
...
" ;;
[[:graph:]])
echo "'$REPLY' is a visible character
...
" ;;
[[:space:]])
echo "'$REPLY' is a whitespace character
...
" ;;
esac

Running this script produces this:
[me@linuxbox ~]$ case4-1
Type a character > a
'a' is lower case
...
For example, the character "a" is both lower case and alphabetic, as well as a hexadecimal digit
...
0 there was no way for
case to match more than one test
...
" ;;&
[[:lower:]])
echo "'$REPLY' is lower case
...
" ;;&
[[:digit:]])
echo "'$REPLY' is a digit
...
" ;;&
[[:punct:]])
echo "'$REPLY' is a punctuation symbol
...
" ;;&
[[:xdigit:]])
echo "'$REPLY' is a hexadecimal digit
...

'a' is alphabetic
...

'a' is a hexadecimal digit
...


Summing Up
The case command is a handy addition to our bag of programming tricks
...


Further Reading




The Advanced Bash-Scripting Guide provides further examples of case applica-

434

The Bash Reference Manual section on Conditional Constructs describes the
case command in detail:
http://tiswww
...
edu/php/chet/bash/bashref
...
org/LDP/abs/html/testbranch
...
In this chapter, we will examine the shell features
that allow our programs to get access to the contents of the command line
...
The variables are named 0 through 9
...
When executed
with no command line arguments:
[me@linuxbox ~]$ posit-param
$0 = /home/me/bin/posit-param

436

Accessing The Command Line
$1
$2
$3
$4
$5
$6
$7
$8
$9

=
=
=
=
=
=
=
=
=

Even when no arguments are provided, $0 will always contain the first item appearing on
the command line, which is the pathname of the program being executed
...
To specify a number greater than nine, surround the number in braces
...


Determining The Number of Arguments
The shell also provides a variable, $#, that yields the number of arguments on the command line:
#!/bin/bash
# posit-param: script to view command line parameters
echo "

437

32 – Positional Parameters
Number of arguments: $#
\$0 = $0
\$1 = $1
\$2 = $2
\$3 = $3
\$4 = $4
\$5 = $5
\$6 = $6
\$7 = $7
\$8 = $8
\$9 = $9
"

The result:
[me@linuxbox ~]$ posit-param a b c d
Number of arguments: 4
$0 = /home/me/bin/posit-param
$1 = a
$2 = b
$3 = c
$4 = d
$5 =
$6 =
$7 =
$8 =
$9 =

shift – Getting Access To Many Arguments
But what happens when we give the program a large number of arguments such as this:
[me@linuxbox ~]$ posit-param *
Number of arguments: 82
$0 = /home/me/bin/posit-param
$1 = addresses
...
html
$4 = debian-500-i386-netinst
...
jigdo
$6 = debian-500-i386-netinst
...
tar
...
txt

On this example system, the wildcard * expands into 82 arguments
...
The shift
command causes all the parameters to “move down one” each time it is executed
...
The value of $# is also reduced by one
...
We display the current argument, increment the variable count with each iteration of the loop to provide a running count of the number of arguments processed and, finally, execute a shift to load
$1 with the next argument
...
By way of example, here is a simple file information program:

439

32 – Positional Parameters
#!/bin/bash
# file_info: simple file information program
PROGNAME=$(basename $0)
if [[ -e $1 ]]; then
echo -e "\nFile Type:"
file $1
echo -e "\nFile Status:"
stat $1
else
echo "$PROGNAME: usage: $PROGNAME file" >&2
exit 1
fi

This program displays the file type (determined by the file command) and the file status (from the stat command) of a specified file
...
It is given the value that results from the basename $0
command
...
In our example, basename removes the leading portion
of the pathname contained in the $0 parameter, the full pathname of our example program
...
By coding it this way, the script can be renamed and the message automatically adjusts to contain the name of the program
...
To demonstrate, we will convert the
file_info script into a shell function:
file_info () {
# file_info: function to display file information
if [[ -e $1 ]]; then
echo -e "\nFile Type:"
file $1
echo -e "\nFile Status:"
stat $1
else
echo "$FUNCNAME: usage: $FUNCNAME file" >&2
return 1

440

Accessing The Command Line
fi
}

Now, if a script that incorporates the file_info shell function calls the function with a
filename argument, the argument will be passed to the function
...
bashrc file
...
The
shell automatically updates this variable to keep track of the currently executed shell
function
...
e
...


Handling Positional Parameters En Masse
It is sometimes useful to manage all the positional parameters as a group
...
This means that we create a
script or shell function that simplifies the execution of another program
...

The shell provides two special parameters for this purpose
...
They are:
Table 32-1: The * And @ Special Parameters
Parameter

Description

$*

Expands into the list of positional parameters, starting with 1
...


$@

Expands into the list of positional parameters, starting with 1
...


Here is a script that shows these special paramaters in action:

441

32 – Positional Parameters
#!/bin/bash
# posit-params3 : script to demonstrate $* and $@
print_params () {
echo "\$1 = $1"
echo "\$2 = $2"
echo "\$3 = $3"
echo "\$4 = $4"
}
pass_params () {
echo -e "\n"
echo -e "\n"
echo -e "\n"
echo -e "\n"
}

'$* :';
'"$*" :';
'$@ :';
'"$@" :';

print_params
print_params
print_params
print_params

$*
"$*"
$@
"$@"

pass_params "word" "words with spaces"

In this rather convoluted program, we create two arguments: “word” and “words with
spaces”, and pass them to the pass_params function
...
When executed, the script reveals the differences:
[me@linuxbox ~]$ posit-param3
$* :
$1 = word
$2 = words
$3 = with
$4 = spaces
"$*" :
$1 = word words with spaces
$2 =
$3 =
$4 =
$@ :
$1 = word
$2 = words
$3 = with
$4 = spaces
"$@" :
$1 = word

442

Handling Positional Parameters En Masse
$2 = words with spaces
$3 =
$4 =

With our arguments, both $! and $@ produce a four word result:
word words with spaces

"$*" produces a one word result:
"word words with spaces"

"$@" produces a two word result:
"word" "words with spaces"

which matches our actual intent
...


A More Complete Application
After a long hiatus, we are going to resume work on our sys_info_page program
...
We will add an option to specify a name for a file to contain the program’s output
...




Interactive mode
...
If it does, the user will be
prompted before the existing file is overwritten
...




Help
...


Here is the code needed to implement the command line processing:
usage () {
echo "$PROGNAME: usage: $PROGNAME [-f file | -i]"
return
}
# process command line options
interactive=

443

32 – Positional Parameters
filename=
while [[ -n $1 ]]; do
case $1 in
-f | --file)
-i | --interactive)
-h | --help)
*)

shift
filename=$1
;;
interactive=1
;;
usage
exit
;;
usage >&2
exit 1
;;

esac
shift
done

First, we add a shell function called usage to display a message when the help option is
invoked or an unknown option is attempted
...
This loop continues while the positional parameter
$1 is not empty
...

Within the loop, we have a case statement that examines the current positional parameter to see if it matches any of the supported choices
...
If not, the usage message is displayed and the script terminates with an error
...
When detected, it causes an additional
shift to occur, which advances the positional parameter $1 to the filename argument
supplied to the -f option
...
Overwrite? [y/n/q] > "
case $REPLY in
Y|y) break

444

A More Complete Application

Q|q)
*)

done

;;
echo "Program terminated
...
If the desired output file
already exists, the user is prompted to overwrite, choose another filename, or quit the
program
...
Notice how the case statement only detects if the user chooses to overwrite or quit
...

In order to implement the output filename feature, we must first convert the existing
page-writing code into a shell function, for reasons that will become clear in a moment:
write_html_page () {
cat <<- _EOF_


$TITLE


$TITLE


$TIMESTAMP


$(report_uptime)
$(report_disk_space)
$(report_home_space)


_EOF_
return
}
# output html page
if [[ -n $filename ]]; then

445

32 – Positional Parameters

else
fi

if touch $filename && [[ -f $filename ]]; then
write_html_page > $filename
else
echo "$PROGNAME: Cannot write file '$filename'" >&2
exit 1
fi
write_html_page

The code that handles the logic of the -f option appears at the end of the listing shown
above
...
To do this, a touch is performed, followed
by a test to determine if the resulting file is a regular file
...

As we can see, the write_html_page function is called to perform the actual generation of the page
...


Summing Up
With the addition of positional parameters, we can now write fairly functional scripts
...
bashrc file
...
Here is a
complete listing, with the most recent changes highlighted:
#!/bin/bash
# sys_info_page: program to output a system information page
PROGNAME=$(basename $0)
TITLE="System Information Report For $HOSTNAME"
CURRENT_TIME=$(date +"%x %r %Z")
TIMESTAMP="Generated $CURRENT_TIME, by $USER"
report_uptime () {
cat <<- _EOF_

System Uptime


$(uptime)

_EOF_
return

446

Summing Up
}
report_disk_space () {
cat <<- _EOF_

Disk Space Utilization


$(df -h)

_EOF_
return
}
report_home_space () {
if [[ $(id -u) -eq 0 ]]; then
cat <<- _EOF_

Home Space Utilization (All Users)


$(du -sh /home/*)

_EOF_
else
cat <<- _EOF_

Home Space Utilization ($USER)


$(du -sh $HOME)

_EOF_
fi
return
}
usage () {
echo "$PROGNAME: usage: $PROGNAME [-f file | -i]"
return
}
write_html_page () {
cat <<- _EOF_


$TITLE


$TITLE


$TIMESTAMP


$(report_uptime)
$(report_disk_space)
$(report_home_space)


_EOF_
return
}
# process command line options

447

32 – Positional Parameters
interactive=
filename=
while [[ -n $1 ]]; do
case $1 in
-f | --file)
-i | --interactive)
-h | --help)
*)

done

esac
shift

shift
filename=$1
;;
interactive=1
;;
usage
exit
;;
usage >&2
exit 1
;;

# interactive mode
if [[ -n $interactive ]]; then
while true; do
read -p "Enter name of output file: " filename
if [[ -e $filename ]]; then
read -p "'$filename' exists
...
"
exit
;;
*)
continue
;;
esac
fi
done
fi
# output html page
if [[ -n $filename ]]; then
if touch $filename && [[ -f $filename ]]; then
write_html_page > $filename
else
echo "$PROGNAME: Cannot write file '$filename'" >&2
exit 1
fi
else

448

Summing Up
write_html_page
fi

We’re not done yet
...


Further Reading


The Bash Hackers Wiki has a good article on positional parameters:
http://wiki
...
org/scripting/posparams



The Bash Reference Manual has an article on the special parameters, including
$* and $@:
http://www
...
org/software/bash/manual/bashref
...
It is described in the SHELL BUILTIN COMMANDS section of the
bash man page and at the Bash Hackers Wiki:
http://wiki
...
org/howto/getopts_tutorial

449

33 – Flow Control: Looping With for

33 – Flow Control: Looping With for
In this final chapter on flow control, we will look at another of the shell’s looping constructs
...
This turns out to be very useful when programming
...

A for loop is implemented, naturally enough, with the for command
...


for: Traditional Shell Form
The original for command’s syntax is:
for variable [in words]; do
commands
done

Where variable is the name of a variable that will increment during the execution of the
loop, words is an optional list of items that will be sequentially assigned to variable, and
commands are the commands that are to be executed on each iteration of the loop
...
We can easily demonstrate how it
works:
[me@linuxbox ~]$ for i in A B C D; do echo $i; done
A
B
C
D

In this example, for is given a list of four words: “A”, “B”, “C”, and “D”
...
Each time the loop is executed, a word is assigned to the variable i
...
As with the while and until loops, the done keyword closes the loop
...
For example, through brace expansion:
[me@linuxbox ~]$ for i in {A
...
txt; do echo $i; done
distros-by-date
...
txt
distros-key-names
...
txt
distros-names
...
txt
distros-vernums
...
txt

or command substitution:
#!/bin/bash
# longest-word : find longest string in a file
while [[ -n $1 ]]; do
if [[ -r $1 ]]; then
max_word=
max_len=0
for i in $(strings $1); do
len=$(echo $i | wc -c)
if (( len > max_len )); then
max_len=$len
max_word=$i
fi
done
echo "$1: '$max_word' ($max_len characters)"
fi
shift
done

451

33 – Flow Control: Looping With for
In this example, we look for the longest string found within a file
...
The for loop processes each word in turn and determines if the current word is the
longest found so far
...

If the optional in words portion of the for command is omitted, for defaults to processing the positional parameters
...
By
omitting the list of words in the for command, the positional parameters are used instead
...
The use of shift has also been eliminated
...
Why? No specific reason actually, besides tradition
...


452

for: Traditional Shell Form

The basis of this tradition comes from the Fortran programming language
...
This behavior led programmers to use
the variables I, J, and K for loop variables, since it was less work to use them
when a temporary variable (as loop variables often are) was needed
...


for: C Language Form
Recent versions of bash have added a second form of for command syntax, one that
resembles the form found in the C programming language
...

In terms of behavior, this form is equivalent to the following construct:
(( expression1 ))
while (( expression2 )); do
commands
(( expression3 ))
done

expression1 is used to initialize conditions for the loop, expression2 is used to determine
when the loop is finished, and expression3 is carried out at the end of each iteration of the
loop
...

The C language form of for is useful anytime a numeric sequence is needed
...


Summing Up
With our knowledge of the for command, we will now apply the final improvements to
our sys_info_page script
...
We still test for the superuser,
but instead of performing the complete set of actions as part of the if, we set some variables used later in a for loop
...


Further Reading


The Advanced Bash-Scripting Guide has a chapter on loops, with a variety of examples using for:
http://tldp
...
html



The Bash Reference Manual describes the looping compound commands, including for:
http://www
...
org/software/bash/manual/bashref
...
In past chapters, we have focused on
processing data at the file level
...

In this chapter, we will look at several shell features that are used to manipulate strings
and numbers
...
In addition to arithmetic expansion (which we touched upon in Chapter 7),
there is a common command line program called bc, which performs higher level math
...
We have
already worked with some forms of parameter expansion; for example, shell variables
...


Basic Parameters
The simplest form of parameter expansion is reflected in the ordinary use of variables
...
Simple parameters may also
be surrounded by braces:
${a}

This has no effect on the expansion, but is required if the variable is adjacent to other
text, which may confuse the shell
...

[me@linuxbox ~]$ a="foo"
[me@linuxbox ~]$ echo "$a_file"

456

Parameter Expansion
If we perform this sequence, the result will be nothing, because the shell will try to expand a variable named a_file rather than a
...
For example, to access the eleventh positional parameter, we
can do this:
${11}

Expansions To Manage Empty Variables
Several parameter expansions deal with nonexistent and empty variables
...

${parameter:-word}
If parameter is unset (i
...
, does not exist) or is empty, this expansion results in the value
of word
...

[me@linuxbox ~]$
[me@linuxbox ~]$
substitute value
[me@linuxbox ~]$

foo=
echo ${foo:-"substitute value if unset"}
if unset
echo $foo

[me@linuxbox ~]$ foo=bar
[me@linuxbox ~]$ echo ${foo:-"substitute value if unset"}
bar
[me@linuxbox ~]$ echo $foo
bar

${parameter:=word}
If parameter is unset or empty, this expansion results in the value of word
...
If parameter is not empty, the expansion results in the value of parameter
...


${parameter:?word}
If parameter is unset or empty, this expansion causes the script to exit with an error, and
the contents of word are sent to standard error
...

[me@linuxbox ~]$ foo=
[me@linuxbox ~]$ echo ${foo:?"parameter is empty"}
bash: foo: parameter is empty
[me@linuxbox ~]$ echo $?
1
[me@linuxbox ~]$ foo=bar
[me@linuxbox ~]$ echo ${foo:?"parameter is empty"}
bar
[me@linuxbox ~]$ echo $?
0

${parameter:+word}
If parameter is unset or empty, the expansion results in nothing
...

[me@linuxbox ~]$ foo=
[me@linuxbox ~]$ echo ${foo:+"substitute value if set"}
[me@linuxbox ~]$ foo=bar
[me@linuxbox ~]$ echo ${foo:+"substitute value if set"}
substitute value if set

458

Parameter Expansion

Expansions That Return Variable Names
The shell has the ability to return the names of variables
...

${!prefix*}
${!prefix@}
This expansion returns the names of existing variables with names beginning with prefix
...

Here, we list all the variables in the environment with names that begin with BASH:
[me@linuxbox ~]$ echo ${!BASH*}
BASH BASH_ARGC BASH_ARGV BASH_COMMAND BASH_COMPLETION
BASH_COMPLETION_DIR BASH_LINENO BASH_SOURCE BASH_SUBSHELL
BASH_VERSINFO BASH_VERSION

String Operations
There is a large set of expansions that can be used to operate on strings
...

${#parameter}
expands into the length of the string contained by parameter
...

[me@linuxbox ~]$ foo="This string is long
...
"
'This string is long
...


${parameter:offset}
${parameter:offset:length}
These expansions are used to extract a portion of the string contained in parameter
...

[me@linuxbox ~]$ foo="This string is long
...


459

34 – Strings And Numbers
[me@linuxbox ~]$ echo ${foo:5:6}
string

If the value of offset is negative, it is taken to mean it starts from the end of the string
rather than the beginning
...
length, if present, must not
be less than zero
...

[me@linuxbox ~]$ foo="This string is long
...

[me@linuxbox ~]$ echo ${foo: -5:2}
lo

${parameter#pattern}
${parameter##pattern}
These expansions remove a leading portion of the string contained in parameter defined
by pattern
...
The difference in the two forms is that the # form removes the shortest match, while the ## form
removes the longest match
...
txt
...
}
txt
...
}
zip

${parameter%pattern}
${parameter%%pattern}
These expansions are the same as the # and ## expansions above, except they remove
text from the end of the string contained in parameter rather than from the beginning
...
txt
...
*}
file
...
*}

460

Parameter Expansion
file

${parameter/pattern/string}
${parameter//pattern/string}
${parameter/#pattern/string}
${parameter/%pattern/string}
This expansion performs a search-and-replace upon the contents of parameter
...
In the normal
form, only the first occurrence of pattern is replaced
...
The /# form requires that the match occur at the beginning of the string, and
the /% form requires the match to occur at the end of the string
...

[me@linuxbox
[me@linuxbox
jpg
...
jpg
[me@linuxbox
jpg
...
jpg

~]$ foo=JPG
...
The string manipulation expansions can be
used as substitutes for other common commands such as sed and cut
...
As an example, we will modify the longest-word program discussed in the previous chapter to
use the parameter expansion ${#j} in place of the command substitution $(echo $j
| wc -c) and its resulting subshell, like so:
#!/bin/bash
# longest-word3 : find longest string in a file
for i; do
if [[ -r $i ]]; then
max_word=
max_len=
for j in $(strings $i); do
len=${#j}
if (( len > max_len )); then

461

34 – Strings And Numbers
max_len=$len
max_word=$j

fi
done
echo "$i: '$max_word' ($max_len characters)"
fi
shift
done

Next, we will compare the efficiency of the two versions by using the time command:
[me@linuxbox ~]$ time longest-word2 dirlist-usr-bin
...
txt: 'scrollkeeper-get-extended-content-list' (38
characters)
real 0m3
...
544s
sys 0m1
...
txt
dirlist-usr-bin
...
060s
user 0m0
...
008s

The original version of the script takes 3
...
06 seconds —a very significant im  
provement
...
bash
has four parameter expansions and two options to the declare command to support it
...
Let's consider the case of a database look-up
...
It's possible the user will enter the value in all uppercase letters or lowercase letters
or a combination of both
...
What to do?
A common approach to this problem is to normalize the user's input
...
We can do this by con462

Parameter Expansion
verting all of the characters in the user's input to either lower or uppercase and ensure that
the database entries are normalized the same way
...

Using declare, we can force a variable to always contain the desired format no matter
what is assigned to it:
#!/bin/bash
# ul-declare: demonstrate case conversion via declare
declare -u upper
declare -l lower
if [[ $1 ]]; then
upper="$1"
lower="$1"
echo $upper
echo $lower
fi

In the above script, we use declare to create two variables, upper and lower
...

There are four parameter expansions that perform upper/lowercase conversion:
Table 34-1: Case Conversion Parameter Expansions
Format

Result

${parameter,,}

Expand the value of parameter into all lowercase
...


${parameter^^}

Expand the value of parameter into all uppercase letters
...

Here is a script that demonstrates these expansions:
#!/bin/bash
# ul-param - demonstrate case conversion via parameter expansion
if [[ $1 ]];
echo
echo
echo
echo
fi

then
${1,,}
${1,}
${1^^}
${1^}

Here is the script in action:
[me@linuxbox ~]$ ul-param aBc
abc
aBc
ABC
ABc

Again, we process the first command line argument and output the four variations supported by the parameter expansions
...


Arithmetic Evaluation And Expansion
We looked at arithmetic expansion in Chapter 7
...
Its basic form is:
$((expression))
where expression is a valid arithmetic expression
...

In previous chapters, we saw some of the common types of expressions and operators
...


464

Arithmetic Evaluation And Expansion

Number Bases
Back in Chapter 9, we got a look at octal (base 8) and hexadecimal (base 16) numbers
...

Table 34-2: Specifying Different Number Bases
Notation

Description

number

By default, numbers without any notation are treated as decimal
(base 10) integers
...


0xnumber

Hexadecimal notation

base#number

number is in base

Some examples:
[me@linuxbox ~]$ echo $((0xff))
255
[me@linuxbox ~]$ echo $((2#11111111))
255

In the examples above, we print the value of the hexadecimal number ff (the largest
two-digit number) and the largest eight-digit binary (base 2) number
...
For example, -5
...

Since the shell’s arithmetic only operates on integers, the results of division are always
whole numbers:
[me@linuxbox ~]$ echo $(( 5 / 2 ))
2

This makes the determination of a remainder in a division operation more important:
[me@linuxbox ~]$ echo $(( 5 % 2 ))
1

By using the division and modulo operators, we can determine that 5 divided by 2 results
in 2, with a remainder of 1
...
It allows an operation to be performed at
specified intervals during the loop's execution
...
We have performed assignment many times, though in a different context
...
We can also do it
within arithmetic expressions:
[me@linuxbox ~]$ foo=
[me@linuxbox ~]$ echo $foo
[me@linuxbox ~]$ if (( foo = 5 ));then echo "It is true
...

[me@linuxbox ~]$ echo $foo
5

In the example above, we first assign an empty value to the variable foo and verify that
it is indeed empty
...
This process does two interesting things: 1) it assigns the value of 5 to the variable
foo, and 2) it evaluates to true because foo was assigned a nonzero value
...
A single = performs assignment
...
foo == 5 says “does foo equal 5?” This can
be very confusing because the test command accepts a single = for string equivalence
...

In addition to the =, the shell also provides notations that perform some very useful assignments:
Table 34-4: Assignment Operators
Notation

Description

467

34 – Strings And Numbers
parameter = value

Simple assignment
...


parameter += value

Addition
...


parameter -= value

Subtraction
...


parameter *= value

Multiplication
...


parameter /= value

Integer division
...


parameter %= value

Modulo
...


parameter++

Variable post-increment
...


parameter−−

Variable post-decrement
...


++parameter

Variable pre-increment
...


--parameter

Variable pre-decrement
...


These assignment operators provide a convenient shorthand for many common arithmetic
tasks
...
This style of notation is taken
from the C programming language and has been incorporated by several other programming languages, including bash
...
While they both
either increment or decrement the parameter by one, the two placements have a subtle
difference
...
If placed after, the operation is performed after
the parameter is returned
...
Here is a
demonstration:
[me@linuxbox ~]$ foo=1
[me@linuxbox ~]$ echo $((foo++))
1
[me@linuxbox ~]$ echo $foo

468

Arithmetic Evaluation And Expansion
2

If we assign the value of one to the variable foo and then increment it with the ++ operator placed after the parameter name, foo is returned with the value of one
...
If we
place the ++ operator in front of the parameter, we get the more expected behavior:
[me@linuxbox ~]$ foo=1
[me@linuxbox ~]$ echo $((++foo))
2
[me@linuxbox ~]$ echo $foo
2

For most shell applications, prefixing the operator will be the most useful
...
We will make some improvements to our modulo script to tighten it up a bit:
#!/bin/bash
# modulo2 : demonstrate the modulo operator
for ((i = 0; i <=
if (((i % 5)
printf
else
printf
fi
done
printf "\n"

20; ++i )); do
== 0 )); then
"<%d> " $i
"%d " $i

Bit Operations
One class of operators manipulates numbers in an unusual way
...
They are used for certain kinds of low level tasks, often involving setting or
reading bit-flags
...
Negate all the bits in a number
...
Shift all the bits in a number to the left
...
Shift all the bits in a number to the right
...
Perform an AND operation on all the bits in two
numbers
...
Perform an OR operation on all the bits in two
numbers
...
Perform an exclusive OR operation on all the
bits in two numbers
...

Here we will demonstrate producing a list of powers of 2, using the left bitwise shift operator:
[me@linuxbox ~]$ for ((i=0;i<8;++i)); do echo $((1<1
2
4
8
16
32
64
128

Logic
As we discovered in Chapter 27, the (( )) compound command supports a variety of
comparison operators
...
Here is
the complete list:
Table 34-6: Comparison Operators
Operator
<=

Description

>=

Greater than or equal to

<

Less than

>

Greater than

470

Less than or equal to

Arithmetic Evaluation And Expansion
==

Equal to

!=

Not equal to

&&

Logical AND

||

Logical OR

expr1?expr2:expr3

Comparison (ternary) operator
...


When used for logical operations, expressions follow the rules of arithmetic logic; that is,
expressions that evaluate as zero are considered false, while non-zero expressions are
considered true
...
This operator (which is
modeled after the one in the C programming language) performs a standalone logical test
...
It acts on three arithmetic expressions
(strings won’t work), and if the first expression is true (or non-zero) the second expression is performed
...
We can try this on the
command line:
[me@linuxbox
[me@linuxbox
[me@linuxbox
1
[me@linuxbox
[me@linuxbox
0

~]$ a=0
~]$ ((a<1?++a:--a))
~]$ echo $a
~]$ ((a<1?++a:--a))
~]$ echo $a

Here we see a ternary operator in action
...
Each time
the operator is performed, the value of the variable a switches from zero to one or vice
versa
...

471

34 – Strings And Numbers
When attempted, bash will declare an error:
[me@linuxbox ~]$ a=0
[me@linuxbox ~]$ ((a<1?a+=1:a-=1))
bash: ((: a<1?a+=1:a-=1: attempted assignment to non-variable (error
token is "-=1")

This problem can be mitigated by surrounding the assignment expression with parentheses:
[me@linuxbox ~]$ ((a<1?(a+=1):(a-=1)))

Next, we see a more complete example of using arithmetic operators in a script that produces a simple table of numbers:
#!/bin/bash
# arith-loop: script to demonstrate arithmetic operators
finished=0
a=0
printf "a\ta**2\ta**3\n"
printf "=\t====\t====\n"
until ((finished)); do
b=$((a**2))
c=$((a**3))
printf "%d\t%d\t%d\n" $a $b $c
((a<10?++a:(finished=1)))
done

In this script, we implement an until loop based on the value of the finished variable
...
Within the loop, we calculate the square and cube of the counter variable
a
...
If it is less than
10 (the maximum number of iterations), it is incremented by one, else the variable finished is given the value of one, making finished arithmetically true, thereby terminating the loop
...
At least not directly with the shell
...

There are several approaches we can take
...

Another approach is to use a specialized calculator program
...

The bc program reads a file written in its own C-like language and executes it
...
The bc language supports quite a few features including variables, loops, and programmer-defined functions
...
bc is well documented by its
man page
...
We’ll write a bc script to add 2 plus 2:
/* A very simple bc script */
2 + 2

The first line of the script is a comment
...
Comments, which may span multiple lines, begin with /* and
end with */
...
bc, we can run it this way:
[me@linuxbox ~]$ bc foo
...
06
...

This is free software with ABSOLUTELY NO WARRANTY
...

4

If we look carefully, we can see the result at the very bottom, after the copyright message
...

bc can also be used interactively:
[me@linuxbox ~]$ bc -q
2 + 2
4
quit

When using bc interactively, we simply type the calculations we wish to perform, and
the results are immediately displayed
...

It is also possible to pass a script to bc via standard input:
[me@linuxbox ~]$ bc < foo
...
This is a here string example:
[me@linuxbox ~]$ bc <<< "2+2"
4

474

bc – An Arbitrary Precision Calculator Language

An Example Script
As a real-world example, we will construct a script that performs a common calculation,
monthly loan payments
...

INTEREST is the APR as a number (7% = 0
...

MONTHS is the length of the loan's term
...
0775 180

475

34 – Strings And Numbers
1270
...
75% APR for 180
months (15 years)
...
This is determined by the value
given to the special scale variable in the bc script
...
While its mathematical notation is slightly
different from that of the shell (bc more closely resembles C), most of it will be quite familiar, based on what we have learned so far
...
As our experience with scripting grows, the ability to effectively manipulate strings and numbers will prove extremely valuable
...


Extra Credit
While the basic functionality of the loan-calc script is in place, the script is far from
complete
...




A better format for the output
...
gnu
...
html#Shell-Parameter-Expansion



The Wikipedia has a good article describing bit operations:
http://en
...
org/wiki/Bit_operation



and an article on ternary operations:
http://en
...
org/wiki/Ternary_operation

476

The Bash Hackers Wiki has a good discussion of parameter expansion:
http://wiki
...
org/syntax/pe

Further Reading


as well as a description of the formula for calculating loan payments used in our
loan-calc script:
http://en
...
org/wiki/Amortization_calculator

477

35 – Arrays

35 – Arrays
In the last chapter, we looked at how the shell can manipulate strings and numbers
...

In this chapter, we will look at another kind of data structure called an array, which holds
multiple values
...
The shell
supports them, too, though in a rather limited fashion
...


What Are Arrays?
Arrays are variables that hold more than one value at a time
...
Let’s consider a spreadsheet as an example
...
It has both rows and columns, and an individual cell in the spreadsheet can
be located according to its row and column address
...
An
array has cells, which are called elements, and each element contains data
...

Most programming languages support multidimensional arrays
...
Many languages support arrays with an arbitrary number of dimensions, though two- and three-dimensional arrays are probably the most commonly used
...
We can think of them as a spreadsheet
with a single column
...
Array support first appeared in bash version 2
...


Creating An Array
Array variables are named just like other bash variables, and are created automatically
when they are accessed
...
With the
first command, element 1 of array a is assigned the value “foo”
...
The use of braces in the second command is required to prevent the shell from attempting pathname expansion on the name of the array
element
...


Assigning Values To An Array
Values may be assigned in one of two ways
...
Note that the first element of an array is subscript zero, not
one
...

Multiple values may be assigned using the following syntax:
name=(value1 value2
...
are values assigned sequentially to elements of the array, starting with element zero
...

Let’s consider a simple data-gathering and presentation example
...
From this
data, our script will output a table showing at what hour of the day the files were last
modified
...
This
script, called hours, produces this result:
[me@linuxbox ~]$
Hour Files Hour
---- ----- ---00
0
12
01
1
13
02
0
14
03
0
15
04
1
16
05
1
17
06
6
18
07
3
19
08
1
20
09
14
21
10
2
22
11
5
23

hours
...
It produces a table showing, for each hour of the day (0-23), how many files were last modified
...
23}; do hours[i]=0; done
# Collect data
for i in $(stat -c %y "$1"/* | cut -c 12-13); do
j=${i/#0}
((++hours[j]))
((++count))
done
# Display data
echo -e "Hour\tFiles\tHour\tFiles"
echo -e "----\t-----\t----\t-----"
for i in {0
...
In the
first section, we check that there is a command line argument and that it is a directory
...

The second section initializes the array hours
...
There is no special requirement to prepare arrays prior to use, but our script
needs to ensure that no element is empty
...
By employing brace expansion ({0
...

The next section gathers the data by running the stat program on each file in the directory
...
Inside the loop, we need to
remove leading zeros from the hour field, since the shell will try (and ultimately fail) to
interpret values “00” through “09” as octal numbers (see Table 34-1)
...
Finally, we increment a counter (count) to track the total number of files in the directory
...
We first output a couple of
header lines and then enter a loop that produces two columns of output
...


481

35 – Arrays

Array Operations
There are many common array operations
...
have many applications in scripting
...
As with positional parameters, the @ notation is the more useful of the two
...
We then execute four
loops to see the affect of word-splitting on the array contents
...
The * notation results in a single word containing the array’s contents, while the @ notation results
in three words, which matches the arrays “real” contents
...
Here is an example:

482

Array Operations
[me@linuxbox ~]$ a[100]=foo
[me@linuxbox ~]$ echo ${#a[@]} # number of array elements
1
[me@linuxbox ~]$ echo ${#a[100]} # length of element 100
3

We create array a and assign the string “foo” to element 100
...
Finally, we look at the
length of element 100 which contains the string “foo”
...
This
differs from the behavior of some other languages in which the unused elements of the array (elements 0-99) would be initialized with empty values and counted
...
This can be done with a parameter expansion using the following forms:
${!array[*]}
${!array[@]}
where array is the name of an array variable
...
Fortunately, the shell provides us with a solution
...
Here,
we assign three values to the array foo, and then append three more
...
The
shell has no direct way of doing this, but it's not hard to do with a little coding:
#!/bin/bash
# array-sort : Sort an array
a=(f e d c b a)
echo "Original array: ${a[@]}"
a_sorted=($(for i in "${a[@]}"; do echo $i; done | sort))
echo "Sorted array:
${a_sorted[@]}"

When executed, the script produces this:
[me@linuxbox ~]$ array-sort
Original array: f e d c b a
Sorted array:
a b c d e f

The script operates by copying the contents of the original array (a) into a second array
(a_sorted) with a tricky piece of command substitution
...


Deleting An Array
To delete an array, use the unset command:
[me@linuxbox ~]$ foo=(a b c d e f)
[me@linuxbox ~]$ echo ${foo[@]}
a b c d e f

484

Array Operations
[me@linuxbox ~]$ unset foo
[me@linuxbox ~]$ echo ${foo[@]}
[me@linuxbox ~]$

unset may also be used to delete single array elements:
[me@linuxbox
[me@linuxbox
a b c d e f
[me@linuxbox
[me@linuxbox
a b d e f

~]$ foo=(a b c d e f)
~]$ echo ${foo[@]}
~]$ unset 'foo[2]'
~]$ echo ${foo[@]}

In this example, we delete the third element of the array, subscript 2
...

Interestingly, the assignment of an empty value to an array does not empty its contents:
[me@linuxbox ~]$ foo=(a b c d e f)
[me@linuxbox ~]$ foo=
[me@linuxbox ~]$ echo ${foo[@]}
b c d e f

Any reference to an array variable without a subscript refers to element zero of the array:
[me@linuxbox
[me@linuxbox
a b c d e f
[me@linuxbox
[me@linuxbox
A b c d e f

~]$ foo=(a b c d e f)
~]$ echo ${foo[@]}
~]$ foo=A
~]$ echo ${foo[@]}

Associative Arrays
Recent versions of bash now support associative arrays
...
This capability allow interesting new approaches to
managing data
...
Associative array elements are accessed in much the same way as integer indexed arrays:
echo ${colors["blue"]}

In the next chapter, we will look at a script that makes good use of associative arrays to
produce an interesting report
...
Most of these are rather obscure, but they may provide occasional utility in some special circumstances
...
This lack of popularity is
unfortunate because arrays are widely used in other programming languages and provide
a powerful tool for solving many kinds of programming problems
...
The
for ((expr; expr; expr))
form of loop is particularly well-suited to calculating array subscripts
...
wikipedia
...
wikipedia
...
While we
have certainly covered a lot of ground in the previous chapters, there are many bash features that we have not covered
...
However, there are a few that, while not in common use, are helpful for certain programming problems
...


Group Commands And Subshells
bash allows commands to be grouped together
...
Here are examples of the syntax of each:
Group command:
{ command1; command2; [command3;
...
])
The two forms differ in that a group command surrounds its commands with braces and a
subshell uses parentheses
...

So what are group commands and subshells good for? While they have an important difference (which we will get to in a moment), they are both used to manage redirection
...
txt
echo "Listing of foo
...
txt
cat foo
...
txt

This is pretty straightforward
...
txt
...
txt"; cat foo
...
txt

Using a subshell is similar:
(ls -l; echo "Listing of foo
...
txt) > output
...
When constructing a pipeline of commands, it
is often useful to combine the results of several commands into a single stream
...
txt"; cat foo
...

In the script that follows, we will use groups commands and look at several programming
techniques that can be employed in conjunction with associative arrays
...
At the end of
listing, the script prints a tally of the number of files belonging to each owner and group
...
6
/usr/bin/2to3
/usr/bin/a2p
/usr/bin/abrowser
/usr/bin/aconnect
/usr/bin/acpi_fakekey
/usr/bin/acpi_listen
/usr/bin/add-apt-repository

...


...
In this script we create five arrays as follows:
files contains the names of the files in the directory, indexed by filename
file_group contains the group owner of each file, indexed by filename
file_owner contains the owner of each file, indexed by file name
groups contains the number of files belonging to the indexed group
owners contains the number of files belonging to the indexed owner
Lines 7-10: Checks to see that a valid directory name was passed as a positional parameter
...

Lines 12-20: Loop through the files in the directory
...
Likewise the file name itself is assigned to the files array (line 15)
...

Lines 22-27: The list of files is output
...
This allows for the possibility that a file name may contain embedded
spaces
...
This permits the entire output of the loop to be piped into the sort command
...

Lines 29-40: These two loops are similar to the file list loop except that they use the "${!
490

Group Commands And Subshells
array[@]}" expansion which expands into the list of array indexes rather than the list of
array elements
...
Whereas a group command executes all of its commands in the current shell, a subshell (as the name suggests)
executes its commands in a child copy of the current shell
...
When the subshell exits, the copy
of the environment is lost, so any changes made to the subshell’s environment (including
variable assignment) is lost as well
...
Group commands are both faster
and require less memory
...
To
recap, if we construct a pipeline like this:
echo "foo" | read
echo $REPLY

The content of the REPLY variable is always empty because the read command is executed in a subshell, and its copy of REPLY is destroyed when the subshell terminates
...
Fortunately, the shell provides an exotic form of
expansion called process substitution that can be used to work around this problem
...

To solve our problem with read, we can employ process substitution like this:
read < <(echo "foo")
echo $REPLY

491

36 – Exotica
Process substitution allows us to treat the output of a subshell as an ordinary file for purposes of redirection
...

Process substitution is often used with loops containing read
...
The listing itself is produced
on the final line of the script
...
The tail command is included in the process substitution
pipeline to eliminate the first line of the listing, which is not needed
...
ldif
Size:
14540
Owner:
me
Group:
me
Modified:
2009-04-02 11:12

492

Group Commands And Subshells
Links:
1
Attributes: -rw-r--r-Filename:
Size:
Owner:
Group:
Modified:
Links:
Attributes:

bin
4096
me
me
2009-07-10 07:31
2
drwxr-xr-x

Filename:
Size:
Owner:
Group:

bookmarks
...
We can add this capability
to our scripts, too
...

When we design a large, complicated script, it is important to consider what happens if
the user logs off or shuts down the computer while the script is running
...
In turn, the programs representing those processes can perform actions to ensure a proper and orderly termination of
the program
...
In the course of good design, we would have the script delete the file
when the script finishes its work
...

bash provides a mechanism for this purpose known as a trap
...
trap uses the following syntax:
trap argument signal [signal
...

Here is a simple example:
#!/bin/bash
# trap-demo : simple signal handling demo

493

36 – Exotica

trap "echo 'I am ignoring you
...
5}; do
echo "Iteration $i of 5"
sleep 5
done

This script defines a trap that will execute an echo command each time either the SIGINT or SIGTERM signal is received while the script is running
...

Iteration 3 of 5
I am ignoring you
...

Constructing a string to form a useful sequence of commands can be awkward, so it is
common practice to specify a shell function as the command
...
" 2>&1
exit 0
}
exit_on_signal_SIGTERM () {
echo "Script terminated
...
5}; do
echo "Iteration $i of 5"
sleep 5
done

This script features two trap commands, one for each signal
...
Note the inclusion of an exit command in each of the signal-handling functions
...

When the user presses Ctrl-c during the execution of this script, the results look like
this:
[me@linuxbox ~]$ trap-demo2
Iteration 1 of 5
Iteration 2 of 5
Script interrupted
...
There is
something of an art to naming temporary files
...
However, since the directory is shared, this poses certain
security concerns, particularly for programs running with superuser privileges
...
This avoids an exploit known as a temp race attack
...
$RANDOM

This will create a filename consisting of the program’s name, followed by its
process ID (PID), followed by a random integer
...


495

36 – Exotica

A better way is to use the mktemp program (not to be confused with the mktemp
standard library function) to both name and create the temporary file
...
The template should include a series of “X” characters, which are replaced
by a corresponding number of random letters and numbers
...
Here is an example:
tempfile=$(mktemp /tmp/foobar
...

The “X” characters in the template are replaced with random letters and numbers
so that the final filename (which, in this example, also includes the expanded
value of the special parameter $$ to obtain the PID) might be something like:
/tmp/foobar
...
UOZuvM6654

For scripts that are executed by regular users, it may be wise to avoid the use of
the /tmp directory and create a directory for temporary files within the user’s
home directory, with a line of code such as this:
[[ -d $HOME/tmp ]] || mkdir $HOME/tmp

Asynchronous Execution
It is sometimes desirable to perform more than one task at the same time
...

Scripts can be constructed to behave in a multitasking fashion
...
However, when a
series of scripts runs this way, there can be problems keeping the parent and child coordinated
...
The
wait command causes a parent script to pause until a specified process (i
...
, the child
script) finishes
...
To do this, we will need two scripts, a parent script:

496

Asynchronous Execution
#!/bin/bash
# async-parent : Asynchronous execution demo (parent)
echo "Parent: starting
...
"
async-child &
pid=$!
echo "Parent: child (PID= $pid) launched
...
"
sleep 2
echo "Parent: pausing to wait for child to finish
...
Continuing
...
Exiting
...
"
sleep 5
echo "Child: child is done
...
"

In this example, we see that the child script is very simple
...
In the parent script, the child script is launched and put into the
background
...

The parent script continues and then executes a wait command with the PID of the child
process
...

When executed, the parent and child scripts produce the following output:
[me@linuxbox ~]$ async-parent
Parent: starting
...

Parent: child (PID= 6741) launched
...

Child: child is running
...

Child: child is done
...

Parent: child is finished
...

Parent: parent is done
...


Named Pipes
In most Unix-like systems, it is possible to create a special type of file called a named
pipe
...
They are not that popular, but they’re good to know
about
...

The most widely used type of client-server system is, of course, a web browser communicating with a web server
...

Named pipes behave like files, but actually form first-in first-out (FIFO) buffers
...
With named
pipes, it is possible to set up something like this:
process1 > named_pipe
and
process2 < named_pipe
and it will behave as if:
process1 | process2

Setting Up A Named Pipe
First, we must create a named pipe
...
Using ls, we examine the
file and see that the first letter in the attributes field is “p”, indicating that it is a named
pipe
...
In the first terminal, we enter a simple command and redirect its output to the named pipe:
[me@linuxbox ~]$ ls -l > pipe1

After we press the Enter key, the command will appear to hang
...
When this occurs, it is said that
the pipe is blocked
...
Using the second terminal window, we enter
this command:
[me@linuxbox ~]$ cat < pipe1

and the directory listing produced from the first terminal window appears in the second
terminal as the output from the cat command
...


Summing Up
Well, we have completed our journey
...
Even though we covered a lot of ground in our trek, we barely scratched the surface as far as the command line goes
...
Start digging around in /usr/bin and you’ll
see!

Further Reading


The “Compound Commands” section of the bash man page contains a full description of group command and subshell notations
...

499

36 – Exotica




Linux Journal has two good articles on named pipes
...
linuxjournal
...
linuxjournal
...
org/LDP/abs/html/process-sub
...
333
absolute pathnames
...
50, 126
aliases
...
160
American Standard Code for Information
Interchange (see ASCII)
...
247
anonymous FTP servers
...
160
ANSI escape codes
...
SYS
...
118
apropos command
...
169
apt-get command
...

aptitude command
...
230
arithmetic expansion
...
70, 453, 464, 467
arithmetic operators
...
391, 464
arrays
...
483
assigning values
...
485, 488
creating
...
484
determine number of elements
...
483
index
...
478
reading variables into
...
484
subscript
...
478

ASCII
...
157
carriage return
...
251, 253, 387
control codes
...
320
linefeed character
...
221
printable characters
...
17
aspell command
...
341
assembly language
...
467
associative arrays
...
496
audio CDs
...
299, 473

B
back references
...

backslash escape sequences
...
156
backups, incremental
...
440
bash
...
48
basic regular expressions 254, 262p
...
473
Berkeley Software Distribution
...
116
binary
...
96
bit operators
...
2, 6
brace expansion
...
381

501

Index
break command
...
39
BSD style
...
182
bugs
...
346
bzip2 command
...
341, 453, 468, 471
C++
...
4
cancel command
...
18, 77p
...
, 266, 298, 330
case compound command
...
462
cat command
...
9, 11
CD-ROMs
...
, 191
cdrecord command
...
192
character classes
...
, 248, 250p
...
27, 249p
...
103
child process
...
92, 105, 356
chown command
...
361
chronological sorting
...
200, 202
client-server architecture
...
341
collation order
...
253, 387
dictionary
...
253
comm command
...
3, 83
command line
...
436
editing
...
67
history
...
xvii, 28
command options
...
73, 75, 451
commands
...
14, 436
determining type
...
44
executable program files
...
99
long options
...
14
comments
...
329, 339
comparison operators
...
341
compiling
...
81
compound commands
...
429
for
...
381
until
...
410
(( ))
...
389, 406
compression algorithms
...
396, 420
configuration files
...
346
constants
...
412
control characters
...
77, 251
control operators
...
394, 406
||
...
109
COPYING
...

in vim
...
80
with X Window System
...
45, 48p
...
62
cp command
...
108p
...
211
crossword puzzles
...
304
CUPS
...
8
cursor movement
...
276, 461

D
daemon programs
...
226
data redundancy
...
389
date command
...
273
dd command
...
166
Debian Style (
...
167
debugging
...
463
defensive programming
...
76, 271, 274
dependencies
...
422p
...
174, 341
device names
...
20
df command
...
342
dictionary collation order
...
284
Digital Restrictions Management (DRM)
...

archiving
...
9
copying
...
28, 34
current working
...
31, 39
hierarchical
...
21, 90, 379
listing
...
30, 36
navigating
...
126
parent
...
126
PWD variable
...
31, 39
renaming
...
7
shared
...
98
synchronizing
...
238
viewing contents
...
177
DISPLAY variable
...
27
dos2unix command
...
75

dpkg command
...
269, 379
Dynamic Host Configuration Protocol (DHCP) 199

E
echo command
...
78
-n option
...
423
EDITOR variable
...
98
effective user ID
...
388
email
...
341
empty variables
...
206
encryption
...
59, 369
endless loop
...
336
environment
...
124
establishing
...
124
login shell
...
124
shell variables
...
127
subshells
...
124
eqn command
...
347
executable program files
...

determining location
...
126
exit command
...
382, 386
expand command
...
67
arithmetic
...
71, 75, 451
command substitution
...
76
errors resulting from
...
84, 86
parameter
...
68, 75, 451
tilde
...
74p
...

arithmetic
...
396, 420
ext3
...
254
Extensible Markup Language
...
383
fdformat command
...
185
fg command
...
498
file command
...
56
file system corruption
...
199
filenames
...
11
embedded spaces in
...
12
hidden
...

access
...
230, 236
attributes
...
91
block special device
...
92
changing owner and group owner
...
91
character special device
...
226
configuration
...
28, 34
copying over a network
...
55
deb
...
31, 39, 218
determining contents
...
20
execution access
...
384
finding
...
11
iso image
...

listing
...
91
moving
...
92
permissions
...
90
regular
...
31, 39
renaming
...
166
shared library
...
127
sticky bit
...
212
synchronizing
...
495
text
...
199, 235, 238
truncating
...
90
viewing contents
...
90
filters
...
211, 234
findutils package
...
361
firewalls
...
498
floppy disks
...

branching
...
429
elif statement
...
413
for compound command
...
450
function statement
...
381
looping
...
406
multiple-choice decisions
...
414
terminating a loop
...
493
until loop
...
410
fmt command
...
4
fold command
...
450
for loop
...
166
Fortran programming language
...
5, 181

Index
Free Software Foundation
...
189
ftp command
...
200, 370
FUNCNAME variable
...
374

G
gcc
...
114, 131
genisoimage command
...
166
getopts command
...
329
gid
...
376
globbing
...
2, 27, 40, 95, 131, 208
gnome-terminal
...
452
GNU C Compiler
...
45, 48p
...
225
GNU Project
...
48
GNU/Linux
...
xvii
grep command
...
318
group commands
...
89
effective group ID
...
89
primary group ID
...
98
GUI
...
227
gzip command
...
176
hard links
...
37
listing
...
63
header files
...
355
help command
...
369

here strings
...
93, 465
hidden files
...
7
high-level programming languages
...

expansion
...
84
history command
...
21
root account
...
90
home directory
...
126
hostname
...
265, 299, 319, 361, 371, 373
Hypertext Markup Language
...
53
id command
...
183
if compound command
...
402
IMCP ECHO_REQUEST
...
234
info files
...
108
init scripts
...
37
INSTALL
...
167
integers
...
70, 473
division
...
388
interactivity
...
402
interpreted languages
...
342
interpreter
...
191p
...
180, 192

J
job control
...
115
jobspec
...
281

505

Index
Joliet extensions
...
137

K
kate command
...
2, 27, 40, 95, 131, 208
kedit command
...
xvi, xixp
...
271
kill command
...
120
killing text
...
318
Konqueror
...
2
kwrite command
...
126, 251, 253
less command
...
202
libraries
...
xxi
line continuation character
...
137
line-continuation character
...
341
linking
...

broken
...
33
hard
...
23, 34
Linux community
...
166
CentOS
...
166p
...
xix, 89, 167, 336
Foresight
...
166
Linspire
...
167
OpenSUSE
...
166
PCLinuxOS
...
167
Slackware
...
xix, 166p
...
167

506

Linux Filesystem Hierarchy Standard
...
xvi, xixp
...
174
literal characters
...
xix
ln command
...
376
locale
...
253
localhost
...
209, 261
logical errors
...
392
logical operators
...
214, 218
login prompt
...
90, 99, 127
long options
...
199
looping
...
420, 466, 469, 486, 492
lossless compression
...
227
lowercase to uppercase conversion
...
332
lpq command
...
331
lprm command
...
336
ls command
...
16
viewing file attributes
...
202
LVM (Logical Volume Manager)
...
340
maintenance
...
347
Makefile
...
45
man pages
...
265, 319
memory
...
109
displaying free
...
111
segmentation violation
...
111

Index
viewing usage
...
111
menu-driven programs
...
81
meta sequences
...
246
metadata
...
28, 34
mkfifo command
...
188, 190
mkisofs command
...
496
mnemonics
...
139
monospaced fonts
...
137
more command
...
178, 192
mount points
...
177
MP3
...
88
multiple-choice decisions
...
88, 108, 496
mv command
...
498
nano command
...
27, 95, 208
netstat command
...
195
anonymous FTP servers
...
199
Dynamic Host Configuration Protocol (DHCP)

...
206
examine network settings and statistics
...
199
firewalls
...
200
Local Area Network
...
199
man in the middle attacks
...
198
secure communication with remote hosts
...
196
tracing the route to a host
...
238
transporting files
...
206
newline character
...
76
NEWS
...
305
nroff command
...
221
number bases
...
93, 465, 481
Ogg Vorbis
...
126
OpenOffice
...
18, xxp
...
203
operators
...
70, 465
assignment
...
419
comparison
...
471
owning files
...
167
package maintainers
...
166
deb
...
deb)
...
169
high-level tools
...
169
low-level tools
...
167
Red Hat Style (
...
167
removing packages
...
166
updating packages
...
166
page description language
...
126
pagers
...
72, 75, 456
parent directory
...
108
passwd command
...
106
paste command
...
183

507

Index
patch command
...
285
PATH variable
...
68, 75, 451
pathnames
...
9
completion
...
9
PDF
...
42, 243, 299, 341, 473
permissions
...
341
ping command
...
60, 404, 491
in command substitution
...
346, 380, 394
portable
...
321, 331
Portable Operating System Interface
...
436, 457p
...
192, 251, 254p
...
26p
...
, 253, 257, 289,
299
PostScript
...
313, 329
primary group ID
...
251
printenv command
...
181
printers
...
181
control codes
...
327
device names
...
329
graphical
...
327
laser
...
314, 455
printing
...
336
history of
...
337
monospaced fonts
...
329
pretty
...
336
proportional fonts
...
337
spooling
...
338

508

viewing jobs
...
109
process substitution
...
108
background
...
108
controlling
...
115
interrupting
...
115
killing
...
110
parent
...
109
process ID
...
494
signals
...
494
sleeping
...
110
stopping
...
109, 111
zombie
...
422
programmable completion
...
109
PS1 variable
...
363
ps2pdf command
...
426
pseudocode
...
121
PuTTY
...
8
PWD variable
...
341

Q
quoting
...
75
escape character
...
417
single quotes
...
176
raster image processor
...
398, 408, 414, 422, 491

Index
Readline
...
49, 344
redirection
...
499
group commands and subshells
...
369
here strings
...
55
standard input
...
54
redirection operators
...
57
&>>
...
59
<(list)
...
369p
...
370
<<<
...
54
>(list)
...
55
|
...
62, 243, 295, 389, 403
anchors
...
263, 294p
...
254, 262p
...
254
relational databases
...
9
release early, release often
...
61
REPLY variable
...
361
repositories
...
375, 386
reusable
...
329
rlogin command
...
31
Rock Ridge extensions
...
318
ROT13 encoding
...
166
rpm command
...
238
rsync remote-update protocol
...
341

S
scalar variables
...
192
scp command
...
86
scripting languages
...
304
searching a file for patterns
...
84
Secure Shell
...
290, 322, 461
set command
...
98
setuid
...
229
sftp command
...
21, 168
shebang
...
42
shell functions
...
354
SHELL variable
...
124
shift command
...
494
signals
...
76
Slackware
...
411
soft link
...
61, 267
sort keys
...
166p
...
135, 357
source tree
...
441, 458
split command
...
203
ssh command
...
88
Stallman, Richard
...
53p
...
57
redirecting to a file
...
53, 370, 398
redirecting
...
53
appending to a file
...
57
redirecting standard error to
...
54
startup files
...
223
sticky bit
...
176
audio CDs
...
179p
...
185
device names
...
177
FAT32
...
183, 189
formatting
...
179
mount points
...
185
reading and writing directly
...
189
unmounting
...
190
stream editor
...

expressions
...
459
length of
...
461
remove leading portion of
...
460
${parameter:offset:length}
...
459
strings command
...
377, 422
style
...
99
subshells
...
99, 101
Sun Microsystems
...
2, 90, 100, 120
symbolic links
...
38, 40
listing
...
23
syntax errors
...
354, 359

T
tables
...
271, 317
tail command
...
230
tar command
...
343

510

targets
...
113
Tatham, Simon
...
318, 322
tee command
...
109
telnet command
...
127
terminal emulators
...

controlling terminal
...
bashrc
...
99
exiting
...
99, 127
TERM variable
...
499
virtual
...
88
terminals
...
, 160, 318, 327
ternary operator
...
423
test command
...
423
testing
...

TEX
...
17
adjusting line length
...
17
carriage return
...
283
converting MS-DOS to Unix
...
62
cutting
...
275
deleting multiple blank lines
...
284
displaying common lines
...
266
DOS format
...
126
editors
...
279
files
...
61
folding
...
305
formatting for typesetters
...
322
joining
...
267

Index
lowercase to uppercase conversion
...
267, 305
paginating
...
280
preparing for printing
...
61
rendering in PostScript
...
290
searching for patterns
...
61, 267
spell checking
...
294
substituting tabs for spaces
...
278
transliterating characters
...
267
viewing with less
...
130, 264, 288
emacs
...
354
gedit
...
288
kate
...
131
kwrite
...
137
nano
...
131
stream
...
354, 359
vi
...
131, 354, 359
visual
...
69, 75
tload command
...
111
top-down design
...
xvi, xxi
touch command
...
, 239, 349, 446
tr command
...
197
tracing
...
288
traps
...
318
true command
...
109
type command
...
318, 328
TZ variable
...
89, 102, 166, 250, 357
umask command
...
181
unalias command
...
419
unary operators
...
279
unexpected token
...
61, 275
Unix
...
331
unix2dos command
...
484
until compound command
...
413
unzip command
...
211
upstream providers
...
373
uptime command
...
176, 190
Usenet
...
125, 127
users
...
89
changing identity
...
106
effective user ID
...
90
identity
...
90
setting default permissions
...
98
superuser
...
, 107
/etc/passwd
...
90

V
validating input
...
72, 364, 456
assigning values
...
366
declaring
...
124
global
...
376
names
...
478
shell
...
188
vi command
...
263, 359
virtual consoles
...
206
virtual terminals
...
137
vmstat command
...
496
wc command
...
265
wget command
...
327
whatis command
...
43, 73
while compound command
...
26, 58, 67, 243, 250
wodim command
...
74pp
...
89
WYSIWYG
...
3, 88, 206
xargs command
...
121
xlogo command
...
265

Y
yanking text
...
169

Z
zgrep command
...
236
zless command
...
45


...
/configure
...
bash_history
...
bash_login
...
bash_profile
...
bashrc
...
profile
...
ssh/known_hosts
...
464, 470

[
[ command
...
20
/bin
...
20
/boot/grub/grub
...
20
/boot/vmlinuz
...
20
/dev/cdrom
...
183
/dev/floppy
...
57
/etc
...
bashrc
...
21
/etc/fstab
...
90
/etc/passwd
...
127, 129
/etc/shadow
...
99
/lib
...
21
/media
...
21
/opt
...
22
/root
...
22
/tmp
...
22
/usr/bin
...
22
/usr/local
...
22, 350, 358
/usr/local/sbin
...
22

Index
/usr/share
...
247
/usr/share/doc
...
23
/var/log
...
23, 64, 183
/var/log/syslog
...
442, 497
$((expression))
...
483
${!array[*]}
...
459
${!prefix*}
...
459
${parameter,,}
...
463

${parameter:-word}
...
458
${parameter:+word}
...
457
${parameter//pattern/string}
...
461
${parameter/%pattern/string}
...
461
${parameter##pattern}
...
460
${parameter%%pattern}
...
460
${parameter^}
...
463
$@
...
441, 449
$#
...
441

513


Title: Linux Comand line for begineer
Description: Linux Comand line for begineer you can learn the linux command very easily.